aegilops tauschii genome: Topics by Science.gov

Sample records for aegilops tauschii genome

Genome-wide analysis of the WRKY transcription factors in aegilops tauschii.

PubMed

Ma, Jianhui; Zhang, Daijing; Shao, Yun; Liu, Pei; Jiang, Lina; Li, Chunxi

2014-01-01

The WRKY transcription factors (TFs) play important roles in responding to abiotic and biotic stress in plants. However, due to its unfinished genome sequencing, relatively few WRKY TFs with full-length coding sequences (CDSs) have been identified in wheat. Instead, the Aegilops tauschii genome, which is the D-genome progenitor of the hexaploid wheat genome, provides important resources for the discovery of new genes. In this study, we performed a bioinformatics analysis to identify WRKY TFs with full-length CDSs from the A. tauschii genome. A detailed evolutionary analysis for all these TFs was conducted, and quantitative real-time PCR was carried out to investigate the expression patterns of the abiotic stress-related WRKY TFs under different abiotic stress conditions in A. tauschii seedlings. A total of 93 WRKY TFs were identified from A. tauschii, and 79 of them were found to be newly discovered genes compared with wheat. Gene phylogeny, gene structure and chromosome location of the 93 WRKY TFs were fully analyzed. These studies provide a global view of the WRKY TFs from A. tauschii and a firm foundation for further investigations in both A. tauschii and wheat. © 2015 S. Karger AG, Basel.
Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

USDA-ARS?s Scientific Manuscript database

Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...
Complete chloroplast genomes of Aegilops tauschii Coss. and Ae. cylindrica Host sheds light on plasmon D evolution.

PubMed

Gogniashvili, Mari; Jinjikhadze, Tamar; Maisaia, Inesa; Akhalkatsi, Maia; Kotorashvili, Adam; Kotaria, Nato; Beridze, Tengiz; Dudnikov, Alexander Ju

2016-11-01

Hexaploid wheat (Triticum aestivum L., genomes AABBDD) originated in South Caucasus by allopolyploidization of the cultivated Emmer wheat T. dicoccum (genomes AABB) with the Caucasian Ae. tauschii ssp strangulata (genomes DD). Genetic variation of Ae. tauschii is an important natural resource, that is why it is of particular importance to investigate how this variation was formed during Ae. tauschii evolutionary history and how it is presented through the species area. The D genome is also found in tetraploid Ae. cylindrica Host (2n = 28, CCDD). The plasmon diversity that exists in Triticum and Aegilops species is of great significance for understanding the evolution of these genera. In the present investigation the complete nucleotide sequence of plasmon D (chloroplast DNA) of nine accessions of Ae. tauschii and two accessions of Ae. cylindrica are presented. Twenty-eight SNPs are characteristic for both TauL1 and TauL2 accessions of Ae. tauschii using TauL3 as a reference. Four SNPs are additionally observed for TauL2 lineage. The longest (27 bp) indel is located in the intergenic spacer Rps15-ndhF of SSC. This indel can be used for simple determination of TauL3 lineage among Ae. tauschii accessions. In the case of Ae. cylindrica additionally 7 SNPs were observed. The phylogeny tree shows that chloroplast DNA of TauL1 and TauL2 diverged from the TauL3 lineage. TauL1 lineage is relatively older then TauL2. The position of Ae. cylindrica accessions on Ae. tauschii phylogeny tree constructed on chloroplast DNA variation data is intermediate between TauL1 and TauL2. The complete nucleotide sequence of chloroplast DNA of Ae. tauschii and Ae. cylindrica allows to refine the origin and evolution of D plasmon of genus Aegilops.
Gene Space Dynamics during the Evolution of Aegilops tauschii, Brachypodium distachyon, Oryza sativa, and Sorghum bicolor Genomes

USDA-ARS?s Scientific Manuscript database

Nine different regions totaling 9.7 Mb of the 4.02 Gb Aegilops tauschii genome were sequenced using the Sanger sequencing technology and compared with orthologous Brachypodium distachyon, Oryza sativa (rice) and Sorghum bicolor (sorghum) genomic sequences. The ancestral gene content in these regio...
Bread wheat progenitors: Aegilops tauschii (DD genome) and Triticum dicoccoides (AABB genome) reveal differential antioxidative response under water stress.

PubMed

Suneja, Yadhu; Gupta, Anil Kumar; Bains, Navtej Singh

2017-01-01

Antioxidant enzymes are known to play a significant role in scavenging reactive oxygen species and maintaining cellular homeostasis. Activity of four antioxidant enzymes viz., superoxide dismutase (SOD), catalase (CAT), ascorbate peroxidase (APX) and glutathione reductase (GR) was examined in the flag leaves of nine Aegilops tauschii and three Triticum dicoccoides accessions along with two bread wheat cultivars under irrigated and rain-fed conditions. These accessions were shortlisted from a larger set on the basis of field performance for a set of morpho-physiological traits. At anthesis, significant differences were observed in enzyme activities in two environments. A 45% elevation in average GR activity was observed under rain-fed conditions. Genotypic variation was evident within each environment as well as in terms of response to stress environment. Aegilops tauschii accession 3769 (86% increase in SOD, 41% in CAT, 72% in APX, 48% in GR activity) and acc. 14096 (37% increase in SOD, 32% CAT, 25% APX, 42% GR) showed up-regulation in the activity of all the four studied antioxidant enzymes. Aegilops tauschii accessions-9809, 14189 and 14113 also seemed to have strong induction mechanism as elevated activity of at least three enzymes was observed in them under rain-fed conditions. T. dicoccoides , on the other hand, maintained active antioxidative machinery under irrigated condition with relatively lower induction under stress. A significant positive correlation (r = 0.760) was identified between change in the activity of CAT and GR under stress. Changes in plant height, spike length and grain weight were recorded under stress and non-stress conditions on the basis of which a cumulative tolerance index was deduced and accessions were ranked for drought tolerance. Overall, Ae. tauschii accession 3769, 14096, 14113 (DD-genome) and T. dicoccoides accession 7054 (AABB-genome) may be used as donors to combine beneficial stress adaptive traits of all the three sub-genomes
Introgression lines of Triticum aestivum x Aegilops tauschii: Agronomic and nutritional value

USDA-ARS?s Scientific Manuscript database

Eighty-five single homozygous substitution lines (SLs) of the Aegilops tauschii D genome in Chinese Spring (CS) hexaploid wheat (Triticum aestivum L.) genetic background were evaluated for agronomic, phenotypic and ionome profiles during three years of field experiments. An augmented design with a r...
Phenotypic and ionome profiling of Triticum aestivum x Aegilops tauschii introgression lines

USDA-ARS?s Scientific Manuscript database

Eighty-four single homozygous introgressions of the Aegilops tauschii D-genome in the ‘Chinese Spring’ genetic background were used to study phenotypic and ionome profiles during two years of field experiments. An augmented design was used with a repeated check of a local bread wheat cultivar was im...
[Phylogenetic relationships and intraspecific variation of D-genome Aegilops L. as revealed by RAPD analysis].

PubMed

Goriunova, S V; Kochieva, E Z; Chikida, N N; Pukhal'skiĭ, V A

2004-05-01

RAPD analysis was carried out to study the genetic variation and phylogenetic relationships of polyploid Aegilops species, which contain the D genome as a component of the alloploid genome, and diploid Aegilops tauschii, which is a putative donor of the D genome for common wheat. In total, 74 accessions of six D-genome Aegilops species were examined. The highest intraspecific variation (0.03-0.21) was observed for Ae. tauschii. Intraspecific distances between accessions ranged 0.007-0.067 in Ae. cylindrica, 0.017-0.047 in Ae. vavilovii, and 0.00-0.053 in Ae. juvenalis. Likewise, Ae. ventricosa and Ae. crassa showed low intraspecific polymorphism. The among-accession difference in alloploid Ae. ventricosa (genome DvNv) was similar to that of one parental species, Ae. uniaristata (N), and substantially lower than in the other parent, Ae. tauschii (D). The among-accession difference in Ae. cylindrica (CcDc) was considerably lower than in either parent, Ae. tauschii (D) or Ae. caudata (C). With the exception of Ae. cylindrica, all D-genome species--Ae. tauschii (D), Ae. ventricosa (DvNv), Ae. crassa (XcrDcrl and XcrDcrlDcr2), Ae. juvenalis (XjDjUj), and Ae. vavilovii (XvaDvaSva)--formed a single polymorphic cluster, which was distinct from clusters of other species. The only exception, Ae. cylindrica, did not group with the other D-genome species, but clustered with Ae. caudata (C), a donor of the C genome. The cluster of these two species was clearly distinct from the cluster of the other D-genome species and close to a cluster of Ae. umbellulata (genome U) and Ae. ovata (genome UgMg). Thus, RAPD analysis for the first time was used to estimate and to compare the interpopulation polymorphism and to establish the phylogenetic relationships of all diploid and alloploid D-genome Aegilops species.
Transcriptomic analysis of Aegilops tauschii during long-term salinity stress.

PubMed

Mansouri, Mehdi; Naghavi, Mohammad Reza; Alizadeh, Hoshang; Mohammadi-Nejad, Ghasem; Mousavi, Seyed Ahmad; Salekdeh, Ghasem Hosseini; Tada, Yuichi

2018-06-21

Aegilops tauschii is the diploid progenitor of the bread wheat D-genome. It originated from Iran and is a source of abiotic stress tolerance genes. However, little is known about the molecular events of salinity tolerance in Ae. tauschii. This study investigates the leaf transcriptional changes associated with long-term salt stress. Total RNA extracted from leaf tissues of control and salt-treated samples was sequenced using the Illumina technology, and more than 98 million high-quality reads were assembled into 255,446 unigenes with an average length of 1398 bp and an N50 of 2269 bp. Functional annotation of the unigenes showed that 93,742 (36.69%) had at least a significant BLAST hit in the SwissProt database, while 174,079 (68.14%) showed significant similarity to proteins in the NCBI nr database. Differential expression analysis identified 4506 salt stress-responsive unigenes. Bioinformatic analysis of the differentially expressed unigenes (DEUs) revealed a number of biological processes and pathways involved in the establishment of ion homeostasis, signaling processes, carbohydrate metabolism, and post-translational modifications. Fine regulation of starch and sucrose content may be important features involved in salt tolerance in Ae. tauschii. Moreover, 82% of DEUs mapped to the D-subgenome, including known QTL for salt tolerance, and these DEUs showed similar salt stress responses in other accessions of Ae. tauschii. These results could provide fundamental insight into the regulatory process underlying salt tolerance in Ae. tauschii and wheat and facilitate identification of genes involved in their salt tolerance mechanisms.
Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS

PubMed Central

2010-01-01

Background Physical maps employing libraries of bacterial artificial chromosome (BAC) clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum), Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat. Results We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs. Conclusions The physical map reported here is the first physical map using fingerprinting of a complete Triticeae genome. This study
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

PubMed Central

2011-01-01

Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was
Identification and Analysis of RNA Editing Sites in the Chloroplast Transcripts of Aegilops tauschii L.

PubMed Central

Wang, Mengxing; Liu, Hui; Ge, Lingqiao; Xing, Guangwei; Wang, Meng; Weining, Song; Nie, Xiaojun

2016-01-01

RNA editing is an important way to convert cytidine (C) to uridine (U) at specific sites within RNA molecules at a post-transcriptional level in the chloroplasts of higher plants. Although it has been systematically studied in many plants, little is known about RNA editing in the wheat D genome donor Aegilops tauschii L. Here, we investigated the chloroplast RNA editing of Ae. tauschii and compared it with other wheat relatives to trace the evolution of wheat. Through bioinformatics prediction, a total of 34 C-to-U editing sites were identified, 17 of which were validated using RT-PCR product sequencing. Furthermore, 60 sites were found by the RNA-Seq read mapping approach, 24 of which agreed with the prediction and six were validated experimentally. The editing sites were biased toward tCn or nCa trinucleotides and 5′-pyrimidines, which were consistent with the flanking bases of editing sites of other seed plants. Furthermore, the editing events could result in the alteration of the secondary structures and topologies of the corresponding proteins, suggesting that RNA editing might impact the function of target genes. Finally, comparative analysis found some evolutionarily conserved editing sites in wheat and two species-specific sites were also obtained. This study is the first to report on RNA editing in Aegilops tauschii L, which not only sheds light on the evolution of wheat from the point of view of RNA editing, but also lays a foundation for further studies to identify the mechanisms of C-to-U alterations. PMID:28042823
Chloroplast and nuclear microsatellite analysis of Aegilops cylindrica.

PubMed

Gandhi, Harish T; Vales, M Isabel; Watson, Christy J W; Mallory-Smith, Carol A; Mori, Naoki; Rehman, Maqsood; Zemetra, Robert S; Riera-Lizarazu, Oscar

2005-08-01

Aegilops cylindrica Host (2n = 4x = 28, genome CCDD) is an allotetraploid formed by hybridization between the diploid species Ae. tauschii Coss. (2n = 2x = 14, genome DD) and Ae. markgrafii (Greuter) Hammer (2n = 2x = 14, genome CC). Previous research has shown that Ae. tauschii contributed its cytoplasm to Ae. cylindrica. However, our analysis with chloroplast microsatellite markers showed that 1 of the 36 Ae. cylindrica accessions studied, TK 116 (PI 486249), had a plastome derived from Ae. markgrafii rather than Ae. tauschii. Thus, Ae. markgrafii has also contributed its cytoplasm to Ae. cylindrica. Our analysis of chloroplast and nuclear microsatellite markers also suggests that D-type plastome and the D genome in Ae. cylindrica were closely related to, and were probably derived from, the tauschii gene pool of Ae. tauschii. A determination of the likely source of the C genome and the C-type plastome in Ae. cylindrica was not possible.
MlNCD1: A novel Aegilops tauschii derived powdery mildew resistance gene identified in common wheat

USDA-ARS?s Scientific Manuscript database

Powdery mildew is a major fungal disease in wheat, especially in cool maritime climates. A novel Aegilops tauschii derived wheat powdery mildew resistance gene present in the germplasm line NC96BGTD1 was genetically characterized as a monogenic trait in field trials using F2 and F4-derived lines fr...
Five Fatty Acyl-Coenzyme A Reductases Are Involved in the Biosynthesis of Primary Alcohols in Aegilops tauschii Leaves

PubMed Central

Wang, Meiling; Wu, Hongqi; Xu, Jing; Li, Chunlian; Wang, Yong; Wang, Zhonghua

2017-01-01

The diploid Aegilops tauschii is the D-genome donor to hexaploid wheat (Triticum aestivum) and represents a potential source for genetic study in common wheat. The ubiquitous wax covering the aerial parts of plants plays an important role in protecting plants against non-stomatal water loss. Cuticular waxes are complex mixtures of very-long-chain fatty acids, alkanes, primary and/or secondary alcohols, aldehydes, ketones, esters, triterpenes, sterols, and flavonoids. In the present work, primary alcohols were identified as the major components of leaf cuticular wax in Ae. tauschii, with C26:0-OH being the dominant primary alcohol. Analysis by scanning electron microscope revealed that dense platelet-shaped wax crystals were deposited on leaf surfaces of Ae. tauschii. Ten putative wax biosynthetic genes encoding fatty acyl-coenzyme A reductase (FAR) were identified in the genome of Ae. tauschii. Five of these genes, Ae.tFAR1, Ae.tFAR2, Ae.tFAR3, Ae.tFAR4, and Ae.tFAR6, were found expressed in the leaf blades. Heterologous expression of the five Ae.tFARs in yeast (Saccharomyces cerevisiae) showed that Ae.tFAR1, Ae.tFAR2, Ae.tFAR3, Ae.tFAR4, and Ae.tFAR6 were predominantly responsible for the accumulation of C16:0, C18:0, C26:0, C24:0, and C28:0 primary alcohols, respectively. In addition, nine Ae.tFAR paralogous genes were located on D chromosome of wheat and the wheat nullisomic–tetrasomic lines with the loss of Ae.tFAR3 and Ae.tFAR4 paralogous genes had significantly reduced levels of primary alcohols in the leaf blades. Collectively, these data suggest that Ae.tFAR1, Ae.tFAR2, Ae.tFAR3, Ae.tFAR4, and Ae.tFAR6 encode alcohol-forming FARs involved in the biosynthesis of primary alcohols in the leaf blades of Ae. tauschii. The information obtained in Ae. tauschii enables us to better understand wax biosynthesis in common wheat. PMID:28659955
A population of wheat multiple synthetic derivatives: an effective platform to explore, harness and utilize genetic diversity of Aegilops tauschii for wheat improvement.

PubMed

Gorafi, Yasir Serag Alnor; Kim, June-Sik; Elbashir, Awad Ahmed Elawad; Tsujimoto, Hisashi

2018-04-28

The multiple synthetic derivatives platform described in this study will provide an opportunity for effective utilization of Aegilops tauschii traits and genes for wheat breeding. Introducing genes from wild relatives is the best option to increase genetic diversity and discover new alleles necessary for wheat improvement. A population harboring genomic fragments from the diploid wheat progenitor Aegilops tauschii Coss. in the background of bread wheat (Triticum aestivum L.) was developed by crossing and backcrossing 43 synthetic wheat lines with the common wheat cultivar Norin 61. We named this population multiple synthetic derivatives (MSD). To validate the suitability of this population for wheat breeding and genetic studies, we randomly selected 400 MSD lines and genotyped them by using Diversity Array Technology sequencing markers. We scored black glume as a qualitative trait and heading time in two environments in Sudan as a quantitative trait. Our results showed high genetic diversity and less recombination which is expected from the nature of the population. Genome-wide association (GWA) analysis showed one QTL at the short arm of chromosome 1D different from those alleles reported previously indicating that black glume in the MSD population is controlled by new allele at the same locus. For heading time, from the two environments, GWA analysis revealed three QTLs on the short arms of chromosomes 2A, 2B and 2D and two on the long arms of chromosomes 5A and 5D. Using the MSD population, which represents the diversity of 43 Ae. tauschii accessions representing most of its natural habitat, QTLs or genes and desired phenotypes (such as drought, heat and salinity tolerance) could be identified and selected for utilization in wheat breeding.
Sequencing and comparative analyses of Aegilops tauschii chromosome arm 3DS revealed rapid evolution of Triticeae genome

USDA-ARS?s Scientific Manuscript database

Bread wheat (Triticum aestivum, AABBDD) is an allohexaploid species derived from multiple rounds of interspecific hybridizations. A high-quality genome assembly of diploid Ae. tauschii, the donor of the wheat D genome, will provide a useful platform to study polyploid wheat evolution. A combination...
Aegilops tauschii Accessions with Geographically Diverse Origin Show Differences in Chromosome Organization and Polymorphism of Molecular Markers Linked to Leaf Rust and Powdery Mildew Resistance Genes.

PubMed

Majka, Maciej; Kwiatek, Michał T; Majka, Joanna; Wiśniewska, Halina

2017-01-01

Aegilops tauschii (2n = 2x = 14) is a diploid wild species which is reported as a donor of the D-genome of cultivated bread wheat. The main goal of this study was to examine the differences and similarities in chromosomes organization among accessions of Ae. tauschii with geographically diversed origin, which is believed as a potential source of genes, especially determining resistance to fungal diseases (i.e., leaf rust and powdery mildew) for breeding of cereals. We established and compared the fluorescence in situ hybridization patterns of 21 accessions of Ae. tauschii using various repetitive sequences mainly from the BAC library of wheat cultivar Chinese Spring. Results obtained for Ae. tauschii chromosomes revealed many similarities between analyzed accessions, however, some hybridization patterns were specific for accessions, which become from cognate regions of the World. The most noticeable differences were observed for accessions from China which were characterized by presence of distinct signals of pTa-535 in the interstitial region of chromosome 3D, less intensity of pTa-86 signals in chromosome 2D, as well as lack of additional signals of pTa-86 in chromosomes 1D, 5D, or 6D. Ae. tauschii of Chinese origin appeared homogeneous and separate from landraces that originated in western Asia. Ae. tauschii chromosomes showed similar hybridization patterns to wheat D-genome chromosomes, but some differences were also observed among both species. What is more, we identified reciprocal translocation between short arm of chromosome 1D and long arm of chromosome 7D in accession with Iranian origin. High polymorphism between analyzed accessions and extensive allelic variation were revealed using molecular markers associated with resistance genes. Majority of the markers localized in chromosomes 1D and 2D showed the diversity of banding patterns between accessions. Obtained results imply, that there is a moderate or high level of polymorphism in the genome of Ae
Characterization of morphology and resistance to Blumeria graminis of winter triticale monosomic addition lines with chromosome 2D of Aegilops tauschii.

PubMed

Majka, M; Kwiatek, M; Belter, J; Wiśniewska, H

2016-10-01

Allocation of the chromosome 2D of Ae. tauschii in triticale background resulted in changes of its organization, what is related to varied expression of genes determining agronomically important traits. Monosomic alien addition lines (MAALs) are crucial for transfer of genes from wild relatives into cultivated varieties. This kind of genetic stocks is used for physical mapping of specific chromosomes and analyzing alien genes expression. The main aim of our study is to improve hexaploid triticale by transferring D-genome chromatin from Aegilops tauschii × Secale cereale (2n = 4x = 28, DDRR). In this paper, we demonstrate the molecular cytogenetics analysis and SSR markers screening combined with phenotype analysis and evaluation of powdery mildew infection of triticale monosomic addition lines carrying chromosome 2D of Ae. tauschii. We confirmed the inheritance of chromosome 2D from the BC2F4 to the BC2F6 generation of triticale hybrids. Moreover, we unveiled a high variable region on the short arm of chromosome 2D, where chromosome rearrangements were mapped. These events had direct influence on plant height of hybrids what might be connected with changes at Rht8 loci. We obtained 20 semi-dwarf plants of BC2F6 generation carrying 2D chromosome with the powdery mildew resistance, without changes in spike morphology, which can be used in the triticale breeding programs.
Radiation hybrid maps of the D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes.

PubMed

Kumar, Ajay; Seetan, Raed; Mergoum, Mohamed; Tiwari, Vijay K; Iqbal, Muhammad J; Wang, Yi; Al-Azzam, Omar; Šimková, Hana; Luo, Ming-Cheng; Dvorak, Jan; Gu, Yong Q; Denton, Anne; Kilian, Andrzej; Lazo, Gerard R; Kianian, Shahryar F

2015-10-16

The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average

Genetic mapping reveals a dominant awn-inhibiting gene related to differentiation of the variety anathera in the wild diploid wheat Aegilops tauschii.

PubMed

Nishijima, Ryo; Ikeda, Tatsuya M; Takumi, Shigeo

2018-02-01

Aegilops tauschii, a wild wheat relative, is the D-genome donor of common wheat. Subspecies and varieties of Ae. tauschii are traditionally classified based on differences in their inflorescence architecture. However, the genetic information for their diversification has been quite limited in the wild wheat relatives. The variety anathera has no awn on the lemma, but the genetic basis for this diagnostic character is unknown. Wide variations in awn length traits at the top and middle spikes were found in the Ae. tauschii core collection, and the awn length at the middle spike was significantly smaller in the eastward-dispersed sublineage than in those in other sublineages. To clarify loci controlling the awnless phenotype of var. anathera, we measured awn length of an intervariety F 2 mapping population, and found that the F 2 individuals could be divided into two groups mainly based on the awn length at the middle of spike, namely short and long awn groups, significantly fitting a 3:1 segregation ratio, which indicated that a single locus controls the awnless phenotype. The awnless locus, Anathera (Antr), was assigned to the distal region of the short arm of chromosome 5D. Quantitative trait locus analysis using the awn length data of each F 2 individual showed that only one major locus was at the same chromosomal position as Antr. These results suggest that a single dominant allele determines the awnless diagnostic character in the variety anathera. The Antr dominant allele is a novel gene inhibiting awn elongation in wheat and its relatives.
Identification and mapping of Sr46 from Aegilops tauschii accession CIae 25 conferring resistance to race TTKSK (Ug99) of wheat stem rust pathogen.

PubMed

Yu, Guotai; Zhang, Qijun; Friesen, Timothy L; Rouse, Matthew N; Jin, Yue; Zhong, Shaobin; Rasmussen, Jack B; Lagudah, Evans S; Xu, Steven S

2015-03-01

Mapping studies confirm that resistance to Ug99 race of stem rust pathogen in Aegilops tauschii accession Clae 25 is conditioned by Sr46 and markers linked to the gene were developed for marker-assisted selection. The race TTKSK (Ug99) of Puccinia graminis f. sp. tritici, the causal pathogen for wheat stem rust, is considered as a major threat to global wheat production. To address this threat, researchers across the world have been devoted to identifying TTKSK-resistant genes. Here, we report the identification and mapping of a stem rust resistance gene in Aegilops tauschii accession CIae 25 that confers resistance to TTKSK and the development of molecular markers for the gene. An F2 population of 710 plants from an Ae. tauschii cross CIae 25 × AL8/78 were first evaluated against race TPMKC. A set of 14 resistant and 116 susceptible F2:3 families from the F2 plants were then evaluated for their reactions to TTKSK. Based on the tests, 179 homozygous susceptible F2 plants were selected as the mapping population to identify the simple sequence repeat (SSR) and sequence tagged site (STS) markers linked to the gene by bulk segregant analysis. A dominant stem rust resistance gene was identified and mapped with 16 SSR and five new STS markers to the deletion bin 2DS5-0.47-1.00 of chromosome arm 2DS in which Sr46 was located. Molecular marker and stem rust tests on CIae 25 and two Ae. tauschii accessions carrying Sr46 confirmed that the gene in CIae 25 is Sr46. This study also demonstrated that Sr46 is temperature-sensitive being less effective at low temperatures. The marker validation indicated that two closely linked markers Xgwm210 and Xwmc111 can be used for marker-assisted selection of Sr46 in wheat breeding programs.
Genetic diversity of avenin-like b genes in Aegilops tauschii Coss.

PubMed

Cao, Dong; Wang, Hongxia; Zhang, Bo; Liu, Baolong; Liu, Dengcai; Chen, Wenjie; Zhang, Huaigang

2018-02-01

Avenin-like storage proteins influence the rheological properties and processing quality in common wheat, and the discovery of new alleles will benefit wheat quality improvement. In this study, 13 avenin-like b alleles (TaALPb7D-A-M) were discovered in 108 Aegilops tauschii Coss. accessions. Ten alleles were reported for the first time, while the remaining three alleles were the same as alleles in other species. A total of 15 nucleotide changes were detected in the 13 alleles, resulting in only 11 amino acid changes because of synonymous mutations. Alleles TaALPb7D-E, TaALPb7D-G, and TaALPb7D-J encoded the same protein. These polymorphic sites existed in the N-terminus, Repetitive region (Left), Repetitive region (Right) and C-terminus domains, with no polymorphisms in the signal peptide sequence nor in those encoding the 18 conserved cysteine residues. Phylogenetic analysis divided the TaALPb7Ds into four clades. The Ae. tauschii alleles were distributed in all four clades, while the alleles derived from common wheat, TaALPb7D-G and TaALPb7D-C, belonged to clade III and IV, respectively. Alleles TaALPb7D-G and TaALPb7D-C were the most widely distributed, being present in nine and six countries, respectively. Iran and Turkey exhibited the highest genetic diversity with respect to TaALPb7D alleles, accessions from these countries carrying seven and six alleles, respectively, which implied that these countries were the centers of origin of the avenin-like b gene. The new alleles discovered and the phylogenetic analysis of avenin-like b genes will provide breeding materials and a theoretical basis for wheat quality improvement.
Genome Comparisons Reveal a Dominant Mechanism of Chromosome Number Reduction in Grasses and Accelerated Genome Evolution in Triticeae

USDA-ARS?s Scientific Manuscript database

Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...
Introgression of wheat DNA markers from A, B and D genomes in early generation progeny of Aegilops cylindrica Host x Triticum aestivum L. hybrids.

PubMed

Schoenenberger, N; Felber, F; Savova-Bianchi, D; Guadagnuolo, R

2005-11-01

Introgression from allohexaploid wheat (Triticum aestivum L., AABBDD) to allotetraploid jointed goatgrass (Aegilops cylindrica Host, CCDD) can take place in areas where the two species grow in sympatry and hybridize. Wheat and Ae. cylindrica share the D genome, issued from the common diploid ancestor Aegilops tauschii Coss. It has been proposed that the A and B genome of bread wheat are secure places to insert transgenes to avoid their introgression into Ae. cylindrica because during meiosis in pentaploid hybrids, A and B genome chromosomes form univalents and tend to be eliminated whereas recombination takes place only in D genome chromosomes. Wheat random amplified polymorphic DNA (RAPD) fragments, detected in intergeneric hybrids and introgressed to the first backcross generation with Ae. cylindrica as the recurrent parent and having a euploid Ae. cylindrica chromosome number or one supernumerary chromosome, were assigned to wheat chromosomes using Chinese Spring nulli-tetrasomic wheat lines. Introgressed fragments were not limited to the D genome of wheat, but specific fragments of A and B genomes were also present in the BC1. Their presence indicates that DNA from any of the wheat genomes can introgress into Ae. cylindrica. Successfully located RAPD fragments were then converted into highly specific and easy-to-use sequence characterised amplified regions (SCARs) through sequencing and primer design. Subsequently these markers were used to characterise introgression of wheat DNA into a BC1S1 family. Implications for risk assessment of genetically modified wheat are discussed.
Genetic mapping of a novel recessive allele for non-glaucousness in wild diploid wheat Aegilops tauschii: implications for the evolution of common wheat.

PubMed

Nishijima, Ryo; Tanaka, Chisa; Yoshida, Kentaro; Takumi, Shigeo

2018-04-01

Cuticular wax on the aerial surface of plants has a protective function against many environmental stresses. The bluish-whitish appearance of wheat leaves and stems is called glaucousness. Most modern cultivars of polyploid wheat species exhibit the glaucous phenotype, while in a wild wheat progenitor, Ae. tauschii, both glaucous and non-glaucous accessions exist. Iw2, a wax inhibitor locus on the short arm of chromosome 2D, is the main contributor to this phenotypic variation in Ae. tauschii, and the glaucous/non-glaucous phenotype of Ae. tauschii is usually inherited by synthetic hexaploid wheat. However, a few synthetic lines show the glaucous phenotype although the parental Ae. tauschii accessions are non-glaucous. Molecular marker genotypes indicate that the exceptional non-glaucous Ae. tauschii accessions share the same genotype in the Iw2 chromosomal region as glaucous accessions, suggesting that these accessions have a different causal locus for their phenotype. This locus was assigned to the long arm of chromosome 3D using an F 2 mapping population and designated W4, a novel glaucous locus in Ae. tauschii. The dominant W4 allele confers glaucousness, consistent with phenotypic observation of Ae. tauschii accessions and the derived synthetic lines. These results implied that glaucous accessions of Ae. tauschii with the W2W2iw2iw2W4W4 genotype could have been the D-genome donor of common wheat.
Intraspecific lineage divergence and its association with reproductive trait change during species range expansion in central Eurasian wild wheat Aegilops tauschii Coss. (Poaceae).

PubMed

Matsuoka, Yoshihiro; Takumi, Shigeo; Kawahara, Taihachi

2015-09-30

How species ranges form in landscapes is a matter of long-standing evolutionary interest. However, little is known about how natural phenotypic variations of ecologically important traits contribute to species range expansion. In this study, we examined the phylogeographic patterns of phenotypic changes in life history (seed production) and phenological (flowering time) traits during the range expansion of Aegilops tauschii Coss. from the Transcaucasus and Middle East to central Asia. Our comparative analyses of the patterns of natural variations for those traits and their association with the intraspecific lineage structure showed that (1) the eastward expansion to Asia was driven by an intraspecific sublineage (named TauL1b), (2) high seed production ability likely had an important role at the initial dispersal stage of TauL1b's expansion to Asia, and (3) the phenological change to early flowering phenotypes was one of the key adaptation events for TauL1b to further expand its range in Asia. This study provides for the first time a broad picture of the process of Ae. tauschii's eastward range expansion in which life history and phenological traits may have had respective roles in its dispersal and adaptation in Asia. The clear association of seed production and flowering time patterns with the intraspecific lineage divergence found in this study invites further genetic research to bring the mechanistic understanding of the changes in these key functional traits during range expansion within reach.
Harnessing NGS and Big Data Optimally: Comparison of miRNA Prediction from Assembled versus Non-assembled Sequencing Data--The Case of the Grass Aegilops tauschii Complex Genome.

PubMed

Budak, Hikmet; Kantar, Melda

2015-07-01

MicroRNAs (miRNAs) are small, endogenous, non-coding RNA molecules that regulate gene expression at the post-transcriptional level. As high-throughput next generation sequencing (NGS) and Big Data rapidly accumulate for various species, efforts for in silico identification of miRNAs intensify. Surprisingly, the effect of the input genomics sequence on the robustness of miRNA prediction was not evaluated in detail to date. In the present study, we performed a homology-based miRNA and isomiRNA prediction of the 5D chromosome of bread wheat progenitor, Aegilops tauschii, using two distinct sequence data sets as input: (1) raw sequence reads obtained from 454-GS FLX Titanium sequencing platform and (2) an assembly constructed from these reads. We also compared this method with a number of available plant sequence datasets. We report here the identification of 62 and 22 miRNAs from raw reads and the assembly, respectively, of which 16 were predicted with high confidence from both datasets. While raw reads promoted sensitivity with the high number of miRNAs predicted, 55% (12 out of 22) of the assembly-based predictions were supported by previous observations, bringing specificity forward compared to the read-based predictions, of which only 37% were supported. Importantly, raw reads could identify several repeat-related miRNAs that could not be detected with the assembly. However, raw reads could not capture 6 miRNAs, for which the stem-loops could only be covered by the relatively longer sequences from the assembly. In summary, the comparison of miRNA datasets obtained by these two strategies revealed that utilization of raw reads, as well as assemblies for in silico prediction, have distinct advantages and disadvantages. Consideration of these important nuances can benefit future miRNA identification efforts in the current age of NGS and Big Data driven life sciences innovation.
Characterization of high molecular weight glutenin subunits in Thinopyrum intermedium, Th. bessarabicum, Lophopyrum elongatum, Aegilops markgrafii, and their addition lines in wheat

USDA-ARS?s Scientific Manuscript database

High molecular weight (HMW) glutenin subunits (GSs) play an important role in determining dough viscoelastic properties and end-use quality in cultivated wheat, and they are also excellent protein markers for genotype identification. The HMW-GSs in wheat species (Triticum ssp.) and Aegilops tauschii...
Structural variation and rates of genome evolution in the grass family seen through comparison of sequences of genomes greatly differing in size.

PubMed

Dvorak, Jan; Wang, Le; Zhu, Tingting; Jorgensen, Chad M; Deal, Karin R; Dai, Xiongtao; Dawson, Matthew W; Müller, Hans-Georg; Luo, Ming-Cheng; Ramasamy, Ramesh K; Dehghani, Hamid; Gu, Yong Q; Gill, Bikram S; Distelfeld, Assaf; Devos, Katrien M; Qi, Peng; You, Frank M; Gulick, Patrick J; McGuire, Patrick E

2018-05-16

Homology was searched with genes annotated in the Aegilops tauschii pseudomolecules against genes annotated in the pseudomolecules of tetraploid wild emmer wheat, Brachypodium distachyon, sorghum, and rice. Similar searches were initiated with genes annotated in the rice pseudomolecules. Matrices of colinear genes and rearrangements in their order were constructed. Optical Bionano genome maps were constructed and used to validate rearrangements unique to the wild emmer and Ae. tauschii genomes. Most common rearrangements were short paracentric inversions and short intrachromosomal translocations. Intrachromosomal translocations outnumbered segmental intrachromosomal duplications. The densities of paracentric inversion lengths were approximated by exponential distributions in all six genomes. Densities of colinear genes along the Ae. tauschii chromosomes were highly correlated with meiotic recombination rates but those of rearrangements were not, suggesting different causes of the erosion of gene colinearity and evolution of major chromosome rearrangements. Frequent rearrangements sharing breakpoints suggested that chromosomes have been rearranged recurrently at some sites. The distal 4 Mb of the short arms of rice chromosomes Os11 and Os12 and corresponding regions in the sorghum, B. distachyon, and Triticeae genomes contain clusters of interstitial translocations including from 1 to 7 colinear genes. The rates of acquisition of major rearrangements were greater in the wild emmer wheat and Ae. tauschii genomes than in the lineage preceding their divergence or in the B. distachyon, rice, and sorghum lineages. It is suggested that synergy between large quantities of dynamic transposable elements and annual growth habit caused the fast evolution of the Triticeae genomes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Cytoplasmic genome substitution in wheat affects the nuclear-cytoplasmic cross-talk leading to transcript and metabolite alterations

PubMed Central

2013-01-01

Background Alloplasmic lines provide a unique tool to study nuclear-cytoplasmic interactions. Three alloplasmic lines, with nuclear genomes from Triticum aestivum and harboring cytoplasm from Aegilops uniaristata, Aegilops tauschii and Hordeum chilense, were investigated by transcript and metabolite profiling to identify the effects of cytoplasmic substitution on nuclear-cytoplasmic signaling mechanisms. Results In combining the wheat nuclear genome with a cytoplasm of H. chilense, 540 genes were significantly altered, whereas 11 and 28 genes were significantly changed in the alloplasmic lines carrying the cytoplasm of Ae. uniaristata or Ae. tauschii, respectively. We identified the RNA maturation-related process as one of the most sensitive to a perturbation of the nuclear-cytoplasmic interaction. Several key components of the ROS chloroplast retrograde signaling, together with the up-regulation of the ROS scavenging system, showed that changes in the chloroplast genome have a direct impact on nuclear-cytoplasmic cross-talk. Remarkably, the H. chilense alloplasmic line down-regulated some genes involved in the determination of cytoplasmic male sterility without expressing the male sterility phenotype. Metabolic profiling showed a comparable response of the central metabolism of the alloplasmic and euplasmic lines to light, while exposing larger metabolite alterations in the H. chilense alloplasmic line as compared with the Aegilops lines, in agreement with the transcriptomic data. Several stress-related metabolites, remarkably raffinose, were altered in content in the H. chilense alloplasmic line when exposed to high light, while amino acids, as well as organic acids were significantly decreased. Alterations in the levels of transcript, related to raffinose, and the photorespiration-related metabolisms were associated with changes in the level of related metabolites. Conclusion The replacement of a wheat cytoplasm with the cytoplasm of a related species affects
Global gene expression profiling related to temperature-sensitive growth abnormalities in interspecific crosses between tetraploid wheat and Aegilops tauschii

PubMed Central

Sakaguchi, Kouhei; Ohno, Ryoko; Yoshida, Kentaro

2017-01-01

Triploid wheat hybrids between tetraploid wheat and Aegilops tauschii sometimes show abnormal growth phenotypes, and the growth abnormalities inhibit generation of wheat synthetic hexaploids. In type II necrosis, one of the growth abnormalities, necrotic cell death accompanied by marked growth repression occurs only under low temperature conditions. At normal temperature, the type II necrosis lines show grass-clump dwarfism with no necrotic symptoms, excess tillers, severe dwarfism and delayed flowering. Here, we report comparative expression analyses to elucidate the molecular mechanisms of the temperature-dependent phenotypic plasticity in the triploid wheat hybrids. We compared gene and small RNA expression profiles in crown tissues to characterize the temperature-dependent phenotypic plasticity. No up-regulation of defense-related genes was observed under the normal temperature, and down-regulation of wheat APETALA1-like MADS-box genes, considered to act as flowering promoters, was found in the grass-clump dwarf lines. Some microRNAs, including miR156, were up-regulated, whereas the levels of transcripts of the miR156 target genes SPLs, known to inhibit tiller and branch number, were reduced in crown tissues of the grass-clump dwarf lines at the normal temperature. Unusual expression of the miR156/SPLs module could explain the grass-clump dwarf phenotype. Dramatic alteration of gene expression profiles, including miRNA levels, in crown tissues is associated with the temperature-dependent phenotypic plasticity in type II necrosis/grass-clump dwarf wheat hybrids. PMID:28463975
Genetic structure of Aegilops cylindrica Host in its native range and in the United States of America.

PubMed

Gandhi, Harish T; Vales, M Isabel; Mallory-Smith, Carol; Riera-Lizarazu, Oscar

2009-10-01

Chloroplast and nuclear microsatellite markers were used to study genetic diversity and genetic structure of Aegilops cylindrica Host collected in its native range and in adventive sites in the USA. Our analysis suggests that Ae. cylindrica, an allotetraploid, arose from multiple hybridizations between Ae. markgrafii (Greuter) Hammer. and Ae. tauschii Coss. presumably along the Fertile Crescent, where the geographic distributions of its diploid progenitors overlap. However, the center of genetic diversity of this species now encompasses a larger area including northern Iraq, eastern Turkey, and Transcaucasia. Although the majority of accessions of Ae. cylindrica (87%) had D-type plastomes derived from Ae. tauschii, accessions with C-type plastomes (13%), derived from Ae. markgrafii, were also observed. This corroborates a previous study suggesting the dimaternal origin of Ae. cylindrica. Model-based and genetic distance-based clustering using both chloroplast and nuclear markers indicated that Ae. tauschii ssp. tauschii contributed one of its D-type plastomes and its D genome to Ae. cylindrica. Analysis of genetic structure using nuclear markers suggested that Ae. cylindrica accessions could be grouped into three subpopulations (arbitrarily named N-K1, N-K2, and N-K3). Members of the N-K1 subpopulation were the most numerous in its native range and members of the N-K2 subpopulation were the most common in the USA. Our analysis also indicated that Ae. cylindrica accessions in the USA were derived from a few founder genotypes. The frequency of Ae. cylindrica accessions with the C-type plastome in the USA (approximately 24%) was substantially higher than in its native range of distribution (approximately 3%) and all C-type Ae. cylindrica in the USA except one belonged to subpopulation N-K2. The high frequency of the C-type plastome in the USA may reflect a favorable nucleo-cytoplasmic combination.
Hybrid incompatibilities in interspecific crosses between tetraploid wheat and its wild diploid relative Aegilops umbellulata.

PubMed

Okada, Moeko; Yoshida, Kentaro; Takumi, Shigeo

2017-12-01

Hybrid abnormalities, severe growth abortion and grass-clump dwarfism, were found in the tetraploid wheat/Aegilops umbellulata hybrids, and the gene expression changes were conserved in the hybrids with those in other wheat synthetic hexaploids. Aegilops umbellulata Zhuk., a diploid goatgrass species with a UU genome, has been utilized as a genetic resource for wheat breeding. Here, we examine the reproductive barriers between tetraploid wheat cultivar Langdon (Ldn) and various Ae. umbellulata accessions by conducting interspecific crossings. Through systematic cross experiments, three types of hybrid incompatibilities were found: seed production failure in crosses, hybrid growth abnormalities and sterility in the ABU hybrids. Hybrid incompatibilities were widely distributed over the entire range of the natural species, and in about 50% of the cross combinations between tetraploid Ldn and Ae. umbellulata accessions, ABU F 1 hybrids showed one of two abnormal growth phenotypes: severe growth abortion (SGA) or grass-clump dwarfism. Expression of the shoot meristem maintenance-related and cell cycle-related genes was markedly repressed in crown tissues of hybrids showing SGA, suggesting dysfunction of mitotic cell division in the shoot apices. The grass-clump dwarf phenotype may be explained by down-regulation of wheat APETALA1-like MADS box genes, which act as flowering promoters, and altered expression in crown tissues of the miR156/SPLs module, which controls tiller number and branching. These gene expression changes in growth abnormalities were well conserved between the Ldn/Ae. umbellulata plants and interspecific hybrids from crosses of Ldn and wheat D-genome progenitor Ae. tauschii.
Dynamic nucleolar activity in wheat × Aegilops hybrids: evidence of C-genome dominance.

PubMed

Mirzaghaderi, Ghader; Abdolmalaki, Zinat; Zohouri, Mohsen; Moradi, Zeinab; Mason, Annaliese S

2017-08-01

NOR loci of C-subgenome are dominant in wheat × Aegilops interspecific hybrids, which may have evolutionary implications for wheat group genome dynamics and evolution. After interspecific hybridisation, some genes are often expressed from only one of the progenitor species, shaping subsequent allopolyploid genome evolution processes. A well-known example is nucleolar dominance, i.e. the formation of cell nucleoli from chromosomes of only one parental species. We studied nucleolar organizing regions (NORs) in diploid Aegilops markgrafii (syn: Ae. caudata; CC), Ae. umbellulata (UU), allotetraploids Aegilops cylindrica (C c C c D c D c ) and Ae. triuncialis (C t C t U t U t ), synthetic interspecific F 1 hybrids between these two allotetraploids and bread wheat (Triticum aestivum, AABBDD) and in F 3 generation hybrids with genome composition AABBDDC t C t U t U t using silver staining and fluorescence in situ hybridization (FISH). In Ae. markgrafii (CC), NORs of both 1C and 5C or only 5C chromosome pairs were active in different individual cells, while only NORs on 1U chromosomes were active in Ae. umbellulata (UU). Although all 35S rDNA loci of the C t subgenome (located on 1C t and 5C t ) were active in Ae. triuncialis, only one pair (occupying either 1C c or 5C c ) was active in Ae. cylindrica, depending on the genotype studied. These C-genome expression patterns were transmitted to the F 1 and F 3 generations. Wheat chromosome NOR activity was variable in Ae. triuncialis × T. aestivum F 1 seeds, but silenced by the F 3 generation. No effect of maternal or paternal cross direction was observed. These results indicate that C-subgenome NOR loci are dominant in wheat × Aegilops interspecific hybrids, which may have evolutionary implications for wheat group genome dynamics and allopolyploid evolution.
Gametocidal genes of Aegilops: segregation distorters in wheat-Aegilops wide hybridization.

PubMed

Niranjana, M

2017-08-01

Aegilops is a genus belonging to the family Poaceace, which have played an indispensible role in the evolution of bread wheat and continues to do so by transferring genes by wide hybridization. Being the secondary gene pool of wheat, gene transfer from Aegilops poses difficulties and segregation distortion is common. Gametocidal genes are the most well characterized class of segregation distorters reported in interspecific crosses of wheat with Aegilops. These "selfish" genetic elements ensure their preferential transmission to progeny at the cost of gametes lacking them without providing any phenotypic benefits to the plant, thereby causing a proportional reduction in fertility. Gametocidal genes (Gc) have been reported in different species of Aegilops belonging to the sections Aegilops (Ae. geniculata and Ae. triuncialis), Cylindropyrum (Ae. caudata and Ae. cylindrica), and Sitopsis (Ae. longissima, Ae. sharonensis, and Ae. speltoides). Gametocidal activity is mostly confined to 2, 3, and 4 homeologous groups of C, S, S 1 , S sh , and M g genomes. Removal of such genes is necessary for successful alien gene introgression and can be achieved by mutagenesis or allosyndetic pairing. However, there are some instances where Gc genes are constructively utilized for development of deletion stocks in wheat, improving genetic variability and chromosome engineering.
[Detection of the introgression of genome elements of the Aegilops cylindrica host. into the Triticum aestivum L. genome by ISSR and SSR analysis].

PubMed

Galaev, A V; Babaiants, L T; Sivolap, Iu M

2004-12-01

To reveal sites of the donor genome in wheat crossed with Aegilops cylindrica, which acquired conferred resistance to fungal diseases, a comparative analysis of introgressive and parental forms was conducted. Two systems of PCR analysis, ISSR and SSR-PCR, were employed. Upon use of 7 ISSR primers in genotypes of 30 individual plants BC1 F9 belonging to lines 5/55-91 and 5/20-91, 19 ISSR loci were revealed and assigned to introgressive fragments of Aegilops cylindrica genome in Triticum aestivum. The 40 pairs of SSR primers allowed the detection of seven introgressive alleles; three of these alleles were located on common wheat chromosomes in the B genome, while four alleles, in the D genome. Based on data of microsatellite analysis, it was assumed that the telomeric region of the long arm of common wheat chromosome 6A also changed. ISSR and SSR methods were shown to be effective for detecting variability caused by introgression of foreign genetic material into the genome of common wheat.
The origin of the B-genome of bread wheat (Triticum aestivum L.).

PubMed

Haider, N

2013-03-01

Understanding the origin of cultivated wheats would further their genetic improvement. The hexaploid bread wheat (Triticum aestivum L., AABBDD) is believed to have originated through one or more rare hybridization events between Aegilops tauschii (DD) and the tetraploid T. turgidum (AABB). Progenitor of the A-genome of the tetraploid and hexaploid wheats has generally been accepted to be T. urartu. In spite of the large number of attempts and published reports about the origin of the B-genome in cultivated wheats, the donor of the B-genome is still relatively unknown and controversial and, hence, remains open. This genome has been found to be closely related to the S-genome of the Sitopsis section (Ae. speltoides, Ae. longissima, Ae. sharonensis, Ae. searsii, and Ae. bicornis) of the genus Aegilops L. Among Sitopsis species, the most positive evidence has been accumulated for Ae. speltoides as the progenitor of the B-genome. Therefore, one or more of the Sitopsis species were proposed frequently as the B-genome donor. Although several reviews have been written on the origin of the genomes of wheat over the years, this paper will attempt for the first time to review the immense literature on the subject, with a particular emphasis on the B-genome which has attracted a huge attention over some 100 years. The ambiguity and conflicting results in most of the methods employed in deducing the precise B-genome donor/s to bread wheat are also discussed.
Map-based analysis of the tenacious glume gene Tg-B1 of wild emmer and its role in wheat domestication

USDA-ARS?s Scientific Manuscript database

The domestication of wheat was instrumental in spawning the civilization of humankind, and it occurred through genetic mutations that gave rise to types with non-fragile rachises, soft glumes, and free-threshing seed. The Tg-D1 gene on chromosome 2D of Aegilops tauschii, the D-genome progenitor of ...
[Genetics determination of wheat resistance to Puccinia graminis F. sp. tritici deriving from Aegilops cylindrica, Triticum erebuni and amphidiploid 4].

PubMed

Babaiants, O V; Babaiants, L T; Horash, A F; Vasil'ev, A A; Trackovetskaia, V A; Paliasn'iĭĭ, V A

2012-01-01

The lines of winter soft wheat developed in the Plant Breeding and Genetics Institute contain new effective introgressive Sr-genes. Line 85/06 possess SrAc1 gene, lines 47/06, 54/06, 82/06, 85/06, 87/06, 238/06, and 367/06 possess SrAc1 and SrAc2 derived from Aegilops cylindrica, line 352/06 - SrTe1 and SrTe2 from Triticum erebuni, line 12/86-04 - SrAd1 and SrAd2 from Amphidiploid 4 (Triticum dicoccoides x Triticum tauschii).

Unlocking Triticeae genomics to sustainably feed the future

PubMed Central

Mochida, Keiichi; Shinozaki, Kazuo

2013-01-01

The tribe Triticeae includes the major crops wheat and barley. Within the last few years, the whole genomes of four Triticeae species—barley, wheat, Tausch’s goatgrass (Aegilops tauschii) and wild einkorn wheat (Triticum urartu)—have been sequenced. The availability of these genomic resources for Triticeae plants and innovative analytical applications using next-generation sequencing technologies are helping to revitalize our approaches in genetic work and to accelerate improvement of the Triticeae crops. Comparative genomics and integration of genomic resources from Triticeae plants and the model grass Brachypodium distachyon are aiding the discovery of new genes and functional analyses of genes in Triticeae crops. Innovative approaches and tools such as analysis of next-generation populations, evolutionary genomics and systems approaches with mathematical modeling are new strategies that will help us discover alleles for adaptive traits to future agronomic environments. In this review, we provide an update on genomic tools for use with Triticeae plants and Brachypodium and describe emerging approaches toward crop improvements in Triticeae. PMID:24204022
Genetic diversity among synthetic hexaploid wheat accessions with resistance to several fungal diseases

USDA-ARS?s Scientific Manuscript database

Synthetic hexaploid wheat (SHW) is known to be an excellent vehicle for transferring large genetic variations especially the many useful traits present in the D genome of Aegilops tauschii Coss (2n=2x=14, DD) for improvement of cultivated wheat (Triticum aestivum L., 2n=6x=42, AABBDD). The objectiv...
Molecular characterization of the celiac disease epitope domains in α-gliadin genes in Aegilops tauschii and hexaploid wheats (Triticum aestivum L.).

PubMed

Xie, Zhenze; Wang, Congyan; Wang, Ke; Wang, Shunli; Li, Xiaohui; Zhang, Zhao; Ma, Wujun; Yan, Yueming

2010-11-01

Nineteen novel full-ORF α-gliadin genes and 32 pseudogenes containing at least one stop codon were cloned and sequenced from three Aegilops tauschii accessions (T15, T43 and T26) and two bread wheat cultivars (Gaocheng 8901 and Zhongyou 9507). Analysis of three typical α-gliadin genes (Gli-At4, Gli-G1 and Gli-Z4) revealed some InDels and a considerable number of SNPs among them. Most of the pseudogenes were resulted from C to T change, leading to the generation of TAG or TAA in-frame stop codon. The putative proteins of both Gli-At3 and Gli-Z7 genes contained an extra cysteine residue in the unique domain II. Analysis of toxic epitodes among 19 deduced α-gliadins demonstrated that 14 of these contained 1-5 T cell stimulatory toxic epitopes while the other 5 did not contain any toxic epitopes. The glutamine residues in two specific ployglutamine domains ranged from 7 to 27, indicating a high variation in length. According to the numbers of 4 T cell stimulatory toxic epitopes and glutamine residues in the two ployglutamine domains among the 19 α-gliadin genes, 2 were assigned to chromosome 6A, 5 to chromosome 6B and 12 to chromosome 6D. These results were consistent with those from wheat cv. Chinese Spring nulli-tetrasomic and phylogenetic analysis. Secondary structure prediction showed that all α-gliadins had high content of β-strands and most of the α-helixes and β-strands were present in two unique domains. Phylogenetic analysis demonstrated that α-gliadin genes had a high homology with γ-gliadin, B-hordein, and LMW-GS genes and they diverged at approximate 39 MYA. Finally, the five α-gliadin genes were successfully expressed in E. coli, and their expression amount reached to the maximum after 4 h induced by IPTG, indicating that the α-gliadin genes can express in a high level under the control of T(7) promoter.
The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum.

PubMed

Zimin, Aleksey V; Puiu, Daniela; Hall, Richard; Kingan, Sarah; Clavijo, Bernardo J; Salzberg, Steven L

2017-11-01

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components. © The Author 2017. Published by Oxford University Press.
Allelic variations of α-gliadin genes from species of Aegilops section Sitopsis and insights into evolution of α-gliadin multigene family among Triticum and Aegilops.

PubMed

Huang, Zhuo; Long, Hai; Wei, Yu-Ming; Yan, Ze-Hong; Zheng, You-Liang

2016-04-01

The α-gliadins account for 15-30 % of the total storage protein in wheat endosperm and play important roles in the dough extensibility and nutritional quality. On the other side, they act as a main source of toxic peptides triggering celiac disease. In this study, 37 α-gliadins were isolated from three species of Aegilops section Sitopsis. Sequence similarity and phylogenetic analyses revealed novel allelic variation at Gli-2 loci of species of Sitopsis and regular organization of motifs in their repetitive domain. Based on the comprehensive analyses of a large number of known sequences of bread wheat and its diploid genome progenitors, the distributions of four T cell epitopes and length variations of two polyglutamine domains are analyzed. Additionally, according to the organization of repeat motifs, we classified the α-gliadins of Triticum and Aegilops into eight types. Their most recent common ancestor and putative divergence patterns were further considered. This study provides new insights into the allelic variations of α-gliadins in Aegilops section Sitopsis, as well as evolution of α-gliadin multigene family among Triticum and Aegilops species.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

2016-01-04

PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genetic effect of the Aegilops caudata plasmon on the manifestation of the Ae. cylindrica genome.

PubMed

Tsunewaki, Koichiro; Mori, Naoki; Takumi, Shigeo

2014-01-01

In the course of reconstructing Aegilops caudata from its own genome (CC) and its plasmon, which had passed half a century in common wheat (genome AABBDD), we produced alloplasmic Ae. cylindrica (genome CCDD) with the plasmon of Ae. caudata. This line, designated (caudata)-CCDD, was found to express male sterility in its second substitution backcross generation (SB2) of (caudata)-AABBCCDD pollinated three times with the Ae. cylindrica pollen. We repeatedly backcrossed these SB2 plants with the Ae. cylindrica pollen until the SB5 generation, and SB5F2 progeny were produced by self-pollination of the SB5 plants. Thirteen morphological and physiological characters, including pollen and seed fertilities, of the (caudata)-CCDD SB5F2 were compared with those of the euplasmic Ae. cylindrica. The results indicated that the male sterility expressed by (caudata)-CCDD was due to genetic incompatibility between the Ae. cylindrica genome and Ae. caudata plasmon that did not affect any other characters of Ae. cylindrica. Also, we report that the genome integrity functions in keeping the univalent transmission rate high.
Identification of PmTA1662 from Aegilops tauschii

USDA-ARS?s Scientific Manuscript database

Powdery mildew remains a significant threat to wheat (Triticum aestivum L.) production, and the rapid breakdown of race-specific resistance to Blumeria graminis (DC.) f. sp. tritici (Bgt) reinforces the need to identify novel sources of resistance. The D-genome progenitor species of hexaploid wheat,...
The abundance of homoeologue transcripts is disrupted by hybridization and is partially restored by genome doubling in synthetic hexaploid wheat.

PubMed

Hao, Ming; Li, Aili; Shi, Tongwei; Luo, Jiangtao; Zhang, Lianquan; Zhang, Xuechuan; Ning, Shunzong; Yuan, Zhongwei; Zeng, Deying; Kong, Xingchen; Li, Xiaolong; Zheng, Hongkun; Lan, Xiujin; Zhang, Huaigang; Zheng, Youliang; Mao, Long; Liu, Dengcai

2017-02-10

The formation of an allopolyploid is a two step process, comprising an initial wide hybridization event, which is later followed by a whole genome doubling. Both processes can affect the transcription of homoeologues. Here, RNA-Seq was used to obtain the genome-wide leaf transcriptome of two independent Triticum turgidum × Aegilops tauschii allotriploids (F1), along with their spontaneous allohexaploids (S1) and their parental lines. The resulting sequence data were then used to characterize variation in homoeologue transcript abundance. The hybridization event strongly down-regulated D-subgenome homoeologues, but this effect was in many cases reversed by whole genome doubling. The suppression of D-subgenome homoeologue transcription resulted in a marked frequency of parental transcription level dominance, especially with respect to genes encoding proteins involved in photosynthesis. Singletons (genes where no homoeologues were present) were frequently transcribed at both the allotriploid and allohexaploid plants. The implication is that whole genome doubling helps to overcome the phenotypic weakness of the allotriploid, restoring a more favourable gene dosage in genes experiencing transcription level dominance in hexaploid wheat.
Flow sorting of C-genome chromosomes from wild relatives of wheat Aegilops markgrafii, Ae. triuncialis and Ae. cylindrica, and their molecular organization.

PubMed

Molnár, István; Vrána, Jan; Farkas, András; Kubaláková, Marie; Cseh, András; Molnár-Láng, Márta; Doležel, Jaroslav

2015-08-01

Aegilops markgrafii (CC) and its natural hybrids Ae. triuncialis (U(t)U(t)C(t)C(t)) and Ae. cylindrica (D(c)D(c)C(c)C(c)) represent a rich reservoir of useful genes for improvement of bread wheat (Triticum aestivum), but the limited information available on their genome structure and the shortage of molecular (cyto-) genetic tools hamper the utilization of the extant genetic diversity. This study provides the complete karyotypes in the three species obtained after fluorescent in situ hybridization (FISH) with repetitive DNA probes, and evaluates the potential of flow cytometric chromosome sorting. The flow karyotypes obtained after the analysis of 4',6-diamidino-2-phenylindole (DAPI)-stained chromosomes were characterized and the chromosome content of the peaks on the flow karyotypes was determined by FISH. Twenty-nine conserved orthologous set (COS) markers covering all seven wheat homoeologous chromosome groups were used for PCR with DNA amplified from flow-sorted chromosomes and genomic DNA. FISH with repetitive DNA probes revealed that chromosomes 4C, 5C, 7C(t), T6U(t)S.6U(t)L-5C(t)L, 1C(c) and 5D(c) could be sorted with purities ranging from 66 to 91 %, while the remaining chromosomes could be sorted in groups of 2-5. This identified a partial wheat-C-genome homology for group 4 and 5 chromosomes. In addition, 1C chromosomes were homologous with group 1 of wheat; a small segment from group 2 indicated 1C-2C rearrangement. An extensively rearranged structure of chromosome 7C relative to wheat was also detected. The possibility of purifying Aegilops chromosomes provides an attractive opportunity to investigate the structure and evolution of the Aegilops C genome and to develop molecular tools to facilitate the identification of alien chromatin and support alien introgression breeding in bread wheat. © The Author 2015. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Radiation hybrid maps of D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes

USDA-ARS?s Scientific Manuscript database

The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high-resolution genome maps saturated with ordered markers to assist in anchoring and orienting BAC contigs/ sequence scaffolds for whole genome sequence assembly. Radiation hybrid (RH) mapping has proven to be an e...
Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

PubMed Central

2010-01-01

Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The
Novel x-type high-molecular-weight glutenin genes from Aegilops tauschii and their implications on the wheat origin and evolution mechanism of Glu-D1-1 proteins.

PubMed

Zhang, Yanzhen; Li, Xiaohui; Wang, Aili; An, Xueli; Zhang, Qian; Pei, Yuhe; Gao, Liyan; Ma, Wujun; Appels, Rudi; Yan, Yueming

2008-01-01

Two new x-type high-molecular-weight glutenin subunits with similar size to 1Dx5, designated 1Dx5*t and 1Dx5.1*t in Aegilops tauschii, were identified by SDS-PAGE, RP-HPLC, and MALDI-TOF-MS. The coding sequences were isolated by AS-PCR and the complete ORFs were obtained. Allele 1Dx5*t consists of 2481 bp encoding a mature protein of 827 residues with deduced Mr of 85,782 Da whereas 1Dx5.1*t comprises 2526 bp encoding 842 residues with Mr of 87,663 Da. The deduced Mr's of both genes were consistent with those determined by MALDI-TOF-MS. Molecular structure analysis showed that the repeat motifs of 1Dx5*t were correspondingly closer to the consensus compared to 1Dx5.1*t and 1Dx5 subunits. A total of 11 SNPs (3 in 1Dx5*t and 8 in 1Dx5.1*t) and two indels in 1Dx5*t were identified, among which 8 SNPs were due to C-T or A-G transitions (an average of 73%). Expression of the cloned ORFs and N-terminal sequencing confirmed the authenticities of the two genes. Interestingly, several hybrid clones of 1Dx5*t expressed a slightly smaller protein relative to the authentic subunit present in seed proteins; this was confirmed to result from a deletion of 180 bp through illegitimate recombination as well as an in-frame stop codon. Network analysis demonstrated that 1Dx5*t, 1Dx2t, 1Dx1.6t, and 1Dx2.2* represent a root within a network and correspond to the common ancestors of the other Glu-D-1-1 alleles in an associated star-like phylogeny, suggesting that there were at least four independent origins of hexaploid wheat. In addition to unequal homologous recombination, duplication and deletion of large fragments occurring in Glu-D-1-1 alleles were attributed to illegitimate recombination.
Wheat-specific gene, ribosomal protein l21, used as the endogenous reference gene for qualitative and real-time quantitative polymerase chain reaction detection of transgenes.

PubMed

Liu, Yi-Ke; Li, He-Ping; Huang, Tao; Cheng, Wei; Gao, Chun-Sheng; Zuo, Dong-Yun; Zhao, Zheng-Xi; Liao, Yu-Cai

2014-10-29

Wheat-specific ribosomal protein L21 (RPL21) is an endogenous reference gene suitable for genetically modified (GM) wheat identification. This taxon-specific RPL21 sequence displayed high homogeneity in different wheat varieties. Southern blots revealed 1 or 3 copies, and sequence analyses showed one amplicon in common wheat. Combined analyses with sequences from common wheat (AABBDD) and three diploid ancestral species, Triticum urartu (AA), Aegilops speltoides (BB), and Aegilops tauschii (DD), demonstrated the presence of this amplicon in the AA genome. Using conventional qualitative polymerase chain reaction (PCR), the limit of detection was 2 copies of wheat haploid genome per reaction. In the quantitative real-time PCR assay, limits of detection and quantification were about 2 and 8 haploid genome copies, respectively, the latter of which is 2.5-4-fold lower than other reported wheat endogenous reference genes. Construct-specific PCR assays were developed using RPL21 as an endogenous reference gene, and as little as 0.5% of GM wheat contents containing Arabidopsis NPR1 were properly quantified.
Allocation of the S-genome chromosomes of Aegilops variabilis Eig. carrying powdery mildew resistance in triticale (× Triticosecale Wittmack).

PubMed

Kwiatek, M; Belter, J; Majka, M; Wiśniewska, H

2016-03-01

It has been hypothesized that the powdery mildew adult plant resistance (APR) controlled by the Pm13 gene in Aegilops longissima Schweinf. & Muschl. (S(l)S(l)) has been evolutionary transferred to Aegilops variabilis Eig. (UUSS). The molecular marker analysis and the visual evaluation of powdery mildew symptoms in Ae. variabilis and the Ae. variabilis × Secale cereale amphiploid forms (2n = 6x = 42, UUSSRR) showed the presence of product that corresponded to Pm13 marker and the lower infection level compared to susceptible model, respectively. This study also describes the transfer of Ae. variabilis Eig. (2n = 4x = 28, U(v)U(v)S(v)S(v)) chromosomes, carrying powdery mildew resistance, into triticale (× Triticosecale Wittm., 2n = 6x = 42, AABBRR) using Ae. variabilis × S. cereale amphiploid forms. The individual chromosomes of Ae. variabilis, triticale 'Lamberto' and hybrids were characterized by genomic and fluorescence in situ hybridization (GISH/FISH). The chromosome configurations of obtained hybrid forms were studied at first metaphase of meiosis of pollen mother cells (PMCs) using GISH. The statistical analysis showed that the way of S-genome chromosome pairing and transmission to subsequent hybrid generations was diploid-like and had no influence on chromosome pairing of triticale chromosomes. The cytogenetic study of hybrid forms were supported by the marker-assisted selection using Pm13 marker and visual evaluation of natural infection by Blumeria graminis, that allowed to select the addition or substitution lines of hybrids carrying chromosome 3S(v) which were tolerant to the powdery mildew infection.
[Molecular-genetic analysis of wheat (T. aestivum L.) genome with introgression of Ae. cylindrica Host genetic elements].

PubMed

Galaev, A V; Sivolap, Iu M

2005-01-01

Wheat-aegilops hybrid plants Triticum aestivum L. (2n = 42) x Aegilops cylindrica Host (2n = 28) were investigated with using microsatellite markers. In two BC1F9 lines some genome modifications connected with losing DNA fragments of initial variety or appearing of Aegilops genome elements were detected. In some investigated hybrids new amplicons lacking in parental plants were found. Substitution of wheat chromosomes for aegilops chromosomes was not revealed. Analysis of microsatellite loci in BC2F5 plants showed stable introgression of aegilops genetic elements into wheat; elimination of some transferred aegilops DNA fragments in the course of backcrossing; decreasing size of introgressive elements after backcrossing. Introgressive lines were classified according to genome changes.
Addition of Aegilops U and M Chromosomes Affects Protein and Dietary Fiber Content of Wholemeal Wheat Flour.

PubMed

Rakszegi, Marianna; Molnár, István; Lovegrove, Alison; Darkó, Éva; Farkas, András; Láng, László; Bedő, Zoltán; Doležel, Jaroslav; Molnár-Láng, Márta; Shewry, Peter

2017-01-01

Cereal grain fiber is an important health-promoting component in the human diet. One option to improve dietary fiber content and composition in wheat is to introduce genes from its wild relatives Aegilops biuncialis and Aegilops geniculata . This study showed that the addition of chromosomes 2U g , 4U g , 5U g , 7U g , 2M g , 5M g , and 7M g of Ae. geniculata and 3U b , 2M b , 3M b , and 7M b of Ae. biuncialis into bread wheat increased the seed protein content. Chromosomes 1U g and 1M g increased the proportion of polymeric glutenin proteins, while the addition of chromosomes 1U b and 6U b led to its decrease. Both Aegilops species had higher proportions of β-glucan compared to arabinoxylan (AX) than wheat lines, and elevated β-glucan content was also observed in wheat chromosome addition lines 5U, 7U, and 7M. The AX content in wheat was increased by the addition of chromosomes 5U g , 7U g , and 1U b while water-soluble AX was increased by the addition of chromosomes 5U, 5M, and 7M, and to a lesser extent by chromosomes 3, 4, 6U g , and 2M b . Chromosomes 5U g and 7M b also affected the structure of wheat AX, as shown by the pattern of oligosaccharides released by digestion with endoxylanase. These results will help to map genomic regions responsible for edible fiber content in Aegilops and will contribute to the efficient transfer of wild alleles in introgression breeding programs to obtain wheat varieties with improved health benefits. Key Message: Addition of Aegilops U- and M-genome chromosomes 5 and 7 improves seed protein and fiber content and composition in wheat.
Visualization of A- and B-genome chromosomes in wheat (Triticum aestivum L.) x jointed goatgrass (Aegilops cylindrica Host) backcross progenies.

PubMed

Wang, Z N; Hang, A; Hansen, J; Burton, C; Mallory-Smith, C A; Zemetra, R S

2000-12-01

Wheat (Triticum aestivum) and jointed goatgrass (Aegilops cylindrica) can cross with each other, and their self-fertile backcross progenies frequently have extra chromosomes and chromosome segments, presumably retained from wheat, raising the possibility that a herbicide resistance gene might transfer from wheat to jointed goatgrass. Genomic in situ hybridization (GISH) was used to clarify the origin of these extra chromosomes. By using T. durum DNA (AABB genome) as a probe and jointed goatgrass DNA (CCDD genome) as blocking DNA, one, two, and three A- or B-genome chromosomes were identified in three BC2S2 individuals where 2n = 29, 30, and 31 chromosomes, respectively. A translocation between wheat and jointed goatgrass chromosomes was also detected in an individual with 30 chromosomes. In pollen mother cells with meiotic configuration of 14 II + 2 I, the two univalents were identified as being retained from the A or B genome of wheat. By using Ae. markgrafii DNA (CC genome) as a probe and wheat DNA (AABBDD genome) as blocking DNA. 14 C-genome chromosomes were visualized in all BC2S2 individuals. The GISH procedure provides a powerful tool to detect the A or B-genome chromatin in a jointed goatgrass background, making it possible to assess the risk of transfer of herbicide resistance genes located on the A or B genome of wheat to jointed goatgrass.
Variation in Susceptibility to Wheat dwarf virus among Wild and Domesticated Wheat

PubMed Central

Nygren, Jim; Shad, Nadeem; Kvarnheden, Anders; Westerbergh, Anna

2015-01-01

We investigated the variation in plant response in host-pathogen interactions between wild (Aegilops spp., Triticum spp.) and domesticated wheat (Triticum spp.) and Wheat dwarf virus (WDV). The distribution of WDV and its wild host species overlaps in Western Asia in the Fertile Crescent, suggesting a coevolutionary relationship. Bread wheat originates from a natural hybridization between wild emmer wheat (carrying the A and B genomes) and the wild D genome donor Aegilops tauschii, followed by polyploidization and domestication. We studied whether the strong selection during these evolutionary processes, leading to genetic bottlenecks, may have resulted in a loss of resistance in domesticated wheat. In addition, we investigated whether putative fluctuations in intensity of selection imposed on the host-pathogen interactions have resulted in a variation in susceptibility to WDV. To test our hypotheses we evaluated eighteen wild and domesticated wheat taxa, directly or indirectly involved in wheat evolution, for traits associated with WDV disease such as leaf chlorosis, different growth traits and WDV content. The plants were exposed to viruliferous leafhoppers (Psammotettix alienus) in a greenhouse trial and evaluated at two time points. We found three different plant response patterns: i) continuous reduction in growth over time, ii) weak response at an early stage of plant development but a much stronger response at a later stage, and iii) remission of symptoms over time. Variation in susceptibility may be explained by differences in the intensity of natural selection, shaping the coevolutionary interaction between WDV and the wild relatives. However, genetic bottlenecks during wheat evolution have not had a strong impact on WDV resistance. Further, this study indicates that the variation in susceptibility may be associated with the genome type and that the ancestor Ae. tauschii may be useful as genetic resource for the improvement of WDV resistance in wheat. PMID
Genetic map of Triticum turgidum based on a hexaploid wheat population without genetic recombination for D genome.

PubMed

Zhang, Li; Luo, Jiang-Tao; Hao, Ming; Zhang, Lian-Quan; Yuan, Zhong-Wei; Yan, Ze-Hong; Liu, Ya-Xi; Zhang, Bo; Liu, Bao-Long; Liu, Chun-Ji; Zhang, Huai-Gang; Zheng, You-Liang; Liu, Deng-Cai

2012-08-13

A synthetic doubled-haploid hexaploid wheat population, SynDH1, derived from the spontaneous chromosome doubling of triploid F1 hybrid plants obtained from the cross of hybrids Triticum turgidum ssp. durum line Langdon (LDN) and ssp. turgidum line AS313, with Aegilops tauschii ssp. tauschii accession AS60, was previously constructed. SynDH1 is a tetraploidization-hexaploid doubled haploid (DH) population because it contains recombinant A and B chromosomes from two different T. turgidum genotypes, while all the D chromosomes from Ae. tauschii are homogenous across the whole population. This paper reports the construction of a genetic map using this population. Of the 606 markers used to assemble the genetic map, 588 (97%) were assigned to linkage groups. These included 513 Diversity Arrays Technology (DArT) markers, 72 simple sequence repeat (SSR), one insertion site-based polymorphism (ISBP), and two high-molecular-weight glutenin subunit (HMW-GS) markers. These markers were assigned to the 14 chromosomes, covering 2048.79 cM, with a mean distance of 3.48 cM between adjacent markers. This map showed good coverage of the A and B genome chromosomes, apart from 3A, 5A, 6A, and 4B. Compared with previously reported maps, most shared markers showed highly consistent orders. This map was successfully used to identify five quantitative trait loci (QTL), including two for spikelet number on chromosomes 7A and 5B, two for spike length on 7A and 3B, and one for 1000-grain weight on 4B. However, differences in crossability QTL between the two T. turgidum parents may explain the segregation distortion regions on chromosomes 1A, 3B, and 6B. A genetic map of T. turgidum including 588 markers was constructed using a synthetic doubled haploid (SynDH) hexaploid wheat population. Five QTLs for three agronomic traits were identified from this population. However, more markers are needed to increase the density and resolution of this map in the future study.

Regions of the bread wheat D genome associated with variation in key photosynthesis traits and shoot biomass under both well watered and water deficient conditions.

PubMed

Osipova, Svetlana; Permyakov, Alexey; Permyakova, Marina; Pshenichnikova, Tatyana; Verkhoturov, Vasiliy; Rudikovsky, Alexandr; Rudikovskaya, Elena; Shishparenok, Alexandr; Doroshkov, Alexey; Börner, Andreas

2016-05-01

A quantitative trait locus (QTL) approach was taken to reveal the genetic basis in wheat of traits associated with photosynthesis during a period of exposure to water deficit stress. The performance, with respect to shoot biomass, gas exchange and chlorophyll fluorescence, leaf pigment content and the activity of various ascorbate-glutathione cycle enzymes and catalase, of a set of 80 wheat lines, each containing a single chromosomal segment introgressed from the bread wheat D genome progenitor Aegilops tauschii, was monitored in plants exposed to various water regimes. Four of the seven D genome chromosomes (1D, 2D, 5D, and 7D) carried clusters of both major (LOD >3.0) and minor (LOD between 2.0 and 3.0) QTL. A major QTL underlying the activity of glutathione reductase was located on chromosome 2D, and another, controlling the activity of ascorbate peroxidase, on chromosome 7D. A region of chromosome 2D defined by the microsatellite locus Xgwm539 and a second on chromosome 7D flanked by the marker loci Xgwm1242 and Xgwm44 harbored a number of QTL associated with the water deficit stress response.
Breeding value of primary synthetic wheat genotypes for grain yield

USDA-ARS?s Scientific Manuscript database

To introduce new genetic diversity into the bread wheat gene pool from its progenitor, Aegilops tauschii (Coss.) Schmalh, 33 primary synthetic hexaploid wheat genotypes (SYN) were crossed to 20 spring bread wheat (BW) cultivars at the International Wheat and Maize Improvement Center. Modified single...
A Candidate Gene for Aphid Resistance in Wheat

USDA-ARS?s Scientific Manuscript database

The greenbug, Schizaphis graminum (Rondani), is an important aphid pest of small grain crops in many parts of the world. A single dominant gene, Gb3 originated from Aegilops tauschii has shown consistent and durable resistance against prevailing greenbug biotypes in wheat fields. Previously, we mapp...
Cloning and function validation of a nb-arc-lrr-type candidate gene for the greenbug aphid resistance locus, Gb3, in wheat

USDA-ARS?s Scientific Manuscript database

The greenbug, Schizaphis graminum, is one of the most important aphid pests of small grain crops in many parts of the world. A single dominant gene, Gb3 originated from Aegilops tauschii has shown consistent and durable resistance against prevailing greenbug biotypes in wheat fields. A fine genetic ...
Sequencing of Chloroplast Genomes from Wheat, Barley, Rye and Their Relatives Provides a Detailed Insight into the Evolution of the Triticeae Tribe

PubMed Central

Middleton, Christopher P.; Senerchia, Natacha; Stein, Nils; Akhunov, Eduard D.; Keller, Beat

2014-01-01

Using Roche/454 technology, we sequenced the chloroplast genomes of 12 Triticeae species, including bread wheat, barley and rye, as well as the diploid progenitors and relatives of bread wheat Triticum urartu, Aegilops speltoides and Ae. tauschii. Two wild tetraploid taxa, Ae. cylindrica and Ae. geniculata, were also included. Additionally, we incorporated wild Einkorn wheat Triticum boeoticum and its domesticated form T. monococcum and two Hordeum spontaneum (wild barley) genotypes. Chloroplast genomes were used for overall sequence comparison, phylogenetic analysis and dating of divergence times. We estimate that barley diverged from rye and wheat approximately 8–9 million years ago (MYA). The genome donors of hexaploid wheat diverged between 2.1–2.9 MYA, while rye diverged from Triticum aestivum approximately 3–4 MYA, more recently than previously estimated. Interestingly, the A genome taxa T. boeoticum and T. urartu were estimated to have diverged approximately 570,000 years ago. As these two have a reproductive barrier, the divergence time estimate also provides an upper limit for the time required for the formation of a species boundary between the two. Furthermore, we conclusively show that the chloroplast genome of hexaploid wheat was contributed by the B genome donor and that this unknown species diverged from Ae. speltoides about 980,000 years ago. Additionally, sequence alignments identified a translocation of a chloroplast segment to the nuclear genome which is specific to the rye/wheat lineage. We propose the presented phylogeny and divergence time estimates as a reference framework for future studies on Triticeae. PMID:24614886
Introgression of Aegilops speltoides segments in Triticum aestivum and the effect of the gametocidal genes.

PubMed

King, Julie; Grewal, Surbhi; Yang, Cai-Yun; Hubbart Edwards, Stella; Scholefield, Duncan; Ashling, Stephen; Harper, John A; Allen, Alexandra M; Edwards, Keith J; Burridge, Amanda J; King, Ian P

2018-02-12

Bread wheat (Triticum aestivum) has been through a severe genetic bottleneck as a result of its evolution and domestication. It is therefore essential that new sources of genetic variation are generated and utilized. This study aimed to generate genome-wide introgressed segments from Aegilops speltoides. Introgressions generated from this research will be made available for phenotypic analysis. Aegilops speltoides was crossed as the male parent to T. aestivum 'Paragon'. The interspecific hybrids were then backcrossed to Paragon. Introgressions were detected and characterized using the Affymetrix Axiom Array and genomic in situ hybridization (GISH). Recombination in the gametes of the F1 hybrids was at a level where it was possible to generate a genetic linkage map of Ae. speltoides. This was used to identify 294 wheat/Ae. speltoides introgressions. Introgressions from all seven linkage groups of Ae. speltoides were found, including both large and small segments. Comparative analysis showed that overall macro-synteny is conserved between Ae. speltoides and T. aestivum, but that Ae. speltoides does not contain the 4A/5A/7B translocations present in wheat. Aegilops speltoides has been reported to carry gametocidal genes, i.e. genes that ensure their transmission through the gametes to the next generation. Transmission rates of the seven Ae. speltoides linkage groups introgressed into wheat varied. A 100 % transmission rate of linkage group 2 demonstrates the presence of the gametocidal genes on this chromosome. A high level of recombination occurs between the chromosomes of wheat and Ae. speltoides, leading to the generation of large numbers of introgressions with the potential for exploitation in breeding programmes. Due to the gametocidal genes, all germplasm developed will always contain a segment from Ae. speltoides linkage group 2S, in addition to an introgression from any other linkage group. © The Authors 2017. Published by Oxford University Press on behalf of
A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.

PubMed

Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio

2016-01-01

The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa , revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.
Impact of transgene genome location on gene migration from herbicide-resistant wheat (Triticum aestivum L.) to jointed goatgrass (Aegilops cylindrica Host).

PubMed

Rehman, Maqsood; Hansen, Jennifer L; Mallory-Smith, Carol A; Zemetra, Robert S

2017-08-01

Wheat (Triticum aestivum) (ABD) and jointed goatgrass (Aegilops cylindrica) (CD) can cross and produce hybrids that can backcross to either parent. Such backcrosses can result in progeny with chromosomes and/or chromosome segments retained from wheat. Thus, a herbicide resistance gene could migrate from wheat to jointed goatgrass. In theory, the risk of gene migration from herbicide-resistant wheat to jointed goatgrass is more likely if the gene is located on the D genome and less likely if the gene is located on the A or B genome of wheat. BC 1 populations (jointed goatgrass as a recurrent parent) were analyzed for chromosome numbers and transgene transmission rates under sprayed and non-sprayed conditions. Transgene retention in the non-sprayed BC 1 generation for the A, B and D genomes was 84, 60 and 64% respectively. In the sprayed populations, the retention was 81, 59 and 74% respectively. The gene transmission rates were higher than the expected 50% or less under sprayed and non-sprayed conditions, possibly owing to meiotic chromosome restitution and/or chromosome non-disjunction. Such high transmission rates in the BC 1 generation negates the benefits of gene placement for reducing the potential of gene migration from wheat to jointed goatgrass. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Accelerated Senescence and Enhanced Disease Resistance in Hybrid Chlorosis Lines Derived from Interspecific Crosses between Tetraploid Wheat and Aegilops tauschii

PubMed Central

Tosa, Yukio; Yoshida, Kentaro; Park, Pyoyun; Takumi, Shigeo

2015-01-01

Hybrid chlorosis, a type of hybrid incompatibility, has frequently been reported in inter- and intraspecific crosses of allopolyploid wheat. In a previous study, we reported some types of growth abnormalities such as hybrid necrosis and observed hybrid chlorosis with mild or severe abnormalities in wheat triploids obtained in crosses between tetraploid wheat cultivar Langdon and four Ae. tauschii accessions and in their derived synthetic hexaploids. However, the molecular mechanisms underlying hybrid chlorosis are not well understood. Here, we compared cytology and gene expression in leaves to characterize the abnormal growth in wheat synthetics showing mild and severe chlorosis. In addition, we compared disease resistance to wheat blast fungus. In total 55 and 105 genes related to carbohydrate metabolism and 53 and 89 genes for defense responses were markedly up-regulated in the mild and severe chlorosis lines, respectively. Abnormal chloroplasts formed in the mesophyll cells before the leaves yellowed in the hybrid chlorosis lines. The plants with mild chlorosis showed increased resistance to wheat blast and powdery mildew fungi, although significant differences only in two, third internode length and maturation time, out of the examined agricultural traits were found between the wild type and plants showing mild chlorosis. These observations suggest that senescence might be accelerated in hybrid chlorosis lines of wheat synthetics. Moreover, in wheat synthetics showing mild chlorosis, the negative effects on biomass can be minimized, and they may show substantial fitness under pathogen-polluted conditions. PMID:25806790
Unreduced gamete formation in wheat × Aegilops spp. hybrids is genotype specific and prevented by shared homologous subgenomes.

PubMed

Fakhri, Zhaleh; Mirzaghaderi, Ghader; Ahmadian, Samira; Mason, Annaliese S

2016-05-01

The presence of homologous subgenomes inhibited unreduced gamete formation in wheat × Aegilops interspecific hybrids. Unreduced gamete rates were under the control of the wheat nuclear genome. Production of unreduced gametes is common among interspecific hybrids, and may be affected by parental genotypes and genomic similarity. In the present study, five cultivars of Triticum aestivum and two tetraploid Aegilops species (i.e. Ae. triuncialis and Ae. cylindrica) were reciprocally crossed to produce 20 interspecific hybrid combinations. These hybrids comprised two different types: T. aestivum × Aegilops triuncialis; 2n = ABDU(t)C(t) (which lack a common subgenome) and T. aestivum × Ae. cylindrica; 2n = ABDD(c)C(c) (which share a common subgenome). The frequency of unreduced gametes in F1 hybrids was estimated in sporads from the frequency of dyads, and the frequency of viable pollen, germinated pollen and seed set were recorded. Different meiotic abnormalities recorded in the hybrids included precocious chromosome migration to the poles at metaphase I and II, laggards in anaphase I and II, micronuclei and chromosome stickiness, failure in cell wall formation, premature cytokinesis and microspore fusion. The mean frequency of restitution meiosis was 10.1 %, and the mean frequency of unreduced viable pollen was 4.84 % in T. aestivum × Ae. triuncialis hybrids. By contrast, in T. aestivum × Ae. cylindrica hybrids no meiotic restitution was observed, and a low rate of viable gametes (0.3 %) was recorded. This study present evidence that high levels of homologous pairing between the D and D(c) subgenomes may interfere with meiotic restitution and the formation of unreduced gametes. Variation in unreduced gamete production was also observed between T. aestivum × Ae. triuncialis hybrid plants, suggesting genetic control of this trait.
Study of improving the quality of bread and wheat-aegilops hybrids with the biotechnological ways

NASA Astrophysics Data System (ADS)

Ganbarzada, Aygun; Hasanova, Sudaba

2016-08-01

The great need of the people to bread demands to increase high qualitative grain plants. At present time for solving these problem different methods of biochemistry, genetics and molecular biology are widely used in the process of selection. To investigate biochemical peculiarities of wheat-aegilops hybrids and to define the correlative relation between these characteristics. To investigate the technological peculiarities of wheat- aegilops hybrids and to define the relation between their main biochemical and technological characteristics. The conclusion of this investigation showed the followings- the wheat-aegilops hybrids according to their morphological and biochemical characteristics have approached to wheats. The electrophoretic spectres of the wheat- aegilops hybrids which have stable for their morphological characteristics are homogeny and heterogenic. Hereditarily some group protein components have passed to their tribes from their parents. But spontaneous hybridisation results in taking part the components of other unknown wheats in these electrophoretic spectres. There is a relation between the electrophoretic spectres and the indications of the grain quality.
Low-molecular-weight glutenin subunits from the 1U genome of Aegilops umbellulata confer superior dough rheological properties and improve breadmaking quality of bread wheat.

PubMed

Wang, Jian; Wang, Chang; Zhen, Shoumin; Li, Xiaohui; Yan, Yueming

2018-04-01

Wheat-related genomes may carry new glutenin genes with the potential for quality improvement of breadmaking. In this study, we estimated the gluten quality properties of the wheat line CNU609 derived from crossing between Chinese Spring (CS, Triticum aestivum L., 2n = 6x = 42, AABBDD) and the wheat Aegilops umbellulata (2n = 2x = 14, UU) 1U(1B) substitution line, and investigated the function of 1U-encoded low-molecular-weight glutenin subunits (LMW-GS). The main quality parameters of CNU609 were significantly improved due to introgression of the 1U genome, including dough development time, stability time, farinograph quality number, gluten index, loaf size and inner structure. Glutenin analysis showed that CNU609 and CS had the same high-molecular-weight glutenin subunit (HMW-GS) composition, but CNU609 carried eight specific 1U genome-encoded LMW-GS. The introgression of the 1U-encoded LMW-GS led to more and larger protein body formation in the CNU609 endosperm. Two new LMW-m type genes from the 1U genome, designated Glu-U3a and Glu-U3b, were cloned and characterized. Secondary structure prediction implied that both Glu-U3a and Glu-U3b encode subunits with high α-helix and β-strand content that could benefit the formation of superior gluten structure. Our results indicate that the 1U genome has superior LMW-GS that can be used as new gene resources for wheat gluten quality improvement. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome

USDA-ARS?s Scientific Manuscript database

Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat ...
A comparative analysis of chromosome pairing at metaphase I in interspecific hybrids between durum wheat (Triticum turgidum L.) and the most widespread Aegilops species.

PubMed

Cifuentes, M; Garcia-Agüero, V; Benavente, E

2010-07-01

Homoeologous metaphase I (MI) associations in hybrids between durum wheat and its wild allotetraploid relatives Aegilops neglecta, Ae. triuncialis and Ae. ventricosa have been characterized by a genomic in situ hybridization procedure that allows simultaneous discrimination of A, B and wild species genomes. Earlier results in equivalent hybrids with the wild species Ae. cylindrica and Ae. geniculata have also been considered to comparatively assay the MI pairing pattern of the durum wheat x Aegilops interspecific combinations more likely to occur in nature. The general picture can be drawn as follows. A and B wheat genomes pair with each other less than the 2 wild constituent genomes do in any of the hybrid combinations examined. Interspecific wheat-wild associations account for 60-70% of total MI pairing in all hybrids, except in that derived from Ae. triuncialis, but the A genome is always the wheat partner most frequently involved in MI pairing with the wild homoeologues. Hybrids with Ae. cylindrica, Ae. geniculata and Ae. ventricosa showed similar reduced levels of MI association and virtually identical MI pairing patterns. However, certain recurrent differences were found when the pattern of homoeologous pairing of hybrids from either Ae. triuncialis or Ae. neglecta was contrasted to that observed in the other durum wheat hybrid combinations. In the former case, a remarkable preferential pairing between the wild species constituent genomes U(t) and C(t) seems to be the reason, whereas a general promotion of homoeologous pairing, qualitatively similar to that observed under the effect of the ph1c mutation, appears to occur in the hybrid with Ae. neglecta. It is further discussed whether the results reported here can be extrapolated to the corresponding bread wheat hybrid combinations. Copyright 2010 S. Karger AG, Basel.
Reconciling the evolutionary origin of bread wheat (Triticum aestivum).

PubMed

El Baidouri, Moaine; Murat, Florent; Veyssiere, Maeva; Molinier, Mélanie; Flores, Raphael; Burlot, Laura; Alaux, Michael; Quesneville, Hadi; Pont, Caroline; Salse, Jérôme

2017-02-01

The origin of bread wheat (Triticum aestivum; AABBDD) has been a subject of controversy and of intense debate in the scientific community over the last few decades. In 2015, three articles published in New Phytologist discussed the origin of hexaploid bread wheat (AABBDD) from the diploid progenitors Triticum urartu (AA), a relative of Aegilops speltoides (BB) and Triticum tauschii (DD). Access to new genomic resources since 2013 has offered the opportunity to gain novel insights into the paleohistory of modern bread wheat, allowing characterization of its origin from its diploid progenitors at unprecedented resolution. We propose a reconciled evolutionary scenario for the modern bread wheat genome based on the complementary investigation of transposable element and mutation dynamics between diploid, tetraploid and hexaploid wheat. In this scenario, the structural asymmetry observed between the A, B and D subgenomes in hexaploid bread wheat derives from the cumulative effect of diploid progenitor divergence, the hybrid origin of the D subgenome, and subgenome partitioning following the polyploidization events. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Complete characterization of wheat-alien metaphase I pairing in interspecific hybrids between durum wheat (Triticum turgidum L.) and jointed goatgrass (Aegilops cylindrica Host).

PubMed

Cifuentes, Marta; Benavente, Elena

2009-05-01

The pattern of homoeologous metaphase I (MI) pairing has been fully characterized in durum wheat x Aegilops cylindrica hybrids (2n = 4x = 28, ABC(c)D(c)) by an in situ hybridization procedure that has permitted individual discrimination of every wheat and wild constituent genome. One of the three hybrid genotypes examined carried the ph1c mutation. In all cases, MI associations between chromosomes of both species represented around two-third of total. Main results from the analysis are as follows (a) the A genome chromosomes are involved in wheat-wild MI pairing more frequently than the B genome partners, irrespective of the alien genome considered; (b) both durum wheat genomes pair preferentially with the D(c) genome of jointed goatgrass. These findings are discussed in relation to the potential of genetic transference between wheat crops and this weedy relative. It can also be highlighted that inactivation of Ph1 provoked a relatively higher promotion of MI associations involving B genome.
[Hybrids of Aegilops cylindrica Host with Triticum durum Desf. and T. aestivum L].

PubMed

Avsenin, V I; Motsnyĭ, A I; Rybalka, A I; Faĭt, V I

2003-01-01

The hybrids of durum and bread wheat with Ae. cylindrica have been obtained without using an embryo rescue technique. The hybrid output (of pollinated flower number) in the field conditions scored 1.0, 15.3 and 10.0% in the crosses T. durum x Ae. cylindrica, Ae. cylindrica x T. durum and T. aestivum x Ae. cylindrica, respectively. A high level of meiotic chromosome pairing between homologous D genomes of bread wheat and Aegilops has been revealed (c = 80.0-83.7%). The possibility of homoeological pairing between wheat and Ae. cylindrica chromosomes has been shown. Herewith, the correlation between the levels of homological and homoeological pairing is absent. The possibilities of genetic material interchange, including between the tetraploid species, as well as the using of Ae. cylindrica cytoplasm for durum wheat breeding are discussed.
Genomic Imprinting Was Evolutionarily Conserved during Wheat Polyploidization.

PubMed

Yang, Guanghui; Liu, Zhenshan; Gao, Lulu; Yu, Kuohai; Feng, Man; Yao, Yingyin; Peng, Huiru; Hu, Zhaorong; Sun, Qixin; Ni, Zhongfu; Xin, Mingming

2018-01-01

Genomic imprinting is an epigenetic phenomenon that causes genes to be differentially expressed depending on their parent of origin. To evaluate the evolutionary conservation of genomic imprinting and the effects of ploidy on this process, we investigated parent-of-origin-specific gene expression patterns in the endosperm of diploid ( Aegilops spp), tetraploid, and hexaploid wheat ( Triticum spp) at various stages of development via high-throughput transcriptome sequencing. We identified 91, 135, and 146 maternally or paternally expressed genes (MEGs or PEGs, respectively) in diploid, tetraploid, and hexaploid wheat, respectively, 52.7% of which exhibited dynamic expression patterns at different developmental stages. Gene Ontology enrichment analysis suggested that MEGs and PEGs were involved in metabolic processes and DNA-dependent transcription, respectively. Nearly half of the imprinted genes exhibited conserved expression patterns during wheat hexaploidization. In addition, 40% of the homoeolog pairs originating from whole-genome duplication were consistently maternally or paternally biased in the different subgenomes of hexaploid wheat. Furthermore, imprinted expression was found for 41.2% and 50.0% of homolog pairs that evolved by tandem duplication after genome duplication in tetraploid and hexaploid wheat, respectively. These results suggest that genomic imprinting was evolutionarily conserved between closely related Triticum and Aegilops species and in the face of polyploid hybridization between species in these genera. © 2018 American Society of Plant Biologists. All rights reserved.
Genome-wide identification and analysis of the MADS-box gene family in bread wheat (Triticum aestivum L.)

PubMed Central

Yang, Congcong; Ding, Puyang; Liu, Yaxi; Qiao, Linyi; Chang, Zhijian; Geng, Hongwei; Wang, Penghao; Jiang, Qiantao; Wang, Jirui; Chen, Guoyue; Wei, Yuming; Zheng, Youliang; Lan, Xiujin

2017-01-01

The MADS-box genes encode transcription factors with key roles in plant growth and development. A comprehensive analysis of the MADS-box gene family in bread wheat (Triticum aestivum) has not yet been conducted, and our understanding of their roles in stress is rather limited. Here, we report the identification and characterization of the MADS-box gene family in wheat. A total of 180 MADS-box genes classified as 32 Mα, 5 Mγ, 5 Mδ, and 138 MIKC types were identified. Evolutionary analysis of the orthologs among T. urartu, Aegilops tauschii and wheat as well as homeologous sequences analysis among the three sub-genomes in wheat revealed that gene loss and chromosomal rearrangements occurred during and/or after the origin of bread wheat. Forty wheat MADS-box genes that were expressed throughout the investigated tissues and development stages were identified. The genes that were regulated in response to both abiotic stresses (i.e., phosphorus deficiency, drought, heat, and combined drought and heat) and biotic stresses (i.e., Fusarium graminearum, Septoria tritici, stripe rust and powdery mildew) were detected as well. A few notable MADS-box genes were specifically expressed in a single tissue and those showed relatively higher expression differences between the stress and control treatment. The expression patterns of considerable MADS-box genes differed from those of their orthologs in Brachypodium, rice, and Arabidopsis. Collectively, the present study provides new insights into the possible roles of MADS-box genes in response to stresses and will be valuable for further functional studies of important candidate MADS-box genes. PMID:28742823
Genome-wide identification and analysis of the MADS-box gene family in bread wheat (Triticum aestivum L.).

PubMed

Ma, Jian; Yang, Yujie; Luo, Wei; Yang, Congcong; Ding, Puyang; Liu, Yaxi; Qiao, Linyi; Chang, Zhijian; Geng, Hongwei; Wang, Penghao; Jiang, Qiantao; Wang, Jirui; Chen, Guoyue; Wei, Yuming; Zheng, Youliang; Lan, Xiujin

2017-01-01

The MADS-box genes encode transcription factors with key roles in plant growth and development. A comprehensive analysis of the MADS-box gene family in bread wheat (Triticum aestivum) has not yet been conducted, and our understanding of their roles in stress is rather limited. Here, we report the identification and characterization of the MADS-box gene family in wheat. A total of 180 MADS-box genes classified as 32 Mα, 5 Mγ, 5 Mδ, and 138 MIKC types were identified. Evolutionary analysis of the orthologs among T. urartu, Aegilops tauschii and wheat as well as homeologous sequences analysis among the three sub-genomes in wheat revealed that gene loss and chromosomal rearrangements occurred during and/or after the origin of bread wheat. Forty wheat MADS-box genes that were expressed throughout the investigated tissues and development stages were identified. The genes that were regulated in response to both abiotic stresses (i.e., phosphorus deficiency, drought, heat, and combined drought and heat) and biotic stresses (i.e., Fusarium graminearum, Septoria tritici, stripe rust and powdery mildew) were detected as well. A few notable MADS-box genes were specifically expressed in a single tissue and those showed relatively higher expression differences between the stress and control treatment. The expression patterns of considerable MADS-box genes differed from those of their orthologs in Brachypodium, rice, and Arabidopsis. Collectively, the present study provides new insights into the possible roles of MADS-box genes in response to stresses and will be valuable for further functional studies of important candidate MADS-box genes.

Genomic Imprinting Was Evolutionarily Conserved during Wheat Polyploidization[OPEN

PubMed Central

Yang, Guanghui; Liu, Zhenshan; Gao, Lulu; Yu, Kuohai; Feng, Man; Peng, Huiru; Sun, Qixin; Ni, Zhongfu

2018-01-01

Genomic imprinting is an epigenetic phenomenon that causes genes to be differentially expressed depending on their parent of origin. To evaluate the evolutionary conservation of genomic imprinting and the effects of ploidy on this process, we investigated parent-of-origin-specific gene expression patterns in the endosperm of diploid (Aegilops spp), tetraploid, and hexaploid wheat (Triticum spp) at various stages of development via high-throughput transcriptome sequencing. We identified 91, 135, and 146 maternally or paternally expressed genes (MEGs or PEGs, respectively) in diploid, tetraploid, and hexaploid wheat, respectively, 52.7% of which exhibited dynamic expression patterns at different developmental stages. Gene Ontology enrichment analysis suggested that MEGs and PEGs were involved in metabolic processes and DNA-dependent transcription, respectively. Nearly half of the imprinted genes exhibited conserved expression patterns during wheat hexaploidization. In addition, 40% of the homoeolog pairs originating from whole-genome duplication were consistently maternally or paternally biased in the different subgenomes of hexaploid wheat. Furthermore, imprinted expression was found for 41.2% and 50.0% of homolog pairs that evolved by tandem duplication after genome duplication in tetraploid and hexaploid wheat, respectively. These results suggest that genomic imprinting was evolutionarily conserved between closely related Triticum and Aegilops species and in the face of polyploid hybridization between species in these genera. PMID:29298834
[Allelic variation at high-molecular-weight glutenin subunit loci in Aegilops biuncialis Vis].

PubMed

Kozub, N A; Sozinov, I A; Ksinias, I N; Sozinov, A A

2011-09-01

Alleles at the high-molecular-weight glutenin subunit loci Glu-U1 and Glu-M(b)1 were analyzed in the tetraploid species Aegilops biuncialis (UUM(b)M(b)). The material for the investigation included the collection of 39 accessions of Ae. biuncialis from Ukraine (the Crimea), one Hellenic accession, one accession of unknown origin, F2 seeds from different crosses, as well as samples from natural populations from the Crimea. Ae. umbellulata and Ae. comosa accessions were used to allocate components of the HMW glutenin subunit patterns of Ae. biuncialis to U or M(b) genomes. Eight alleles were identified at the Glu-U1 locus and ten alleles were revealed at the Glu-M(b) 1 locus. Among alleles at the Glu-M(b) 1 locus ofAe. biuncialis there were two alleles controlling the y-type subunit only and one allele encoding the x-subunit only.
Discovery and characterization of two new stem rust resistance genes in Aegilops sharonensis.

PubMed

Yu, Guotai; Champouret, Nicolas; Steuernagel, Burkhard; Olivera, Pablo D; Simmons, Jamie; Williams, Cole; Johnson, Ryan; Moscou, Matthew J; Hernández-Pinzón, Inmaculada; Green, Phon; Sela, Hanan; Millet, Eitan; Jones, Jonathan D G; Ward, Eric R; Steffenson, Brian J; Wulff, Brande B H

2017-06-01

We identified two novel wheat stem rust resistance genes, Sr-1644-1Sh and Sr-1644-5Sh in Aegilops sharonensis that are effective against widely virulent African races of the wheat stem rust pathogen. Stem rust is one of the most important diseases of wheat in the world. When single stem rust resistance (Sr) genes are deployed in wheat, they are often rapidly overcome by the pathogen. To this end, we initiated a search for novel sources of resistance in diverse wheat relatives and identified the wild goatgrass species Aegilops sharonesis (Sharon goatgrass) as a rich reservoir of resistance to wheat stem rust. The objectives of this study were to discover and map novel Sr genes in Ae. sharonensis and to explore the possibility of identifying new Sr genes by genome-wide association study (GWAS). We developed two biparental populations between resistant and susceptible accessions of Ae. sharonensis and performed QTL and linkage analysis. In an F 6 recombinant inbred line and an F 2 population, two genes were identified that mapped to the short arm of chromosome 1S sh , designated as Sr-1644-1Sh, and the long arm of chromosome 5S sh , designated as Sr-1644-5Sh. The gene Sr-1644-1Sh confers a high level of resistance to race TTKSK (a member of the Ug99 race group), while the gene Sr-1644-5Sh conditions strong resistance to TRTTF, another widely virulent race found in Yemen. Additionally, GWAS was conducted on 125 diverse Ae. sharonensis accessions for stem rust resistance. The gene Sr-1644-1Sh was detected by GWAS, while Sr-1644-5Sh was not detected, indicating that the effectiveness of GWAS might be affected by marker density, population structure, low allele frequency and other factors.
Generation of Wheat Transcription Factor FOX Rice Lines and Systematic Screening for Salt and Osmotic Stress Tolerance.

PubMed

Wu, Jinxia; Zhang, Zhiguo; Zhang, Qian; Liu, Yayun; Zhu, Butuo; Cao, Jian; Li, Zhanpeng; Han, Longzhi; Jia, Jizeng; Zhao, Guangyao; Sun, Xuehui

2015-01-01

Transcription factors (TFs) play important roles in plant growth, development, and responses to environmental stress. In this study, we collected 1,455 full-length (FL) cDNAs of TFs, representing 45 families, from wheat and its relatives Triticum urartu, Aegilops speltoides, Aegilops tauschii, Triticum carthlicum, and Triticum aestivum. More than 15,000 T0 TF FOX (Full-length cDNA Over-eXpressing) rice lines were generated; of these, 10,496 lines set seeds. About 14.88% of the T0 plants showed obvious phenotypic changes. T1 lines (5,232 lines) were screened for salt and osmotic stress tolerance using 150 mM NaCl and 20% (v/v) PEG-4000, respectively. Among them, five lines (591, 746, 1647, 1812, and J4065) showed enhanced salt stress tolerance, five lines (591, 746, 898, 1078, and 1647) showed enhanced osmotic stress tolerance, and three lines (591, 746, and 1647) showed both salt and osmotic stress tolerance. Further analysis of the T-DNA flanking sequences showed that line 746 over-expressed TaEREB1, line 898 over-expressed TabZIPD, and lines 1812 and J4065 over-expressed TaOBF1a and TaOBF1b, respectively. The enhanced salt and osmotic stress tolerance of lines 898 and 1812 was confirmed by retransformation of the respective genes. Our results demonstrate that a heterologous FOX system may be used as an alternative genetic resource for the systematic functional analysis of the wheat genome.
Decomposing Additive Genetic Variance Revealed Novel Insights into Trait Evolution in Synthetic Hexaploid Wheat.

PubMed

Jighly, Abdulqader; Joukhadar, Reem; Singh, Sukhwinder; Ogbonnaya, Francis C

2018-01-01

Whole genome duplication (WGD) is an evolutionary phenomenon, which causes significant changes to genomic structure and trait architecture. In recent years, a number of studies decomposed the additive genetic variance explained by different sets of variants. However, they investigated diploid populations only and none of the studies examined any polyploid organism. In this research, we extended the application of this approach to polyploids, to differentiate the additive variance explained by the three subgenomes and seven sets of homoeologous chromosomes in synthetic allohexaploid wheat (SHW) to gain a better understanding of trait evolution after WGD. Our SHW population was generated by crossing improved durum parents ( Triticum turgidum; 2n = 4x = 28, AABB subgenomes) with the progenitor species Aegilops tauschii (syn Ae. squarrosa, T. tauschii ; 2n = 2x = 14, DD subgenome). The population was phenotyped for 10 fungal/nematode resistance traits as well as two abiotic stresses. We showed that the wild D subgenome dominated the additive effect and this dominance affected the A more than the B subgenome. We provide evidence that this dominance was not inflated by population structure, relatedness among individuals or by longer linkage disequilibrium blocks observed in the D subgenome within the population used for this study. The cumulative size of the three homoeologs of the seven chromosomal groups showed a weak but significant positive correlation with their cumulative explained additive variance. Furthermore, an average of 69% for each chromosomal group's cumulative additive variance came from one homoeolog that had the highest explained variance within the group across all 12 traits. We hypothesize that structural and functional changes during diploidization may explain chromosomal group relations as allopolyploids keep balanced dosage for many genes. Our results contribute to a better understanding of trait evolution mechanisms in polyploidy, which will
Decomposing Additive Genetic Variance Revealed Novel Insights into Trait Evolution in Synthetic Hexaploid Wheat

PubMed Central

Jighly, Abdulqader; Joukhadar, Reem; Singh, Sukhwinder; Ogbonnaya, Francis C.

2018-01-01

Whole genome duplication (WGD) is an evolutionary phenomenon, which causes significant changes to genomic structure and trait architecture. In recent years, a number of studies decomposed the additive genetic variance explained by different sets of variants. However, they investigated diploid populations only and none of the studies examined any polyploid organism. In this research, we extended the application of this approach to polyploids, to differentiate the additive variance explained by the three subgenomes and seven sets of homoeologous chromosomes in synthetic allohexaploid wheat (SHW) to gain a better understanding of trait evolution after WGD. Our SHW population was generated by crossing improved durum parents (Triticum turgidum; 2n = 4x = 28, AABB subgenomes) with the progenitor species Aegilops tauschii (syn Ae. squarrosa, T. tauschii; 2n = 2x = 14, DD subgenome). The population was phenotyped for 10 fungal/nematode resistance traits as well as two abiotic stresses. We showed that the wild D subgenome dominated the additive effect and this dominance affected the A more than the B subgenome. We provide evidence that this dominance was not inflated by population structure, relatedness among individuals or by longer linkage disequilibrium blocks observed in the D subgenome within the population used for this study. The cumulative size of the three homoeologs of the seven chromosomal groups showed a weak but significant positive correlation with their cumulative explained additive variance. Furthermore, an average of 69% for each chromosomal group's cumulative additive variance came from one homoeolog that had the highest explained variance within the group across all 12 traits. We hypothesize that structural and functional changes during diploidization may explain chromosomal group relations as allopolyploids keep balanced dosage for many genes. Our results contribute to a better understanding of trait evolution mechanisms in polyploidy, which will
Global transgenerational gene expression dynamics in two newly synthesized allohexaploid wheat (Triticum aestivum) lines

PubMed Central

2012-01-01

Background Alteration in gene expression resulting from allopolyploidization is a prominent feature in plants, but its spectrum and extent are not fully known. Common wheat (Triticum aestivum) was formed via allohexaploidization about 10,000 years ago, and became the most important crop plant. To gain further insights into the genome-wide transcriptional dynamics associated with the onset of common wheat formation, we conducted microarray-based genome-wide gene expression analysis on two newly synthesized allohexaploid wheat lines with chromosomal stability and a genome constitution analogous to that of the present-day common wheat. Results Multi-color GISH (genomic in situ hybridization) was used to identify individual plants from two nascent allohexaploid wheat lines between Triticum turgidum (2n = 4x = 28; genome BBAA) and Aegilops tauschii (2n = 2x = 14; genome DD), which had a stable chromosomal constitution analogous to that of common wheat (2n = 6x = 42; genome BBAADD). Genome-wide analysis of gene expression was performed for these allohexaploid lines along with their parental plants from T. turgidum and Ae. tauschii, using the Affymetrix Gene Chip Wheat Genome-Array. Comparison with the parental plants coupled with inclusion of empirical mid-parent values (MPVs) revealed that whereas the great majority of genes showed the expected parental additivity, two major patterns of alteration in gene expression in the allohexaploid lines were identified: parental dominance expression and non-additive expression. Genes involved in each of the two altered expression patterns could be classified into three distinct groups, stochastic, heritable and persistent, based on their transgenerational heritability and inter-line conservation. Strikingly, whereas both altered patterns of gene expression showed a propensity of inheritance, identity of the involved genes was highly stochastic, consistent with the involvement of diverse Gene Ontology (GO) terms. Nonetheless, those
Generation of amphidiploids from hybrids of wheat and related species from the genera Aegilops, Secale, Thinopyrum, and Triticum as a source of genetic variation for wheat improvement.

PubMed

Nemeth, Csilla; Yang, Cai-yun; Kasprzak, Paul; Hubbart, Stella; Scholefield, Duncan; Mehra, Surbhi; Skipper, Emma; King, Ian; King, Julie

2015-02-01

We aim to improve diversity of domesticated wheat by transferring genetic variation for important target traits from related wild and cultivated grass species. The present study describes the development of F1 hybrids between wheat and related species from the genera Aegilops, Secale, Thinopyrum, and Triticum and production of new amphidiploids. Amphidiploid lines were produced from 20 different distant relatives. Both colchicine and caffeine were successfully used to double the chromosome numbers. The genomic constitution of the newly formed amphidiploids derived from seven distant relatives was determined using genomic in situ hybridization (GISH). Altogether, 42 different plants were analysed, 19 using multicolour GISH separating the chromosomes from the A, B, and D genomes of wheat, as well as the distant relative, and 23 using single colour GISH. Restructuring of the allopolyploid genome, both chromosome losses and aneuploidy, was detected in all the genomes contained by the amphidiploids. From the observed chromosome numbers there is an indication that in amphidiploids the B genome of wheat suffers chromosome losses less frequently than the other wheat genomes. Phenotyping to realize the full potential of the wheat-related grass germplasm is underway, linking the analyzed genotypes to agronomically important target traits.
Transfer of useful variability of high grain iron and zinc from Aegilops kotschyi into wheat through seed irradiation approach.

PubMed

Verma, Shailender Kumar; Kumar, Satish; Sheikh, Imran; Malik, Sachin; Mathpal, Priyanka; Chugh, Vishal; Kumar, Sundip; Prasad, Ramasare; Dhaliwal, Harcharan Singh

2016-01-01

To transfer the 2S chromosomal fragment(s) of Aegilops kotschyi (2S(k)) into the bread wheat genome which could lead to the biofortification of wheat with high grain iron and zinc content. Wheat-Ae. kotschyi 2A/2S(k) substitution lines with high grain iron and zinc content were used to transfer the gene/loci for high grain Fe and Zn content into wheat using seed irradiation approach. Bread wheat plants derived from 40 krad-irradiated seeds showed the presence of univalents and multivalents during meiotic metaphase-I. Genomic in situ hybridization analysis of seed irradiation hybrid F2 seedlings showed several terminal and interstitial signals indicated the introgression of Ae. kotschyi chromosome segments. This proves the efficacy of seed radiation hybrid approach in gene transfer experiments. All the radiation-treated hybrid plants with high grain Fe and Zn content were analyzed with wheat group 2 chromosome-specific polymorphic simple sequence repeat markers to identify the introgression of small alien chromosome fragment(s). Radiation-induced hybrids showed more than 65% increase in grain iron and 54% increase in Zn contents with better harvest index than the elite wheat cultivar WL711 indicating effective and compensating translocations of 2S(k) fragments into wheat genome.
Lr41, Lr39, and a leaf rust resistance gene from Aegilops cylindrica may be allelic and are located on wheat chromosome 2DS.

PubMed

Singh, Sukhwinder; Franks, C D; Huang, L; Brown-Guedira, G L; Marshall, D S; Gill, B S; Fritz, A

2004-02-01

The leaf rust resistance gene Lr41 in wheat germplasm KS90WGRC10 and a resistance gene in wheat breeding line WX93D246-R-1 were transferred to Triticum aestivum from Aegilops tauschii and Ae. cylindrica, respectively. The leaf rust resistance gene in WX93D246-R-1 was located on wheat chromosome 2D by monosomic analysis. Molecular marker analysis of F(2) plants from non-critical crosses determined that this gene is 11.2 cM distal to marker Xgwm210 on the short arm of 2D. No susceptible plants were detected in a population of 300 F(2) plants from a cross between WX93D246-R-1 and TA 4186 ( Lr39), suggesting that the gene in WX93D246-R-1 is the same as, or closely linked to, Lr39. In addition, no susceptible plants were detected in a population of 180 F(2) plants from the cross between KS90WGRC10 and WX93D246-R-1. The resistance gene in KS90WGRC10, Lr41, was previously reported to be located on wheat chromosome 1D. In this study, no genetic association was found between Lr41 and 51 markers located on chromosome 1D. A population of 110 F(3 )lines from a cross between KS90WGRC10 and TAM 107 was evaluated with polymorphic SSR markers from chromosome 2D and marker Xgdm35 was found to be 1.9 cM proximal to Lr41. When evaluated with diverse isolates of Puccinia triticina, similar reactions were observed on WX93D246-R-1, KS90WGRC10, and TA 4186. The results of mapping, allelism, and race specificity test indicate that these germplasms likely have the same gene for resistance to leaf rust.
Introgression of a new stem rust resistance gene from Aegilops markgrafii into wheat

USDA-ARS?s Scientific Manuscript database

In a prior study, we reported that an Alcedo/Aegilops markgrafii disomic addition line, AIII(D) (2n=44), was resistant to three races of the Ug99 lineage and five North American races of stem rust pathogen in wheat and the resistance originated from the alien chromosome. In this study, our objectiv...
Mapping of powdery mildew resistance gene Pm53 introgressed from Aegilops speltoides into soft red winter wheat.

PubMed

Petersen, Stine; Lyerly, Jeanette H; Worthington, Margaret L; Parks, Wesley R; Cowger, Christina; Marshall, David S; Brown-Guedira, Gina; Murphy, J Paul

2015-02-01

A powdery mildew resistance gene was introgressed from Aegilops speltoides into winter wheat and mapped to chromosome 5BL. Closely linked markers will permit marker-assisted selection for the resistance gene. Powdery mildew of wheat (Triticum aestivum L.) is a major fungal disease in many areas of the world, caused by Blumeria graminis f. sp. tritici (Bgt). Host plant resistance is the preferred form of disease prevention because it is both economical and environmentally sound. Identification of new resistance sources and closely linked markers enable breeders to utilize these new sources in marker-assisted selection as well as in gene pyramiding. Aegilops speltoides (2n = 2x = 14, genome SS), has been a valuable disease resistance donor. The powdery mildew resistant wheat germplasm line NC09BGTS16 (NC-S16) was developed by backcrossing an Ae. speltoides accession, TAU829, to the susceptible soft red winter wheat cultivar 'Saluda'. NC-S16 was crossed to the susceptible cultivar 'Coker 68-15' to develop F2:3 families for gene mapping. Greenhouse and field evaluations of these F2:3 families indicated that a single gene, designated Pm53, conferred resistance to powdery mildew. Bulked segregant analysis showed that multiple simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers specific to chromosome 5BL segregated with the resistance gene. The gene was flanked by markers Xgwm499, Xwmc759, IWA6024 (0.7 cM proximal) and IWA2454 (1.8 cM distal). Pm36, derived from a different wild wheat relative (T. turgidum var. dicoccoides), had previously been mapped to chromosome 5BL in a durum wheat line. Detached leaf tests revealed that NC-S16 and a genotype carrying Pm36 differed in their responses to each of three Bgt isolates. Pm53 therefore appears to be a new source of powdery mildew resistance.
Identification of ecogeographical gaps in the Spanish Aegilops collections with potential tolerance to drought and salinity

PubMed Central

Parra-Quijano, Mauricio; Iriondo, Jose María

2017-01-01

Drought, one of the most important abiotic stress factors limiting biomass, significantly reduces crop productivity. Salinization also affects the productivity of both irrigated and rain-fed wheat crops. Species of genus Aegilops can be considered crop wild relatives (CWR) of wheat and have been widely used as gene sources in wheat breeding, especially in providing resistance to pests and diseases. Five species (Ae. biuncialis, Ae. geniculata, Ae. neglecta, Ae. triuncialis and Ae. ventricosa) are included in the Spanish National Inventory of CWRs. This study aimed to identify ecogeographic gaps in the Spanish Network on Plant Genetic Resources for Food and Agriculture (PGRFA) with potential tolerance to drought and salinity. Data on the Spanish populations of the target species collected and conserved in genebanks of the Spanish Network on PGRFA and data on other population occurrences in Spain were compiled and assessed for their geo-referencing quality. The records with the best geo-referencing quality values were used to identify the ecogeographical variables that might be important for Aegilops distribution in Spain. These variables were then used to produce ecogeographic land characterization maps for each species, allowing us to identify populations from low and non-represented ecogeographical categories in ex situ collections. Predictive characterization strategy was used to identify 45 Aegilops populations in these ecogeographical gaps with potential tolerance to drought and salinity conditions. Further efforts are being made to collect and evaluate these populations. PMID:28761779
Integrated physical map of bread wheat chromosome arm 7DS to facilitate gene cloning and comparative studies.

PubMed

Tulpová, Zuzana; Luo, Ming-Cheng; Toegelová, Helena; Visendi, Paul; Hayashi, Satomi; Vojta, Petr; Paux, Etienne; Kilian, Andrzej; Abrouk, Michaël; Bartoš, Jan; Hajdúch, Marián; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

2018-03-08

Bread wheat (Triticum aestivum L.) is a staple food for a significant part of the world's population. The growing demand on its production can be satisfied by improving yield and resistance to biotic and abiotic stress. Knowledge of the genome sequence would aid in discovering genes and QTLs underlying these traits and provide a basis for genomics-assisted breeding. Physical maps and BAC clones associated with them have been valuable resources from which to generate a reference genome of bread wheat and to assist map-based gene cloning. As a part of a joint effort coordinated by the International Wheat Genome Sequencing Consortium, we have constructed a BAC-based physical map of bread wheat chromosome arm 7DS consisting of 895 contigs and covering 94% of its estimated length. By anchoring BAC contigs to one radiation hybrid map and three high resolution genetic maps, we assigned 73% of the assembly to a distinct genomic position. This map integration, interconnecting a total of 1713 markers with ordered and sequenced BAC clones from a minimal tiling path, provides a tool to speed up gene cloning in wheat. The process of physical map assembly included the integration of the 7DS physical map with a whole-genome physical map of Aegilops tauschii and a 7DS Bionano genome map, which together enabled efficient scaffolding of physical-map contigs, even in the non-recombining region of the genetic centromere. Moreover, this approach facilitated a comparison of bread wheat and its ancestor at BAC-contig level and revealed a reconstructed region in the 7DS pericentromere. Copyright © 2018. Published by Elsevier B.V.
[The detection of nonallelic to known genes of resistance to Tilletia caries (DC) Tul. in wheat strains from interspecific hybridization (Triticum aestivum x Aegilops cylindrica)].

PubMed

Babaiants, L T; Dubinina, L A; Iushchenko, G M

2000-01-01

It was established by hybridological analysis that winter bread wheat lines 1/74-91, 3/36-91, 5/55-91 possess single dominant gene of resistance to bunt (Tilletia caries (DC) Tul.), but lines 8/2-91, 5/43-91, 4/11-91 and 8/16-91 have two independent dominant genes for this character. These genes originated from Aegilops cylindrica are not identical to Bt1-Bt17 genes and are unknown to date. The lines were obtained from crosses between winter bread wheat variety Odeskaya polukarlikovaya and Aegilops cylindrica.
Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

PubMed

Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

2018-02-01

This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

PubMed

Anderson, Olin D; Huo, Naxin; Gu, Yong Q

2013-06-01

The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Random chromosome elimination in synthetic Triticum-Aegilops amphiploids leads to development of a stable partial amphiploid with high grain micro- and macronutrient content and powdery mildew resistance.

PubMed

Tiwari, Vijay K; Rawat, Nidhi; Neelam, Kumari; Kumar, Sundip; Randhawa, Gursharn S; Dhaliwal, Harcharan S

2010-12-01

Synthetic amphiploids are the immortal sources for studies on crop evolution, genome dissection, and introgression of useful variability from related species. Cytological analysis of synthetic decaploid wheat (Triticum aestivum L.) - Aegilops kotschyi Boiss. amphiploids (AABBDDUkUkSkSk) showed some univalents from the C1 generation onward followed by chromosome elimination. Most of the univalents came to metaphase I plate after the reductional division of paired chromosomes and underwent equational division leading to their elimination through laggards and micronuclei. Substantial variation in the chromosome number of pollen mother cells from different tillers, spikelets, and anthers of some plants also indicated somatic chromosome elimination. Genomic in situ hybridization, fluorescence in situ hybridization, and simple sequence repeat markers analysis of two amphiploids with reduced chromosomes indicated random chromosome elimination of various genomes with higher sensitivity of D followed by the Sk and Uk genomes to elimination, whereas 1D chromosome was preferentially eliminated in both the amphiploids investigated. One of the partial amphiploids, C4 T. aestivum 'Chinese Spring' - Ae. kotschyi 396 (2n = 58), with 34 T. aestivum, 14 Uk, and 10 Sk had stable meiosis and high fertility. The partial amphiploids with white glumes, bold seeds, and tough rachis with high grain macro- and micronutrients and resistance to powdery mildew could be used for T. aestivum biofortification and transfer of powdery mildew resistance.
Genome Evolution Due to Allopolyploidization in Wheat

PubMed Central

Feldman, Moshe; Levy, Avraham A.

2012-01-01

The wheat group has evolved through allopolyploidization, namely, through hybridization among species from the plant genera Aegilops and Triticum followed by genome doubling. This speciation process has been associated with ecogeographical expansion and with domestication. In the past few decades, we have searched for explanations for this impressive success. Our studies attempted to probe the bases for the wide genetic variation characterizing these species, which accounts for their great adaptability and colonizing ability. Central to our work was the investigation of how allopolyploidization alters genome structure and expression. We found in wheat that allopolyploidy accelerated genome evolution in two ways: (1) it triggered rapid genome alterations through the instantaneous generation of a variety of cardinal genetic and epigenetic changes (which we termed “revolutionary” changes), and (2) it facilitated sporadic genomic changes throughout the species’ evolution (i.e., evolutionary changes), which are not attainable at the diploid level. Our major findings in natural and synthetic allopolyploid wheat indicate that these alterations have led to the cytological and genetic diploidization of the allopolyploids. These genetic and epigenetic changes reflect the dynamic structural and functional plasticity of the allopolyploid wheat genome. The significance of this plasticity for the successful establishment of wheat allopolyploids, in nature and under domestication, is discussed. PMID:23135324
[Detection of the introgression of genome elements of Aegilops cylindrica Host. into Triticum aestivum L. genome with ISSR-analysis].

PubMed

Galaev, A V; Babaiants, L T; Sivolap, Iu M

2003-01-01

Comparative analysis of introgressive and parental forms of wheat was carried out to reveal the sites of donor genome with new loci of resistance to fungal diseases. By ISSR-method 124 ISSR-loci were detected in the genomes of 18 individual plants of introgressive line 5/20-91; 17 of them have been related to introgressive fragments of Ae. cylindrica genome in T. aestivum. It was shown that ISSR-method is effective for detection of the variability caused by introgression of alien genetic material to T. aestivum genome.

A novel chimeric low-molecular-weight glutenin subunit gene from the wild relatives of wheat Aegilops kotschyi and Ae. juvenalis: evolution at the Glu-3 loci.

PubMed

Li, Xiaohui; Ma, Wujun; Gao, Liyan; Zhang, Yanzhen; Wang, Aili; Ji, Kangmin; Wang, Ke; Appels, Rudi; Yan, Yueming

2008-09-01

Four LMW-m and one novel chimeric (between LMW-i and LMW-m types) low-molecular-weight glutenin subunit (LMW-GS) genes from Aegilops neglecta (UUMM), Ae. kotschyi (UUSS), and Ae. juvenalis (DDMMUU) were isolated and characterized. Sequence structures showed that the 4 LMW-m-type genes, assigned to the M genome of Ae. neglecta, displayed a high homology with those from hexaploid common wheat. The novel chimeric gene, designed as AjkLMW-i, was isolated from both Ae. kotschyi and Ae. juvenalis and shown to be located on the U genome. Phylogentic analysis demonstrated that it had higher identity to the LMW-m-type than the LMW-i-type genes. A total of 20 single nucleotide polymorphisms (SNPs) were detected among the 4 LMW-m genes, with 13 of these being nonsynonymous SNPs that resulted in amino acid substitutions in the deduced mature proteins. Phylogenetic analysis demonstrated that it had higher identity to the LMW-m-type than the LMW-i-type genes. The divergence time estimation showed that the M and D genomes were closely related and diverged at 5.42 million years ago (MYA) while the differentiation between the U and A genomes was 6.82 MYA. We propose that, in addition to homologous recombination, an illegitimate recombination event on the U genome may have occurred 6.38 MYA and resulted in the generation of the chimeric gene AjkLMW-i, which may be an important genetic mechanism for the origin and evolution of LMW-GS Glu-3 alleles as well as other prolamin genes.
Molecular analysis, cytogenetics and fertility of introgression lines from transgenic wheat to Aegilops cylindrica host.

PubMed

Schoenenberger, Nicola; Guadagnuolo, Roberto; Savova-Bianchi, Dessislava; Küpfer, Philippe; Felber, François

2006-12-01

Natural hybridization and backcrossing between Aegilops cylindrica and Triticum aestivum can lead to introgression of wheat DNA into the wild species. Hybrids between Ae. cylindrica and wheat lines bearing herbicide resistance (bar), reporter (gus), fungal disease resistance (kp4), and increased insect tolerance (gna) transgenes were produced by pollination of emasculated Ae. cylindrica plants. F1 hybrids were backcrossed to Ae. cylindrica under open-pollination conditions, and first backcrosses were selfed using pollen bags. Female fertility of F1 ranged from 0.03 to 0.6%. Eighteen percent of the sown BC1s germinated and flowered. Chromosome numbers ranged from 30 to 84 and several of the plants bore wheat-specific sequence-characterized amplified regions (SCARs) and the bar gene. Self fertility in two BC1 plants was 0.16 and 5.21%, and the others were completely self-sterile. Among 19 BC1S1 individuals one plant was transgenic, had 43 chromosomes, contained the bar gene, and survived glufosinate treatments. The other BC1S1 plants had between 28 and 31 chromosomes, and several of them carried SCARs specific to wheat A and D genomes. Fertility of these plants was higher under open-pollination conditions than by selfing and did not necessarily correlate with even or euploid chromosome number. Some individuals having supernumerary wheat chromosomes recovered full fertility.
Development of wheat-Aegilops speltoides recombinants and simple PCR-based markers for stem rust resistance genes on the 2S#1 chromosome

USDA-ARS?s Scientific Manuscript database

Wild relatives of wheat are important but underutilized resources for new rust resistance genes because linked negative traits often hinder deployment of these genes in commercial wheats. Here we report reduced alien chromatin recombinants derived from E.R. Sears wheat-Aegilops speltoides translocat...
Development of wheat-Aegilops speltoides recombinants and simple PCR-based markers for stem rust resistance genes on the 2S#1 chromosome

USDA-ARS?s Scientific Manuscript database

Wild relatives of wheat are important but underutilized resources for new rust resistance genes because linked negative traits often hinder deployment of these genes in commercial wheats. Here we report reduced alien chromatin recombinants derived from E.R. Sears' wheat-Aegilops speltoides transloca...
Line differences in Cor/Lea and fructan biosynthesis-related gene transcript accumulation are related to distinct freezing tolerance levels in synthetic wheat hexaploids.

PubMed

Yokota, Hirokazu; Iehisa, Julio C M; Shimosaka, Etsuo; Takumi, Shigeo

2015-03-15

In common wheat, cultivar differences in freezing tolerance are considered to be mainly due to allelic differences at two major loci controlling freezing tolerance. One of the two loci, Fr-2, is coincident with a cluster of genes encoding C-repeat binding factors (CBFs), which induce downstream Cor/Lea genes during cold acclimation. Here, we conducted microarray analysis to study comprehensive changes in gene expression profile under long-term low-temperature (LT) treatment and to identify other LT-responsive genes related to cold acclimation in leaves of seedlings and crown tissues of a synthetic hexaploid wheat line. The microarray analysis revealed marked up-regulation of a number of Cor/Lea genes and fructan biosynthesis-related genes under the long-term LT treatment. For validation of the microarray data, we selected four synthetic wheat lines that contain the A and B genomes from the tetraploid wheat cultivar Langdon and the diverse D genomes originating from different Aegilops tauschii accessions with distinct levels of freezing tolerance after cold acclimation. Quantitative RT-PCR showed increased transcript levels of the Cor/Lea, CBF, and fructan biosynthesis-related genes in more freezing-tolerant lines than in sensitive lines. After a 14-day LT treatment, a significant difference in fructan accumulation was observed among the four lines. Therefore, the fructan biosynthetic pathway is associated with cold acclimation in development of wheat freezing tolerance and is another pathway related to diversity in freezing tolerance, in addition to the CBF-mediated Cor/Lea expression pathway. Copyright © 2014 Elsevier GmbH. All rights reserved.
Fine genetic mapping of greenbug aphid resistance gene Gb3 in Aegilops tauschii

USDA-ARS?s Scientific Manuscript database

The greenbug is a serious aphid pest of wheat and sorghum in the southern High Plains of the US. The greenbug resistant gene Gb3 originated from the goatgrass has shown consistent and durable resistance against prevailing greenbug biotypes in wheat fields for moer than 30 years. Our goal is to clone...
Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene.

PubMed

Guo, Zhiai; Song, Yanxia; Zhou, Ronghua; Ren, Zhenglong; Jia, Jizeng

2010-02-01

Ppd-D1 is one of the most potent genes affecting the photoperiod response of wheat (Triticum aestivum). Only two alleles, insensitive Ppd-D1a and sensitive Ppd-D1b, were known previously, and these did not adequately explain the broad adaptation of wheat to photoperiod variation. In this study, five diagnostic molecular markers were employed to identify Ppd-D1 haplotypes in 492 wheat varieties from diverse geographic locations and 55 accessions of Aegilops tauschii, the D genome donor species of wheat. Six Ppd-D1 haplotypes, designated I-VI, were identified. Types II, V and VI were considered to be more ancient and types I, III and IV were considered to be derived from type II. The transcript abundances of the Ppd-D1 haplotypes showed continuous variation, being highest for haplotype I, lowest for haplotype III, and correlating negatively with varietal differences in heading time. These haplotypes also significantly affected other agronomic traits. The distribution frequency of Ppd-D1 haplotypes showed partial correlations with both latitudes and altitudes of wheat cultivation regions. The evolution, expression and distribution of Ppd-D1 haplotypes were consistent evidentially with each other. What was regarded as a pair of alleles in the past can now be considered a series of alleles leading to continuous variation.
Introgression of an imidazolinone-resistance gene from winter wheat (Triticum aestivum L.) into jointed goatgrass (Aegilops cylindrica Host).

PubMed

Perez-Jones, Alejandro; Mallory-Smith, Carol A; Hansen, Jennifer L; Zemetra, Robert S

2006-12-01

Imidazolinone-resistant winter wheat (Triticum aestivum L.) is being commercialized in the USA. This technology allows wheat growers to selectively control jointed goatgrass (Aegilops cylindrica Host), a weed that is especially problematic because of its close genetic relationship with wheat. However, the potential movement of the imidazolinone-resistance gene from winter wheat to jointed goatgrass is a concern. Winter wheat and jointed goatgrass have the D genome in common and can hybridize and backcross under natural field conditions. Since the imidazolinone-resistance gene (Imi1) is located on the D genome, it is possible for resistance to be transferred to jointed goatgrass via hybridization and backcrossing. To study the potential for gene movement, BC(2)S(2) plants were produced artificially using imidazolinone-resistant winter wheat (cv. FS-4) as the female parent and a native jointed goatgrass collection as the male recurrent parent. FS-4, the jointed goatgrass collection, and 18 randomly selected BC(2)S(2) populations were treated with imazamox. The percentage of survival was 100% for the FS-4, 0% for the jointed goatgrass collection and 6 BC(2)S(2) populations, 40% or less for 2 BC(2)S(2) populations, and 50% or greater for the remaining 10 BC(2)S(2) populations. Chromosome counts in BC(2)S(3) plants showed a restoration of the chromosome number of jointed goatgrass, with four out of four plants examined having 28 chromosomes. Sequencing of AHASL1D in BC(2)S(3) plants derived from BC(2)S(2)-6 revealed the sexual transmission of Imi1 from FS-4 to jointed goatgrass. Imi1 conferred resistance to the imidazolinone herbicide imazamox, as shown by the in vitro assay for acetohydroxyacid synthase (AHAS) activity.
Molecular and phylogenetic characterization of the homoeologous EPSP Synthase genes of allohexaploid wheat, Triticum aestivum (L.).

PubMed

Aramrak, Attawan; Kidwell, Kimberlee K; Steber, Camille M; Burke, Ian C

2015-10-23

5-Enolpyruvylshikimate-3-phosphate synthase (EPSPS) is the sixth and penultimate enzyme in the shikimate biosynthesis pathway, and is the target of the herbicide glyphosate. The EPSPS genes of allohexaploid wheat (Triticum aestivum, AABBDD) have not been well characterized. Herein, the three homoeologous copies of the allohexaploid wheat EPSPS gene were cloned and characterized. Genomic and coding DNA sequences of EPSPS from the three related genomes of allohexaploid wheat were isolated using PCR and inverse PCR approaches from soft white spring "Louise'. Development of genome-specific primers allowed the mapping and expression analysis of TaEPSPS-7A1, TaEPSPS-7D1, and TaEPSPS-4A1 on chromosomes 7A, 7D, and 4A, respectively. Sequence alignments of cDNA sequences from wheat and wheat relatives served as a basis for phylogenetic analysis. The three genomic copies of wheat EPSPS differed by insertion/deletion and single nucleotide polymorphisms (SNPs), largely in intron sequences. RT-PCR analysis and cDNA cloning revealed that EPSPS is expressed from all three genomic copies. However, TaEPSPS-4A1 is expressed at much lower levels than TaEPSPS-7A1 and TaEPSPS-7D1 in wheat seedlings. Phylogenetic analysis of 1190-bp cDNA clones from wheat and wheat relatives revealed that: 1) TaEPSPS-7A1 is most similar to EPSPS from the tetraploid AB genome donor, T. turgidum (99.7 % identity); 2) TaEPSPS-7D1 most resembles EPSPS from the diploid D genome donor, Aegilops tauschii (100 % identity); and 3) TaEPSPS-4A1 resembles EPSPS from the diploid B genome relative, Ae. speltoides (97.7 % identity). Thus, EPSPS sequences in allohexaploid wheat are preserved from the most two recent ancestors. The wheat EPSPS genes are more closely related to Lolium multiflorum and Brachypodium distachyon than to Oryza sativa (rice). The three related EPSPS homoeologues of wheat exhibited conservation of the exon/intron structure and of coding region sequence, but contained significant sequence
ThMYC4E, candidate Blue aleurone 1 gene controlling the associated trait in Triticum aestivum

PubMed Central

Chen, Wenjie; Zhang, Bo; Wang, Daowen; Liu, Dengcai; Zhang, Huaigang

2017-01-01

Blue aleurone is a useful and interesting trait in common wheat that was derived from related species. Here, transcriptomes of blue and white aleurone were compared for isolating Blue aleurone 1 (Ba1) transferred from Thinopyrum ponticum. In the genes involved in anthocyanin biosynthesis, only a basic helix-loop-helix (bHLH) transcription factor, ThMYC4E, had a higher transcript level in blue aleurone phenotype, and was homologous to the genes on chromosome 4 of Triticum aestivum. ThMYC4E carried the characteristic domains (bHLH-MYC_N, HLH and ACT-like) of a bHLH transcription factor, and clustered with genes regulating anthocyanin biosynthesis upon phylogenetic analysis. The over-expression of ThMYC4E regulated anthocyanin biosynthesis with the coexpression of the MYB transcription factor ZmC1 from maize. ThMYC4E existed in the genomes of the addition, substitution and near isogenic lines with the blue aleurone trait derived from Th. ponticum, and could not be detected in any germplasm of T. urartu, T. monococcum, T. turgidum, Aegilops tauschii or T. aestivum, with white aleurone. These results suggested that ThMYC4E was candidate Ba1 gene controlling the blue aleurone trait in T. aestivum genotypes carrying Th. ponticum introgression. The ThMYC4E isolation aids in better understanding the genetic mechanisms of the blue aleurone trait and in its more effective use during wheat breeding. PMID:28704468
Contrasting patterns of evolution of 45S and 5S rDNA families uncover new aspects in the genome constitution of the agronomically important grass Thinopyrum intermedium (Triticeae).

PubMed

Mahelka, Václav; Kopecky, David; Baum, Bernard R

2013-09-01

We employed sequencing of clones and in situ hybridization (genomic and fluorescent in situ hybridization [GISH and rDNA-FISH]) to characterize both the sequence variation and genomic organization of 45S (herein ITS1-5.8S-ITS2 region) and 5S (5S gene + nontranscribed spacer) ribosomal DNA (rDNA) families in the allohexaploid grass Thinopyrum intermedium. Both rDNA families are organized within several rDNA loci within all three subgenomes of the allohexaploid species. Both families have undergone different patterns of evolution. The 45S rDNA family has evolved in a concerted manner: internal transcribed spacer (ITS) sequences residing within the arrays of two subgenomes out of three got homogenized toward one major ribotype, whereas the third subgenome contained a minor proportion of distinct unhomogenized copies. Homogenization mechanisms such as unequal crossover and/or gene conversion were coupled with the loss of certain 45S rDNA loci. Unlike in the 45S family, the data suggest that neither interlocus homogenization among homeologous chromosomes nor locus loss occurred in 5S rDNA. Consistently with other Triticeae, the 5S rDNA family in intermediate wheatgrass comprised two distinct array types-the long- and short-spacer unit classes. Within the long and short units, we distinguished five and three different types, respectively, likely representing homeologous unit classes donated by putative parental species. Although the major ITS ribotype corresponds in our phylogenetic analysis to the E-genome species, the minor ribotype corresponds to Dasypyrum. 5S sequences suggested the contributions from Pseudoroegneria, Dasypyrum, and Aegilops. The contribution from Aegilops to the intermediate wheatgrass' genome is a new finding with implications in wheat improvement. We discuss rDNA evolution and potential origin of intermediate wheatgrass.
Analysis of the Gli-D2 locus identifies a genetic target for simultaneously improving the breadmaking and health-related traits of common wheat.

PubMed

Li, Da; Jin, Huaibing; Zhang, Kunpu; Wang, Zhaojun; Wang, Faming; Zhao, Yue; Huo, Naxin; Liu, Xin; Gu, Yong Q; Wang, Daowen; Dong, Lingli

2018-05-11

Gliadins are a major component of wheat seed proteins. However, the complex homoeologous Gli-2 loci (Gli-A2, -B2 and -D2) that encode the α-gliadins in commercial wheat are still poorly understood. Here we analyzed the Gli-D2 locus of Xiaoyan 81 (Xy81), a winter wheat cultivar. A total of 421.091 kb of the Gli-D2 sequence was assembled from sequencing multiple bacterial artificial clones, and 10 α-gliadin genes were annotated. Comparative genomic analysis showed that Xy81 carried only eight of the α-gliadin genes of the D genome donor Aegilops tauschii, with two of them each experiencing a tandem duplication. A mutant line lacking Gli-D2 (DLGliD2) consistently exhibited better breadmaking quality and dough functionalities than its progenitor Xy81, but without penalties in other agronomic traits. It also had an elevated lysine content in the grains. Transcriptome analysis verified the lack of Gli-D2 α-gliadin gene expression in DLGliD2. Furthermore, the transcript and protein levels of protein disulfide isomerase were both upregulated in DLGliD2 grains. Consistent with this finding, DLGliD2 had increased disulfide content in the flour. Our work sheds light on the structure and function of Gli-D2 in commercial wheat, and suggests that the removal of Gli-D2 and the gliadins specified by it is likely to be useful for simultaneously enhancing the end-use and health-related traits of common wheat. Because gliadins and homologous proteins are widely present in grass species, the strategy and information reported here may be broadly useful for improving the quality traits of diverse cereal crops. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Diversification of the celiac disease α-gliadin complex in wheat: a 33-mer peptide with six overlapping epitopes, evolved following polyploidization.

PubMed

Ozuna, Carmen V; Iehisa, Julio C M; Giménez, María J; Alvarez, Juan B; Sousa, Carolina; Barro, Francisco

2015-06-01

The gluten proteins from wheat, barley and rye are responsible both for celiac disease (CD) and for non-celiac gluten sensitivity, two pathologies affecting up to 6-8% of the human population worldwide. The wheat α-gliadin proteins contain three major CD immunogenic peptides: p31-43, which induces the innate immune response; the 33-mer, formed by six overlapping copies of three highly stimulatory epitopes; and an additional DQ2.5-glia-α3 epitope which partially overlaps with the 33-mer. Next-generation sequencing (NGS) and Sanger sequencing of α-gliadin genes from diploid and polyploid wheat provided six types of α-gliadins (named 1-6) with strong differences in their frequencies in diploid and polyploid wheat, and in the presence and abundance of these CD immunogenic peptides. Immunogenic variants of the p31-43 peptide were found in most of the α-gliadins. Variants of the DQ2.5-glia-α3 epitope were associated with specific types of α-gliadins. Remarkably, only type 1 α-gliadins contained 33-mer epitopes. Moreover, the full immunodominant 33-mer fragment was only present in hexaploid wheat at low abundance, probably as the result of allohexaploidization events from subtype 1.2 α-gliadins found only in Aegilops tauschii, the D-genome donor of hexaploid wheat. Type 3 α-gliadins seem to be the ancestral type as they are found in most of the α-gliadin-expressing Triticeae species. These findings are important for reducing the incidence of CD by the breeding/selection of wheat varieties with low stimulatory capacity of T cells. Moreover, advanced genome-editing techniques (TALENs, CRISPR) will be easier to implement on the small group of α-gliadins containing only immunogenic peptides. © 2015 Society for Experimental Biology and John Wiley & Sons Ltd.
[Molecular marker mapping of the gene resistant to common bunt transferred from Aegilops cylindrica into bread wheat].

PubMed

Galaev, A V; Babaiants, L T; Sivolap, Iu M

2006-01-01

Introgression lines 5/55-91 and 378/2000 of bread wheat contain the gene of resistance to Tilletia caries (DC.) Tul. transferred from Aegilops cylindrica Host. Using bulked segregant analysis with ISSR and SSR PCR the lincage of microsatellite locus Xgwm 259 with the gene of common bunt resistance has been identified in F2 population of 378/2000 x Lutestens 23397. DNA mapping made it possible to localize this highly effective gene in the intercalary region of the long arm of wheat chromosome 1B at the distance of 7.6-8.5 cM of the microsatellite Xgwm 259 locus which thus can be used in wheat breeding for selection of genotype resistance to common bunt.
Potential Implications of Climate Change on Aegilops Species Distribution: Sympatry of These Crop Wild Relatives with the Major European Crop Triticum aestivum and Conservation Issues.

PubMed

Ostrowski, Marie-France; Prosperi, Jean-Marie; David, Jacques

2016-01-01

Gene flow from crop to wild relatives is a common phenomenon which can lead to reduced adaptation of the wild relatives to natural ecosystems and/or increased adaptation to agrosystems (weediness). With global warming, wild relative distributions will likely change, thus modifying the width and/or location of co-occurrence zones where crop-wild hybridization events could occur (sympatry). This study investigates current and 2050 projected changes in sympatry levels between cultivated wheat and six of the most common Aegilops species in Europe. Projections were generated using MaxEnt on presence-only data, bioclimatic variables, and considering two migration hypotheses and two 2050 climate scenarios (RCP4.5 and RCP8.5). Overall, a general decline in suitable climatic conditions for Aegilops species outside the European zone and a parallel increase in Europe were predicted. If no migration could occur, the decline was predicted to be more acute outside than within the European zone. The potential sympatry level in Europe by 2050 was predicted to increase at a higher rate than species richness, and most expansions were predicted to occur in three countries, which are currently among the top four wheat producers in Europe: Russia, France and Ukraine. The results are also discussed with regard to conservation issues of these crop wild relatives.
Development of a wheat-Aegilops searsii substitution line with positively affecting Chinese steamed bread quality

PubMed Central

Du, Xuye; Ma, Xin; Min, Jingzhi; Zhang, Xiaocun; Jia, Zhenzhen

2018-01-01

A wheat-Aegilops searsii substitution line GL1402, in which chromosome 1B was substituted with 1Ss from Ae. searsii, was developed and detected using SDS-PAGE and GISH. The SDS-PAGE analysis showed that the HMW-GS encoded by the Glu-B1 loci of Chinese Spring was replaced by the HMW-GS encoded by the Glu-1Ss loci of Ae. searsii. Glutenin macropolymer (GMP) investigation showed that GL1402 had a much higher GMP content than Chinese Spring did. A dough quality comparison of GL1402 and Chinese Spring indicated that GL1402 showed a significantly higher protein content and middle peak time (MPT), and a smaller right peak slope (RPS). Quality tests of Chinese steamed bread (CSB) showed that the GL1402 also produced good steamed bread quality. These results suggested that the substitution line is a valuable breeding material for improving the wheat processing quality.
OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species.

PubMed

Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q

2015-07-01

Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
[Effect of an introgression from Aegilops cylindrica host on manifestation of productivity traits in winter common wheat F2 plants].

PubMed

Kozub, N A; Sozinov, I A; sozinov, A A

2004-12-01

The effect of introgression of a chromosome 1D segment from Aegilops cylindrica to winter common wheat on productivity traits in F2 plants was studied using storage protein loci as genetic markers. An allele of the gliadin-coding Gli-D1 locus served as a marker of the introgression. Using of two- and three-locus interaction models, it was shown that the introgression tagged with Gli-D1 affected the manifestation of productivity traits (productive tillering, grain weight per plant and grain number per plant) through interaction with other marker storage protein loci: Glu-B1, Glu-D1, and Gli-B2.
Development of a wheat-Aegilops searsii substitution line with positively affecting Chinese steamed bread quality.

PubMed

Du, Xuye; Ma, Xin; Min, Jingzhi; Zhang, Xiaocun; Jia, Zhenzhen

2018-03-01

A wheat- Aegilops searsii substitution line GL1402, in which chromosome 1B was substituted with 1S s from Ae. searsii , was developed and detected using SDS-PAGE and GISH. The SDS-PAGE analysis showed that the HMW-GS encoded by the Glu-B1 loci of Chinese Spring was replaced by the HMW-GS encoded by the Glu-1S s loci of Ae. searsii . Glutenin macropolymer (GMP) investigation showed that GL1402 had a much higher GMP content than Chinese Spring did. A dough quality comparison of GL1402 and Chinese Spring indicated that GL1402 showed a significantly higher protein content and middle peak time (MPT), and a smaller right peak slope (RPS). Quality tests of Chinese steamed bread (CSB) showed that the GL1402 also produced good steamed bread quality. These results suggested that the substitution line is a valuable breeding material for improving the wheat processing quality.
Dynamic evolution of resistance gene analogs in the orthologous genomic regions of powdery mildew resistance gene MlIW170 in Triticum dicoccoides and Aegilops tauschii

USDA-ARS?s Scientific Manuscript database

Wheat is one of the most important staple grain crops in the world. Powdery mildew disease caused by Blumeria graminis f.sp. tritici can result in significant losses in both grain yield and quality in wheat. In this study, the wheat powdery mildew resistance gene MlIW170 locus located on the short ...

Recurrent Deletions of Puroindoline Genes at the Grain Hardness Locus in Four Independent Lineages of Polyploid Wheat1[W][OA

PubMed Central

Li, Wanlong; Huang, Li; Gill, Bikram S.

2008-01-01

Polyploidy is known to induce numerous genetic and epigenetic changes but little is known about their physiological bases. In wheat, grain texture is mainly determined by the Hardness (Ha) locus consisting of genes Puroindoline a (Pina) and b (Pinb). These genes are conserved in diploid progenitors but were deleted from the A and B genomes of tetraploid Triticum turgidum (AB). We now report the recurrent deletions of Pina-Pinb in other lineages of polyploid wheat. We analyzed the Ha haplotype structure in 90 diploid and 300 polyploid accessions of Triticum and Aegilops spp. Pin genes were conserved in all diploid species and deletion haplotypes were detected in all polyploid Triticum and most of the polyploid Aegilops spp. Two Pina-Pinb deletion haplotypes were found in hexaploid wheat (Triticum aestivum; ABD). Pina and Pinb were eliminated from the G genome, but maintained in the A genome of tetraploid Triticum timopheevii (AG). Subsequently, Pina and Pinb were deleted from the A genome but retained in the Am genome of hexaploid Triticum zhukovskyi (AmAG). Comparison of deletion breakpoints demonstrated that the Pina-Pinb deletion occurred independently and recurrently in the four polyploid wheat species. The implications of Pina-Pinb deletions for polyploid-driven evolution of gene and genome and its possible physiological significance are discussed. PMID:18024553
Ecological genomics of natural plant populations: the Israeli perspective.

PubMed

Nevo, Eviatar

2009-01-01

The genomic era revolutionized evolutionary population biology. The ecological genomics of the wild progenitors of wheat and barley reviewed here was central in the research program of the Institute of Evolution, University of Haifa, since 1975 ( http://evolution.haifa.ac.il ). We explored the following questions: (1) How much of the genomic and phenomic diversity of wild progenitors of cultivars (wild emmer wheat, Triticum dicoccoides, the progenitor of most wheat, plus wild relatives of the Aegilops species; wild barley, Hordeum spontaneum, the progenitor of cultivated barley; wild oat, Avena sterilis, the progenitor of cultivated oats; and wild lettuce species, Lactuca, the progenitor and relatives of cultivated lettuce) are adaptive and processed by natural selection at both coding and noncoding genomic regions? (2) What is the origin and evolution of genomic adaptation and speciation processes and their regulation by mutation, recombination, and transposons under spatiotemporal variables and stressful macrogeographic and microgeographic environments? (3) How much genetic resources are harbored in the wild progenitors for crop improvement? We advanced ecological genetics into ecological genomics and analyzed (regionally across Israel and the entire Near East Fertile Crescent and locally at microsites, focusing on the "Evolution Canyon" model) hundreds of populations and thousands of genotypes for protein (allozyme) and deoxyribonucleic acid (DNA) (coding and noncoding) diversity, partly combined with phenotypic diversity. The environmental stresses analyzed included abiotic (climatic and microclimatic, edaphic) and biotic (pathogens, demographic) stresses. Recently, we introduced genetic maps, cloning, and transformation of candidate genes. Our results indicate abundant genotypic and phenotypic diversity in natural plant populations. The organization and evolution of molecular and organismal diversity in plant populations, at all genomic regions and
Cytogenetic and molecular markers for detecting Aegilops uniaristata chromosomes in a wheat background.

PubMed

Gong, Wenping; Li, Guangrong; Zhou, Jianping; Li, Genying; Liu, Cheng; Huang, Chengyan; Zhao, Zhendong; Yang, Zujun

2014-09-01

Aegilops uniaristata has many agronomically useful traits that can be used for wheat breeding. So far, a Triticum turgidum - Ae. uniaristata amphiploid and one set of Chinese Spring (CS) - Ae. uniaristata addition lines have been produced. To guide Ae. uniaristata chromatin transformation from these lines into cultivated wheat through chromosome engineering, reliable cytogenetic and molecular markers specific for Ae. uniaristata chromosomes need to be developed. Standard C-banding shows that C-bands mainly exist in the centromeric regions of Ae. uniaristata but rarely at the distal ends. Fluorescence in situ hybridization (FISH) using (GAA)8 as a probe showed that the hybridization signal of chromosomes 1N-7N are different, thus (GAA)8 can be used to identify all Ae. uniaristata chromosomes in wheat background simultaneously. Moreover, a total of 42 molecular markers specific for Ae. uniaristata chromosomes were developed by screening expressed sequence tag - sequence tagged site (EST-STS), expressed sequence tag - simple sequence repeat (EST-SSR), and PCR-based landmark unique gene (PLUG) primers. The markers were subsequently localized using the CS - Ae. uniaristata addition lines and different wheat cultivars as controls. The cytogenetic and molecular markers developed herein will be helpful for screening and identifying wheat - Ae. uniaristata progeny.
Dynamics and Differential Proliferation of Transposable Elements During the Evolution of the B and A Genomes of Wheat

PubMed Central

Charles, Mathieu; Belcram, Harry; Just, Jérémy; Huneau, Cécile; Viollet, Agnès; Couloux, Arnaud; Segurens, Béatrice; Carter, Meredith; Huteau, Virginie; Coriton, Olivier; Appels, Rudi; Samain, Sylvie; Chalhoub, Boulos

2008-01-01

Transposable elements (TEs) constitute >80% of the wheat genome but their dynamics and contribution to size variation and evolution of wheat genomes (Triticum and Aegilops species) remain unexplored. In this study, 10 genomic regions have been sequenced from wheat chromosome 3B and used to constitute, along with all publicly available genomic sequences of wheat, 1.98 Mb of sequence (from 13 BAC clones) of the wheat B genome and 3.63 Mb of sequence (from 19 BAC clones) of the wheat A genome. Analysis of TE sequence proportions (as percentages), ratios of complete to truncated copies, and estimation of insertion dates of class I retrotransposons showed that specific types of TEs have undergone waves of differential proliferation in the B and A genomes of wheat. While both genomes show similar rates and relatively ancient proliferation periods for the Athila retrotransposons, the Copia retrotransposons proliferated more recently in the A genome whereas Gypsy retrotransposon proliferation is more recent in the B genome. It was possible to estimate for the first time the proliferation periods of the abundant CACTA class II DNA transposons, relative to that of the three main retrotransposon superfamilies. Proliferation of these TEs started prior to and overlapped with that of the Athila retrotransposons in both genomes. However, they also proliferated during the same periods as Gypsy and Copia retrotransposons in the A genome, but not in the B genome. As estimated from their insertion dates and confirmed by PCR-based tracing analysis, the majority of differential proliferation of TEs in B and A genomes of wheat (87 and 83%, respectively), leading to rapid sequence divergence, occurred prior to the allotetraploidization event that brought them together in Triticum turgidum and Triticum aestivum, <0.5 million years ago. More importantly, the allotetraploidization event appears to have neither enhanced nor repressed retrotranspositions. We discuss the apparent proliferation
[Molecular cytogenetic identification of Aegilops ventricosa x Aegilops cylindrica amphiploid SDAU18].

PubMed

Wang, Yu Hai; Bao, Yin Guang; Hao, Yuan Feng; Yuan, Yuan Yuan; Zhao, Chun Hua; Wang, Qing Zhuan; Wang, Hong Gang

2009-02-01

SDAU18, an amphiploid of Ae.ventricosa with Ae.cylindrica, was identified by cytological analysis, seed storage protein electrophoresis, genomic in situ hybridization (GISH) and inoculation assessment. The results are as follows: The chromosome number of root tip cells (RTCs) of SDAU18 plants varied from 52 to 56. 28 bivalents were observed in most PMCs MI of SDAU18 with 56 chromosomes, meanwhile, a few univalents, multivalents also existed in some PMCs MI, and the average chromosome configuration was 2n = 56 = 3.21 I +19.78 II, (Ring)+6.50 II (Rod)+0.01 III +0.04 IV (Ring)R+0.01 IV (Rod). There were both Ae. ventricosa-specific bands and Ae. cylindrica-specific bands in the seed storage protein electrophoretogram of SDAU18, furthermore, SDAU18 had one novel HMW-GS not found in the parents and two novel ones not found in common wheats. By labeling the total genomic DNA of Ae. ventricosa and Ae. cylindrica as probes respectively, and using that of another parent as block, GISH of RTCs spread of SDAU18 was carried out. The green hybridization signal was observed in 14 chromosomes respectively, within 56 ones in RTCs of SDAU18. SDAU18 was immune to powdery mildew and stripe rusts. SDAU18 was an amphiploid of Ae. ventricosa with Ae. cylindrica, and had very important significance in wheat breeding and genetic improvement.
Genetic variation of jointed goatgrass (Aegilops cylindrica Host.) from Iran using RAPD-PCR and SDS-PAGE of seed proteins.

PubMed

Farkhari, M; Naghavi, M R; Pyghambari, S A; Sabokdast

2007-09-01

Genetic variation of 28 populations of jointed goatgrass (Aegilops cylindrica Host.), collected from different parts of Iran, were evaluated using both RAPD-PCR and SDS-PAGE of seed proteins. The diversity within and between populations for the three-band High Molecular Weight (HMW) subunits of glutenin pattern were extremely low. Out of 15 screened primers of RAPD, 14 primers generated 133 reproducible fragments which among them 92 fragments were polymorphic (69%). Genetic similarity calculated from the RAPD data ranged from 0.64 to 0.98. A dendrogram was prepared on the basis of a similarity matrix using the UPGMA algorithm and separated the 28 populations into two groups. Confusion can happen between populations with the same origin as well as between populations of very diverse geographical origins. Our results show that compare to seed storage protein, RAPD is suitable for genetic diversity assessment in Ae. cylindrica populations.
Development of wheat-Aegilops speltoides recombinants and simple PCR-based markers for Sr32 and a new stem rust resistance gene on the 2S#1 chromosome.

PubMed

Mago, Rohit; Verlin, Dawn; Zhang, Peng; Bansal, Urmil; Bariana, Harbans; Jin, Yue; Ellis, Jeffrey; Hoxha, Sami; Dundas, Ian

2013-12-01

Wheat- Aegilops speltoides recombinants carrying stem rust resistance genes Sr32 and SrAes1t effective against Ug99 and PCR markers for marker-assisted selection. Wild relatives of wheat are important resources for new rust resistance genes but underutilized because the valuable resistances are often linked to negative traits that prevent deployment of these genes in commercial wheats. Here, we report ph1b-induced recombinants with reduced alien chromatin derived from E.R. Sears' wheat-Aegilops speltoides 2D-2S#1 translocation line C82.2, which carries the widely effective stem rust resistance gene Sr32. Infection type assessments of the recombinants showed that the original translocation in fact carries two stem rust resistance genes, Sr32 on the short arm and a previously undescribed gene SrAes1t on the long arm of chromosome 2S#1. Recombinants with substantially shortened alien chromatin were produced for both genes, which confer resistance to stem rust races in the TTKSK (Ug99) lineage and representative races of all Australian stem rust lineages. Selected recombinants were back crossed into adapted Australian cultivars and PCR markers were developed to facilitate the incorporation of these genes into future wheat varieties. Our recombinants and those from several other labs now show that Sr32, Sr39, and SrAes7t on the short arm and Sr47 and SrAes1t on the long arm of 2S#1 form two linkage groups and at present no rust races are described that can distinguish these resistance specificities.
Expression pattern of salt tolerance-related genes in Aegilops cylindrica.

PubMed

Arabbeigi, Mahbube; Arzani, Ahmad; Majidi, Mohammad Mahdi; Sayed-Tabatabaei, Badraldin Ebrahim; Saha, Prasenjit

2018-02-01

Aegilops cylindrica , a salt-tolerant gene pool of wheat, is a useful plant model for understanding mechanism of salt tolerance. A salt-tolerant USL26 and a salt-sensitive K44 genotypes of A. cylindrica , originating from Uremia Salt Lake shores in Northwest Iran and a non-saline Kurdestan province in West Iran, respectively, were identified based on screening evaluation and used for this work. The objective of the current study was to investigate the expression patterns of four genes related to ion homeostasis in this species. Under treatment of 400 mM NaCl, USL26 showed significantly higher root and shoot dry matter levels and K + concentrations, together with lower Na + concentrations than K44 genotype. A. cylindrica HKT1;5 ( AecHKT1;5 ), SOS1 ( AecSOS1 ), NHX1 ( AecNHX1 ) and VP1 ( AecVP1 ) were partially sequenced to design each gene specific primer. Quantitative real-time PCR showed a differential expression pattern of these genes between the two genotypes and between the root and shoot tissues. Expressions of AecHKT1;5 and AecSOS1 was greater in the roots than in the shoots of USL26 while AecNHX1 and AecVP1 were equally expressed in both tissues of USL26 and K44. The higher transcripts of AecHKT1;5 in the roots versus the shoots could explain both the lower Na + in the shoots and the much lower Na + and higher K + concentrations in the roots/shoots of USL26 compared to K44. Therefore, the involvement of AecHKT1;5 in shoot-to-root handover of Na + in possible combination with the exclusion of excessive Na + from the root in the salt-tolerant genotype are suggested.
Mapping of stripe rust resistance gene in an Aegilops caudate introgression line in wheat and its genetic association with leaf rust resistance.

PubMed

Toor, Puneet Inder; Kaur, Satinder; Bansal, Mitaly; Yadav, Bharat; Chhuneja, Parveen

2016-12-01

A pair of stripe rust and leaf rust resistance genes was introgressed from Aegilops caudata, a nonprogenitor diploid species with the CC genome, to cultivated wheat. Inheritance and genetic mapping of stripe rust resistance gene in backcrossrecombinant inbred line (BC-RIL) population derived from the cross of a wheat-Ae. caudata introgression line (IL) T291- 2(pau16060) with wheat cv. PBW343 is reported here. Segregation of BC-RILs for stripe rust resistance depicted a single major gene conditioning adult plant resistance (APR) with stripe rust reaction varying from TR-20MS in resistant RILs signifying the presence of some minor genes as well. Genetic association with leaf rust resistance revealed that two genes are located at a recombination distance of 13%. IL T291-2 had earlier been reported to carry introgressions on wheat chromosomes 2D, 3D, 4D, 5D, 6D and 7D. Genetic mapping indicated the introgression of stripe rust resistance gene on wheat chromosome 5DS in the region carrying leaf rust resistance gene LrAc, but as an independent introgression. Simple sequence repeat (SSR) and sequence-tagged site (STS) markers designed from the survey sequence data of 5DS enriched the target region harbouring stripe and leaf rust resistance genes. Stripe rust resistance locus, temporarily designated as YrAc, mapped at the distal most end of 5DS linked with a group of four colocated SSRs and two resistance gene analogue (RGA)-STS markers at a distance of 5.3 cM. LrAc mapped at a distance of 9.0 cM from the YrAc and at 2.8 cM from RGA-STS marker Ta5DS_2737450, YrAc and LrAc appear to be the candidate genes for marker-assisted enrichment of the wheat gene pool for rust resistance.
Agronomic Traits and Molecular Marker Identification of Wheat–Aegilops caudata Addition Lines

PubMed Central

Gong, Wenping; Han, Ran; Li, Haosheng; Song, Jianmin; Yan, Hongfei; Li, Genying; Liu, Aifeng; Cao, Xinyou; Guo, Jun; Zhai, Shengnan; Cheng, Dungong; Zhao, Zhendong; Liu, Cheng; Liu, Jianjun

2017-01-01

Aegilops caudata is an important gene source for wheat breeding. Intensive evaluation of its utilization value is an essential first step prior to its application in breeding. In this research, the agronomical and quality traits of Triticum aestivum-Ae. caudata additions B–G (homoeologous groups not identified) were analyzed and evaluated. Disease resistance tests showed that chromosome D of Ae. caudata might possess leaf rust resistance, and chromosome E might carry stem rust and powdery mildew resistance genes. Investigations into agronomical traits suggested that the introduction of the Ae. caudata chromosome in addition line F could reduce plant height. Grain quality tests showed that the introduction of chromosomes E or F into wheat could increase its protein and wet gluten content. Therefore, wheat-Ae. caudata additions D–F are all potentially useful candidates for chromosome engineering activities to create useful wheat-alien chromosome introgressions. A total of 55 EST-based molecular markers were developed and then used to identify the chromosome homoeologous group of each of the Ae. caudata B–G chromosomes. Marker analysis indicated that the Ae. caudata chromosomes in addition lines B to G were structurally altered, therefore, a large population combined with intensive screening pressure should be taken into consideration when inducing and screening for wheat-Ae. caudata compensating translocations. Marker data also indicated that the Ae. caudata chromosomes in addition lines C–F were 5C, 6C, 7C, and 3C, respectively, while the homoeologous group of chromosomes B and G of Ae. caudata are as yet undetermined and need further research. PMID:29075275
Diversity of fungal endophytes in recent and ancient wheat ancestors Triticum dicoccoides and Aegilops sharonensis.

PubMed

Ofek-Lalzar, Maya; Gur, Yonatan; Ben-Moshe, Sapir; Sharon, Or; Kosman, Evsey; Mochli, Elad; Sharon, Amir

2016-10-01

Endophytes have profound impacts on plants, including beneficial effects on agriculturally important traits. We hypothesized that endophytes in wild plants include beneficial endophytes that are absent or underrepresented in domesticated crops. In this work, we studied the structure of endophyte communities in wheat-related grasses, Triticum dicoccoides and Aegilops sharonensis, and compared it to an endophyte community from wheat (T. aeastivum). Endophytes were isolated by cultivation and by cultivation-independent methods. In total, 514 intergenic spacer region sequences from single cultures were analyzed. Categorization at 97% sequence similarity resulted in 67 operational taxonomic units (OTUs) that were evenly distributed between the different plant species. A narrow core community of Alternaria spp. was found in all samples, but each plant species also contained a significant portion of unique endophytes. The cultivation-independent analysis identified a larger number of OTUs than the cultivation method, half of which were singletons or doubletons. For OTUs with a relative abundance >0.5%, similar numbers were obtained by both methods. Collectively, our data show that wild grass relatives of wheat contain a wealth of taxonomically diverse fungal endophytes that are not found in modern wheat, some of which belong to taxa with known beneficial effects. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Wheat multiple synthetic derivatives: a new source for heat stress tolerance adaptive traits

PubMed Central

Elbashir, Awad Ahmed Elawad; Gorafi, Yasir Serag Alnor; Tahir, Izzat Sidahmed Ali; Kim, June-Sik; Tsujimoto, Hisashi

2017-01-01

Heat stress is detrimental to wheat (Triticum aestivum L.) productivity. In this study, we aimed to select heat-tolerant plants from a multiple synthetic derivatives (MSD) population and evaluate their agronomic and physiological traits. We selected six tolerant plants from the population with the background of the cultivar ‘Norin 61’ (N61) and established six MNH (MSD population of N61 selected as heat stress-tolerant) lines. We grew these lines with N61 in the field and growth chamber. In the field, we used optimum and late sowings to ensure plant exposure to heat. In the growth chamber, in addition to N61, we used the heat-tolerant cultivars ‘Gelenson’ and ‘Bacanora’. We confirmed that MNH2 and MNH5 lines acquired heat tolerance. These lines had higher photosynthesis and stomata conductance and exhibited no reduction in grain yield and biomass under heat stress compared to N61. We noticed that N61 had relatively good adaptability to heat stress. Our results indicate that the MSD population includes the diversity of Aegilops tauschii and is a promising resource to uncover useful quantitative traits derived from this wild species. Selected lines could be useful for heat stress tolerance breeding. PMID:28744178
Introgression of leaf rust and stripe rust resistance from Sharon goatgrass (Aegilops sharonensis Eig) into bread wheat (Triticum aestivum L.).

PubMed

Millet, E; Manisterski, J; Ben-Yehuda, P; Distelfeld, A; Deek, J; Wan, A; Chen, X; Steffenson, B J

2014-06-01

Leaf rust and stripe rust are devastating wheat diseases, causing significant yield losses in many regions of the world. The use of resistant varieties is the most efficient way to protect wheat crops from these diseases. Sharon goatgrass (Aegilops sharonensis or AES), which is a diploid wild relative of wheat, exhibits a high frequency of leaf and stripe rust resistance. We used the resistant AES accession TH548 and induced homoeologous recombination by the ph1b allele to obtain resistant wheat recombinant lines carrying AES chromosome segments in the genetic background of the spring wheat cultivar Galil. The gametocidal effect from AES was overcome by using an "anti-gametocidal" wheat mutant. These recombinant lines were found resistant to highly virulent races of the leaf and stripe rust pathogens in Israel and the United States. Molecular DArT analysis of the different recombinant lines revealed different lengths of AES segments on wheat chromosome 6B, which indicates the location of both resistance genes.
Rapid evolutionary dynamics in a 2.8-Mb chromosomal region containing multiple prolamin and resistance gene families in Aegilops tauschii

USDA-ARS?s Scientific Manuscript database

The prolamin (seed storage proteins high in glutamine and proline) and resistance gene families are important in domesticated bread wheat (Triticum aestivum) food uses and in defense against pathogen attacks, respectively. To better understand the evolution of these multi-gene families, the DNA se...
The 2NS Translocation from Aegilops ventricosa Confers Resistance to the Triticum Pathotype of Magnaporthe oryzae

PubMed Central

Cruz, C.D.; Peterson, G.L.; Bockus, W.W.; Kankanala, P.; Dubcovsky, J.; Jordan, K.W.; Akhunov, E.; Chumley, F.; Baldelomar, F.D.; Valent, B.

2016-01-01

Wheat blast is a serious disease caused by the fungus Magnaporthe oryzae (Triticum pathotype) (MoT). The objective of this study was to determine the effect of the 2NS translocation from Aegilops ventricosa (Zhuk.) Chennav on wheat head and leaf blast resistance. Disease phenotyping experiments were conducted in growth chamber, greenhouse, and field environments. Among 418 cultivars of wheat (Triticum aestivum L.), those with 2NS had 50.4 to 72.3% less head blast than those without 2NS when inoculated with an older MoT isolate under growth chamber conditions. When inoculated with recently collected isolates, cultivars with 2NS had 64.0 to 80.5% less head blast. Under greenhouse conditions when lines were inoculated with an older MoT isolate, those with 2NS had a significant head blast reduction. With newer isolates, not all lines with 2NS showed a significant reduction in head blast, suggesting that the genetic background and/or environment may influence the expression of any resistance conferred by 2NS. However, when near-isogenic lines (NILs) with and without 2NS were planted in the field, there was strong evidence that 2NS conferred resistance to head blast. Results from foliar inoculations suggest that the resistance to head infection that is imparted by the 2NS translocation does not confer resistance to foliar disease. In conclusion, the 2NS translocation was associated with significant reductions in head blast in both spring and winter wheat. PMID:27814405
Discovery and molecular mapping of a new gene conferring resistance to stem rust, Sr53, derived from Aegilops geniculata and characterization of spontaneous translocation stocks with reduced alien chromatin.

PubMed

Liu, Wenxuan; Rouse, Matthew; Friebe, Bernd; Jin, Yue; Gill, Bikram; Pumphrey, Michael O

2011-07-01

This study reports the discovery and molecular mapping of a resistance gene effective against stem rust races RKQQC and TTKSK (Ug99) derived from Aegilops geniculata (2n = 4x = 28, U(g)U(g)M(g)M(g)). Two populations from the crosses TA5599 (T5DL-5M(g)L·5M(g)S)/TA3809 (ph1b mutant in Chinese Spring background) and TA5599/Lakin were developed and used for genetic mapping to identify markers linked to the resistance gene. Further molecular and cytogenetic characterization resulted in the identification of nine spontaneous recombinants with shortened Ae. geniculata segments. Three of the wheat-Ae. geniculata recombinants (U6154-124, U6154-128, and U6200-113) are interstitial translocations (T5DS·5DL-5M(g)L-5DL), with 20-30% proximal segments of 5M(g)L translocated to 5DL; the other six are recombinants (T5DL-5M(g)L·5M(g)S) have shortened segments of 5M(g)L with fraction lengths (FL) of 0.32-0.45 compared with FL 0.55 for the 5M(g)L segment in the original translocation donor, TA5599. Recombinants U6200-64, U6200-117, and U6154-124 carry the stem rust resistance gene Sr53 with the same infection type as TA5599, the resistance gene donor. All recombinants were confirmed to be genetically compensating on the basis of genomic in situ hybridization and molecular marker analysis with chromosome 5D- and 5M(g)-specific SSR/STS-PCR markers. These recombinants between wheat and Ae. geniculata will provide another source for wheat stem rust resistance breeding and for physical mapping of the resistance locus and crossover hot spots between wheat chromosome 5D and chromosome 5M(g)L of Ae. geniculata.
Composition, variation, expression and evolution of low-molecular-weight glutenin subunit genes in Triticum urartu.

PubMed

Luo, Guangbin; Zhang, Xiaofei; Zhang, Yanlin; Yang, Wenlong; Li, Yiwen; Sun, Jiazhu; Zhan, Kehui; Zhang, Aimin; Liu, Dongcheng

2015-02-28

Wheat (AABBDD, 2n = 6x = 42) is a major dietary component for many populations across the world. Bread-making quality of wheat is mainly determined by glutenin subunits, but it remains challenging to elucidate the composition and variation of low-molecular-weight glutenin subunits (LMW-GS) genes, the major components for glutenin subunits in hexaploid wheat. This problem, however, can be greatly simplified by characterizing the LMW-GS genes in Triticum urartu, the A-genome donor of hexaploid wheat. In the present study, we exploited the high-throughput molecular marker system, gene cloning, proteomic methods and molecular evolutionary genetic analysis to reveal the composition, variation, expression and evolution of LMW-GS genes in a T. urartu population from the Fertile Crescent region. Eight LMW-GS genes, including four m-type, one s-type and three i-type, were characterized in the T. urartu population. Six or seven genes, the highest number at the Glu-A3 locus, were detected in each accession. Three i-type genes, each containing more than six allelic variants, were tightly linked because of their co-segregation in every accession. Only 2-3 allelic variants were detected for each m- and s-type gene. The m-type gene, TuA3-385, for which homologs were previously characterized only at Glu-D3 locus in common wheat and Aegilops tauschii, was detected at Glu-A3 locus in T. urartu. TuA3-460 was the first s-type gene identified at Glu-A3 locus. Proteomic analysis showed 1-4 genes, mainly i-type, expressed in individual accessions. About 62% accessions had three active i-type genes, rather than one or two in common wheat. Southeastern Turkey might be the center of origin and diversity for T. urartu due to its abundance of LMW-GS genes/genotypes. Phylogenetic reconstruction demonstrated that the characterized T. urartu might be the direct donor of the Glu-A3 locus in common wheat varieties. Compared with the Glu-A3 locus in common wheat, a large number of highly
Analysis of ATP6 sequence diversity in the Triticum-Aegilops group of species reveals the crucial role of rearrangement in mitochondrial genome evolution

USDA-ARS?s Scientific Manuscript database

Mutation and chromosomal rearrangements are the two main forces of increasing genetic diversity for natural selection to act upon, and ultimately drive the evolutionary process. Although genome evolution is a function of both forces, simultaneously, the ratio of each can be varied among different ge...
Development of wheat lines carrying stem rust resistance gene Sr39 with reduced Aegilops speltoides chromatin and simple PCR markers for marker-assisted selection.

PubMed

Mago, Rohit; Zhang, P; Bariana, H S; Verlin, D C; Bansal, U K; Ellis, J G; Dundas, I S

2009-11-01

The use of major resistance genes is a cost-effective strategy for preventing stem rust epidemics in wheat crops. The stem rust resistance gene Sr39 provides resistance to all currently known pathotypes of Puccinia graminis f. sp. tritici (Pgt) including Ug99 (TTKSK) and was introgressed together with leaf rust resistance gene Lr35 conferring adult plant resistance to P. triticina (Pt), into wheat from Aegilops speltoides. It has not been used extensively in wheat breeding because of the presumed but as yet undocumented negative agronomic effects associated with Ae. speltoides chromatin. This investigation reports the production of a set of recombinants with shortened Ae. speltoides segments through induction of homoeologous recombination between the wheat and the Ae. speltoides chromosome. Simple PCR-based DNA markers were developed for resistant and susceptible genotypes (Sr39#22r and Sr39#50s) and validated across a set of recombinant lines and wheat cultivars. These markers will facilitate the pyramiding of ameliorated sources of Sr39 with other stem rust resistance genes that are effective against the Pgt pathotype TTKSK and its variants.
Chemical interactions between plants in Mediterranean vegetation: the influence of selected plant extracts on Aegilops geniculata metabolome.

PubMed

Scognamiglio, Monica; Fiumano, Vittorio; D'Abrosca, Brigida; Esposito, Assunta; Choi, Young Hae; Verpoorte, Robert; Fiorentino, Antonio

2014-10-01

Allelopathy is the chemical mediated communication among plants. While on one hand there is growing interest in the field, on the other hand it is still debated as doubts exist at different levels. A number of compounds have been reported for their ability to influence plant growth, but the existence of this phenomenon in the field has rarely been demonstrated. Furthermore, only few studies have reported the uptake and the effects at molecular level of the allelochemicals. Allelopathy has been reported on some plants of Mediterranean vegetation and could contribute to structuring this ecosystem. Sixteen plants of Mediterranean vegetation have been selected and studied by an NMR-based metabolomics approach. The extracts of these donor plants have been characterized in terms of chemical composition and the effects on a selected receiving plant, Aegilops geniculata, have been studied both at the morphological and at the metabolic level. Most of the plant extracts employed in this study were found to have an activity, which could be correlated with the presence of flavonoids and hydroxycinnamate derivatives. These plant extracts affected the receiving plant in different ways, with different rates of growth inhibition at morphological level. The results of metabolomic analysis of treated plants suggested the induction of oxidative stress in all the receiving plants treated with active donor plant extracts, although differences were observed among the responses. Finally, the uptake and transport into receiving plant leaves of different metabolites present in the extracts added to the culture medium were observed. Copyright © 2014 Elsevier Ltd. All rights reserved.

[Analysis of storage proteins (prolamines, puroindolines and waxy) in common wheat lines Triticum aestivum L. x (Triticum timopheevii Zhuk. x Triticum tauschii) with complex resistance to fungal infections].

PubMed

Obukhova, L V; Laĭkova, L I; Shumnyĭ, V K

2010-06-01

Storage proteins (prolamines, puroindolines, and Waxy) were studied in common wheat introgression lines obtained with the use of the Saratovskaya 29 (S29) cultivar line and synthetic hexaploid wheat (Triticum timopheevii Zhuk. x T. tauschii) (Sintetik, Sin.) and displaying complex resistance to fungal infections. Comparative analysis of storage proteins in the introgression lines of common wheat Triticum aestivum L. and in the parental forms revealed the only line (BC5) having a substitution at the Gli-B2 locus from Sintetik. Hybrid lines subjected to nine back crosses with the recurrent parental form S29 and selections for resistance to pathogens can be considered as nearly isogenic for the selected trait and retaining the allelic composition of (1) prolamines responsible for the bread-making qualitiy, (2) puroindolines associated with grain texture, and (3) Waxy proteins responsible for nutritive qualities. These lines are valuable as donors of immunity in breeding programs without the loss of the quality of flour and grain as compared to the S29 line and are also important in searching for genes determining resistance to leaf and stem rust and to powdery mildew. The amphiploid has a number of characters (silent Glu-A 1 locus and Ha genotype) that can negatively affect the quality of flour and grain and thus should be taken into account when choosing this donor.
Molecular and cytogenetic characterization of a durum wheat-Aegilops speltoides chromosome translocation conferring resistance to stem rust.

PubMed

Faris, Justin D; Xu, Steven S; Cai, Xiwen; Friesen, Timothy L; Jin, Yue

2008-01-01

Stem rust is a serious disease of wheat that has caused historical epidemics, but it has not been a threat in recent decades in North America owing to the eradication of the alternative host and deployment of resistant cultivars. However, the recent emergence of Ug99 (or race TTKS) poses a threat to global wheat production because most currently grown wheat varieties are susceptible. In this study, we evaluated a durum wheat-Aegilops speltoides chromosome translocation line (DAS15) for reaction to Ug99 and six other races of stem rust, and used molecular and cytogenetic tools to characterize the translocation. DAS15 was resistant to all seven races of stem rust. Two durum-Ae. speltoides translocated chromosomes were detected in DAS15. One translocation involved the short arm, centromere, and a major portion of the long arm of Ae. speltoides chromosome 2S and a small terminal segment from durum chromosome arm 2BL. Thus, this translocated chromosome is designated T2BL-2SL*2SS. Cytogenetic mapping assigned the resistance gene(s) in DAS15 to the Ae. speltoides segment in T2BL-2SL*2SS. The Ae. speltoides segment in the other translocated chromosome did not harbour stem rust resistance. A comparison of DAS15 and the wheat stocks carrying the Ae. speltoides-derived resistance genes Sr32 and Sr39 indicated that stem rust resistance gene present in DAS15 is likely novel and will be useful for developing germplasm with resistance to Ug99. Efforts to reduce Ae. speltoides chromatin in T2BL-2SL*2SS are currently in progress.
[Chromosomal localization of the speltoidy gene, introgressed into bread wheat from Aegilops speltoides Tausch., and its interaction with the Q gene of Triticum spelta L].

PubMed

Simonov, A V; Pshenichnikova, T A

2012-11-01

The differences between bread wheat (Triticum aestivum L.) and spelt (Triticum spelta L.) in the shape of the spike and threshing character are determined by the allelic status of one major Q gene, mapped to the long arm of chromosome 5A. This gene is a member of the APETALA2 family of transcription factors and plays an important role in domestication of wheat. In the present study, using monosomic analysis, we determined the chromosomal localization of the Q(S)gene, introgressed into bread wheat from Aegilops speltoides Tausch. and homoallelic to the Q gene. It was demonstrated that the Q(S) gene was located in chromosome 5A of the bread wheat line from the Arsenal collection. This gene conferred spike speltoidy in the line itself, as well as in its hybrids with bread wheat cultivars. The Q(S) gene dominated over the bread wheat Q gene and was equally effective in the homo-, hemi-, and heterozygous states. In hybrids between the introgression line and a number of spring spelt accessions, interaction between the Q and Q(S) genes was observed, manifested as the formation of superspeltoid spike.
High mature grain phytase activity in the Triticeae has evolved by duplication followed by neofunctionalization of the purple acid phosphatase phytase (PAPhy) gene

PubMed Central

Brinch-Pedersen, Henrik

2013-01-01

The phytase activity in food and feedstuffs is an important nutritional parameter. Members of the Triticeae tribe accumulate purple acid phosphatase phytases (PAPhy) during grain filling. This accumulation elevates mature grain phytase activities (MGPA) up to levels between ~650 FTU/kg for barley and 6000 FTU/kg for rye. This is notably more than other cereals. For instance, rice, maize, and oat have MGPAs below 100 FTU/kg. The cloning and characterization of the PAPhy gene complement from wheat, barley, rye, einkorn, and Aegilops tauschii is reported here. The Triticeae PAPhy genes generally consist of a set of paralogues, PAPhy_a and PAPhy_b, and have been mapped to Triticeae chromosomes 5 and 3, respectively. The promoters share a conserved core but the PAPhy_a promoter have acquired a novel cis-acting regulatory element for expression during grain filling while the PAPhy_b promoter has maintained the archaic function and drives expression during germination. Brachypodium is the only sequenced Poaceae sharing the PAPhy duplication. As for the Triticeae, the duplication is reflected in a high MGPA of ~4200 FTU/kg in Brachypodium. The sequence conservation of the paralogous loci on Brachypodium chromosomes 1 and 2 does not extend beyond the PAPhy gene. The results indicate that a single-gene segmental duplication may have enabled the evolution of high MGPA by creating functional redundancy of the parent PAPhy gene. This implies that similar MGPA levels may be out of reach in breeding programs for some Poaceae, e.g. maize and rice, whereas Triticeae breeders should focus on PAPhy_a. PMID:23918958
Gametocidal Factor Transferred from Aegilops geniculata Roth Can Be Adapted for Large-Scale Chromosome Manipulations in Cereals

PubMed Central

Kwiatek, Michał T.; Wiśniewska, Halina; Ślusarkiewicz-Jarzina, Aurelia; Majka, Joanna; Majka, Maciej; Belter, Jolanta; Pudelska, Hanna

2017-01-01

Segregation distorters are curious, evolutionarily selfish genetic elements, which distort Mendelian segregation in their favor at the expense of others. Those agents include gametocidal factors (Gc), which ensure their preferential transmission by triggering damages in cells lacking them via chromosome break induction. Hence, we hypothesized that the gametocidal system can be adapted for chromosome manipulations between Triticum and Secale chromosomes in hexaploid triticale (×Triticosecale Wittmack). In this work we studied the little-known gametocidal action of a Gc factor located on Aegilops geniculata Roth chromosome 4Mg. Our results indicate that the initiation of the gametocidal action takes place at anaphase II of meiosis of pollen mother cells. Hence, we induced androgenesis at postmeiotic pollen divisions (via anther cultures) in monosomic 4Mg addition plants of hexaploid triticale (AABBRR) followed by production of doubled haploids, to maintain the chromosome aberrations caused by the gametocidal action. This approach enabled us to obtain a large number of plants with two copies of particular chromosome translocations, which were identified by the use of cytomolecular methods. We obtained 41 doubled haploid triticale lines and 17 of them carried chromosome aberrations that included plants with the following chromosome sets: 40T+Dt2RS+Dt2RL (5 lines), 40T+N2R (1), 38T+D4RS.4BL (3), 38T+D5BS-5BL.5RL (5), and 38T+D7RS.3AL (3). The results show that the application of the Gc mechanism in combination with production of doubled haploid lines provides a sufficiently large population of homozygous doubled haploid individuals with two identical copies of translocation chromosomes. In our opinion, this approach will be a valuable tool for the production of novel plant material, which could be used for gene tracking studies, genetic mapping, and finally to enhance the diversity of cereals. PMID:28396677
Inducing rye 1R chromosome structural changes in common wheat cv. Chinese spring by the gametocidal chromosome 2C of Aegilops cylindrica.

PubMed

Shi, Fang; Liu, Kun-Fan; Endo, Takashi R; Wang, Dao-Wen

2005-05-01

To generate 1 R deletion and translocation lines, we introduced a 2C chromosome,which was derived from Aegilops cylindrica and was known to have a gametocidal function when added monosomically into common wheat cv. Chinese Spring (CS) and its derivative, into a wheat-rye 1R chromosome disomic addition line (CS-1R"). When the individuals with chromosome constitution 21" + 1R" + 2C' (2n = 45) were selfed, the 1R chromosome structural changes were found to be induced with high frequency (24.1%) among the progenies. By using C-banding and GISH analysis, we analyzed 1R structural changes in 46 F3 individuals, which came from 23 F2 plants. The rearranged 1R chromosomes could be characterized in about 85% of the F3 individuals. This included telosome 1RL (39.1%), iso-chromosome 1 RL (2.2%), whole arm translocation involving 1RL (32.6%), telosome 1RS (4.3%), iso-chromosome 1RS (4.3%), and 1R deletion mutant with break point in the long arm (2.2%). The mutant 1R lines obtained in this study will potentially be useful in mapping the chromosome locations of agronomically important genes located in 1R. This study also demonstrated that molecular markers might be used to identify wheat chromosome arm involved in translocation with 1R.
Evolutionary dynamics of retrotransposons assessed by high-throughput sequencing in wild relatives of wheat.

PubMed

Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian

2013-01-01

Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.
Cytogenetic analysis and mapping of leaf rust resistance in Aegilops speltoides Tausch derived bread wheat line Selection2427 carrying putative gametocidal gene(s).

PubMed

Niranjana, M; Vinod; Sharma, J B; Mallick, Niharika; Tomar, S M S; Jha, S K

2017-12-01

Leaf rust (Puccinia triticina) is a major biotic stress affecting wheat yields worldwide. Host-plant resistance is the best method for controlling leaf rust. Aegilops speltoides is a good source of resistance against wheat rusts. To date, five Lr genes, Lr28, Lr35, Lr36, Lr47, and Lr51, have been transferred from Ae. speltoides to bread wheat. In Selection2427, a bread wheat introgresed line with Ae. speltoides as the donor parent, a dominant gene for leaf rust resistance was mapped to the long arm of chromosome 3B (LrS2427). None of the Lr genes introgressed from Ae. speltoides have been mapped to chromosome 3B. Since none of the designated seedling leaf rust resistance genes have been located on chromosome 3B, LrS2427 seems to be a novel gene. Selection2427 showed a unique property typical of gametocidal genes, that when crossed to other bread wheat cultivars, the F 1 showed partial pollen sterility and poor seed setting, whilst Selection2427 showed reasonable male and female fertility. Accidental co-transfer of gametocidal genes with LrS2427 may have occurred in Selection2427. Though LrS2427 did not show any segregation distortion and assorted independently of putative gametocidal gene(s), its utilization will be difficult due to the selfish behavior of gametocidal genes.
Isolation and sequence analysis of the wheat B genome subtelomeric DNA.

PubMed

Salina, Elena A; Sergeeva, Ekaterina M; Adonina, Irina G; Shcherban, Andrey B; Afonnikov, Dmitry A; Belcram, Harry; Huneau, Cecile; Chalhoub, Boulos

2009-09-05

Telomeric and subtelomeric regions are essential for genome stability and regular chromosome replication. In this work, we have characterized the wheat BAC (bacterial artificial chromosome) clones containing Spelt1 and Spelt52 sequences, which belong to the subtelomeric repeats of the B/G genomes of wheats and Aegilops species from the section Sitopsis. The BAC library from Triticum aestivum cv. Renan was screened using Spelt1 and Spelt52 as probes. Nine positive clones were isolated; of them, clone 2050O8 was localized mainly to the distal parts of wheat chromosomes by in situ hybridization. The distribution of the other clones indicated the presence of different types of repetitive sequences in BACs. Use of different approaches allowed us to prove that seven of the nine isolated clones belonged to the subtelomeric chromosomal regions. Clone 2050O8 was sequenced and its sequence of 119,737 bp was annotated. It is composed of 33% transposable elements (TEs), 8.2% Spelt52 (namely, the subfamily Spelt52.2) and five non-TE-related genes. DNA transposons are predominant, making up 24.6% of the entire BAC clone, whereas retroelements account for 8.4% of the clone length. The full-length CACTA transposon Caspar covers 11,666 bp, encoding a transposase and CTG-2 proteins, and this transposon accounts for 40% of the DNA transposons. The in situ hybridization data for 2050O8 derived subclones in combination with the BLAST search against wheat mapped ESTs (expressed sequence tags) suggest that clone 2050O8 is located in the terminal bin 4BL-10 (0.95-1.0). Additionally, four of the predicted 2050O8 genes showed significant homology to four putative orthologous rice genes in the distal part of rice chromosome 3S and confirm the synteny to wheat 4BL. Satellite DNA sequences from the subtelomeric regions of diploid wheat progenitor can be used for selecting the BAC clones from the corresponding regions of hexaploid wheat chromosomes. It has been demonstrated for the first time
Genotype-by-sequencing facilitates genetic mapping of a stem rust resistance locus in Aegilops umbellulata, a wild relative of cultivated wheat.

PubMed

Edae, Erena A; Olivera, Pablo D; Jin, Yue; Poland, Jesse A; Rouse, Matthew N

2016-12-15

Wild relatives of wheat play a significant role in wheat improvement as a source of genetic diversity. Stem rust disease of wheat causes significant yield losses at the global level and stem rust pathogen race TTKSK (Ug99) is virulent to most previously deployed resistance genes. Therefore, the objective of this study was to identify loci conferring resistance to stem rust pathogen races including Ug99 in an Aegilops umbelluata bi-parental mapping population using genotype-by-sequencing (GBS) SNP markers. A bi-parental F 2:3 population derived from a cross made between stem rust resistant accession PI 298905 and stem rust susceptible accession PI 542369 was used for this study. F 2 individuals were evaluated with stem rust race TTTTF followed by testing F 2:3 families with races TTTTF and TTKSK. The segregation pattern of resistance to both stem rust races suggested the presence of one resistance gene. A genetic linkage map, comprised 1,933 SNP markers, was created for all seven chromosomes of Ae. umbellulata using GBS. A major stem rust resistance QTL that explained 80% and 52% of the phenotypic variations for TTTTF and TTKSK, respectively, was detected on chromosome 2U of Ae. umbellulata. The novel resistance gene for stem rust identified in this study can be transferred to commercial wheat varieties assisted by the tightly linked markers identified here. These markers identified through our mapping approach can be a useful strategy to identify and track the resistance gene in marker-assisted breeding in wheat.
New insights into structural organization and gene duplication in a 1.75-Mb chromosomal region harboring the alpha-gliadin gene family in Aegilops tauschii

USDA-ARS?s Scientific Manuscript database

Among the wheat prolamins important for its end-use traits, alpha-gliadins are abundant and also a major cause of food-related allergies and intolerances. Previous studies of various wheat species estimated between 25 to 150 alpha-gliadin genes reside in the Gli-2 locus regions. To better understand...
GenomeFingerprinter: the genome fingerprint and the universal genome fingerprint analysis for systematic comparative genomics.

PubMed

Ai, Yuncan; Ai, Hannan; Meng, Fanmei; Zhao, Lei

2013-01-01

No attention has been paid on comparing a set of genome sequences crossing genetic components and biological categories with far divergence over large size range. We define it as the systematic comparative genomics and aim to develop the methodology. First, we create a method, GenomeFingerprinter, to unambiguously produce a set of three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections, to illustrate the genome fingerprint of a given genome sequence. Second, we develop a set of concepts and tools, and thereby establish a method called the universal genome fingerprint analysis (UGFA). Particularly, we define the total genetic component configuration (TGCC) (including chromosome, plasmid, and phage) for describing a strain as a systematic unit, the universal genome fingerprint map (UGFM) of TGCC for differentiating strains as a universal system, and the systematic comparative genomics (SCG) for comparing a set of genomes crossing genetic components and biological categories. Third, we construct a method of quantitative analysis to compare two genomes by using the outcome dataset of genome fingerprint analysis. Specifically, we define the geometric center and its geometric mean for a given genome fingerprint map, followed by the Euclidean distance, the differentiate rate, and the weighted differentiate rate to quantitatively describe the difference between two genomes of comparison. Moreover, we demonstrate the applications through case studies on various genome sequences, giving tremendous insights into the critical issues in microbial genomics and taxonomy. We have created a method, GenomeFingerprinter, for rapidly computing, geometrically visualizing, intuitively comparing a set of genomes at genome fingerprint level, and hence established a method called the universal genome fingerprint analysis, as well as developed a method of quantitative analysis of the outcome dataset. These have set
Genome-wide characterization of JASMONATE-ZIM DOMAIN transcription repressors in wheat (Triticum aestivum L.).

PubMed

Wang, Yukun; Qiao, Linyi; Bai, Jianfang; Wang, Peng; Duan, Wenjing; Yuan, Shaohua; Yuan, Guoliang; Zhang, Fengting; Zhang, Liping; Zhao, Changping

2017-02-13

The JASMONATE-ZIM DOMAIN (JAZ) repressor family proteins are jasmonate co-receptors and transcriptional repressor in jasmonic acid (JA) signaling pathway, and they play important roles in regulating the growth and development of plants. Recently, more and more researches on JAZ gene family are reported in many plants. Although the genome sequencing of common wheat (Triticum aestivum L.) and its relatives is complete, our knowledge about this gene family remains vacant. Fourteen JAZ genes were identified in the wheat genome. Structural analysis revealed that the TaJAZ proteins in wheat were as conserved as those in other plants, but had structural characteristics. By phylogenetic analysis, all JAZ proteins from wheat and other plants were clustered into 11 sub-groups (G1-G11), and TaJAZ proteins shared a high degree of similarity with some JAZ proteins from Aegliops tauschii, Brachypodium distachyon and Oryza sativa. The Ka/Ks ratios of TaJAZ genes ranged from 0.0016 to 0.6973, suggesting that the TaJAZ family had undergone purifying selection in wheat. Gene expression patterns obtained by quantitative real-time PCR (qRT-PCR) revealed differential temporal and spatial regulation of TaJAZ genes under multifarious abiotic stress treatments of high salinity, drought, cold and phytohormone. Among these, TaJAZ7, 8 and 12 were specifically expressed in the anther tissues of the thermosensitive genic male sterile (TGMS) wheat line BS366 and normal control wheat line Jing411. Compared with the gene expression patterns in the normal wheat line Jing411, TaJAZ7, 8 and 12 had different expression patterns in abnormally dehiscent anthers of BS366 at the heading stage 6, suggesting that specific up- or down-regulation of these genes might be associated with the abnormal anther dehiscence in TGMS wheat line. This study analyzed the size and composition of the JAZ gene family in wheat, and investigated stress responsive and differential tissue-specific expression profiles of each
Characterization and Expression Analysis of Phytoene Synthase from Bread Wheat (Triticum aestivum L.)

PubMed Central

Flowerika; Alok, Anshu; Kumar, Jitesh; Thakur, Neha; Pandey, Ashutosh; Pandey, Ajay Kumar; Upadhyay, Santosh Kumar; Tiwari, Siddharth

2016-01-01

Phytoene synthase (PSY) regulates the first committed step of the carotenoid biosynthetic pathway in plants. The present work reports identification and characterization of the three PSY genes (TaPSY1, TaPSY2 and TaPSY3) in wheat (Triticum aestivum L.). The TaPSY1, TaPSY2, and TaPSY3 genes consisted of three homoeologs on the long arm of group 7 chromosome (7L), short arm of group 5 chromosome (5S), and long arm of group 5 chromosome (5L), respectively in each subgenomes (A, B, and D) with a similarity range from 89% to 97%. The protein sequence analysis demonstrated that TaPSY1 and TaPSY3 retain most of conserved motifs for enzyme activity. Phylogenetic analysis of all TaPSY revealed an evolutionary relationship among PSY proteins of various monocot species. TaPSY derived from A and D subgenomes shared proximity to the PSY of Triticum urartu and Aegilops tauschii, respectively. The differential expression of TaPSY1, TaPSY2, and TaPSY3 in the various tissues, seed development stages, and stress treatments suggested their role in plant development, and stress condition. TaPSY3 showed higher expression in all tissues, followed by TaPSY1. The presence of multiple stress responsive cis-regulatory elements in promoter region of TaPSY3 correlated with the higher expression during drought and heat stresses has suggested their role in these conditions. The expression pattern of TaPSY3 was correlated with the accumulation of β-carotene in the seed developmental stages. Bacterial complementation assay has validated the functional activity of each TaPSY protein. Hence, TaPSY can be explored in developing genetically improved wheat crop. PMID:27695116
The gametocidal chromosome as a tool for chromosome manipulation in wheat.

PubMed

Endo, T R

2007-01-01

Many alien chromosomes have been introduced into common wheat (the genus Triticum) from related wild species (the genus Aegilops). Some alien chromosomes have unique genes that secure their existence in the host by causing chromosome breakage in the gametes lacking them. Such chromosomes or genes, called gametocidal (Gc) chromosomes or Gc genes, are derived from different genomes (C, S, S(l) and M(g)) and belong to three different homoeologous groups 2, 3 and 4. The Gc genes of the C and M(g) genomes induce mild, or semi-lethal, chromosome mutations in euploid and alien addition lines of common wheat. Thus, induced chromosomal rearrangements have been identified and established in wheat stocks carrying deletions of wheat and alien (rye and barley) chromosomes or wheat-alien translocations. The gametocidal chromosomes isolated in wheat to date are reviewed here, focusing on their feature as a tool for chromosome manipulation.
Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line.

PubMed

Liu, Huitao; Cui, Peng; Zhan, Kehui; Lin, Qiang; Zhuo, Guoyin; Guo, Xiaoli; Ding, Feng; Yang, Wenlong; Liu, Dongcheng; Hu, Songnian; Yu, Jun; Zhang, Aimin

2011-03-29

Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line. The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants. The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non
Whole-genome sequencing for comparative genomics and de novo genome assembly.

PubMed

Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

2015-01-01

Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).
Wheat plant selection for high yields entailed improvement of leaf anatomical and biochemical traits including tolerance to non-optimal temperature conditions.

PubMed

Brestic, Marian; Zivcak, Marek; Hauptvogel, Pavol; Misheva, Svetlana; Kocheva, Konstantina; Yang, Xinghong; Li, Xiangnan; Allakhverdiev, Suleyman I

2018-05-01

Assessment of photosynthetic traits and temperature tolerance was performed on field-grown modern genotype (MG), and the local landrace (LR) of wheat (Triticum aestivum L.) as well as the wild relative species (Aegilops cylindrica Host.). The comparison was based on measurements of the gas exchange (A/c i , light and temperature response curves), slow and fast chlorophyll fluorescence kinetics, and some growth and leaf parameters. In MG, we observed the highest CO 2 assimilation rate [Formula: see text] electron transport rate (J max ) and maximum carboxylation rate [Formula: see text]. The Aegilops leaves had substantially lower values of all photosynthetic parameters; this fact correlated with its lower biomass production. The mesophyll conductance was almost the same in Aegilops and MG, despite the significant differences in leaf phenotype. In contrary, in LR with a higher dry mass per leaf area, the half mesophyll conductance (g m ) values indicated more limited CO 2 diffusion. In Aegilops, we found much lower carboxylation capacity; this can be attributed mainly to thin leaves and lower Rubisco activity. The difference in CO 2 assimilation rate between MG and others was diminished because of its higher mitochondrial respiration activity indicating more intense metabolism. Assessment of temperature response showed lower temperature optimum and a narrow ecological valence (i.e., the range determining the tolerance limits of a species to an environmental factor) in Aegilops. In addition, analysis of photosynthetic thermostability identified the LR as the most sensitive. Our results support the idea that the selection for high yields was accompanied by the increase of photosynthetic productivity through unintentional improvement of leaf anatomical and biochemical traits including tolerance to non-optimal temperature conditions.
Ensembl Genomes 2016: more genomes, more complexity

PubMed Central

Kersey, Paul Julian; Allen, James E.; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J.; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J.; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K.; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D.; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello–Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M.; Howe, Kevin L.; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M.

2016-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. PMID:26578574
Ensembl Genomes 2016: more genomes, more complexity.

PubMed

Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

2016-01-04

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Enabling functional genomics with genome engineering

PubMed Central

Hilton, Isaac B.; Gersbach, Charles A.

2015-01-01

Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. PMID:26430154
phiGENOME: an integrative navigation throughout bacteriophage genomes.

PubMed

Stano, Matej; Klucar, Lubos

2011-11-01

phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Navigating yeast genome maintenance with functional genomics.

PubMed

Measday, Vivien; Stirling, Peter C

2016-03-01

Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Enabling functional genomics with genome engineering.

PubMed

Hilton, Isaac B; Gersbach, Charles A

2015-10-01

Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. © 2015 Hilton and Gersbach; Published by Cold Spring Harbor Laboratory Press.
Genome Maps, a new generation genome browser.

PubMed

Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

2013-07-01

Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.
Genome Maps, a new generation genome browser

PubMed Central

Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

2013-01-01

Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. PMID:23748955
GenomeD3Plot: a library for rich, interactive visualizations of genomic data in web applications.

PubMed

Laird, Matthew R; Langille, Morgan G I; Brinkman, Fiona S L

2015-10-15

A simple static image of genomes and associated metadata is very limiting, as researchers expect rich, interactive tools similar to the web applications found in the post-Web 2.0 world. GenomeD3Plot is a light weight visualization library written in javascript using the D3 library. GenomeD3Plot provides a rich API to allow the rapid visualization of complex genomic data using a convenient standards based JSON configuration file. When integrated into existing web services GenomeD3Plot allows researchers to interact with data, dynamically alter the view, or even resize or reposition the visualization in their browser window. In addition GenomeD3Plot has built in functionality to export any resulting genome visualization in PNG or SVG format for easy inclusion in manuscripts or presentations. GenomeD3Plot is being utilized in the recently released Islandviewer 3 (www.pathogenomics.sfu.ca/islandviewer/) to visualize predicted genomic islands with other genome annotation data. However, its features enable it to be more widely applicable for dynamic visualization of genomic data in general. GenomeD3Plot is licensed under the GNU-GPL v3 at https://github.com/brinkmanlab/GenomeD3Plot/. brinkman@sfu.ca. © The Author 2015. Published by Oxford University Press.
Homoeolog-specific transcriptional bias in allopolyploid wheat

PubMed Central

2010-01-01

Background Interaction between parental genomes is accompanied by global changes in gene expression which, eventually, contributes to growth vigor and the broader phenotypic diversity of allopolyploid species. In order to gain a better understanding of the effects of allopolyploidization on the regulation of diverged gene networks, we performed a genome-wide analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat created by the hybridization of a tetraploid derivative of hexaploid wheat with the diploid ancestor of the wheat D genome Ae. tauschii. Results Affymetrix wheat genome arrays were used for both the discovery of divergent homoeolog-specific mutations and analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat. More than 34,000 detectable parent-specific features (PSF) distributed across the wheat genome were used to assess AB genome (could not differentiate A and B genome contributions) and D genome parental expression in the allopolyploid transcriptome. In re-synthesized polyploid 81% of PSFs detected mid-parent levels of gene expression, and only 19% of PSFs showed the evidence of non-additive expression. Non-additive expression in both AB and D genomes was strongly biased toward up-regulation of parental type of gene expression with only 6% and 11% of genes, respectively, being down-regulated. Of all the non-additive gene expression, 84% can be explained by differences in the parental genotypes used to make the allopolyploid. Homoeolog-specific co-regulation of several functional gene categories was found, particularly genes involved in photosynthesis and protein biosynthesis in wheat. Conclusions Here, we have demonstrated that the establishment of interactions between the diverged regulatory networks in allopolyploids is accompanied by massive homoeolog-specific up- and down-regulation of gene expression. This study provides insights into interactions between homoeologous genomes and their role
GenomeGraphs: integrated genomic data visualization with R.

PubMed

Durinck, Steffen; Bullard, James; Spellman, Paul T; Dudoit, Sandrine

2009-01-06

Biological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses. We developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system. GenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R.
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints.

PubMed

Glusman, Gustavo; Mauldin, Denise E; Hood, Leroy E; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into "genome fingerprints" via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics.
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints

PubMed Central

Glusman, Gustavo; Mauldin, Denise E.; Hood, Leroy E.; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into “genome fingerprints” via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics. PMID:29018478
The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

PubMed

Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

2013-02-01

Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.
Genome-wide analysis of short interspersed nuclear elements SINES revealed high sequence conservation, gene association and retrotranspositional activity in wheat

PubMed Central

Ben-David, Smadar; Yaakov, Beery; Kashkush, Khalil

2013-01-01

Short interspersed nuclear elements (SINEs) are non-autonomous non-LTR retroelements that are present in most eukaryotic species. While SINEs have been intensively investigated in humans and other animal systems, they are poorly studied in plants, especially in wheat (Triticum aestivum). We used quantitative PCR of various wheat species to determine the copy number of a wheat SINE family, termed Au SINE, combined with computer-assisted analyses of the publicly available 454 pyrosequencing database of T. aestivum. In addition, we utilized site-specific PCR on 57 Au SINE insertions, transposon methylation display and transposon display on newly formed wheat polyploids to assess retrotranspositional activity, epigenetic status and genetic rearrangements in Au SINE, respectively. We retrieved 3706 different insertions of Au SINE from the 454 pyrosequencing database of T. aestivum, and found that most of the elements are inserted in A/T-rich regions, while approximately 38% of the insertions are associated with transcribed regions, including known wheat genes. We observed typical retrotransposition of Au SINE in the second generation of a newly formed wheat allohexaploid, and massive hypermethylation in CCGG sites surrounding Au SINE in the third generation. Finally, we observed huge differences in the copy numbers in diploid Triticum and Aegilops species, and a significant increase in the copy numbers in natural wheat polyploids, but no significant increase in the copy number of Au SINE in the first four generations for two of three newly formed allopolyploid species used in this study. Our data indicate that SINEs may play a prominent role in the genomic evolution of wheat through stress-induced activation. PMID:23855320
Genome-wide analysis of short interspersed nuclear elements SINES revealed high sequence conservation, gene association and retrotranspositional activity in wheat.

PubMed

Ben-David, Smadar; Yaakov, Beery; Kashkush, Khalil

2013-10-01

Short interspersed nuclear elements (SINEs) are non-autonomous non-LTR retroelements that are present in most eukaryotic species. While SINEs have been intensively investigated in humans and other animal systems, they are poorly studied in plants, especially in wheat (Triticum aestivum). We used quantitative PCR of various wheat species to determine the copy number of a wheat SINE family, termed Au SINE, combined with computer-assisted analyses of the publicly available 454 pyrosequencing database of T. aestivum. In addition, we utilized site-specific PCR on 57 Au SINE insertions, transposon methylation display and transposon display on newly formed wheat polyploids to assess retrotranspositional activity, epigenetic status and genetic rearrangements in Au SINE, respectively. We retrieved 3706 different insertions of Au SINE from the 454 pyrosequencing database of T. aestivum, and found that most of the elements are inserted in A/T-rich regions, while approximately 38% of the insertions are associated with transcribed regions, including known wheat genes. We observed typical retrotransposition of Au SINE in the second generation of a newly formed wheat allohexaploid, and massive hypermethylation in CCGG sites surrounding Au SINE in the third generation. Finally, we observed huge differences in the copy numbers in diploid Triticum and Aegilops species, and a significant increase in the copy numbers in natural wheat polyploids, but no significant increase in the copy number of Au SINE in the first four generations for two of three newly formed allopolyploid species used in this study. Our data indicate that SINEs may play a prominent role in the genomic evolution of wheat through stress-induced activation. © 2013 Ben-Gurion University The Plant Journal © 2013 John Wiley & Sons Ltd.
Family genome browser: visualizing genomes with pedigree information.

PubMed

Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

2015-07-15

Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaby, Ian K.; Blaby-Haas, Crysten E.

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE PAGES

Blaby, Ian K.; Blaby-Haas, Crysten E.

2017-03-21

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
WheatGenome.info: A Resource for Wheat Genomics Resource.

PubMed

Lai, Kaitao

2016-01-01

An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

PubMed Central

Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

2016-01-01

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564
Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

PubMed

Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

2016-01-01

One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. Copyright © 2016 Elsevier Ltd. All rights reserved.

Genome U-Plot: a whole genome visualization.

PubMed

Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

2018-05-15

The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.
Whole-exome/genome sequencing and genomics.

PubMed

Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

2013-12-01

As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.
A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

PubMed

Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

2016-12-01

The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Visualization for genomics: the Microbial Genome Viewer.

PubMed

Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

2004-07-22

A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

PubMed

Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

2016-01-04

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Tapping the promise of genomics in species with complex, nonmodel genomes.

PubMed

Hirsch, Candice N; Buell, C Robin

2013-01-01

Genomics is enabling a renaissance in all disciplines of plant biology. However, many plant genomes are complex and remain recalcitrant to current genomic technologies. The complexities of these nonmodel plant genomes are attributable to gene and genome duplication, heterozygosity, ploidy, and/or repetitive sequences. Methods are available to simplify the genome and reduce these barriers, including inbreeding and genome reduction, making these species amenable to current sequencing and assembly methods. Some, but not all, of the complexities in nonmodel genomes can be bypassed by sequencing the transcriptome rather than the genome. Additionally, comparative genomics approaches, which leverage phylogenetic relatedness, can aid in the interpretation of complex genomes. Although there are limitations in accessing complex nonmodel plant genomes using current sequencing technologies, genome manipulation and resourceful analyses can allow access to even the most recalcitrant plant genomes.
Ensembl genomes 2016: more genomes, more complexity

USDA-ARS?s Scientific Manuscript database

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent...
The Sequenced Angiosperm Genomes and Genome Databases.

PubMed

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.
The Sequenced Angiosperm Genomes and Genome Databases

PubMed Central

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology. PMID:29706973
GenColors-based comparative genome databases for small eukaryotic genomes.

PubMed

Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

2013-01-01

Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

PubMed

Manolio, Teri A

2016-10-01

Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. Published by Elsevier Ireland Ltd.
Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

PubMed

Machado, Henrique; Gram, Lone

2017-01-01

Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.
The coffee genome hub: a resource for coffee genomes

PubMed Central

Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan

2015-01-01

The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. PMID:25392413
Genome size analyses of Pucciniales reveal the largest fungal genomes.

PubMed

Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

2014-01-01

Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.
Genome size analyses of Pucciniales reveal the largest fungal genomes

PubMed Central

Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G.; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T.; Loureiro, João; Talhinhas, Pedro

2014-01-01

Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research. PMID:25206357
Home - The Cancer Genome Atlas - Cancer Genome - TCGA

Cancer.gov

The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.
Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

PubMed Central

Machado, Henrique; Gram, Lone

2017-01-01

Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms. PMID:28706512
Center for Cancer Genomics | Office of Cancer Genomics

Cancer.gov

The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approaches, CCG aims to accelerate structural, functional and computational research to explore cancer mechanisms, discover new cancer targets, and develop new therapeutics.
Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.

PubMed

Janicki, Mateusz; Rooke, Rebecca; Yang, Guojun

2011-08-01

A major portion of most eukaryotic genomes are transposable elements (TEs). During evolution, TEs have introduced profound changes to genome size, structure, and function. As integral parts of genomes, the dynamic presence of TEs will continue to be a major force in reshaping genomes. Early computational analyses of TEs in genome sequences focused on filtering out "junk" sequences to facilitate gene annotation. When the high abundance and diversity of TEs in eukaryotic genomes were recognized, these early efforts transformed into the systematic genome-wide categorization and classification of TEs. The availability of genomic sequence data reversed the classical genetic approaches to discovering new TE families and superfamilies. Curated TE databases and their accurate annotation of genome sequences in turn facilitated the studies on TEs in a number of frontiers including: (1) TE-mediated changes of genome size and structure, (2) the influence of TEs on genome and gene functions, (3) TE regulation by host, (4) the evolution of TEs and their population dynamics, and (5) genomic scale studies of TE activity. Bioinformatics and genomic approaches have become an integral part of large-scale studies on TEs to extract information with pure in silico analyses or to assist wet lab experimental studies. The current revolution in genome sequencing technology facilitates further progress in the existing frontiers of research and emergence of new initiatives. The rapid generation of large-sequence datasets at record low costs on a routine basis is challenging the computing industry on storage capacity and manipulation speed and the bioinformatics community for improvement in algorithms and their implementations.
Comparative primate genomics: emerging patterns of genome content and dynamics

PubMed Central

Rogers, Jeffrey; Gibbs, Richard A.

2014-01-01

Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

Comparative primate genomics: emerging patterns of genome content and dynamics.

PubMed

Rogers, Jeffrey; Gibbs, Richard A

2014-05-01

Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future.
RPAN: rice pan-genome browser for ∼3000 rice genomes.

PubMed

Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

2017-01-25

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome

USDA-ARS?s Scientific Manuscript database

CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of human, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific t...
Genomics Portals: integrative web-platform for mining genomics data.

PubMed

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data

PubMed Central

2010-01-01

Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
The Perennial Ryegrass GenomeZipper: Targeted Use of Genome Resources for Comparative Grass Genomics1[C][W

PubMed Central

Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F.X.; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

2013-01-01

Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species. PMID:23184232
Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute’s genomic medicine portfolio

PubMed Central

Manolio, Teri A.

2016-01-01

Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual’s genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of “Genomic Medicine Meetings,” under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and diffficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI’s genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. PMID:27612677
Comparative genome analysis in the integrated microbial genomes (IMG) system.

PubMed

Markowitz, Victor M; Kyrpides, Nikos C

2007-01-01

Comparative genome analysis is critical for the effective exploration of a rapidly growing number of complete and draft sequences for microbial genomes. The Integrated Microbial Genomes (IMG) system (img.jgi.doe.gov) has been developed as a community resource that provides support for comparative analysis of microbial genomes in an integrated context. IMG allows users to navigate the multidimensional microbial genome data space and focus their analysis on a subset of genes, genomes, and functions of interest. IMG provides graphical viewers, summaries, and occurrence profile tools for comparing genes, pathways, and functions (terms) across specific genomes. Genes can be further examined using gene neighborhoods and compared with sequence alignment tools.
Comparing Mycobacterium tuberculosis genomes using genome topology networks.

PubMed

Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

2015-02-14

Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes
A universal genomic coordinate translator for comparative genomics

PubMed Central

2014-01-01

Background Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Results Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of
A universal genomic coordinate translator for comparative genomics.

PubMed

Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

2014-06-30

Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across
Microbial genomic taxonomy

PubMed Central

2013-01-01

A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, <10 in Karlin genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups. PMID:24365132
Integrated genome browser: visual analytics platform for genomics.

PubMed

Freese, Nowlan H; Norris, David C; Loraine, Ann E

2016-07-15

Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB's ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. IGB is open source and is freely available from http://bioviz.org/igb aloraine@uncc.edu. © The Author 2016. Published by Oxford University Press.
PlantRGDB: A Database of Plant Retrocopied Genes.

PubMed

Wang, Yi

2017-01-01

RNA-based gene duplication, known as retrocopy, plays important roles in gene origination and genome evolution. The genomes of many plants have been sequenced, offering an opportunity to annotate and mine the retrocopies in plant genomes. However, comprehensive and unified annotation of retrocopies in these plants is still lacking. In this study I constructed the PlantRGDB (Plant Retrocopied Gene DataBase), the first database of plant retrocopies, to provide a putatively complete centralized list of retrocopies in plant genomes. The database is freely accessible at http://probes.pw.usda.gov/plantrgdb or http://aegilops.wheat.ucdavis.edu/plantrgdb. It currently integrates 49 plant species and 38,997 retrocopies along with characterization information. PlantRGDB provides a user-friendly web interface for searching, browsing and downloading the retrocopies in the database. PlantRGDB also offers graphical viewer-integrated sequence information for displaying the structure of each retrocopy. The attributes of the retrocopies of each species are reported using a browse function. In addition, useful tools, such as an advanced search and BLAST, are available to search the database more conveniently. In conclusion, the database will provide a web platform for obtaining valuable insight into the generation of retrocopies and will supplement research on gene duplication and genome evolution in plants. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Genome build information is an essential part of genomic track files.

PubMed

Kanduri, Chakravarthi; Domanska, Diana; Hovig, Eivind; Sandve, Geir Kjetil

2017-09-14

Genomic locations are represented as coordinates on a specific genome build version, but the build information is frequently missing when coordinates are provided. We show that this information is essential to correctly interpret and analyse the genomic intervals contained in genomic track files. Although not a substitute for best practices, we also provide a tool to predict the genome build version of genomic track files.
Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.

2005-08-26

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. Amore » minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.« less
Cytogenetic and molecular identification of three Triticum aestivum-Leymus racemosus translocation addition lines.

PubMed

Wang, Le; Yuan, Jianhua; Bie, Tongde; Zhou, Bo; Chen, Peidu

2009-06-01

Chromosome 2C from Aegilops cylindrica has the ability to induce chromosome breakage in common wheat (Tritivum aestivum). In the BC(1)F(3) generation of the T. aestivum cv. Chinese Spring and a hybrid between T. aestivum-Leymus racemosus Lr.7 addition line and T. aestivum-Ae. cylindrica 2C addition line, three disomic translocation addition lines (2n = 44) were selected by mitotic chromosome C-banding and genomic in situ hybridization. We further characterized these T. aestivum-L. racemosus translocation addition lines, NAU636, NAU637 and NAU638, by chromosome C-banding, in situ hybridization using the A- and D-genome-specific bacterial artificial chromosome (BAC) clones 676D4 and 9M13; plasmids pAs1 and pSc119.2, and 45S rDNA; as well as genomic DNA of L. racemosus as probes, in combination with double ditelosomic test cross and SSR marker analysis. The translocation chromosomes were designated as T3AS-Lr7S, T6BS-Lr7S, and T5DS-Lr7L. The translocation line T3AS-Lr7S was highly resistant to Fusarium head blight and will be useful germplasm for resistance breeding.
Preferential elimination of chromosome 1D from homoeologous group-1 alien addition lines in hexaploid wheat.

PubMed

Garg, Monika; Elamein, Hala M M; Tanaka, Hiroyuki; Tsujimoto, Hisashi

2007-10-01

Alien chromosome addition lines are useful genetic material for studying the effect of an individual chromosome in the same genetic background. However, addition lines are sometimes unstable and tend to lose the alien chromosome in subsequent generations. In this study, we report preferential removal of chromosome 1D rather than the alien chromosome from homoeologous group-1 addition lines. The Agropyron intermedium chromosome 1Agi (1E) addition line, created in the background of 'Vilmorin 27', showed loss of a part of chromosome 1D, thereby losing its HMW glutenin locus. Even in the case of Aegilops longissima and Ae. peregrina, the genomes of which are closer to the B genome than D genome, chromosome 1D was lost from chromosome 1Sl and 1Sv addition lines in cv. 'Chinese Spring' rather than chromosome 1B during transfer from one generation to another. A similar observation was also observed in the case of a chromosome 1E disomic addition line of Ag. elongatum and alloplasmic common wheat line with Ag. intermedium ssp. trichophorum cytoplasm. The reason for this strange observation is thought to lie in the history of wheat evolution, the size of chromosome 1D compared to 1A and 1B, or differing pollen competition abilities.
Goodbye genome paper, hello genome report: the increasing popularity of 'genome announcements' and their impact on science.

PubMed

Smith, David Roy

2017-05-01

Next-generation sequencing technologies have revolutionized genomics and altered the scientific publication landscape. Life-science journals abound with genome papers-peer-reviewed descriptions of newly sequenced chromosomes. Although they once filled the pages of Nature and Science, genome papers are now mostly relegated to journals with low-impact factors. Some have forecast the death of the genome paper and argued that they are using up valuable resources and not advancing science. However, the publication rate of genome papers is on the rise. This increase is largely because some journals have created a new category of manuscript called genome reports, which are short, fast-tracked papers describing a chromosome sequence(s), its GenBank accession number and little else. In 2015, for example, more than 2000 genome reports were published, and 2016 is poised to bring even more. Here, I highlight the growing popularity of genome reports and discuss their merits, drawbacks and impact on science and the academic publication infrastructure. Genome reports can be excellent assets for the research community, but they are also being used as quick and easy routes to a publication, and in some instances they are not peer reviewed. One of the best arguments for genome reports is that they are a citable, user-generated genomic resource providing essential methodological and biological information, which may not be present in the sequence database. But they are expensive and time-consuming avenues for achieving such a goal. © The Author 2016. Published by Oxford University Press.
Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

PubMed

Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

2014-01-01

Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

Application of resequencing to rice genomics, functional genomics and evolutionary analysis

PubMed Central

2014-01-01

Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357
OryzaGenome: Genome Diversity Database of Wild Oryza Species.

PubMed

Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi-Xuan; Han, Bin; Kurata, Nori

2016-01-01

The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
GenomeVIP: a cloud platform for genomic variant discovery and interpretation

PubMed Central

Mashl, R. Jay; Scott, Adam D.; Huang, Kuan-lin; Wyczalkowski, Matthew A.; Yoon, Christopher J.; Niu, Beifang; DeNardo, Erin; Yellapantula, Venkata D.; Handsaker, Robert E.; Chen, Ken; Koboldt, Daniel C.; Ye, Kai; Fenyö, David; Raphael, Benjamin J.; Wendl, Michael C.; Ding, Li

2017-01-01

Identifying genomic variants is a fundamental first step toward the understanding of the role of inherited and acquired variation in disease. The accelerating growth in the corpus of sequencing data that underpins such analysis is making the data-download bottleneck more evident, placing substantial burdens on the research community to keep pace. As a result, the search for alternative approaches to the traditional “download and analyze” paradigm on local computing resources has led to a rapidly growing demand for cloud-computing solutions for genomics analysis. Here, we introduce the Genome Variant Investigation Platform (GenomeVIP), an open-source framework for performing genomics variant discovery and annotation using cloud- or local high-performance computing infrastructure. GenomeVIP orchestrates the analysis of whole-genome and exome sequence data using a set of robust and popular task-specific tools, including VarScan, GATK, Pindel, BreakDancer, Strelka, and Genome STRiP, through a web interface. GenomeVIP has been used for genomic analysis in large-data projects such as the TCGA PanCanAtlas and in other projects, such as the ICGC Pilots, CPTAC, ICGC-TCGA DREAM Challenges, and the 1000 Genomes SV Project. Here, we demonstrate GenomeVIP's ability to provide high-confidence annotated somatic, germline, and de novo variants of potential biological significance using publicly available data sets. PMID:28522612
Pre-genomic, genomic and post-genomic study of microbial communities involved in bioenergy.

PubMed

Rittmann, Bruce E; Krajmalnik-Brown, Rosa; Halden, Rolf U

2008-08-01

Microorganisms can produce renewable energy in large quantities and without damaging the environment or disrupting food supply. The microbial communities must be robust and self-stabilizing, and their essential syntrophies must be managed. Pre-genomic, genomic and post-genomic tools can provide crucial information about the structure and function of these microbial communities. Applying these tools will help accelerate the rate at which microbial bioenergy processes move from intriguing science to real-world practice.
Decoding the genome beyond sequencing: the new phase of genomic research.

PubMed

Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J

2011-10-01

While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. Copyright © 2011 Elsevier Inc. All rights reserved.
Genome Surfing As Driver of Microbial Genomic Diversity.

PubMed

Choudoir, Mallory J; Panke-Buisse, Kevin; Andam, Cheryl P; Buckley, Daniel H

2017-08-01

Historical changes in population size, such as those caused by demographic range expansions, can produce nonadaptive changes in genomic diversity through mechanisms such as gene surfing. We propose that demographic range expansion of a microbial population capable of horizontal gene exchange can result in genome surfing, a mechanism that can cause widespread increase in the pan-genome frequency of genes acquired by horizontal gene exchange. We explain that patterns of genetic diversity within Streptomyces are consistent with genome surfing, and we describe several predictions for testing this hypothesis both in Streptomyces and in other microorganisms. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Genomic HyperBrowser: an analysis web server for genome-scale data

PubMed Central

Sandve, Geir K.; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K.; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalaš, Matúš; Lien, Tonje; Rye, Morten B.; Frigessi, Arnoldo; Hovig, Eivind

2013-01-01

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome. PMID:23632163
The Genomic HyperBrowser: an analysis web server for genome-scale data.

PubMed

Sandve, Geir K; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalas, Matús; Lien, Tonje; Rye, Morten B; Frigessi, Arnoldo; Hovig, Eivind

2013-07-01

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.
Fungal Genomics Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scalemore » genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.« less
Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Muchero, Wellington; Labbe, Jessy L; Priya, Ranjan

2014-01-01

To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel andmore » fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.« less
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome.

PubMed

Wang, Yi; Liu, Xianju; Ren, Chong; Zhong, Gan-Yuan; Yang, Long; Li, Shaohua; Liang, Zhenchang

2016-04-21

CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of humans, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific target sites for CRISPR/Cas9 have been computationally identified for several annual model and crop species, but such sites have not been reported for perennial, woody fruit species. In this study, we identified and characterized five types of CRISPR/Cas9 target sites in the widely cultivated grape species Vitis vinifera and developed a user-friendly database for editing grape genomes in the future. A total of 35,767,960 potential CRISPR/Cas9 target sites were identified from grape genomes in this study. Among them, 22,597,817 target sites were mapped to specific genomic locations and 7,269,788 were found to be highly specific. Protospacers and PAMs were found to distribute uniformly and abundantly in the grape genomes. They were present in all the structural elements of genes with the coding region having the highest abundance. Five PAM types, TGG, AGG, GGG, CGG and NGG, were observed. With the exception of the NGG type, they were abundantly present in the grape genomes. Synteny analysis of similar genes revealed that the synteny of protospacers matched the synteny of homologous genes. A user-friendly database containing protospacers and detailed information of the sites was developed and is available for public use at the Grape-CRISPR website ( http://biodb.sdau.edu.cn/gc/index.html ). Grape genomes harbour millions of potential CRISPR/Cas9 target sites. These sites are widely distributed among and within chromosomes with predominant abundance in the coding regions of genes. We developed a publicly-accessible Grape-CRISPR database for facilitating the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among
Genome projects and the functional-genomic era.

PubMed

Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

2005-12-01

The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.
GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

PubMed

Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

2017-01-25

Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.
Genomes as geography: using GIS technology to build interactive genome feature maps

PubMed Central

Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J

2006-01-01

Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be
Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

PubMed Central

Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

2015-01-01

Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486
Dramatic improvement in genome assembly achieved using doubled-haploid genomes.

PubMed

Zhang, Hong; Tan, Engkong; Suzuki, Yutaka; Hirose, Yusuke; Kinoshita, Shigeharu; Okano, Hideyuki; Kudoh, Jun; Shimizu, Atsushi; Saito, Kazuyoshi; Watabe, Shugo; Asakawa, Shuichi

2014-10-27

Improvement in de novo assembly of large genomes is still to be desired. Here, we improved draft genome sequence quality by employing doubled-haploid individuals. We sequenced wildtype and doubled-haploid Takifugu rubripes genomes, under the same conditions, using the Illumina platform and assembled contigs with SOAPdenovo2. We observed 5.4-fold and 2.6-fold improvement in the sizes of the N50 contig and scaffold of doubled-haploid individuals, respectively, compared to the wildtype, indicating that the use of a doubled-haploid genome aids in accurate genome analysis.
WheatGenome.info: an integrated database and portal for wheat genome information.

PubMed

Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David

2012-02-01

Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species.

PubMed

Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

2008-06-23

The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the
Personal genomics services: whose genomes?

PubMed

Gurwitz, David; Bregman-Eschet, Yael

2009-07-01

New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below.
Insights into structural variations and genome rearrangements in prokaryotic genomes.

PubMed

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Schistosoma comparative genomics: integrating genome structure, parasite biology and anthelmintic discovery

PubMed Central

Swain, Martin T.; Larkin, Denis M.; Caffrey, Conor R.; Davies, Stephen J.; Loukas, Alex; Skelly, Patrick J.; Hoffmann, Karl F.

2011-01-01

Schistosoma genomes provide a comprehensive resource for identifying the molecular processes that shape parasite evolution and for discovering novel chemotherapeutic or immunoprophylactic targets. Here, we demonstrate how intra- and intergenus comparative genomics can be used to drive these investigations forward, illustrate the advantages and limitations of these approaches and review how post genomic technologies offer complementary strategies for genome characterisation. While sequencing and functional characterisation of other schistosome/platyhelminth genomes continues to expedite anthelmintic discovery, we contend that future priorities should equally focus on improving assembly quality, and chromosomal assignment, of existing schistosome/platyhelminth genomes. PMID:22024648
Short and long-term genome stability analysis of prokaryotic genomes.

PubMed

Brilli, Matteo; Liò, Pietro; Lacroix, Vincent; Sagot, Marie-France

2013-05-08

Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were
Rapid construction of genome map for large yellow croaker (Larimichthys crocea) by the whole-genome mapping in BioNano Genomics Irys system.

PubMed

Xiao, Shijun; Li, Jiongtang; Ma, Fengshou; Fang, Lujing; Xu, Shuangbin; Chen, Wei; Wang, Zhi Yong

2015-09-03

Large yellow croaker (Larimichthys crocea) is an important commercial fish in China and East-Asia. The annual product of the species from the aqua-farming industry is about 90 thousand tons. In spite of its economic importance, genetic studies of economic traits and genomic selections of the species are hindered by the lack of genomic resources. Specifically, a whole-genome physical map of large yellow croaker is still missing. The traditional BAC-based fingerprint method is extremely time- and labour-consuming. Here we report the first genome map construction using the high-throughput whole-genome mapping technique by nanochannel arrays in BioNano Genomics Irys system. For an optimal marker density of ~10 per 100 kb, the nicking endonuclease Nt.BspQ1 was chosen for the genome map generation. 645,305 DNA molecules with a total length of ~112 Gb were labelled and detected, covering more than 160X of the large yellow croaker genome. Employing IrysView package and signature patterns in raw DNA molecules, a whole-genome map of large yellow croaker was assembled into 686 maps with a total length of 727 Mb, which was consistent with the estimated genome size. The N50 length of the whole-genome map, including 126 maps, was up to 1.7 Mb. The excellent hybrid alignment with large yellow croaker draft genome validated the consensus genome map assembly and highlighted a promising application of whole-genome mapping on draft genome sequence super-scaffolding. The genome map data of large yellow croaker are accessible on lycgenomics.jmu.edu.cn/pm. Using the state-of-the-art whole-genome mapping technique in Irys system, the first whole-genome map for large yellow croaker has been constructed and thus highly facilitates the ongoing genomic and evolutionary studies for the species. To our knowledge, this is the first public report on genome map construction by the whole-genome mapping for aquatic-organisms. Our study demonstrates a promising application of the whole-genome
Translational genomics for plant breeding with the genome sequence explosion.

PubMed

Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha

2016-04-01

The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Deep whole-genome sequencing of 90 Han Chinese genomes.

PubMed

Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

2017-09-01

Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the
Whole-genome alignment.

PubMed

Dewey, Colin N

2012-01-01

Whole-genome alignment (WGA) is the prediction of evolutionary relationships at the nucleotide level between two or more genomes. It combines aspects of both colinear sequence alignment and gene orthology prediction, and is typically more challenging to address than either of these tasks due to the size and complexity of whole genomes. Despite the difficulty of this problem, numerous methods have been developed for its solution because WGAs are valuable for genome-wide analyses, such as phylogenetic inference, genome annotation, and function prediction. In this chapter, we discuss the meaning and significance of WGA and present an overview of the methods that address it. We also examine the problem of evaluating whole-genome aligners and offer a set of methodological challenges that need to be tackled in order to make the most effective use of our rapidly growing databases of whole genomes.
Genomic Data Commons and Genomic Cloud Pilots - Google Hangout

Cancer.gov

Join us for a live, moderated discussion about two NCI efforts to expand access to cancer genomics data: the Genomic Data Commons and Genomic Cloud Pilots. NCI subject matters experts will include Louis M. Staudt, M.D., Ph.D., Director Center for Cancer Genomics, Warren Kibbe, Ph.D., Director, NCI Center for Biomedical Informatics and Information Technology, and moderated by Anthony Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics and Information Technology. We welcome your questions before and during the Hangout on Twitter using the hashtag #AskNCI.
Structural Genomics: Correlation Blocks, Population Structure, and Genome Architecture

PubMed Central

Hu, Xin-Sheng; Yeh, Francis C.; Wang, Zhiquan

2011-01-01

An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level. PMID:21886455
The genomes and comparative genomics of Lactobacillus delbrueckii phages.

PubMed

Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

2011-07-01

Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.
Evolution of genome size and genomic GC content in carnivorous holokinetics (Droseraceae).

PubMed

Veleba, Adam; Šmarda, Petr; Zedek, František; Horová, Lucie; Šmerda, Jakub; Bureš, Petr

2017-02-01

Studies in the carnivorous family Lentibulariaceae in the last years resulted in the discovery of the smallest plant genomes and an unusual pattern of genomic GC content evolution. However, scarcity of genomic data in other carnivorous clades still prevents a generalization of the observed patterns. Here the aim was to fill this gap by mapping genome evolution in the second largest carnivorous family, Droseraceae, where this evolution may be affected by chromosomal holokinetism in Drosera METHODS: The genome size and genomic GC content of 71 Droseraceae species were measured by flow cytometry. A dated phylogeny was constructed, and the evolution of both genomic parameters and their relationship to species climatic niches were tested using phylogeny-based statistics. The 2C genome size of Droseraceae varied between 488 and 10 927 Mbp, and the GC content ranged between 37·1 and 44·7 %. The genome sizes and genomic GC content of carnivorous and holocentric species did not differ from those of their non-carnivorous and monocentric relatives. The genomic GC content positively correlated with genome size and annual temperature fluctuations. The genome size and chromosome numbers were inversely correlated in the Australian clade of Drosera CONCLUSIONS: Our results indicate that neither carnivory (nutrient scarcity) nor the holokinetism have a prominent effect on size and DNA base composition of Droseraceae genomes. However, the holokinetic drive seems to affect karyotype evolution in one of the major clades of Drosera Our survey confirmed that the evolution of GC content is tightly connected with the evolution of genome size and also with environmental conditions. © The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Rewriting the blueprint of life by synthetic genomics and genome engineering.

PubMed

Annaluru, Narayana; Ramalingam, Sivaprakash; Chandrasegaran, Srinivasan

2015-06-16

Advances in DNA synthesis and assembly methods over the past decade have made it possible to construct genome-size fragments from oligonucleotides. Early work focused on synthesis of small viral genomes, followed by hierarchical synthesis of wild-type bacterial genomes and subsequently on transplantation of synthesized bacterial genomes into closely related recipient strains. More recently, a synthetic designer version of yeast Saccharomyces cerevisiae chromosome III has been generated, with numerous changes from the wild-type sequence without having an impact on cell fitness and phenotype, suggesting plasticity of the yeast genome. A project to generate the first synthetic yeast genome--the Sc2.0 Project--is currently underway.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.

PubMed

Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

2017-01-04

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Brief Guide to Genomics: DNA, Genes and Genomes

MedlinePlus

... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
Novel genomes and genome constitutions identified by GISH and 5S rDNA and knotted1 genomic sequences in the genus Setaria.

PubMed

Zhao, Meicheng; Zhi, Hui; Doust, Andrew N; Li, Wei; Wang, Yongfang; Li, Haiquan; Jia, Guanqing; Wang, Yongqiang; Zhang, Ning; Diao, Xianmin

2013-04-11

The Setaria genus is increasingly of interest to researchers, as its two species, S. viridis and S. italica, are being developed as models for understanding C4 photosynthesis and plant functional genomics. The genome constitution of Setaria species has been studied in the diploid species S. viridis, S. adhaerans and S. grisebachii, where three genomes A, B and C were identified respectively. Two allotetraploid species, S. verticillata and S. faberi, were found to have AABB genomes, and one autotetraploid species, S. queenslandica, with an AAAA genome, has also been identified. The genomes and genome constitutions of most other species remain unknown, even though it was thought there are approximately 125 species in the genus distributed world-wide. GISH was performed to detect the genome constitutions of Eurasia species of S. glauca, S. plicata, and S. arenaria, with the known A, B and C genomes as probes. No or very poor hybridization signal was detected indicating that their genomes are different from those already described. GISH was also performed reciprocally between S. glauca, S. plicata, and S. arenaria genomes, but no hybridization signals between each other were found. The two sets of chromosomes of S. lachnea both hybridized strong signals with only the known C genome of S. grisebachii. Chromosomes of Qing 9, an accession formerly considered as S. viridis, hybridized strong signal only to B genome of S. adherans. Phylogenetic trees constructed with 5S rDNA and knotted1 markers, clearly classify the samples in this study into six clusters, matching the GISH results, and suggesting that the F genome of S. arenaria is basal in the genus. Three novel genomes in the Setaria genus were identified and designated as genome D (S. glauca), E (S. plicata) and F (S. arenaria) respectively. The genome constitution of tetraploid S. lachnea is putatively CCC'C'. Qing 9 is a B genome species indigenous to China and is hypothesized to be a newly identified species. The
Novel genomes and genome constitutions identified by GISH and 5S rDNA and knotted1 genomic sequences in the genus Setaria

PubMed Central

2013-01-01

Background The Setaria genus is increasingly of interest to researchers, as its two species, S. viridis and S. italica, are being developed as models for understanding C4 photosynthesis and plant functional genomics. The genome constitution of Setaria species has been studied in the diploid species S. viridis, S. adhaerans and S. grisebachii, where three genomes A, B and C were identified respectively. Two allotetraploid species, S. verticillata and S. faberi, were found to have AABB genomes, and one autotetraploid species, S. queenslandica, with an AAAA genome, has also been identified. The genomes and genome constitutions of most other species remain unknown, even though it was thought there are approximately 125 species in the genus distributed world-wide. Results GISH was performed to detect the genome constitutions of Eurasia species of S. glauca, S. plicata, and S. arenaria, with the known A, B and C genomes as probes. No or very poor hybridization signal was detected indicating that their genomes are different from those already described. GISH was also performed reciprocally between S. glauca, S. plicata, and S. arenaria genomes, but no hybridization signals between each other were found. The two sets of chromosomes of S. lachnea both hybridized strong signals with only the known C genome of S. grisebachii. Chromosomes of Qing 9, an accession formerly considered as S. viridis, hybridized strong signal only to B genome of S. adherans. Phylogenetic trees constructed with 5S rDNA and knotted1 markers, clearly classify the samples in this study into six clusters, matching the GISH results, and suggesting that the F genome of S. arenaria is basal in the genus. Conclusions Three novel genomes in the Setaria genus were identified and designated as genome D (S. glauca), E (S. plicata) and F (S. arenaria) respectively. The genome constitution of tetraploid S. lachnea is putatively CCC’C’. Qing 9 is a B genome species indigenous to China and is hypothesized to be
Ensembl Genomes 2013: scaling up access to genome-wide data

USDA-ARS?s Scientific Manuscript database

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...
Bacterial Genome Instability

PubMed Central

Darmon, Elise

2014-01-01

SUMMARY Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease. PMID:24600039
Genome-derived vaccines.

PubMed

De Groot, Anne S; Rappuoli, Rino

2004-02-01

Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.
Comparative genomics of Lactobacillus

PubMed Central

Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.

2011-01-01

Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712

PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

PubMed Central

Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

2008-01-01

Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client
Inverse Symmetry in Complete Genomes and Whole-Genome Inverse Duplication

PubMed Central

Kong, Sing-Guan; Fan, Wen-Lang; Chen, Hong-Da; Hsu, Zi-Ting; Zhou, Nengji; Zheng, Bo; Lee, Hoong-Chien

2009-01-01

The cause of symmetry is usually subtle, and its study often leads to a deeper understanding of the bearer of the symmetry. To gain insight into the dynamics driving the growth and evolution of genomes, we conducted a comprehensive study of textual symmetries in 786 complete chromosomes. We focused on symmetry based on our belief that, in spite of their extreme diversity, genomes must share common dynamical principles and mechanisms that drive their growth and evolution, and that the most robust footprints of such dynamics are symmetry related. We found that while complement and reverse symmetries are essentially absent in genomic sequences, inverse–complement plus reverse–symmetry is prevalent in complex patterns in most chromosomes, a vast majority of which have near maximum global inverse symmetry. We also discovered relations that can quantitatively account for the long observed but unexplained phenomenon of -mer skews in genomes. Our results suggest segmental and whole-genome inverse duplications are important mechanisms in genome growth and evolution, probably because they are efficient means by which the genome can exploit its double-stranded structure to enrich its code-inventory. PMID:19898631
Genomic Data Commons | Office of Cancer Genomics

Cancer.gov

The NCI’s Center for Cancer Genomics launches the Genomic Data Commons (GDC), a unified data sharing platform for the cancer research community. The mission of the GDC is to enable data sharing across the entire cancer research community, to ultimately support precision medicine in oncology.
Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuo, Alan; Grigoriev, Igor

2009-04-17

Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentousmore » ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.« less
Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

PubMed

Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

2016-01-01

Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.
JGI Fungal Genomics Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor V.

2011-03-14

Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functionalmore » genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here« less
Genomic Encyclopedia of Fungi

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supportedmore » by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.« less
The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

PubMed

Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

2016-10-11

Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.
Genome expansion via lineage splitting and genome reduction in the cicada endosymbiont Hodgkinia.

PubMed

Campbell, Matthew A; Van Leuven, James T; Meister, Russell C; Carey, Kaitlin M; Simon, Chris; McCutcheon, John P

2015-08-18

Comparative genomics from mitochondria, plastids, and mutualistic endosymbiotic bacteria has shown that the stable establishment of a bacterium in a host cell results in genome reduction. Although many highly reduced genomes from endosymbiotic bacteria are stable in gene content and genome structure, organelle genomes are sometimes characterized by dramatic structural diversity. Previous results from Candidatus Hodgkinia cicadicola, an endosymbiont of cicadas, revealed that some lineages of this bacterium had split into two new cytologically distinct yet genetically interdependent species. It was hypothesized that the long life cycle of cicadas in part enabled this unusual lineage-splitting event. Here we test this hypothesis by investigating the structure of the Ca. Hodgkinia genome in one of the longest-lived cicadas, Magicicada tredecim. We show that the Ca. Hodgkinia genome from M. tredecim has fragmented into multiple new chromosomes or genomes, with at least some remaining partitioned into discrete cells. We also show that this lineage-splitting process has resulted in a complex of Ca. Hodgkinia genomes that are 1.1-Mb pairs in length when considered together, an almost 10-fold increase in size from the hypothetical single-genome ancestor. These results parallel some examples of genome fragmentation and expansion in organelles, although the mechanisms that give rise to these extreme genome instabilities are likely different.
Primer in Genetics and Genomics, Article 2-Advancing Nursing Research With Genomic Approaches.

PubMed

Lee, Hyunhwa; Gill, Jessica; Barr, Taura; Yun, Sijung; Kim, Hyungsuk

2017-03-01

Nurses investigate reasons for variable patient symptoms and responses to treatments to inform how best to improve outcomes. Genomics has the potential to guide nursing research exploring contributions to individual variability. This article is meant to serve as an introduction to the novel methods available through genomics for addressing this critical issue and includes a review of methodological considerations for selected genomic approaches. This review presents essential concepts in genetics and genomics that will allow readers to identify upcoming trends in genomics nursing research and improve research practice. It introduces general principles of genomic research and provides an overview of the research process. It also highlights selected nursing studies that serve as clinical examples of the use of genomic technologies. Finally, the authors provide suggestions about how to apply genomic technology in nursing research along with directions for future research. Using genomic approaches in nursing research can advance the understanding of the complex pathophysiology of disease susceptibility and different patient responses to interventions. Nurses should be incorporating genomics into education, clinical practice, and research as the influence of genomics in health-care research and practice continues to grow. Nurses are also well placed to translate genomic discoveries into improved methods for patient assessment and intervention.
Genomics for Everyone

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chain, Patrick

Genomics — the genetic mapping and DNA sequencing of sets of genes or the complete genomes of organisms, along with related genome analysis and database work — is emerging as one of the transformative sciences of the 21st century. But current bioinformatics tools are not accessible to most biological researchers. Now, a new computational and web-based tool called EDGE Bioinformatics is working to fulfill the promise of democratizing genomics.
Enabling responsible public genomics.

PubMed

Conley, John M; Doerr, Adam K; Vorhaus, Daniel B

2010-01-01

As scientific understandings of genetics advance, researchers require increasingly rich datasets that combine genomic data from large numbers of individuals with medical and other personal information. Linking individuals' genetic data and personal information precludes anonymity and produces medically significant information--a result not contemplated by the established legal and ethical conventions governing human genomic research. To pursue the next generation of human genomic research and commerce in a responsible fashion, scientists, lawyers, and regulators must address substantial new issues, including researchers' duties with respect to clinically significant data, the challenges to privacy presented by genomic data, the boundary between genomic research and commerce, and the practice of medicine. This Article presents a new model for understanding and addressing these new challenges--a "public genomics" premised on the idea that ethically, legally, and socially responsible genomics research requires openness, not privacy, as its organizing principle. Responsible public genomics combines the data contributed by informed and fully consenting information altruists and the research potential of rich datasets in a genomic commons that is freely and globally available. This Article examines the risks and benefits of this public genomics model in the context of an ambitious genetic research project currently under way--the Personal Genome Project. This Article also (i) demonstrates that large-scale genomic projects are desirable, (ii) evaluates the risks and challenges presented by public genomics research, and (iii) determines that the current legal and regulatory regimes restrict beneficial and responsible scientific inquiry while failing to adequately protect participants. The Article concludes by proposing a modified normative and legal framework that embraces and enables a future of responsible public genomics.
Breeding-assisted genomics.

PubMed

Poland, Jesse

2015-04-01

The revolution of inexpensive sequencing has ushered in an unprecedented age of genomics. The promise of using this technology to accelerate plant breeding is being realized with a vision of genomics-assisted breeding that will lead to rapid genetic gain for expensive and difficult traits. The reality is now that robust phenotypic data is an increasing limiting resource to complement the current wealth of genomic information. While genomics has been hailed as the discipline to fundamentally change the scope of plant breeding, a more symbiotic relationship is likely to emerge. In the context of developing and evaluating large populations needed for functional genomics, none excel in this area more than plant breeders. While genetic studies have long relied on dedicated, well-structured populations, the resources dedicated to these populations in the context of readily available, inexpensive genotyping is making this philosophy less tractable relative to directly focusing functional genomics on material in breeding programs. Through shifting effort for basic genomic studies from dedicated structured populations, to capturing the entire scope of genetic determinants in breeding lines, we can move towards not only furthering our understanding of functional genomics in plants, but also rapidly improving crops for increased food security, availability and nutrition. Copyright © 2015 Elsevier Ltd. All rights reserved.
Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus

PubMed Central

Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

2016-01-01

Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859
MIPS plant genome information resources.

PubMed

Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

2007-01-01

The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.
Genomic Advances to Improve Biomass for Biofuels (Genomics and Bioenergy)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rokhsar, Daniel

2008-02-11

Lawrence Berkeley National Lab bioscientist Daniel Rokhsar discusses genomic advances to improve biomass for biofuels. He presented his talk Feb. 11, 2008 in Berkeley, California as part of Berkeley Lab's community lecture series. Rokhsar works with the U.S. Department of Energy's Joint Genome Institute and Berkeley Lab's Genomics Division.
New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison

PubMed Central

Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.

2011-01-01

Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011
Genomics for Everyone

ScienceCinema

Chain, Patrick

2018-05-31

Genomics â the genetic mapping and DNA sequencing of sets of genes or the complete genomes of organisms, along with related genome analysis and database work â is emerging as one of the transformative sciences of the 21st century. But current bioinformatics tools are not accessible to most biological researchers. Now, a new computational and web-based tool called EDGE Bioinformatics is working to fulfill the promise of democratizing genomics.
Ensembl Genomes 2013: scaling up access to genome-wide data.

PubMed

Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

2014-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
Toward genome-enabled mycology.

PubMed

Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W

2013-01-01

Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.

The Genome Portal of the Department of Energy Joint Genome Institute

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nordberg, Henrik; Cantor, Michael; Dushekyo, Serge

2014-03-14

The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. Genome Portal in the past 2 years was significantly updated, with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI. A critical aspect of handling big data in genomics is the development of visualization and analysis tools that allow scientists to derive meaning from what are otherwise terrabases ofmore » inert sequence. An interactive visualization tool developed in the group allows us to explore contigs resulting from a single metagenome assembly. Implemented with modern web technologies that take advantage of the power of the computer's graphical processing unit (gpu), the tool allows the user to easily navigate over a 100,000 data points in multiple dimensions, among many biologically meaningful parameters of a dataset such as relative abundance, contig length, and G+C content.« less
Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes.

PubMed

Puigbò, Pere; Lobkovsky, Alexander E; Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2014-08-21

Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of event across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species. We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically seven and twenty times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction, which is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about tenfold more gene families than the typical genome in the group although some groups appear to have vast, 'open' supergenomes. Reconstruction of evolution for groups of closely related bacteria and archaea reveals an extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size.
From genomics to chemical genomics: new developments in KEGG

PubMed Central

Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika

2006-01-01

The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
SMART on FHIR Genomics: facilitating standardized clinico-genomic apps.

PubMed

Alterovitz, Gil; Warner, Jeremy; Zhang, Peijin; Chen, Yishen; Ullman-Cullere, Mollie; Kreda, David; Kohane, Isaac S

2015-11-01

Supporting clinical decision support for personalized medicine will require linking genome and phenome variants to a patient's electronic health record (EHR), at times on a vast scale. Clinico-genomic data standards will be needed to unify how genomic variant data are accessed from different sequencing systems. A specification for the basis of a clinic-genomic standard, building upon the current Health Level Seven International Fast Healthcare Interoperability Resources (FHIR®) standard, was developed. An FHIR application protocol interface (API) layer was attached to proprietary sequencing platforms and EHRs in order to expose gene variant data for presentation to the end-user. Three representative apps based on the SMART platform were built to test end-to-end feasibility, including integration of genomic and clinical data. Successful design, deployment, and use of the API was demonstrated and adopted by HL7 Clinical Genomics Workgroup. Feasibility was shown through development of three apps by various types of users with background levels and locations. This prototyping work suggests that an entirely data (and web) standards-based approach could prove both effective and efficient for advancing personalized medicine. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

PubMed Central

Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

2016-01-01

Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

PubMed

Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

2014-07-01

Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Between Two Fern Genomes

PubMed Central

2014-01-01

Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969
Exploring Other Genomes: Bacteria.

ERIC Educational Resources Information Center

Flannery, Maura C.

2001-01-01

Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)
Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups.

PubMed

Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F; Espinoza-Rojas, Daniela A; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G; Figueroa, Jaime E; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J

2017-01-01

Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis , functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be
Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

PubMed Central

Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F.; Espinoza-Rojas, Daniela A.; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G.; Figueroa, Jaime E.; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J.

2017-01-01

Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be
A Genome-Wide Landscape of Retrocopies in Primate Genomes.

PubMed

Navarro, Fábio C P; Galante, Pedro A F

2015-07-29

Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates.

PubMed

Nakatani, Yoichiro; Takeda, Hiroyuki; Kohara, Yuji; Morishita, Shinichi

2007-09-01

Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleost fishes, amphibians, reptiles, and marsupials.
The human genome contracts again.

PubMed

Pavlichin, Dmitri S; Weissman, Tsachy; Yona, Golan

2013-09-01

The number of human genomes that have been sequenced completely for different individuals has increased rapidly in recent years. Storing and transferring complete genomes between computers for the purpose of applying various applications and analysis tools will soon become a major hurdle, hindering the analysis phase. Therefore, there is a growing need to compress these data efficiently. Here, we describe a technique to compress human genomes based on entropy coding, using a reference genome and known Single Nucleotide Polymorphisms (SNPs). Furthermore, we explore several intrinsic features of genomes and information in other genomic databases to further improve the compression attained. Using these methods, we compress James Watson's genome to 2.5 megabytes (MB), improving on recent work by 37%. Similar compression is obtained for most genomes available from the 1000 Genomes Project. Our biologically inspired techniques promise even greater gains for genomes of lower organisms and for human genomes as more genomic data become available. Code is available at sourceforge.net/projects/genomezip/
Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

PubMed

Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

2014-06-01

To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.
Funding Opportunity: Genomic Data Centers

Cancer.gov

Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,
MicroScope: a platform for microbial genome annotation and comparative genomics

PubMed Central

Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C.; Médigue, C.

2009-01-01

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope’s rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of
MicroScope: a platform for microbial genome annotation and comparative genomics.

PubMed

Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

2009-01-01

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of
Effects of sample treatments on genome recovery via single-cell genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clingenpeel, Scott; Schwientek, Patrick; Hugenholtz, Philip

2014-06-13

It is known that single-cell genomics is a powerful tool for accessing genetic information from uncultivated microorganisms. Methods of handling samples before single-cell genomic amplification may affect the quality of the genomes obtained. Using three bacterial strains we demonstrate that, compared to cryopreservation, lower-quality single-cell genomes are recovered when the sample is preserved in ethanol or if the sample undergoes fluorescence in situ hybridization, while sample preservation in paraformaldehyde renders it completely unsuitable for sequencing.
Comparative genomics of wild type yeast strains unveils important genome diversity

PubMed Central

Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

2008-01-01

Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome
Autopolyploidy genome duplication preserves other ancient genome duplications in Atlantic salmon (Salmo salar).

PubMed

Christensen, Kris A; Davidson, William S

2017-01-01

Salmonids (e.g. Atlantic salmon, Pacific salmon, and trouts) have a long legacy of genome duplication. In addition to three ancient genome duplications that all teleosts are thought to share, salmonids have had one additional genome duplication. We explored a methodology for untangling these duplications from each other to better understand them in Atlantic salmon. In this methodology, homeologous regions (paralogous/duplicated genomic regions originating from a whole genome duplication) from the most recent genome duplication were assumed to have duplicated genes at greater density and have greater sequence similarity. This assumption was used to differentiate duplicated gene pairs in Atlantic salmon that are either from the most recent genome duplication or from earlier duplications. From a comparison with multiple vertebrate species, it is clear that Atlantic salmon have retained more duplicated genes from ancient genome duplications than other vertebrates--often at higher density in the genome and containing fewer synonymous mutations. It may be that polysomic inheritance is the mechanism responsible for maintaining ancient gene duplicates in salmonids. Polysomic inheritance (when multiple chromosomes pair during meiosis) is thought to be relatively common in salmonids compared to other vertebrate species. These findings illuminate how genome duplications may not only increase the number of duplicated genes, but may also be involved in the maintenance of them from previous genome duplications as well.

An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes

PubMed Central

Cho, Yun Sung; Kim, Hyunho; Kim, Hak-Min; Jho, Sungwoong; Jun, JeHoon; Lee, Yong Joo; Chae, Kyun Shik; Kim, Chang Geun; Kim, Sangsoo; Eriksson, Anders; Edwards, Jeremy S.; Lee, Semin; Kim, Byung Chul; Manica, Andrea; Oh, Tae-Kwang; Church, George M.; Bhak, Jong

2016-01-01

Human genomes are routinely compared against a universal reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically relevant or personal reference. Here we report a hybrid assembly of a Korean reference genome (KOREF) for constructing personal and ethnic references by combining sequencing and mapping methods. We also build its consensus variome reference, providing information on millions of variants from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. We find that the ethnically relevant consensus reference can be beneficial for efficient variant detection. Systematic comparison of human assemblies shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity. PMID:27882922
The Oxytricha trifallax Macronuclear Genome: A Complex Eukaryotic Genome with 16,000 Tiny Chromosomes

PubMed Central

Swart, Estienne C.; Bracht, John R.; Magrini, Vincent; Minx, Patrick; Chen, Xiao; Zhou, Yi; Khurana, Jaspreet S.; Goldman, Aaron D.; Nowacki, Mariusz; Schotanus, Klaas; Jung, Seolkyoung; Fulton, Robert S.; Ly, Amy; McGrath, Sean; Haub, Kevin; Wiggins, Jessica L.; Storton, Donna; Matese, John C.; Parsons, Lance; Chang, Wei-Jen; Bowen, Michael S.; Stover, Nicholas A.; Jones, Thomas A.; Eddy, Sean R.; Herrick, Glenn A.; Doak, Thomas G.; Wilson, Richard K.; Mardis, Elaine R.; Landweber, Laura F.

2013-01-01

The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor “silent” germline micronuclear genome by a process of “unscrambling” and fragmentation. The tiny macronuclear “nanochromosomes” typically encode single, protein-coding genes (a small portion, 10%, encode 2–8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing
Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform.

PubMed

Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Wang, Zhenyi; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Cheng, Rui; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-Chen; Paterson, Andrew H; Wang, Xiyin

2017-05-01

Mainly due to their economic importance, genomes of 10 legumes, including soybean ( Glycine max ), wild peanut ( Arachis duranensis and Arachis ipaensis ), and barrel medic ( Medicago truncatula ), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape ( Vitis vinifera ) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). © 2017 American Society of Plant Biologists. All Rights Reserved.
Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

PubMed

Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

2017-04-28

Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
[Landscape and ecological genomics].

PubMed

Tetushkin, E Ia

2013-10-01

Landscape genomics is the modern version of landscape genetics, a discipline that arose approximately 10 years ago as a combination of population genetics, landscape ecology, and spatial statistics. It studies the effects of environmental variables on gene flow and other microevolutionary processes that determine genetic connectivity and variations in populations. In contrast to population genetics, it operates at the level of individual specimens rather than at the level of population samples. Another important difference between landscape genetics and genomics and population genetics is that, in the former, the analysis of gene flow and local adaptations takes quantitative account of landforms and features of the matrix, i.e., hostile spaces that separate species habitats. Landscape genomics is a part of population ecogenomics, which, along with community genomics, is a major part of ecological genomics. One of the principal purposes of landscape genomics is the identification and differentiation of various genome-wide and locus-specific effects. The approaches and computation tools developed for combined analysis of genomic and landscape variables make it possible to detect adaptation-related genome fragments, which facilitates the planning of conservation efforts and the prediction of species' fate in response to expected changes in the environment.
Informational laws of genome structures

PubMed Central

Bonnici, Vincenzo; Manca, Vincenzo

2016-01-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155
Informational laws of genome structures

NASA Astrophysics Data System (ADS)

Bonnici, Vincenzo; Manca, Vincenzo

2016-06-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Navigating protected genomics data with UCSC Genome Browser in a Box.

PubMed

Haeussler, Maximilian; Raney, Brian J; Hinrichs, Angie S; Clawson, Hiram; Zweig, Ann S; Karolchik, Donna; Casper, Jonathan; Speir, Matthew L; Haussler, David; Kent, W James

2015-03-01

Genome Browser in a Box (GBiB) is a small virtual machine version of the popular University of California Santa Cruz (UCSC) Genome Browser that can be run on a researcher's own computer. Once GBiB is installed, a standard web browser is used to access the virtual server and add personal data files from the local hard disk. Annotation data are loaded on demand through the Internet from UCSC or can be downloaded to the local computer for faster access. Software downloads and installation instructions are freely available for non-commercial use at https://genome-store.ucsc.edu/. GBiB requires the installation of open-source software VirtualBox, available for all major operating systems, and the UCSC Genome Browser, which is open source and free for non-commercial use. Commercial use of GBiB and the Genome Browser requires a license (http://genome.ucsc.edu/license/). © The Author 2014. Published by Oxford University Press.
How genome complexity can explain the difficulty of aligning reads to genomes.

PubMed

Phan, Vinhthuy; Gao, Shanshan; Tran, Quang; Vo, Nam S

2015-01-01

Although it is frequently observed that aligning short reads to genomes becomes harder if they contain complex repeat patterns, there has not been much effort to quantify the relationship between complexity of genomes and difficulty of short-read alignment. Existing measures of sequence complexity seem unsuitable for the understanding and quantification of this relationship. We investigated several measures of complexity and found that length-sensitive measures of complexity had the highest correlation to accuracy of alignment. In particular, the rate of distinct substrings of length k, where k is similar to the read length, correlated very highly to alignment performance in terms of precision and recall. We showed how to compute this measure efficiently in linear time, making it useful in practice to estimate quickly the difficulty of alignment for new genomes without having to align reads to them first. We showed how the length-sensitive measures could provide additional information for choosing aligners that would align consistently accurately on new genomes. We formally established a connection between genome complexity and the accuracy of short-read aligners. The relationship between genome complexity and alignment accuracy provides additional useful information for selecting suitable aligners for new genomes. Further, this work suggests that the complexity of genomes sometimes should be thought of in terms of specific computational problems, such as the alignment of short reads to genomes.
Phytozome Comparative Plant Genomics Portal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goodstein, David; Batra, Sajeev; Carlson, Joseph

2014-09-09

The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes
A Taste of Algal Genomes from the Joint Genome Institute

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuo, Alan; Grigoriev, Igor

Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basicmore » and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.« less
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

PubMed

Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

2016-01-01

PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

PubMed

Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

2013-01-01

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal
Next-Generation Genomics Facility at C-CAMP: Accelerating Genomic Research in India

PubMed Central

S, Chandana; Russiachand, Heikham; H, Pradeep; S, Shilpa; M, Ashwini; S, Sahana; B, Jayanth; Atla, Goutham; Jain, Smita; Arunkumar, Nandini; Gowda, Malali

2014-01-01

Next-Generation Sequencing (NGS; http://www.genome.gov/12513162) is a recent life-sciences technological revolution that allows scientists to decode genomes or transcriptomes at a much faster rate with a lower cost. Genomic-based studies are in a relatively slow pace in India due to the non-availability of genomics experts, trained personnel and dedicated service providers. Using NGS there is a lot of potential to study India's national diversity (of all kinds). We at the Centre for Cellular and Molecular Platforms (C-CAMP) have launched the Next Generation Genomics Facility (NGGF) to provide genomics service to scientists, to train researchers and also work on national and international genomic projects. We have HiSeq1000 from Illumina and GS-FLX Plus from Roche454. The long reads from GS FLX Plus, and high sequence depth from HiSeq1000, are the best and ideal hybrid approaches for de novo and re-sequencing of genomes and transcriptomes. At our facility, we have sequenced around 70 different organisms comprising of more than 388 genomes and 615 transcriptomes – prokaryotes and eukaryotes (fungi, plants and animals). In addition we have optimized other unique applications such as small RNA (miRNA, siRNA etc), long Mate-pair sequencing (2 to 20 Kb), Coding sequences (Exome), Methylome (ChIP-Seq), Restriction Mapping (RAD-Seq), Human Leukocyte Antigen (HLA) typing, mixed genomes (metagenomes) and target amplicons, etc. Translating DNA sequence data from NGS sequencer into meaningful information is an important exercise. Under NGGF, we have bioinformatics experts and high-end computing resources to dissect NGS data such as genome assembly and annotation, gene expression, target enrichment, variant calling (SSR or SNP), comparative analysis etc. Our services (sequencing and bioinformatics) have been utilized by more than 45 organizations (academia and industry) both within India and outside, resulting several publications in peer-reviewed journals and several genomic
The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes.

PubMed

Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars

2017-02-10

The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.
GenomePeek—an online tool for prokaryotic genome and metagenome analysis

DOE PAGES

McNair, Katelyn; Edwards, Robert A.

2015-06-16

As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping errormore » rates low, as well as offering unique data visualization options.« less
Orthology for comparative genomics in the mouse genome database.

PubMed

Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

2015-08-01

The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
GenomeVista

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poliakov, Alexander; Couronne, Olivier

2002-11-04

Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement.

PubMed

Butler, J B; Vaillancourt, R E; Potts, B M; Lee, D J; King, G J; Baten, A; Shepherd, M; Freeman, J S

2017-05-22

Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomic information between these important taxa and provide further insights into the rate of structural change in tree genomes. We constructed three high density linkage maps for two Corymbia species (Corymbia citriodora subsp. variegata and Corymbia torelliana) which were used to compare genome structure between both species and Eucalyptus grandis. Genome structure was highly conserved between the Corymbia species. However, the comparison of Corymbia and E. grandis suggests large (from 1-13 MB) intra-chromosomal rearrangements have occurred on seven of the 11 chromosomes. Most rearrangements were supported through comparisons of the three independent Corymbia maps to the E. grandis genome sequence, and to other independently constructed Eucalyptus linkage maps. These are the first large scale chromosomal rearrangements discovered between eucalypts. Nonetheless, in the general context of plants, the genomic structure of the two genera was remarkably conserved; adding to a growing body of evidence that conservation of genome structure is common amongst woody angiosperms.
Comparative Genomics in Drosophila.

PubMed

Oti, Martin; Pane, Attilio; Sammeth, Michael

2018-01-01

Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding

Ebolavirus comparative genomics

DOE PAGES

Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...

2015-07-14

The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less
IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

PubMed Central

Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

2009-01-01

Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385
Insights into conifer giga-genomes.

PubMed

De La Torre, Amanda R; Birol, Inanc; Bousquet, Jean; Ingvarsson, Pär K; Jansson, Stefan; Jones, Steven J M; Keeling, Christopher I; MacKay, John; Nilsson, Ove; Ritland, Kermit; Street, Nathaniel; Yanchuk, Alvin; Zerbe, Philipp; Bohlmann, Jörg

2014-12-01

Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world's forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20-30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes. © 2014 American Society of Plant Biologists. All Rights Reserved.
Experimental Induction of Genome Chaos.

PubMed

Ye, Christine J; Liu, Guo; Heng, Henry H

2018-01-01

Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.
Pseudomonas Genome Database: facilitating user-friendly, comprehensive comparisons of microbial genomes.

PubMed

Winsor, Geoffrey L; Van Rossum, Thea; Lo, Raymond; Khaira, Bhavjinder; Whiteside, Matthew D; Hancock, Robert E W; Brinkman, Fiona S L

2009-01-01

Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license.
Substantial genome synteny preservation among woody angiosperm species: comparative genomics of Chinese chestnut (Castanea mollissima) and plant reference genomes.

PubMed

Staton, Margaret; Zhebentyayeva, Tetyana; Olukolu, Bode; Fang, Guang Chen; Nelson, Dana; Carlson, John E; Abbott, Albert G

2015-10-05

Chinese chestnut (Castanea mollissima) has emerged as a model species for the Fagaceae family with extensive genomic resources including a physical map, a dense genetic map and quantitative trait loci (QTLs) for chestnut blight resistance. These resources enable comparative genomics analyses relative to model plants. We assessed the degree of conservation between the chestnut genome and other well annotated and assembled plant genomic sequences, focusing on the QTL regions of most interest to the chestnut breeding community. The integrated physical and genetic map of Chinese chestnut has been improved to now include 858 shared sequence-based markers. The utility of the integrated map has also been improved through the addition of 42,970 BAC (bacterial artificial chromosome) end sequences spanning over 26 million bases of the estimated 800 Mb chestnut genome. Synteny between chestnut and ten model plant species was conducted on a macro-syntenic scale using sequences from both individual probes and BAC end sequences across the chestnut physical map. Blocks of synteny with chestnut were found in all ten reference species, with the percent of the chestnut physical map that could be aligned ranging from 10 to 39 %. The integrated genetic and physical map was utilized to identify BACs that spanned the three previously identified QTL regions conferring blight resistance. The clones were pooled and sequenced, yielding 396 sequence scaffolds covering 13.9 Mbp. Comparative genomic analysis on a microsytenic scale, using the QTL-associated genomic sequence, identified synteny from chestnut to other plant genomes ranging from 5.4 to 12.9 % of the genome sequences aligning. On both the macro- and micro-synteny levels, the peach, grape and poplar genomes were found to be the most structurally conserved with chestnut. Interestingly, these results did not strictly follow the expectation that decreased phylogenetic distance would correspond to increased levels of genome
Novel bacteriophages containing a genome of another bacteriophage within their genomes.

PubMed

Swanson, Maud M; Reavy, Brian; Makarova, Kira S; Cock, Peter J; Hopkins, David W; Torrance, Lesley; Koonin, Eugene V; Taliansky, Michael

2012-01-01

A novel bacteriophage infecting Staphylococus pasteuri was isolated during a screen for phages in Antarctic soils. The phage named SpaA1 is morphologically similar to phages of the family Siphoviridae. The 42,784 bp genome of SpaA1 is a linear, double-stranded DNA molecule with 3' protruding cohesive ends. The SpaA1 genome encompasses 63 predicted protein-coding genes which cluster within three regions of the genome, each of apparently different origin, in a mosaic pattern. In two of these regions, the gene sets resemble those in prophages of Bacillus thuringiensis kurstaki str. T03a001 (genes involved in DNA replication/transcription, cell entry and exit) and B. cereus AH676 (additional regulatory and recombination genes), respectively. The third region represents an almost complete genome (except for the short terminal segments) of a distinct bacteriophage, MZTP02. Nearly the same gene module was identified in prophages of B. thuringiensis serovar monterrey BGSC 4AJ1 and B. cereus Rock4-2. These findings suggest that MZTP02 can be shuttled between genomes of other bacteriophages and prophages, leading to the formation of chimeric genomes. The presence of a complete phage genome in the genome of other phages apparently has not been described previously and might represent a 'fast track' route of virus evolution and horizontal gene transfer. Another phage (BceA1) nearly identical in sequence to SpaA1, and also including the almost complete MZTP02 genome within its own genome, was isolated from a bacterium of the B. cereus/B. thuringiensis group. Remarkably, both SpaA1 and BceA1 phages can infect B. cereus and B. thuringiensis, but only one of them, SpaA1, can infect S. pasteuri. This finding is best compatible with a scenario in which MZTP02 was originally contained in BceA1 infecting Bacillus spp, the common hosts for these two phages, followed by emergence of SpaA1 infecting S. pasteuri.
Comparative genomics meets topology: a novel view on genome median and halving problems.

PubMed

Alexeev, Nikita; Avdeyev, Pavel; Alekseyev, Max A

2016-11-11

Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space. We show that the restricted variants of genome median and halving problems are, in fact, closely related. We demonstrate that these problems have a neat topological interpretation in terms of embedded graphs and polygon gluings. We illustrate how such interpretation can lead to solutions to these problems in particular cases. This study provides an unexpected link between comparative genomics and topology, and demonstrates advantages of solving genome median and halving problems within the topological framework.
Genomic mutation consequence calculator.

PubMed

Major, John E

2007-11-15

The genomic mutation consequence calculator (GMCC) is a tool that will reliably and quickly calculate the consequence of arbitrary genomic mutations. GMCC also reports supporting annotations for the specified genomic region. The particular strength of the GMCC is it works in genomic space, not simply in spliced transcript space as some similar tools do. Within gene features, GMCC can report on the effects on splice site, UTR and coding regions in all isoforms affected by the mutation. A considerable number of genomic annotations are also reported, including: genomic conservation score, known SNPs, COSMIC mutations, disease associations and others. The manual interface also offers link outs to various external databases and resources. In batch mode, GMCC returns a csv file which can easily be parsed by the end user. GMCC is intended to support the many tumor resequencing efforts, but can be useful to any study investigating genomic mutations.
Multiple genome alignment for identifying the core structure among moderately related microbial genomes.

PubMed

Uchiyama, Ikuo

2008-10-31

Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.
EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

PubMed

Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

2017-08-01

Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Big Data and Genome Editing Technology: A New Paradigm of Cardiovascular Genomics.

PubMed

Krittanawong, Chayakrit; Sun, Tao; Herzog, Eyal

2017-01-01

Opinion Statements: Cardiovascular diseases (CVDs) encompass a range of conditions extending from congenital heart disease to acute coronary syndrome most of which are heterogenous in nature and some of them are multiple genetic loci. However, the pathogenesis of most CVDs remains incompletely understood. The advance in genome-editing technologies, an engineering process of DNA sequences at precise genomic locations, has enabled a new paradigm that human genome can be precisely modified to achieve a therapeutic effect. Genome-editing includes the correction of genetic variants that cause disease, the addition of therapeutic genes to specific sites in the genomic locations, and the removal of deleterious genes or genome sequences. Site-specific genome engineering can be used as nucleases (known as molecular scissors) including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) systems to provide remarkable opportunities for developing novel therapies in cardiovascular clinical care. Here we discuss genetic polymorphisms and mechanistic insights in CVDs with an emphasis on the impact of genome-editing technologies. The current challenges and future prospects for genomeediting technologies in cardiovascular medicine are also discussed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes

PubMed Central

Cannon, Steven B.; Sterck, Lieven; Rombauts, Stephane; Sato, Shusei; Cheung, Foo; Gouzy, Jérôme; Wang, Xiaohong; Mudge, Joann; Vasdewani, Jayprakash; Schiex, Thomas; Spannagl, Manuel; Monaghan, Erin; Nicholson, Christine; Humphray, Sean J.; Schoof, Heiko; Mayer, Klaus F. X.; Rogers, Jane; Quétier, Francis; Oldroyd, Giles E.; Debellé, Frédéric; Cook, Douglas R.; Retzel, Ernest F.; Roe, Bruce A.; Town, Christopher D.; Tabata, Satoshi; Van de Peer, Yves; Young, Nevin D.

2006-01-01

Genome sequencing of the model legumes, Medicago truncatula and Lotus japonicus, provides an opportunity for large-scale sequence-based comparison of two genomes in the same plant family. Here we report synteny comparisons between these species, including details about chromosome relationships, large-scale synteny blocks, microsynteny within blocks, and genome regions lacking clear correspondence. The Lotus and Medicago genomes share a minimum of 10 large-scale synteny blocks, each with substantial collinearity and frequently extending the length of whole chromosome arms. The proportion of genes syntenic and collinear within each synteny block is relatively homogeneous. Medicago–Lotus comparisons also indicate similar and largely homogeneous gene densities, although gene-containing regions in Mt occupy 20–30% more space than Lj counterparts, primarily because of larger numbers of Mt retrotransposons. Because the interpretation of genome comparisons is complicated by large-scale genome duplications, we describe synteny, synonymous substitutions and phylogenetic analyses to identify and date a probable whole-genome duplication event. There is no direct evidence for any recent large-scale genome duplication in either Medicago or Lotus but instead a duplication predating speciation. Phylogenetic comparisons place this duplication within the Rosid I clade, clearly after the split between legumes and Salicaceae (poplar). PMID:17003129
Ethical aspects of genome diversity research: genome research into cultural diversity or cultural diversity in genome research?

PubMed

Ilkilic, Ilhan; Paul, Norbert W

2009-03-01

The goal of the Human Genome Diversity Project (HGDP) was to reconstruct the history of human evolution and the historical and geographical distribution of populations with the help of scientific research. Through this kind of research, the entire spectrum of genetic diversity to be found in the human species was to be explored with the hope of generating a better understanding of the history of humankind. An important part of this genome diversity research consists in taking blood and tissue samples from indigenous populations. For various reasons, it has not been possible to execute this project in the planned scope and form to date. Nevertheless, genomic diversity research addresses complex issues which prove to be highly relevant from the perspective of research ethics, transcultural medical ethics, and cultural philosophy. In the article at hand, we discuss these ethical issues as illustrated by the HGDP. This investigation focuses on the confrontation of culturally diverse images of humans and their cosmologies within the framework of genome diversity research and the ethical questions it raises. We argue that in addition to complex questions pertaining to research ethics such as informed consent and autonomy of probands, genome diversity research also has a cultural-philosophical, meta-ethical, and phenomenological dimension which must be taken into account in ethical discourses. Acknowledging this fact, we attempt to show the limits of current guidelines used in international genome diversity studies, following this up by a formulation of theses designed to facilitate an appropriate inquiry and ethical evaluation of intercultural dimensions of genome research.
The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

PubMed

Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

2015-04-01

Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.
Genomes by design

PubMed Central

Haimovich, Adrian D.; Muir, Paul; Isaacs, Farren J.

2016-01-01

Next-generation DNA sequencing has revealed the complete genome sequences of numerous organisms, establishing a fundamental and growing understanding of genetic variation and phenotypic diversity. Engineering at the gene, network and whole-genome scale aims to introduce targeted genetic changes both to explore emergent phenotypes and to introduce new functionalities. Expansion of these approaches into massively parallel platforms establishes the ability to generate targeted genome modifications, elucidating causal links between genotype and phenotype, as well as the ability to design and reprogramme organisms. In this Review, we explore techniques and applications in genome engineering, outlining key advances and defining challenges. PMID:26260262
GI-POP: a combinational annotation and genomic island prediction pipeline for ongoing microbial genome projects.

PubMed

Lee, Chi-Ching; Chen, Yi-Ping Phoebe; Yao, Tzu-Jung; Ma, Cheng-Yu; Lo, Wei-Cheng; Lyu, Ping-Chiang; Tang, Chuan Yi

2013-04-10

Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. Copyright © 2012 Elsevier B.V. All rights reserved.
AGAPE (Automated Genome Analysis PipelinE) for Pan-Genome Analysis of Saccharomyces cerevisiae

PubMed Central

Song, Giltae; Dickins, Benjamin J. A.; Demeter, Janos; Engel, Stacia; Dunn, Barbara; Cherry, J. Michael

2015-01-01

The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community. PMID:25781462
Personal genomes in progress: from the human genome project to the personal genome project.

PubMed

Lunshof, Jeantine E; Bobe, Jason; Aach, John; Angrist, Misha; Thakuria, Joseph V; Vorhaus, Daniel B; Hoehe, Margret R; Church, George M

2010-01-01

The cost of a diploid human genome sequence has dropped from about $70M to $2000 since 2007--even as the standards for redundancy have increased from 7x to 40x in order to improve call rates. Coupled with the low return on investment for common single-nucleotide polylmorphisms, this has caused a significant rise in interest in correlating genome sequences with comprehensive environmental and trait data (GET). The cost of electronic health records, imaging, and microbial, immunological, and behavioral data are also dropping quickly. Sharing such integrated GET datasets and their interpretations with a diversity of researchers and research subjects highlights the need for informed-consent models capable of addressing novel privacy and other issues, as well as for flexible data-sharing resources that make materials and data available with minimum restrictions on use. This article examines the Personal Genome Project's effort to develop a GET database as a public genomics resource broadly accessible to both researchers and research participants, while pursuing the highest standards in research ethics.
Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

PubMed

ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

2018-05-15

We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

St2-80: a new FISH marker for St genome and genome analysis in Triticeae.

PubMed

Wang, Long; Shi, Qinghua; Su, Handong; Wang, Yi; Sha, Lina; Fan, Xing; Kang, Houyang; Zhang, Haiqin; Zhou, Yonghong

2017-07-01

The St genome is one of the most fundamental genomes in Triticeae. Repetitive sequences are widely used to distinguish different genomes or species. The primary objectives of this study were to (i) screen a new sequence that could easily distinguish the chromosome of the St genome from those of other genomes by fluorescence in situ hybridization (FISH) and (ii) investigate the genome constitution of some species that remain uncertain and controversial. We used degenerated oligonucleotide primer PCR (Dop-PCR), Dot-blot, and FISH to screen for a new marker of the St genome and to test the efficiency of this marker in the detection of the St chromosome at different ploidy levels. Signals produced by a new FISH marker (denoted St 2 -80) were present on the entire arm of chromosomes of the St genome, except in the centromeric region. On the contrary, St 2 -80 signals were present in the terminal region of chromosomes of the E, H, P, and Y genomes. No signal was detected in the A and B genomes, and only weak signals were detected in the terminal region of chromosomes of the D genome. St 2 -80 signals were obvious and stable in chromosomes of different genomes, whether diploid or polyploid. Therefore, St 2 -80 is a potential and useful FISH marker that can be used to distinguish the St genome from those of other genomes in Triticeae.
Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.

PubMed

van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F

2015-09-17

In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the
Genome Editing Tools in Plants

PubMed Central

Mohanta, Tapan Kumar; Bashir, Tufail; Hashem, Abeer; Bae, Hanhong

2017-01-01

Genome editing tools have the potential to change the genomic architecture of a genome at precise locations, with desired accuracy. These tools have been efficiently used for trait discovery and for the generation of plants with high crop yields and resistance to biotic and abiotic stresses. Due to complex genomic architecture, it is challenging to edit all of the genes/genomes using a particular genome editing tool. Therefore, to overcome this challenging task, several genome editing tools have been developed to facilitate efficient genome editing. Some of the major genome editing tools used to edit plant genomes are: Homologous recombination (HR), zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), pentatricopeptide repeat proteins (PPRs), the CRISPR/Cas9 system, RNA interference (RNAi), cisgenesis, and intragenesis. In addition, site-directed sequence editing and oligonucleotide-directed mutagenesis have the potential to edit the genome at the single-nucleotide level. Recently, adenine base editors (ABEs) have been developed to mutate A-T base pairs to G-C base pairs. ABEs use deoxyadeninedeaminase (TadA) with catalytically impaired Cas9 nickase to mutate A-T base pairs to G-C base pairs. PMID:29257124
The platypus genome unraveled.

PubMed

O'Brien, Stephen J

2008-06-13

The genome of the platypus has been sequenced, assembled, and annotated by an international genomics team. Like the animal itself the platypus genome contains an amalgam of mammal, reptile, and bird-like features.
The Simons Genome Diversity Project: 300 genomes from 142 diverse populations.

PubMed

Mallick, Swapan; Li, Heng; Lipson, Mark; Mathieson, Iain; Gymrek, Melissa; Racimo, Fernando; Zhao, Mengyao; Chennagiri, Niru; Nordenfelt, Susanne; Tandon, Arti; Skoglund, Pontus; Lazaridis, Iosif; Sankararaman, Sriram; Fu, Qiaomei; Rohland, Nadin; Renaud, Gabriel; Erlich, Yaniv; Willems, Thomas; Gallo, Carla; Spence, Jeffrey P; Song, Yun S; Poletti, Giovanni; Balloux, Francois; van Driem, George; de Knijff, Peter; Romero, Irene Gallego; Jha, Aashish R; Behar, Doron M; Bravi, Claudio M; Capelli, Cristian; Hervig, Tor; Moreno-Estrada, Andres; Posukh, Olga L; Balanovska, Elena; Balanovsky, Oleg; Karachanak-Yankova, Sena; Sahakyan, Hovhannes; Toncheva, Draga; Yepiskoposyan, Levon; Tyler-Smith, Chris; Xue, Yali; Abdullah, M Syafiq; Ruiz-Linares, Andres; Beall, Cynthia M; Di Rienzo, Anna; Jeong, Choongwon; Starikovskaya, Elena B; Metspalu, Ene; Parik, Jüri; Villems, Richard; Henn, Brenna M; Hodoglugil, Ugur; Mahley, Robert; Sajantila, Antti; Stamatoyannopoulos, George; Wee, Joseph T S; Khusainova, Rita; Khusnutdinova, Elza; Litvinov, Sergey; Ayodo, George; Comas, David; Hammer, Michael F; Kivisild, Toomas; Klitz, William; Winkler, Cheryl A; Labuda, Damian; Bamshad, Michael; Jorde, Lynn B; Tishkoff, Sarah A; Watkins, W Scott; Metspalu, Mait; Dryomov, Stanislav; Sukernik, Rem; Singh, Lalji; Thangaraj, Kumarasamy; Pääbo, Svante; Kelso, Janet; Patterson, Nick; Reich, David

2016-10-13

Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.
Molluscan Evolutionary Genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simison, W. Brian; Boore, Jeffrey L.

2005-12-01

In the last 20 years there have been dramatic advances in techniques of high-throughput DNA sequencing, most recently accelerated by the Human Genome Project, a program that has determined the three billion base pair code on which we are based. Now this tremendous capability is being directed at other genome targets that are being sampled across the broad range of life. This opens up opportunities as never before for evolutionary and organismal biologists to address questions of both processes and patterns of organismal change. We stand at the dawn of a new 'modern synthesis' period, paralleling that of the earlymore » 20th century when the fledgling field of genetics first identified the underlying basis for Darwin's theory. We must now unite the efforts of systematists, paleontologists, mathematicians, computer programmers, molecular biologists, developmental biologists, and others in the pursuit of discovering what genomics can teach us about the diversity of life. Genome-level sampling for mollusks to date has mostly been limited to mitochondrial genomes and it is likely that these will continue to provide the best targets for broad phylogenetic sampling in the near future. However, we are just beginning to see an inroad into complete nuclear genome sequencing, with several mollusks and other eutrochozoans having been selected for work about to begin. Here, we provide an overview of the state of molluscan mitochondrial genomics, highlight a few of the discoveries from this research, outline the promise of broadening this dataset, describe upcoming projects to sequence whole mollusk nuclear genomes, and challenge the community to prepare for making the best use of these data.« less
[Parental genome imprinting].

PubMed

Babinet, C

1993-01-01

Genetical as well as experimental embryology methods have permitted, in recent years, to uncover a very important feature of mammalian embryonic development: it has been shown that female and male genomic complements are differentially imprinted in such a way that contribution of both a maternally and a paternally derived genome are absolutely necessary for the embryo to complete its normal development. Differential genomic imprinting seems therefore to impose some new and essential kind of information to the one already contained in the genomic sequences. The differential imprinting should be imposed on the genetic material during gametogenesis and persist throughout somatic development after fertilization. It should then be erased in the germ cell line and be established again in sperm and egg genomes. The recent discovery of several mouse genes which are imprinted should permit to address the question of the molecular mechanisms of imprinting.
Chromium and Genomic Stability

PubMed Central

Wise, Sandra S.; Wise, John Pierce

2014-01-01

Many metals serve as micronutrients which protect against genomic instability. Chromium is most abundant in its trivalent and hexavalent forms. Trivalent chromium has historically been considered an essential element, though recent data indicate that while it can have pharmacological effects and value, it is not essential. There are no data indicating that trivalent chromium promotes genomic stability and, instead may promote genomic instability. Hexavalent chromium is widely accepted as highly toxic and carcinogenic with no nutritional value. Recent data indicate that it causes genomic instability and also has no role in promoting genomic stability. PMID:22192535
Comparative genomics reveals insights into avian genome evolution and adaptation

PubMed Central

Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

2015-01-01

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE PAGES

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; ...

2016-01-01

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but
Alignment of Common Wheat and Other Grass Genomes Establishes a Comparative Genomics Research Platform

PubMed Central

Sun, Sangrong; Wang, Jinpeng; Yu, Jigao; Meng, Fanbo; Xia, Ruiyan; Wang, Li; Wang, Zhenyi; Ge, Weina; Liu, Xiaojian; Li, Yuxian; Liu, Yinzhe; Yang, Nanshan; Wang, Xiyin

2017-01-01

Grass genomes are complicated structures as they share a common tetraploidization, and particular genomes have been further affected by extra polyploidizations. These events and the following genomic re-patternings have resulted in a complex, interweaving gene homology both within a genome, and between genomes. Accurately deciphering the structure of these complicated plant genomes would help us better understand their compositional and functional evolution at multiple scales. Here, we build on our previous research by performing a hierarchical alignment of the common wheat genome vis-à-vis eight other sequenced grass genomes with most up-to-date assemblies, and annotations. With this data, we constructed a list of the homologous genes, and then, in a layer-by-layer process, separated their orthology, and paralogy that were established by speciations and recursive polyploidizations, respectively. Compared with the other grasses, the far fewer collinear outparalogous genes within each of three subgenomes of common wheat suggest that homoeologous recombination, and genomic fractionation should have occurred after its formation. In sum, this work contributes to the establishment of an important and timely comparative genomics platform for researchers in the grass community and possibly beyond. Homologous gene list can be found in Supplemental material. PMID:28912789
The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.

2011-04-29

In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspectmore » centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.« less
GenomicTools: a computational platform for developing high-throughput analytics in genomics.

PubMed

Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

2012-01-15

Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.
On genomics, kin, and privacy

PubMed Central

Telenti, Amalio; Ayday, Erman; Hubaux, Jean Pierre

2014-01-01

The storage of greater numbers of exomes or genomes raises the question of loss of privacy for the individual and for families if genomic data are not properly protected. Access to genome data may result from a personal decision to disclose, or from gaps in protection. In either case, revealing genome data has consequences beyond the individual, as it compromises the privacy of family members. Increasing availability of genome data linked or linkable to metadata through online social networks and services adds one additional layer of complexity to the protection of genome privacy. The field of computer science and information technology offers solutions to secure genomic data so that individuals, medical personnel or researchers can access only the subset of genomic information required for healthcare or dedicated studies. PMID:25254097
Whole-genome sequence-based genomic prediction in laying chickens with different genomic relationship matrices to account for genetic architecture.

PubMed

Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner

2017-01-16

With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights
Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

PubMed

Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

2015-01-01

Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
A comprehensive crop genome research project: the Superhybrid Rice Genome Project in China.

PubMed

Yu, Jun; Wong, Gane Ka-Shu; Liu, Siqi; Wang, Jian; Yang, Huanming

2007-06-29

In May 2000, the Beijing Institute of Genomics formally announced the launch of a comprehensive crop genome research project on rice genomics, the Chinese Superhybrid Rice Genome Project. SRGP is not simply a sequencing project targeted to a single rice (Oryza sativa L.) genome, but a full-swing research effort with an ultimate goal of providing inclusive basic genomic information and molecular tools not only to understand biology of the rice, both as an important crop species and a model organism of cereals, but also to focus on a popular superhybrid rice landrace, LYP9. We have completed the first phase of SRGP and provide the rice research community with a finished genome sequence of an indica variety, 93-11 (the paternal cultivar of LYP9), together with ample data on subspecific (between subspecies) polymorphisms, transcriptomes and proteomes, useful for within-species comparative studies. In the second phase, we have acquired the genome sequence of the maternal cultivar, PA64S, together with the detailed catalogues of genes uniquely expressed in the parental cultivars and the hybrid as well as allele-specific markers that distinguish parental alleles. Although SRGP in China is not an open-ended research programme, it has been designed to pave a way for future plant genomics research and application, such as to interrogate fundamentals of plant biology, including genome duplication, polyploidy and hybrid vigour, as well as to provide genetic tools for crop breeding and to carry along a social burden-leading a fight against the world's hunger. It began with genomics, the newly developed and industry-scale research field, and from the world's most populous country. In this review, we summarize our scientific goals and noteworthy discoveries that exploit new territories of systematic investigations on basic and applied biology of rice and other major cereal crops.
Inferring the Minimal Genome of Mesoplasma florum by Comparative Genomics and Transposon Mutagenesis.

PubMed

Baby, Vincent; Lachance, Jean-Christophe; Gagnon, Jules; Lucier, Jean-François; Matteau, Dominick; Knight, Tom; Rodrigue, Sébastien

2018-01-01

The creation and comparison of minimal genomes will help better define the most fundamental mechanisms supporting life. Mesoplasma florum is a near-minimal, fast-growing, nonpathogenic bacterium potentially amenable to genome reduction efforts. In a comparative genomic study of 13 M. florum strains, including 11 newly sequenced genomes, we have identified the core genome and open pangenome of this species. Our results show that all of the strains have approximately 80% of their gene content in common. Of the remaining 20%, 17% of the genes were found in multiple strains and 3% were unique to any given strain. On the basis of random transposon mutagenesis, we also estimated that ~290 out of 720 genes are essential for M. florum L1 in rich medium. We next evaluated different genome reduction scenarios for M. florum L1 by using gene conservation and essentiality data, as well as comparisons with the first working approximation of a minimal organism, Mycoplasma mycoides JCVI-syn3.0. Our results suggest that 409 of the 473 M. mycoides JCVI-syn3.0 genes have orthologs in M. florum L1. Conversely, 57 putatively essential M. florum L1 genes have no homolog in M. mycoides JCVI-syn3.0. This suggests differences in minimal genome compositions, even for these evolutionarily closely related bacteria. IMPORTANCE The last years have witnessed the development of whole-genome cloning and transplantation methods and the complete synthesis of entire chromosomes. Recently, the first minimal cell, Mycoplasma mycoides JCVI-syn3.0, was created. Despite these milestone achievements, several questions remain to be answered. For example, is the composition of minimal genomes virtually identical in phylogenetically related species? On the basis of comparative genomics and transposon mutagenesis, we investigated this question by using an alternative model, Mesoplasma florum, that is also amenable to genome reduction efforts. Our results suggest that the creation of additional minimal
Using comparative genome analysis to identify problems in annotated microbial genomes.

PubMed

Poptsova, Maria S; Gogarten, J Peter

2010-07-01

Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.

A Trichosporonales genome tree based on 27 haploid and three evolutionarily conserved 'natural' hybrid genomes.

PubMed

Takashima, Masako; Sriswasdi, Sira; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Sugita, Takashi; Iwasaki, Wataru

2018-01-01

To construct a backbone tree consisting of basidiomycetous yeasts, draft genome sequences from 25 species of Trichosporonales (Tremellomycetes, Basidiomycota) were generated. In addition to the hybrid genomes of Trichosporon coremiiforme and Trichosporon ovoides that we described previously, we identified an interspecies hybrid genome in Cutaneotrichosporon mucoides (formerly Trichosporon mucoides). This hybrid genome had a gene retention rate of ~55%, and its closest haploid relative was Cutaneotrichosporon dermatis. After constructing the C. mucoides subgenomes, we generated a phylogenetic tree using genome data from the 27 haploid species and the subgenome data from the three hybrid genome species. It was a high-quality tree with 100% bootstrap support for all of the branches. The genome-based tree provided superior resolution compared with previous multi-gene analyses. Although our backbone tree does not include all Trichosporonales genera (e.g. Cryptotrichosporon), it will be valuable for future analyses of genome data. Interest in interspecies hybrid fungal genomes has recently increased because they may provide a basis for new technologies. The three Trichosporonales hybrid genomes described in this study are different from well-characterized hybrid genomes (e.g. those of Saccharomyces pastorianus and Saccharomyces bayanus) because these hybridization events probably occurred in the distant evolutionary past. Hence, they will be useful for studying genome stability following hybridization and speciation events. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

PubMed Central

Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

2014-01-01

Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848
The Simons Genome Diversity Project: 300 genomes from 142 diverse populations

PubMed Central

Mallick, Swapan; Li, Heng; Lipson, Mark; Mathieson, Iain; Gymrek, Melissa; Racimo, Fernando; Zhao, Mengyao; Chennagiri, Niru; Nordenfelt, Susanne; Tandon, Arti; Skoglund, Pontus; Lazaridis, Iosif; Sankararaman, Sriram; Fu, Qiaomei; Rohland, Nadin; Renaud, Gabriel; Erlich, Yaniv; Willems, Thomas; Gallo, Carla; Spence, Jeffrey P.; Song, Yun S.; Poletti, Giovanni; Balloux, Francois; van Driem, George; de Knijff, Peter; Romero, Irene Gallego; Jha, Aashish R.; Behar, Doron M.; Bravi, Claudio M.; Capelli, Cristian; Hervig, Tor; Moreno-Estrada, Andres; Posukh, Olga L.; Balanovska, Elena; Balanovsky, Oleg; Karachanak-Yankova, Sena; Sahakyan, Hovhannes; Toncheva, Draga; Yepiskoposyan, Levon; Tyler-Smith, Chris; Xue, Yali; Abdullah, M. Syafiq; Ruiz-Linares, Andres; Beall, Cynthia M.; Di Rienzo, Anna; Jeong, Choongwon; Starikovskaya, Elena B.; Metspalu, Ene; Parik, Jüri; Villems, Richard; Henn, Brenna M.; Hodoglugil, Ugur; Mahley, Robert; Sajantila, Antti; Stamatoyannopoulos, George; Wee, Joseph T. S.; Khusainova, Rita; Khusnutdinova, Elza; Litvinov, Sergey; Ayodo, George; Comas, David; Hammer, Michael; Kivisild, Toomas; Klitz, William; Winkler, Cheryl; Labuda, Damian; Bamshad, Michael; Jorde, Lynn B.; Tishkoff, Sarah A.; Watkins, W. Scott; Metspalu, Mait; Dryomov, Stanislav; Sukernik, Rem; Singh, Lalji; Thangaraj, Kumarasamy; Pääbo, Svante; Kelso, Janet; Patterson, Nick; Reich, David

2016-01-01

We report the Simons Genome Diversity Project (SGDP) dataset: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioral modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that in other non-Africans. PMID:27654912
Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

PubMed

Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

2015-03-17

The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.
The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

PubMed

Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

2013-01-01

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
Microbial Genomes Multiply

NASA Technical Reports Server (NTRS)

Doolittle, Russell F.

2002-01-01

The publication of the first complete sequence of a bacterial genome in 1995 was a signal event, underscored by the fact that the article has been cited more than 2,100 times during the intervening seven years. It was a marvelous technical achievement, made possible by automatic DNA-sequencing machines. The feat is the more impressive in that complete genome sequencing has now been adopted in many different laboratories around the world. Four years ago in these columns I examined the situation after a dozen microbial genomes had been completed. Now, with upwards of 60 microbial genome sequences determined and twice that many in progress, it seems reasonable to assess just what is being learned. Are new concepts emerging about how cells work? Have there been practical benefits in the fields of medicine and agriculture? Is it feasible to determine the genomic sequence of every bacterial species on Earth? The answers to these questions maybe Yes, Perhaps, and No, respectively.
Efficient Breeding by Genomic Mating.

PubMed

Akdemir, Deniz; Sánchez, Julio I

2016-01-01

Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.
The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects

PubMed Central

Papanicolaou, Alexie

2016-01-01

Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called “genome projects”. The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure. PMID:27006757
The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

PubMed

Papanicolaou, Alexie

2016-01-01

Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.
Approaches to Fungal Genome Annotation

PubMed Central

Haas, Brian J.; Zeng, Qiandong; Pearson, Matthew D.; Cuomo, Christina A.; Wortman, Jennifer R.

2011-01-01

Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center’s production genome annotation environment. PMID:22059117
Comparative genomics reveals insights into avian genome evolution and adaptation.

PubMed

Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

2014-12-12

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.
Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

PubMed

Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

2018-04-12

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.
Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition

PubMed Central

Lefébure, Tristan; Stanhope, Michael J

2007-01-01

Background The genus Streptococcus is one of the most diverse and important human and agricultural pathogens. This study employs comparative evolutionary analyses of 26 Streptococcus genomes to yield an improved understanding of the relative roles of recombination and positive selection in pathogen adaptation to their hosts. Results Streptococcus genomes exhibit extreme levels of evolutionary plasticity, with high levels of gene gain and loss during species and strain evolution. S. agalactiae has a large pan-genome, with little recombination in its core-genome, while S. pyogenes has a smaller pan-genome and much more recombination of its core-genome, perhaps reflecting the greater habitat, and gene pool, diversity for S. agalactiae compared to S. pyogenes. Core-genome recombination was evident in all lineages (18% to 37% of the core-genome judged to be recombinant), while positive selection was mainly observed during species differentiation (from 11% to 34% of the core-genome). Positive selection pressure was unevenly distributed across lineages and biochemical main role categories. S. suis was the lineage with the greatest level of positive selection pressure, the largest number of unique loci selected, and the largest amount of gene gain and loss. Conclusion Recombination is an important evolutionary force in shaping Streptococcus genomes, not only in the acquisition of significant portions of the genome as lineage specific loci, but also in facilitating rapid evolution of the core-genome. Positive selection, although undoubtedly a slower process, has nonetheless played an important role in adaptation of the core-genome of different Streptococcus species to different hosts. PMID:17475002
Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

PubMed Central

2012-01-01

Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST) 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA) strains (including STs 16, 17, 18, and 78), in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade) and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA) clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains, as previously reported
Genome Improvement at JGI-HAGSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less
Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform1[OPEN

PubMed Central

Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-chen; Paterson, Andrew H.

2017-01-01

Mainly due to their economic importance, genomes of 10 legumes, including soybean (Glycine max), wild peanut (Arachis duranensis and Arachis ipaensis), and barrel medic (Medicago truncatula), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape (Vitis vinifera) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). PMID:28325848
Genomic taxonomy of vibrios

PubMed Central

Thompson, Cristiane C; Vicente, Ana Carolina P; Souza, Rangel C; Vasconcelos, Ana Tereza R; Vesth, Tammi; Alves, Nelson; Ussery, David W; Iida, Tetsuya; Thompson, Fabiano L

2009-01-01

Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera) from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA), supertrees, Average Amino Acid Identity (AAI), genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.). A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in the birth of the online
The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

PubMed Central

Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

2013-01-01

Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520
What Does it Mean to be Genomically Literate? National Human Genome Research Institute Meeting Report

PubMed Central

Hurle, Belen; Citrin, Toby; Jenkins, Jean F.; Kaphingst, Kimberly A.; Lamb, Neil; Roseman, Jo Ellen; Bonham, Vence L.

2014-01-01

Genomic discoveries will increasingly advance the science of medicine. Limited genomic literacy may adversely impact the public’s understanding and use of the power of genetics and genomics in health care and public health. In November 2011, a meeting was held by the National Human Genome Research Institute to examine the challenge of achieving genomic literacy for the general public, from K-12 to adult education. The role of the media in disseminating scientific messages and in perpetuating, or reducing, misconceptions was also discussed. Workshop participants agreed that genomic literacy will only be achieved through active engagement between genomics experts and the varied constituencies that comprise the public. This report summarizes the background, content, and outcomes from this meeting, including recommendations for a research agenda to inform decisions about how to advance genomic literacy in our society. PMID:23448722
Company profile: Complete Genomics Inc.

PubMed

Reid, Clifford

2011-02-01

Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.

The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution.

PubMed

Baniaga, Anthony E; Arrigo, Nils; Barker, Michael S

2016-06-03

The haploid nuclear genome size (1C DNA) of vascular land plants varies over several orders of magnitude. Much of this observed diversity in genome size is due to the proliferation and deletion of transposable elements. To date, all vascular land plant lineages with extremely small nuclear genomes represent recently derived states, having ancestors with much larger genome sizes. The Selaginellaceae represent an ancient lineage with extremely small genomes. It is unclear how small nuclear genomes evolved in Selaginella We compared the rates of nuclear genome size evolution in Selaginella and major vascular plant clades in a comparative phylogenetic framework. For the analyses, we collected 29 new flow cytometry estimates of haploid genome size in Selaginella to augment publicly available data. Selaginella possess some of the smallest known haploid nuclear genome sizes, as well as the lowest rate of genome size evolution observed across all vascular land plants included in our analyses. Additionally, our analyses provide strong support for a history of haploid nuclear genome size stasis in Selaginella Our results indicate that Selaginella, similar to other early diverging lineages of vascular land plants, has relatively low rates of genome size evolution. Further, our analyses highlight that a rapid transition to a small genome size is only one route to an extremely small genome. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
3D genomics imposes evolution of the domain model of eukaryotic genome organization.

PubMed

Razin, Sergey V; Vassetzky, Yegor S

2017-02-01

The hypothesis that the genome is composed of a patchwork of structural and functional domains (units) that may be either active or repressed was proposed almost 30 years ago. Here, we examine the evolution of the domain model of eukaryotic genome organization in view of the expansion of genome-scale techniques in the twenty-first century that have provided us with a wealth of information on genome organization, folding, and functioning.
Genome Sequencing of Steroid Producing Bacteria Using Ion Torrent Technology and a Reference Genome.

PubMed

Sola-Landa, Alberto; Rodríguez-García, Antonio; Barreiro, Carlos; Pérez-Redondo, Rosario

2017-01-01

The Next-Generation Sequencing technology has enormously eased the bacterial genome sequencing and several tens of thousands of genomes have been sequenced during the last 10 years. Most of the genome projects are published as draft version, however, for certain applications the complete genome sequence is required.In this chapter, we describe the strategy that allowed the complete genome sequencing of Mycobacterium neoaurum NRRL B-3805, an industrial strain exploited for steroid production, using Ion Torrent sequencing reads and the genome of a close strain as the reference. This protocol can be applied to analyze the genetic variations between closely related strains; for example, to elucidate the point mutations between a parental strain and a random mutagenesis-derived mutant.
AnnotateGenomicRegions: a web application.

PubMed

Zammataro, Luca; DeMolfetta, Rita; Bucci, Gabriele; Ceol, Arnaud; Muller, Heiko

2014-01-01

Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions.
Ensembl comparative genomics resources

PubMed Central

Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

2016-01-01

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847
Active Transposition in Genomes

PubMed Central

Huang, Cheng Ran Lisa; Burns, Kathleen H.; Boeke, Jef D.

2013-01-01

Transposons are DNA sequences capable of moving in genomes. Early evidence showed their accumulation in many species and suggested their continued activity in at least isolated organisms. In the past decade, with the development of various genomic technologies, it has become abundantly clear that ongoing activity is the rule rather than the exception. Active transposons of various classes are observed throughout plants and animals, including humans. They continue to create new insertions, have an enormous variety of structural and functional impact on genes and genomes, and play important roles in genome evolution. Transposon activities have been identified and measured by employing various strategies. Here, we summarize evidence of current transposon activity in various plant and animal genomes. PMID:23145912
Barcodes for genomes and applications

PubMed Central

Zhou, Fengfeng; Olman, Victor; Xu, Ying

2008-01-01

Background Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1genome and termed the genome's barcode. Results We found that for each genome, the majority of its short sequence fragments have highly similar barcodes while sequence fragments with different barcodes typically correspond to genes that are horizontally transferred or highly expressed. This observation has led to new and more effective ways for addressing two challenging problems: metagenome binning problem and identification of horizontally transferred genes. Our barcode-based metagenome binning algorithm substantially improves the state of the art in terms of both binning accuracies and the scope of applicability. Other attractive properties of genomes barcodes include (a) the barcodes have different and identifiable characteristics for different classes of genomes like prokaryotes, eukaryotes, mitochondria and plastids, and (b) barcodes similarities are generally proportional to the genomes' phylogenetic closeness. Conclusion These and other properties of genomes barcodes make them a new and effective tool for studying numerous genome and metagenome analysis problems. PMID:19091119
Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

PubMed

Kroneis, Thomas; El-Heliebi, Amin

2015-01-01

Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.
ChloroMitoCU: Codon patterns across organelle genomes for functional genomics and evolutionary applications.

PubMed

Sablok, Gaurav; Chen, Ting-Wen; Lee, Chi-Ching; Yang, Chi; Gan, Ruei-Chi; Wegrzyn, Jill L; Porta, Nicola L; Nayak, Kinshuk C; Huang, Po-Jung; Varotto, Claudio; Tang, Petrus

2017-06-01

Organelle genomes are widely thought to have arisen from reduction events involving cyanobacterial and archaeal genomes, in the case of chloroplasts, or α-proteobacterial genomes, in the case of mitochondria. Heterogeneity in base composition and codon preference has long been the subject of investigation of topics ranging from phylogenetic distortion to the design of overexpression cassettes for transgenic expression. From the overexpression point of view, it is critical to systematically analyze the codon usage patterns of the organelle genomes. In light of the importance of codon usage patterns in the development of hyper-expression organelle transgenics, we present ChloroMitoCU, the first-ever curated, web-based reference catalog of the codon usage patterns in organelle genomes. ChloroMitoCU contains the pre-compiled codon usage patterns of 328 chloroplast genomes (29,960 CDS) and 3,502 mitochondrial genomes (49,066 CDS), enabling genome-wide exploration and comparative analysis of codon usage patterns across species. ChloroMitoCU allows the phylogenetic comparison of codon usage patterns across organelle genomes, the prediction of codon usage patterns based on user-submitted transcripts or assembled organelle genes, and comparative analysis with the pre-compiled patterns across species of interest. ChloroMitoCU can increase our understanding of the biased patterns of codon usage in organelle genomes across multiple clades. ChloroMitoCU can be accessed at: http://chloromitocu.cgu.edu.tw/. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Insights into Conifer Giga-Genomes1

PubMed Central

De La Torre, Amanda R.; Birol, Inanc; Bousquet, Jean; Ingvarsson, Pär K.; Jansson, Stefan; Jones, Steven J.M.; Keeling, Christopher I.; MacKay, John; Nilsson, Ove; Ritland, Kermit; Street, Nathaniel; Yanchuk, Alvin; Zerbe, Philipp; Bohlmann, Jörg

2014-01-01

Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world’s forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20–30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes. PMID:25349325
Genome chaos: survival strategy during crisis.

PubMed

Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

2014-01-01

Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.
Office of Cancer Genomics |

Cancer.gov

The mission of the NCI’s Office of Cancer Genomics (OCG) is to enhance the understanding of the molecular mechanisms of cancer, advance and accelerate genomics science and technology development, and efficiently translate the genomics data to improve cancer research, prevention, early detection, diagnosis and treatment.
Whole genome annotation and comparative genomic analyses of bio-control fungus Purpureocillium lilacinum.

PubMed

Prasad, Pushplata; Varshney, Deepti; Adholeya, Alok

2015-11-25

The fungus Purpureocillium lilacinum is widely known as a biological control agent against plant parasitic nematodes. This research article consists of genomic annotation of the first draft of whole genome sequence of P. lilacinum. The study aims to decipher the putative genetic components of the fungus involved in nematode pathogenesis by performing comparative genomic analysis with nine closely related fungal species in Hypocreales. de novo genomic assembly was done and a total of 301 scaffolds were constructed for P. lilacinum genomic DNA. By employing structural genome prediction models, 13, 266 genes coding for proteins were predicted in the genome. Approximately 73% of the predicted genes were functionally annotated using Blastp, InterProScan and Gene Ontology. A 14.7% fraction of the predicted genes shared significant homology with genes in the Pathogen Host Interactions (PHI) database. The phylogenomic analysis carried out using maximum likelihood RAxML algorithm provided insight into the evolutionary relationship of P. lilacinum. In congruence with other closely related species in the Hypocreales namely, Metarhizium spp., Pochonia chlamydosporia, Cordyceps militaris, Trichoderma reesei and Fusarium spp., P. lilacinum has large gene sets coding for G-protein coupled receptors (GPCRs), proteases, glycoside hydrolases and carbohydrate esterases that are required for degradation of nematode-egg shell components. Screening of the genome by Antibiotics & Secondary Metabolite Analysis Shell (AntiSMASH) pipeline indicated that the genome potentially codes for a variety of secondary metabolites, possibly required for adaptation to heterogeneous lifestyles reported for P. lilacinum. Significant up-regulation of subtilisin-like serine protease genes in presence of nematode eggs in quantitative real-time analyses suggested potential role of serine proteases in nematode pathogenesis. The data offer a better understanding of Purpureocillium lilacinum genome and will
Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

PubMed Central

Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F

2007-01-01

Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA
Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

PubMed

Christen, Matthias; Deutsch, Samuel; Christen, Beat

2015-08-21

Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .
A dictionary based informational genome analysis

PubMed Central

2012-01-01

Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068
AnnotateGenomicRegions: a web application

PubMed Central

2014-01-01

Background Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Results Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. Conclusions The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions. PMID:24564446
Genomic Species Are Ecological Species as Revealed by Comparative Genomics in Agrobacterium tumefaciens

PubMed Central

Lassalle, Florent; Campillo, Tony; Vial, Ludovic; Baude, Jessica; Costechareyre, Denis; Chapulliot, David; Shams, Malek; Abrouk, Danis; Lavire, Céline; Oger-Desfeux, Christine; Hommais, Florence; Guéguen, Laurent; Daubin, Vincent; Muller, Daniel; Nesme, Xavier

2011-01-01

The definition of bacterial species is based on genomic similarities, giving rise to the operational concept of genomic species, but the reasons of the occurrence of differentiated genomic species remain largely unknown. We used the Agrobacterium tumefaciens species complex and particularly the genomic species presently called genomovar G8, which includes the sequenced strain C58, to test the hypothesis of genomic species having specific ecological adaptations possibly involved in the speciation process. We analyzed the gene repertoire specific to G8 to identify potential adaptive genes. By hybridizing 25 strains of A. tumefaciens on DNA microarrays spanning the C58 genome, we highlighted the presence and absence of genes homologous to C58 in the taxon. We found 196 genes specific to genomovar G8 that were mostly clustered into seven genomic islands on the C58 genome—one on the circular chromosome and six on the linear chromosome—suggesting higher plasticity and a major adaptive role of the latter. Clusters encoded putative functional units, four of which had been verified experimentally. The combination of G8-specific functions defines a hypothetical species primary niche for G8 related to commensal interaction with a host plant. This supports that the G8 ancestor was able to exploit a new ecological niche, maybe initiating ecological isolation and thus speciation. Searching genomic data for synapomorphic traits is a powerful way to describe bacterial species. This procedure allowed us to find such phenotypic traits specific to genomovar G8 and thus propose a Latin binomial, Agrobacterium fabrum, for this bona fide genomic species. PMID:21795751
Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

PubMed

Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

2017-07-05

Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.
An Exploration into Fern Genome Space.

PubMed

Wolf, Paul G; Sessa, Emily B; Marchant, Daniel Blaine; Li, Fay-Wei; Rothfels, Carl J; Sigel, Erin M; Gitzendanner, Matthew A; Visger, Clayton J; Banks, Jo Ann; Soltis, Douglas E; Soltis, Pamela S; Pryer, Kathleen M; Der, Joshua P

2015-08-26

Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

PubMed

Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

2014-01-01

A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.
Ensembl comparative genomics resources.

PubMed

Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

2016-01-01

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.
Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

PubMed

Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

2016-01-01

Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Complete sequence of the first chimera genome constructed by cloning the whole genome of Synechocystis strain PCC6803 into the Bacillus subtilis 168 genome.

PubMed

Watanabe, Satoru; Shiwa, Yuh; Itaya, Mitsuhiro; Yoshikawa, Hirofumi

2012-12-01

Genome synthesis of existing or designed genomes is made feasible by the first successful cloning of a cyanobacterium, Synechocystis PCC6803, in Gram-positive, endospore-forming Bacillus subtilis. Whole-genome sequence analysis of the isolate and parental B. subtilis strains provides clues for identifying single nucleotide polymorphisms (SNPs) in the 2 complete bacterial genomes in one cell.
Genome databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Courteau, J.

1991-10-11

Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts inmore » the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.« less
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

PubMed

Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D

2012-06-07

Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.
Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

PubMed

Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

2012-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.
Genomic understanding of dinoflagellates.

PubMed

Lin, Senjie

2011-01-01

The phylum of dinoflagellates is characterized by many unusual and interesting genomic and physiological features, the imprint of which, in its immense genome, remains elusive. Much novel understanding has been achieved in the last decade on various aspects of dinoflagellate biology, but most remarkably about the structure, expression pattern and epigenetic modification of protein-coding genes in the nuclear and organellar genomes. Major findings include: 1) the great diversity of dinoflagellates, especially at the base of the dinoflagellate tree of life; 2) mini-circularization of the genomes of typical dinoflagellate plastids (with three membranes, chlorophylls a, c1 and c2, and carotenoid peridinin), the scrambled mitochondrial genome and the extensive mRNA editing occurring in both systems; 3) ubiquitous spliced leader trans-splicing of nuclear-encoded mRNA and demonstrated potential as a novel tool for studying dinoflagellate transcriptomes in mixed cultures and natural assemblages; 4) existence and expression of histones and other nucleosomal proteins; 5) a ribosomal protein set expected of typical eukaryotes; 6) genetic potential of non-photosynthetic solar energy utilization via proton-pump rhodopsin; 7) gene candidates in the toxin synthesis pathways; and 8) evidence of a highly redundant, high gene number and highly recombined genome. Despite this progress, much more work awaits genome-wide transcriptome and whole genome sequencing in order to unfold the molecular mechanisms underlying the numerous mysterious attributes of dinoflagellates. Copyright © 2011 Institut Pasteur. Published by Elsevier SAS. All rights reserved.
Saccharomyces cerevisiae: gene annotation and genome variability, state of the art through comparative genomics.

PubMed

Louis, Ed

2011-01-01

In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.
Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

PubMed

Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

2016-07-01

This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data

PubMed Central

Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

2017-01-01

The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data—previously only browseable through our FTP site—by focusing on particular samples, populations or data sets of interest. PMID:27638885
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

PubMed Central

Thorvaldsdóttir, Helga; Mesirov, Jill P.

2013-01-01

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today’s sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license. PMID:22517427
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

PubMed

Thorvaldsdóttir, Helga; Robinson, James T; Mesirov, Jill P

2013-03-01

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.
The Banana Genome Hub

PubMed Central

Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

2013-01-01

Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967
Genomic Medicine and Lung Diseases

PubMed Central

Center, David M.; Schwartz, David A.; Solway, Julian; Gail, Dorothy; Laposky, Aaron D.

2012-01-01

The recent explosion of genomic data and technology points to opportunities to redefine lung diseases at the molecular level; to apply integrated genomic approaches to elucidate mechanisms of lung pathophysiology; and to improve early detection, diagnosis, and treatment of lung diseases. Research is needed to translate genomic discoveries into clinical applications, such as detecting preclinical disease, predicting patient outcomes, guiding treatment choices, and most of all identifying potential therapeutic targets for lung diseases. The Division of Lung Diseases in the National Heart, Lung, and Blood Institute convened a workshop, “Genomic Medicine and Lung Diseases,” to discuss the potential for integrated genomics and systems approaches to advance 21st century pulmonary medicine and to evaluate the most promising opportunities for this next phase of genomics research to yield clinical benefit. Workshop sessions included (1) molecular phenotypes, molecular biomarkers, and therapeutics; (2) new technology and opportunity; (3) integrative genomics; (4) molecular anatomy of the lung; (5) novel data and information platforms; and (6) recommendations for exceptional research opportunities in lung genomics research. PMID:22652029
Genome constraint through sexual reproduction: application of 4D-Genomics in reproductive biology.

PubMed

Horne, Steven D; Abdallah, Batoul Y; Stevens, Joshua B; Liu, Guo; Ye, Karen J; Bremer, Steven W; Heng, Henry H Q

2013-06-01

Assisted reproductive technologies have been used to achieve pregnancies since the first successful test tube baby was born in 1978. Infertile couples are at an increased risk for multiple miscarriages and the application of current protocols are associated with high first-trimester miscarriage rates. Among the contributing factors of these higher rates is a high incidence of fetal aneuploidy. Numerous studies support that protocols including ovulation-induction, sperm cryostorage, density-gradient centrifugation, and embryo culture can induce genome instability, but the general mechanism is less clear. Application of the genome theory and 4D-Genomics recently led to the establishment of a new paradigm for sexual reproduction; sex primarily constrains genome integrity that defines the biological system rather than just providing genetic diversity at the gene level. We therefore propose that application of assisted reproductive technologies can bypass this sexual reproduction filter as well as potentially induce additional system instability. We have previously demonstrated that a single-cell resolution genomic approach, such as spectral karyotyping to trace stochastic genome level alterations, is effective for pre- and post-natal analysis. We propose that monitoring overall genome alteration at the karyotype level alongside the application of assisted reproductive technologies will improve the efficacy of the techniques while limiting stress-induced genome instability. The development of more single-cell based cytogenomic technologies are needed in order to better understand the system dynamics associated with infertility and the potential impact that assisted reproductive technologies have on genome instability. Importantly, this approach will be useful in studying the potential for diseases to arise as a result of bypassing the filter of sexual reproduction.
The dog genome map and its use in mammalian comparative genomics.

PubMed

Switonski, Marek; Szczerbal, Izabela; Nowacka, Joanna

2004-01-01

The dog genome organization was extensively studied in the last ten years. The most important achievements are the well-developed marker genome maps, including over 3200 marker loci, and a survey of the DNA genome sequence. This knowledge, along with the most advanced map of the human genome, turned out to be very useful in comparative genomic studies. On the one hand, it has promoted the development of marker genome maps of other species of the family Canidae (red fox, arctic fox, Chinese raccoon dog) as well as studies on the evolution of their karyotype. But the most important approach is the comparative analysis of human and canine hereditary diseases. At present, causative gene mutations are known for 30 canine hereditary diseases. A majority of them have human counterparts with similar clinical and molecular features. Studies on identification of genes having a major impact on some multifactorial diseases (hip dysplasia, epilepsy) and cancers (multifocal renal cystadenocarcinoma and nodular dermatofibrosis) are advanced. Very promising are the results of gene therapy for certain canine monogenic diseases (haemophilia, hereditary retinal dystrophy, mucopolysaccharidosis), which have human equivalents. The above-mentioned examples prove a very important model role of the dog in studies of human genetic diseases. On the other hand, the identification of gene mutations responsible for hereditary diseases has a substantial impact on breeding strategy in the dog.
Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

PubMed

Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

2016-01-04

The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Human genetics and genomics a decade after the release of the draft sequence of the human genome.

PubMed

Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

2011-10-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.

Human genetics and genomics a decade after the release of the draft sequence of the human genome

PubMed Central

2011-01-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605
Organizational heterogeneity of vertebrate genomes.

PubMed

Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

2012-01-01

Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

PubMed

Lee, Hayan; Schatz, Michael C

2012-08-15

Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net
The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

PubMed

Argout, X; Martin, G; Droc, G; Fouet, O; Labadie, K; Rivals, E; Aury, J M; Lanaud, C

2017-09-15

Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes. The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).
Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia

PubMed Central

Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M.

2016-01-01

Abstract Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints. PMID:28175287
Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

PubMed

Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

2006-11-01

The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.
Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia.

PubMed

Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M; Ruiz-Herrera, Aurora

2016-12-01

Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.
The emerging genomics and systems biology research lead to systems genomics studies.

PubMed

Yang, Mary Qu; Yoshigoe, Kenji; Yang, William; Tong, Weida; Qin, Xiang; Dunker, A; Chen, Zhongxue; Arbania, Hamid R; Liu, Jun S; Niemierko, Andrzej; Yang, Jack Y

2014-01-01

Synergistically integrating multi-layer genomic data at systems level not only can lead to deeper insights into the molecular mechanisms related to disease initiation and progression, but also can guide pathway-based biomarker and drug target identification. With the advent of high-throughput next-generation sequencing technologies, sequencing both DNA and RNA has generated multi-layer genomic data that can provide DNA polymorphism, non-coding RNA, messenger RNA, gene expression, isoform and alternative splicing information. Systems biology on the other hand studies complex biological systems, particularly systematic study of complex molecular interactions within specific cells or organisms. Genomics and molecular systems biology can be merged into the study of genomic profiles and implicated biological functions at cellular or organism level. The prospectively emerging field can be referred to as systems genomics or genomic systems biology. The Mid-South Bioinformatics Centre (MBC) and Joint Bioinformatics Ph.D. Program of University of Arkansas at Little Rock and University of Arkansas for Medical Sciences are particularly interested in promoting education and research advancement in this prospectively emerging field. Based on past investigations and research outcomes, MBC is further utilizing differential gene and isoform/exon expression from RNA-seq and co-regulation from the ChiP-seq specific for different phenotypes in combination with protein-protein interactions, and protein-DNA interactions to construct high-level gene networks for an integrative genome-phoneme investigation at systems biology level.
Theory of microbial genome evolution

NASA Astrophysics Data System (ADS)

Koonin, Eugene

Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.
Genome Variation Map: a data repository of genome variations in BIG Data Center.

PubMed

Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

2018-01-04

The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome Variation Map: a data repository of genome variations in BIG Data Center

PubMed Central

Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

2018-01-01

Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473
The bonobo genome compared with the chimpanzee and human genomes

PubMed Central

Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

2012-01-01

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832
Fungal Genomics for Energy and Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor V.

2013-03-11

Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Sequencing Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for usersmore » to nominate new species for sequencing. Over 200 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.« less
Mutational Dynamics of Aroid Chloroplast Genomes

PubMed Central

Ahmed, Ibrar; Biggs, Patrick J.; Matthews, Peter J.; Collins, Lesley J.; Hendy, Michael D.; Lockhart, Peter J.

2012-01-01

A characteristic feature of eukaryote and prokaryote genomes is the co-occurrence of nucleotide substitution and insertion/deletion (indel) mutations. Although similar observations have also been made for chloroplast DNA, genome-wide associations have not been reported. We determined the chloroplast genome sequences for two morphotypes of taro (Colocasia esculenta; family Araceae) and compared these with four publicly available aroid chloroplast genomes. Here, we report the extent of genome-wide association between direct and inverted repeats, indels, and substitutions in these aroid chloroplast genomes. We suggest that alternative but not mutually exclusive hypotheses explain the mutational dynamics of chloroplast genome evolution. PMID:23204304
Lophotrochozoan mitochondrial genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valles, Yvonne; Boore, Jeffrey L.

2005-10-01

Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animalsmore » across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.« less
Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

PubMed Central

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

2014-01-01

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021
Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

PubMed

Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

2014-12-01

Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
The SickKids Genome Clinic: developing and evaluating a pediatric model for individualized genomic medicine.

PubMed

Bowdin, S C; Hayeems, R Z; Monfared, N; Cohn, R D; Meyn, M S

2016-01-01

Our increasing knowledge of how genomic variants affect human health and the falling costs of whole-genome sequencing are driving the development of individualized genomic medicine. This new clinical paradigm uses knowledge of an individual's genomic variants to anticipate, diagnose and manage disease. While individualized genetic medicine offers the promise of transformative change in health care, it forces us to reconsider existing ethical, scientific and clinical paradigms. The potential benefits of pre-symptomatic identification of at-risk individuals, improved diagnostics, individualized therapy, accurate prognosis and avoidance of adverse drug reactions coexist with the potential risks of uninterpretable results, psychological harm, outmoded counseling models and increased health care costs. Here we review the challenges, opportunities and limits of integrating genomic analysis into pediatric clinical practice and describe a model for implementing individualized genomic medicine. Our multidisciplinary team of bioinformaticians, health economists, health services and policy researchers, ethicists, geneticists, genetic counselors and clinicians has designed a 'Genome Clinic' research project that addresses multiple challenges in pediatric genomic medicine--ranging from development of bioinformatics tools for the clinical assessment of genomic variants and the discovery of disease genes to health policy inquiries, assessment of clinical care models, patient preference and the ethics of consent. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
From Genomics to Gene Therapy: Induced Pluripotent Stem Cells Meet Genome Editing.

PubMed

Hotta, Akitsu; Yamanaka, Shinya

2015-01-01

The advent of induced pluripotent stem (iPS) cells has opened up numerous avenues of opportunity for cell therapy, including the initiation in September 2014 of the first human clinical trial to treat dry age-related macular degeneration. In parallel, advances in genome-editing technologies by site-specific nucleases have dramatically improved our ability to edit endogenous genomic sequences at targeted sites of interest. In fact, clinical trials have already begun to implement this technology to control HIV infection. Genome editing in iPS cells is a powerful tool and enables researchers to investigate the intricacies of the human genome in a dish. In the near future, the groundwork laid by such an approach may expand the possibilities of gene therapy for treating congenital disorders. In this review, we summarize the exciting progress being made in the utilization of genomic editing technologies in pluripotent stem cells and discuss remaining challenges toward gene therapy applications.
The tiger genome and comparative analysis with lion and snow leopard genomes.

PubMed

Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

2013-01-01

Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species.

The tiger genome and comparative analysis with lion and snow leopard genomes

PubMed Central

Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

2013-01-01

Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858
GenomeDiagram: a python package for the visualization of large-scale genomic data.

PubMed

Pritchard, Leighton; White, Jennifer A; Birch, Paul R J; Toth, Ian K

2006-03-01

We present GenomeDiagram, a flexible, open-source Python module for the visualization of large-scale genomic, comparative genomic and other data with reference to a single chromosome or other biological sequence. GenomeDiagram may be used to generate publication-quality vector graphics, rastered images and in-line streamed graphics for webpages. The package integrates with datatypes from the BioPython project, and is available for Windows, Linux and Mac OS X systems. GenomeDiagram is freely available as source code (under GNU Public License) at http://bioinf.scri.ac.uk/lp/programs.html, and requires Python 2.3 or higher, and recent versions of the ReportLab and BioPython packages. A user manual, example code and images are available at http://bioinf.scri.ac.uk/lp/programs.html.
Human Genome Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1993-01-01

The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.
Genomics education for the public: perspectives of genomic researchers and ELSI advisors.

PubMed

Dressler, Lynn G; Jones, Sondra Smolek; Markey, Janell M; Byerly, Katherine W; Roberts, Megan C

2014-03-01

For more than two decades genomic education of the public has been a significant challenge. As genomic information becomes integrated into daily life and routine clinical care, the need for public education is even more critical. We conducted a pilot study to learn how genomic researchers and ethical, legal, and social implications advisors who were affiliated with large-scale genomic variation studies have approached the issue of educating the public about genomics. Semi-structured telephone interviews were conducted with researchers and advisors associated with the SNP/HAPMAP studies and the Cancer Genome Atlas Study. Respondents described approach(es) associated with educating the public about their study. Interviews were audio-recorded, transcribed, coded, and analyzed by team review. Although few respondents described formal educational efforts, most provided recommendations for what should/could be done, emphasizing the need for an overarching entity(s) to take responsibility to lead the effort to educate the public. Opposing views were described related to: who this should be; the overall goal of the educational effort; and the educational approach. Four thematic areas emerged: What is the rationale for educating the public about genomics?; Who is the audience?; Who should be responsible for this effort?; and What should the content be? Policy issues associated with these themes included the need to agree on philosophical framework(s) to guide the rationale, content, and target audiences for education programs; coordinate previous/ongoing educational efforts; and develop a centralized knowledge base. Suggestions for next steps are presented. A complex interplay of philosophical, professional, and cultural issues can create impediments to genomic education of the public. Many challenges, however, can be addressed by agreement on a guiding philosophical framework(s) and identification of a responsible entity(s) to provide leadership for developing
Evolutionary genomics of Entamoeba

PubMed Central

Weedall, Gareth D.; Hall, Neil

2011-01-01

Entamoeba histolytica is a human pathogen that causes amoebic dysentery and leads to significant morbidity and mortality worldwide. Understanding the genome and evolution of the parasite will help explain how, when and why it causes disease. Here we review current knowledge about the evolutionary genomics of Entamoeba: how differences between the genomes of different species may help explain different phenotypes, and how variation among E. histolytica parasites reveals patterns of population structure. The imminent expansion of the amount genome data will greatly improve our knowledge of the genus and of pathogenic species within it. PMID:21288488
Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

PubMed Central

2011-01-01

Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of
Whole genome sequence analysis of BT-474 using complete Genomics' standard and long fragment read technologies.

PubMed

Ciotlos, Serban; Mao, Qing; Zhang, Rebecca Yu; Li, Zhenyu; Chin, Robert; Gulbahce, Natali; Liu, Sophie Jia; Drmanac, Radoje; Peters, Brock A

2016-01-01

The cell line BT-474 is a popular cell line for studying the biology of cancer and developing novel drugs. However, there is no complete, published genome sequence for this highly utilized scientific resource. In this study we sought to provide a comprehensive and useful data set for the scientific community by generating a whole genome sequence for BT-474. Five μg of genomic DNA, isolated from an early passage of the BT-474 cell line, was used to generate a whole genome sequence (114X coverage) using Complete Genomics' standard sequencing process. To provide additional variant phasing and structural variation data we also processed and analyzed two separate libraries of 5 and 6 individual cells to depths of 99X and 87X, respectively, using Complete Genomics' Long Fragment Read (LFR) technology. BT-474 is a highly aneuploid cell line with an extremely complex genome sequence. This ~300X total coverage genome sequence provides a more complete understanding of this highly utilized cell line at the genomic level.
A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.

PubMed

Moraes, Fernanda; Góes, Andréa

2016-05-06

The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.
[Genome editing of industrial microorganism].

PubMed

Zhu, Linjiang; Li, Qi

2015-03-01

Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.
Genome-wide analysis of wild-type Epstein-Barr virus genomes derived from healthy individuals of the 1,000 Genomes Project.

PubMed

Santpere, Gabriel; Darre, Fleur; Blanco, Soledad; Alcami, Antonio; Villoslada, Pablo; Mar Albà, M; Navarro, Arcadi

2014-04-01

Most people in the world (∼90%) are infected by the Epstein-Barr virus (EBV), which establishes itself permanently in B cells. Infection by EBV is related to a number of diseases including infectious mononucleosis, multiple sclerosis, and different types of cancer. So far, only seven complete EBV strains have been described, all of them coming from donors presenting EBV-related diseases. To perform a detailed comparative genomic analysis of EBV including, for the first time, EBV strains derived from healthy individuals, we reconstructed EBV sequences infecting lymphoblastoid cell lines (LCLs) from the 1000 Genomes Project. As strain B95-8 was used to transform B cells to obtain LCLs, it is always present, but a specific deletion in its genome sets it apart from natural EBV strains. After studying hundreds of individuals, we determined the presence of natural EBV in at least 10 of them and obtained a set of variants specific to wild-type EBV. By mapping the natural EBV reads into the EBV reference genome (NC007605), we constructed nearly complete wild-type viral genomes from three individuals. Adding them to the five disease-derived EBV genomic sequences available in the literature, we performed an in-depth comparative genomic analysis. We found that latency genes harbor more nucleotide diversity than lytic genes and that six out of nine latency-related genes, as well as other genes involved in viral attachment and entry into host cells, packaging, and the capsid, present the molecular signature of accelerated protein evolution rates, suggesting rapid host-parasite coevolution.
Fueling the Future with Fungal Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor V.

2014-10-27

Genomes of fungi relevant to energy and environment are in focus of the JGI Fungal Genomic Program. One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts and pathogens) and biorefinery processes (cellulose degradation and sugar fermentation) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Science Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 400 fungal genomes have beenmore » sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics will lead to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such ‘parts’ suggested by comparative genomics and functional analysis in these areas are presented here.« less
Scanning the human genome at kilobase resolution.

PubMed

Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming

2008-05-01

Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
Genomic individuality and its biological implications.

PubMed

Zhao, J

1996-06-01

It is a widely accepted fundamental concept that all somatic genomes of a human individual are identical to each other. The theoretical basis of this concept is that all of these somatic genomes are the descendants of the genome of a single fertilized cell as well as the simple replicated products of asexual reproduction, thus not forming any new recombined genomes. The question here is whether such a concept might only represent one side of somatic genome biology and, even worse, whether it has perhaps already led to a very prevalent misconception that within the organism body, there exists no variability among individual somatic genomes. A hypothesis, called genomic individuality, is proposed, simply saying that every individual somatic genome, perhaps with rare exceptions, has its own unique or individual 'genetic identity' or 'fingerprint', which is characterized by its distinctive sequences or patterns of deoxyribonucleic acid molecules, or both. Thus, no two somatic genomes can be identical to each other in every or all aspects, and consequently, there must be a great deal of genomic variation present within the body of any multicellular organism. The concept or hypothesis of genomic individuality would not only provide a more complete understanding of genome biology, but also suggest a new insight into the studies of the biology of cells and organisms.
Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.

The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae,more » respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.« less
The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

PubMed

Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

2017-01-04

The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Listeria Genomics

NASA Astrophysics Data System (ADS)

Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.
[Preface for genome editing special issue].

PubMed

Gu, Feng; Gao, Caixia

2017-10-25

Genome editing technology, as an innovative biotechnology, has been widely used for editing the genome from model organisms, animals, plants and microbes. CRISPR/Cas9-based genome editing technology shows its great value and potential in the dissection of functional genomics, improved breeding and genetic disease treatment. In the present special issue, the principle and application of genome editing techniques has been summarized. The advantages and disadvantages of the current genome editing technology and future prospects would also be highlighted.
An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.

PubMed

Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora

2015-05-01

Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. © 2015 WILEY Periodicals, Inc.
Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

PubMed

Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

2014-04-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.
Analyses of Charophyte Chloroplast Genomes Help Characterize the Ancestral Chloroplast Genome of Land Plants

PubMed Central

Civáň, Peter; Foster, Peter G.; Embley, Martin T.; Séneca, Ana; Cox, Cymon J.

2014-01-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes. PMID:24682153

Anticipation of Personal Genomics Data Enhances Interest and Learning Environment in Genomics and Molecular Biology Undergraduate Courses

PubMed Central

Weber, K. Scott; Jensen, Jamie L.; Johnson, Steven M.

2015-01-01

An important discussion at colleges is centered on determining more effective models for teaching undergraduates. As personalized genomics has become more common, we hypothesized it could be a valuable tool to make science education more hands on, personal, and engaging for college undergraduates. We hypothesized that providing students with personal genome testing kits would enhance the learning experience of students in two undergraduate courses at Brigham Young University: Advanced Molecular Biology and Genomics. These courses have an emphasis on personal genomics the last two weeks of the semester. Students taking these courses were given the option to receive personal genomics kits in 2014, whereas in 2015 they were not. Students sent their personal genomics samples in on their own and received the data after the course ended. We surveyed students in these courses before and after the two-week emphasis on personal genomics to collect data on whether anticipation of obtaining their own personal genomic data impacted undergraduate student learning. We also tested to see if specific personal genomic assignments improved the learning experience by analyzing the data from the undergraduate students who completed both the pre- and post-course surveys. Anticipation of personal genomic data significantly enhanced student interest and the learning environment based on the time students spent researching personal genomic material and their self-reported attitudes compared to those who did not anticipate getting their own data. Personal genomics homework assignments significantly enhanced the undergraduate student interest and learning based on the same criteria and a personal genomics quiz. We found that for the undergraduate students in both molecular biology and genomics courses, incorporation of personal genomic testing can be an effective educational tool in undergraduate science education. PMID:26241308
Anticipation of Personal Genomics Data Enhances Interest and Learning Environment in Genomics and Molecular Biology Undergraduate Courses.

PubMed

Weber, K Scott; Jensen, Jamie L; Johnson, Steven M

2015-01-01

An important discussion at colleges is centered on determining more effective models for teaching undergraduates. As personalized genomics has become more common, we hypothesized it could be a valuable tool to make science education more hands on, personal, and engaging for college undergraduates. We hypothesized that providing students with personal genome testing kits would enhance the learning experience of students in two undergraduate courses at Brigham Young University: Advanced Molecular Biology and Genomics. These courses have an emphasis on personal genomics the last two weeks of the semester. Students taking these courses were given the option to receive personal genomics kits in 2014, whereas in 2015 they were not. Students sent their personal genomics samples in on their own and received the data after the course ended. We surveyed students in these courses before and after the two-week emphasis on personal genomics to collect data on whether anticipation of obtaining their own personal genomic data impacted undergraduate student learning. We also tested to see if specific personal genomic assignments improved the learning experience by analyzing the data from the undergraduate students who completed both the pre- and post-course surveys. Anticipation of personal genomic data significantly enhanced student interest and the learning environment based on the time students spent researching personal genomic material and their self-reported attitudes compared to those who did not anticipate getting their own data. Personal genomics homework assignments significantly enhanced the undergraduate student interest and learning based on the same criteria and a personal genomics quiz. We found that for the undergraduate students in both molecular biology and genomics courses, incorporation of personal genomic testing can be an effective educational tool in undergraduate science education.
Comparative Genomics in Homo sapiens.

PubMed

Oti, Martin; Sammeth, Michael

2018-01-01

Genomes can be compared at different levels of divergence, either between species or within species. Within species genomes can be compared between different subpopulations, such as human subpopulations from different continents. Investigating the genomic differences between different human subpopulations is important when studying complex diseases that are affected by many genetic variants, as the variants involved can differ between populations. The 1000 Genomes Project collected genome-scale variation data for 2504 human individuals from 26 different populations, enabling a systematic comparison of variation between human subpopulations. In this chapter, we present step-by-step a basic protocol for the identification of population-specific variants employing the 1000 Genomes data. These variants are subsequently further investigated for those that affect the proteome or RNA splice sites, to investigate potentially biologically relevant differences between the populations.
Pan-Genomic Analysis Provides Insights into the Genomic Variation and Evolution of Salmonella Paratyphi A

PubMed Central

Chen, Chunxia; Cui, Xiaoying; Yu, Jun; Xiao, Jingfa; Kan, Biao

2012-01-01

Salmonella Paratyphi A (S. Paratyphi A) is a highly adapted, human-specific pathogen that causes paratyphoid fever. Cases of paratyphoid fever have recently been increasing, and the disease is becoming a major public health concern, especially in Eastern and Southern Asia. To investigate the genomic variation and evolution of S. Paratyphi A, a pan-genomic analysis was performed on five newly sequenced S. Paratyphi A strains and two other reference strains. A whole genome comparison revealed that the seven genomes are collinear and that their organization is highly conserved. The high rate of substitutions in part of the core genome indicates that there are frequent homologous recombination events. Based on the changes in the pan-genome size and cluster number (both in the core functional genes and core pseudogenes), it can be inferred that the sharply increasing number of pseudogene clusters may have strong correlation with the inactivation of functional genes, and indicates that the S. Paratyphi A genome is being degraded. PMID:23028950
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies.

PubMed

Card, Daren C; Schield, Drew R; Reyes-Velasco, Jacobo; Fujita, Matthew K; Andrew, Audra L; Oyler-McCance, Sara J; Fike, Jennifer A; Tomback, Diana F; Ruggiero, Robert P; Castoe, Todd A

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5-5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies

USGS Publications Warehouse

Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthre K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (~3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

PubMed

Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

2016-12-01

In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
WhopGenome: high-speed access to whole-genome variation and sequence data in R.

PubMed

Wittelsbürger, Ulrich; Pfeifer, Bastian; Lercher, Martin J

2015-02-01

The statistical programming language R has become a de facto standard for the analysis of many types of biological data, and is well suited for the rapid development of new algorithms. However, variant call data from population-scale resequencing projects are typically too large to be read and processed efficiently with R's built-in I/O capabilities. WhopGenome can efficiently read whole-genome variation data stored in the widely used variant call format (VCF) file format into several R data types. VCF files can be accessed either on local hard drives or on remote servers. WhopGenome can associate variants with annotations such as those available from the UCSC genome browser, and can accelerate the reading process by filtering loci according to user-defined criteria. WhopGenome can also read other Tabix-indexed files and create indices to allow fast selective access to FASTA-formatted sequence files. The WhopGenome R package is available on CRAN at http://cran.r-project.org/web/packages/WhopGenome/. A Bioconductor package has been submitted. lercher@cs.uni-duesseldorf.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
WordSeeker: concurrent bioinformatics software for discovering genome-wide patterns and word-based genomic signatures

PubMed Central

2010-01-01

Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985
An archaeal genomic signature

NASA Technical Reports Server (NTRS)

Graham, D. E.; Overbeek, R.; Olsen, G. J.; Woese, C. R.

2000-01-01

Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).
One Bacterial Cell, One Complete Genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos

2010-04-26

While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated frommore » the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.« less
National Plant Genome Initiative

DTIC Science & Technology

2004-01-01

trials have also identified new objectives for vegetable breeding programs, expedited by knowledge and tools from crop genomics and farmer demand...The same tools and resources are being applied to develop improved crops and new breeding strategies, as well. With the sequencing of the rice genome...marker-assisted breeding strategies for wheat • Establishment of a comparative cereal genomics database, Gramene, which uses the complete rice
Rice functional genomics research in China.

PubMed

Han, Bin; Xue, Yongbiao; Li, Jiayang; Deng, Xing-Wang; Zhang, Qifa

2007-06-29

Rice functional genomics is a scientific approach that seeks to identify and define the function of rice genes, and uncover when and how genes work together to produce phenotypic traits. Rapid progress in rice genome sequencing has facilitated research in rice functional genomics in China. The Ministry of Science and Technology of China has funded two major rice functional genomics research programmes for building up the infrastructures of the functional genomics study such as developing rice functional genomics tools and resources. The programmes were also aimed at cloning and functional analyses of a number of genes controlling important agronomic traits from rice. National and international collaborations on rice functional genomics study are accelerating rice gene discovery and application.
Genome Information Broker (GIB): data retrieval and comparative analysis system for completed microbial genomes and more

PubMed Central

Fumoto, Masaki; Miyazaki, Satoru; Sugawara, Hideaki

2002-01-01

Genome Information Broker (GIB) is a powerful tool for the study of comparative genomics. GIB allows users to retrieve and display partial and/or whole genome sequences together with the relevant biological annotation. GIB has accumulated all the completed microbial genome and has recently been expanded to include Arabidopsis thaliana genome data from DDBJ/EMBL/GenBank. In the near future, hundreds of genome sequences will be determined. In order to handle such huge data, we have enhanced the GIB architecture by using XML, CORBA and distributed RDBs. We introduce the new GIB here. GIB is freely accessible at http://gib.genes.nig.ac.jp/. PMID:11752256
GI-SVM: A sensitive method for predicting genomic islands based on unannotated sequence of a single genome.

PubMed

Lu, Bingxin; Leong, Hon Wai

2016-02-01

Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.
Butterfly genomics eclosing.

PubMed

Beldade, P; McMillan, W O; Papanicolaou, A

2008-02-01

Technological and conceptual advances of the last decade have led to an explosion of genomic data and the emergence of new research avenues. Evolutionary and ecological functional genomics, with its focus on the genes that affect ecological success and adaptation in natural populations, benefits immensely from a phylogenetically widespread sampling of biological patterns and processes. Among those organisms outside established model systems, butterflies offer exceptional opportunities for multidisciplinary research on the processes generating and maintaining variation in ecologically relevant traits. Here we highlight research on wing color pattern variation in two groups of Nymphalid butterflies, the African species Bicyclus anynana (subfamily Satyrinae) and species of the South American genus Heliconius (subfamily Heliconiinae), which are emerging as important systems for studying the nature and origins of functional diversity. Growing genomic resources including genomic and cDNA libraries, dense genetic maps, high-density gene arrays, and genetic transformation techniques are extending current gene mapping and expression profiling analysis and enabling the next generation of research questions linking genes, development, form, and fitness. Efforts to develop such resources in Bicyclus and Heliconius underscore the general challenges facing the larger research community and highlight the need for a community-wide effort to extend ongoing functional genomic research on butterflies.
Theory of prokaryotic genome evolution.

PubMed

Sela, Itamar; Wolf, Yuri I; Koonin, Eugene V

2016-10-11

Bacteria and archaea typically possess small genomes that are tightly packed with protein-coding genes. The compactness of prokaryotic genomes is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. Here, by fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. These results suggest that the number of genes in prokaryotic genomes reflects the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias (i.e., the rate of deletion of genetic material being slightly greater than the rate of acquisition). Thus, new genes acquired by microbial genomes, on average, appear to be adaptive. The tight spacing of protein-coding genes likely results from a combination of the deletion bias and purifying selection that efficiently eliminates nonfunctional, noncoding sequences.
Fueling Future with Algal Genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

Algae constitute a major component of fundamental eukaryotic diversity, play profound roles in the carbon cycle, and are prominent candidates for biofuel production. The US Department of Energy Joint Genome Institute (JGI) is leading the world in algal genome sequencing (http://jgi.doe.gov/Algae) and contributes of the algal genome projects worldwide (GOLD database, 2012). The sequenced algal genomes offer catalogs of genes, networks, and pathways. The sequenced first of its kind genomes of a haptophyte E.huxleyii, chlorarachniophyte B.natans, and cryptophyte G.theta fill the gaps in the eukaryotic tree of life and carry unique genes and pathways as well as molecular fossils ofmore » secondary endosymbiosis. Natural adaptation to conditions critical for industrial production is encoded in algal genomes, for example, growth of A.anophagefferens at very high cell densities during the harmful algae blooms or a global distribution across diverse environments of E.huxleyii, able to live on sparse nutrients due to its expanded pan-genome. Communications and signaling pathways can be derived from simple symbiotic systems like lichens or complex marine algae metagenomes. Collectively these datasets derived from algal genomics contribute to building a comprehensive parts list essential for algal biofuel development.« less
Microbial Lifestyle and Genome Signatures

PubMed Central

Dutta, Chitra; Paul, Sandip

2012-01-01

Microbes are known for their unique ability to adapt to varying lifestyle and environment, even to the extreme or adverse ones. The genomic architecture of a microbe may bear the signatures not only of its phylogenetic position, but also of the kind of lifestyle to which it is adapted. The present review aims to provide an account of the specific genome signatures observed in microbes acclimatized to distinct lifestyles or ecological niches. Niche-specific signatures identified at different levels of microbial genome organization like base composition, GC-skew, purine-pyrimidine ratio, dinucleotide abundance, codon bias, oligonucleotide composition etc. have been discussed. Among the specific cases highlighted in the review are the phenomena of genome shrinkage in obligatory host-restricted microbes, genome expansion in strictly intra-amoebal pathogens, strand-specific codon usage in intracellular species, acquisition of genome islands in pathogenic or symbiotic organisms, discriminatory genomic traits of marine microbes with distinct trophic strategies, and conspicuous sequence features of certain extremophiles like those adapted to high temperature or high salinity. PMID:23024607
The Giardia genome project database.

PubMed

McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L

2000-08-15

The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.

Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

PubMed

Sanitá Lima, Matheus; Smith, David Roy

2017-11-06

Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.
Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

PubMed Central

2010-01-01

Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org) has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC) in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence. PMID:21092105
Clinical genomics information management software linking cancer genome sequence and clinical decisions.

PubMed

Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

2013-09-01

Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician. Copyright © 2013 Elsevier Inc. All rights reserved.
Multiplexed precision genome editing with trackable genomic barcodes in yeast.

PubMed

Roy, Kevin R; Smith, Justin D; Vonesch, Sibylle C; Lin, Gen; Tu, Chelsea Szu; Lederer, Alex R; Chu, Angela; Suresh, Sundari; Nguyen, Michelle; Horecka, Joe; Tripathi, Ashutosh; Burnett, Wallace T; Morgan, Maddison A; Schulz, Julia; Orsley, Kevin M; Wei, Wu; Aiyar, Raeka S; Davis, Ronald W; Bankaitis, Vytas A; Haber, James E; Salit, Marc L; St Onge, Robert P; Steinmetz, Lars M

2018-07-01

Our understanding of how genotype controls phenotype is limited by the scale at which we can precisely alter the genome and assess the phenotypic consequences of each perturbation. Here we describe a CRISPR-Cas9-based method for multiplexed accurate genome editing with short, trackable, integrated cellular barcodes (MAGESTIC) in Saccharomyces cerevisiae. MAGESTIC uses array-synthesized guide-donor oligos for plasmid-based high-throughput editing and features genomic barcode integration to prevent plasmid barcode loss and to enable robust phenotyping. We demonstrate that editing efficiency can be increased more than fivefold by recruiting donor DNA to the site of breaks using the LexA-Fkh1p fusion protein. We performed saturation editing of the essential gene SEC14 and identified amino acids critical for chemical inhibition of lipid signaling. We also constructed thousands of natural genetic variants, characterized guide mismatch tolerance at the genome scale, and ascertained that cryptic Pol III termination elements substantially reduce guide efficacy. MAGESTIC will be broadly useful to uncover the genetic basis of phenotypes in yeast.
MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

PubMed

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2015-01-01

The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Exploring cancer genomic data from the cancer genome atlas project.

PubMed

Lee, Ju-Seog

2016-11-01

The Cancer Genome Atlas (TCGA) has compiled genomic, epigenomic, and proteomic data from more than 10,000 samples derived from 33 types of cancer, aiming to improve our understanding of the molecular basis of cancer development. Availability of these genome-wide information provides an unprecedented opportunity for uncovering new key regulators of signaling pathways or new roles of pre-existing members in pathways. To take advantage of the advancement, it will be necessary to learn systematic approaches that can help to uncover novel genes reflecting genetic alterations, prognosis, or response to treatments. This minireview describes the updated status of TCGA project and explains how to use TCGA data. [BMB Reports 2016; 49(11): 607-611].
Advances in computer simulation of genome evolution: toward more realistic evolutionary genomics analysis by approximate bayesian computation.

PubMed

Arenas, Miguel

2015-04-01

NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.
The interface of genomic technologies and nursing.

PubMed

Loescher, Lois J; Merkle, Carrie J

2005-01-01

(a) to summarize views of the interface of technology, genomic technology, and nursing; (b) provide an overview of current and emerging genomic technologies; (c) present clinical exemplars of uses of genomic technology in two disease conditions; and (d) list genomic-focused nursing research on genomic technologies. A discussion of genomic technology in the context of nurses' views of technology, the importance of genomic technology for nurses, linking the central dogma of molecular biology to state-of-the-art tests and assays, and nurses' current use of technologies. Human genome discoveries will continue to be an integral part of disease prevention, diagnosis, treatment, and management. These discoveries also have the potential for being integrated into nursing science. Genomic technologies are becoming a driving force in patient management, so that nurses will be unable to provide quality care without knowledge of the types of genomic technologies, the rationale for their use, and the possible sequelae that can result from genetic diagnosis or treatment. Many nurses already are using genomic technologies to conduct genomic-focused nursing research. The biobehavioral nature of much of this research further indicates the important contributions of nurses in genomics.
The evolutionary processes of mitochondrial and chloroplast genomes differ from those of nuclear genomes

NASA Astrophysics Data System (ADS)

Korpelainen, Helena

2004-11-01

This paper first introduces our present knowledge of the origin of mitochondria and chloroplasts, and the organization and inheritance patterns of their genomes, and then carries on to review the evolutionary processes influencing mitochondrial and chloroplast genomes. The differences in evolutionary phenomena between the nuclear and cytoplasmic genomes are highlighted. It is emphasized that varying inheritance patterns and copy numbers among different types of genomes, and the potential advantage achieved through the transfer of many cytoplasmic genes to the nucleus, have important implications for the evolution of nuclear, mitochondrial and chloroplast genomes. Cytoplasmic genes transferred to the nucleus have joined the more strictly controlled genetic system of the nuclear genome, including also sexual recombination, while genes retained within the cytoplasmic organelles can be involved in selection and drift processes both within and among individuals. Within-individual processes can be either intra- or intercellular. In the case of heteroplasmy, which is attributed to mutations or biparental inheritance, within-individual selection on cytoplasmic DNA may provide a mechanism by which the organism can adapt rapidly. The inheritance of cytoplasmic genomes is not universally maternal. The presence of a range of inheritance patterns indicates that different strategies have been adopted by different organisms. On the other hand, the variability occasionally observed in the inheritance mechanisms of cytoplasmic genomes reduces heritability and increases environmental components in phenotypic features and, consequently, decreases the potential for adaptive evolution.
Understanding Genomic Knowledge in Rural Appalachia: The West Virginia Genome Community Project.

PubMed

Mallow, Jennifer A; Theeke, Laurie A; Crawford, Patricia; Prendergast, Elizabeth; Conner, Chuck; Richards, Tony; McKown, Barbara; Bush, Donna; Reed, Donald; Stabler, Meagan E; Zhang, Jianjun; Dino, Geri; Barr, Taura L

Rural communities have limited knowledge about genetics and genomics and are also underrepresented in genomic education initiatives. The purpose of this project was to assess genomic and epigenetic knowledge and beliefs in rural West Virginia. A total of 93 participants from three communities participated in focus groups and 68 participants completed a demographic survey. The age of the respondents ranged from 21 to 81 years. Most respondents had a household income of less than $40,000, were female and most were married, completed at least a HS/GED or some college education working either part-time or full-time. A Community Based Participatory Research process with focus groups and demographic questionnaires was used. Most participants had a basic understanding of genetics and epigenetics, but not genomics. Participants reported not knowing much of their family history and that their elders did not discuss such information. If the conversations occurred, it was only during times of crisis or an illness event. Mental health and substance abuse are topics that are not discussed with family in this rural population. Most of the efforts surrounding genetic/genomic understanding have focused on urban populations. This project is the first of its kind in West Virginia and has begun to lay the much needed infrastructure for developing educational initiatives and extending genomic research projects into our rural Appalachian communities. By empowering the public with education, regarding the influential role genetics, genomics, and epigenetics have on their health, we can begin to tackle the complex task of initiating behavior changes that will promote the health and well-being of individuals, families and communities.
Understanding Genomic Knowledge in Rural Appalachia: The West Virginia Genome Community Project

PubMed Central

Mallow, Jennifer A.; Theeke, Laurie A.; Crawford, Patricia; Prendergast, Elizabeth; Conner, Chuck; Richards, Tony; McKown, Barbara; Bush, Donna; Reed, Donald; Stabler, Meagan E.; Zhang, Jianjun; Dino, Geri; Barr, Taura L.

2016-01-01

Purpose Rural communities have limited knowledge about genetics and genomics and are also underrepresented in genomic education initiatives. The purpose of this project was to assess genomic and epigenetic knowledge and beliefs in rural West Virginia. Sample A total of 93 participants from three communities participated in focus groups and 68 participants completed a demographic survey. The age of the respondents ranged from 21 to 81 years. Most respondents had a household income of less than $40,000, were female and most were married, completed at least a HS/GED or some college education working either part-time or full-time. Method A Community Based Participatory Research process with focus groups and demographic questionnaires was used. Findings Most participants had a basic understanding of genetics and epigenetics, but not genomics. Participants reported not knowing much of their family history and that their elders did not discuss such information. If the conversations occurred, it was only during times of crisis or an illness event. Mental health and substance abuse are topics that are not discussed with family in this rural population. Conclusions Most of the efforts surrounding genetic/genomic understanding have focused on urban populations. This project is the first of its kind in West Virginia and has begun to lay the much needed infrastructure for developing educational initiatives and extending genomic research projects into our rural Appalachian communities. By empowering the public with education, regarding the influential role genetics, genomics, and epigenetics have on their health, we can begin to tackle the complex task of initiating behavior changes that will promote the health and well-being of individuals, families and communities. PMID:27212895
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

PubMed

Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

PubMed Central

Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423
Two Low Coverage Bird Genomes and a Comparison of Reference-Guided versus De Novo Genome Assemblies

PubMed Central

Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthew K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies. PMID:25192061
Invited review: Inbreeding in the genomics era: Inbreeding, inbreeding depression, and management of genomic variability.

PubMed

Howard, Jeremy T; Pryce, Jennie E; Baes, Christine; Maltecca, Christian

2017-08-01

Traditionally, pedigree-based relationship coefficients have been used to manage the inbreeding and degree of inbreeding depression that exists within a population. The widespread incorporation of genomic information in dairy cattle genetic evaluations allows for the opportunity to develop and implement methods to manage populations at the genomic level. As a result, the realized proportion of the genome that 2 individuals share can be more accurately estimated instead of using pedigree information to estimate the expected proportion of shared alleles. Furthermore, genomic information allows genome-wide relationship or inbreeding estimates to be augmented to characterize relationships for specific regions of the genome. Region-specific stretches can be used to more effectively manage areas of low genetic diversity or areas that, when homozygous, result in reduced performance across economically important traits. The use of region-specific metrics should allow breeders to more precisely manage the trade-off between the genetic value of the progeny and undesirable side effects associated with inbreeding. Methods tailored toward more effectively identifying regions affected by inbreeding and their associated use to manage the genome at the herd level, however, still need to be developed. We have reviewed topics related to inbreeding, measures of relatedness, genetic diversity and methods to manage populations at the genomic level, and we discuss future challenges related to managing populations through implementing genomic methods at the herd and population levels. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing

PubMed Central

Staton, Margaret; Best, Teodora; Khodwekar, Sudhir; Owusu, Sandra; Xu, Tao; Xu, Yi; Jennings, Tara; Cronn, Richard; Arumuganathan, A. Kathiravetpilla; Coggeshall, Mark; Gailing, Oliver; Liang, Haiying; Romero-Severson, Jeanne; Schlarbaum, Scott; Carlson, John E.

2015-01-01

Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence. PMID:26698853
Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

PubMed

Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

2016-01-01

Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.
Genomics Education for the Public: Perspectives of Genomic Researchers and ELSI Advisors

PubMed Central

Jones, Sondra Smolek; Markey, Janell M.; Byerly, Katherine W.; Roberts, Megan C.

2014-01-01

Aims: For more than two decades genomic education of the public has been a significant challenge. As genomic information becomes integrated into daily life and routine clinical care, the need for public education is even more critical. We conducted a pilot study to learn how genomic researchers and ethical, legal, and social implications advisors who were affiliated with large-scale genomic variation studies have approached the issue of educating the public about genomics. Methods/Results: Semi-structured telephone interviews were conducted with researchers and advisors associated with the SNP/HAPMAP studies and the Cancer Genome Atlas Study. Respondents described approach(es) associated with educating the public about their study. Interviews were audio-recorded, transcribed, coded, and analyzed by team review. Although few respondents described formal educational efforts, most provided recommendations for what should/could be done, emphasizing the need for an overarching entity(s) to take responsibility to lead the effort to educate the public. Opposing views were described related to: who this should be; the overall goal of the educational effort; and the educational approach. Four thematic areas emerged: What is the rationale for educating the public about genomics?; Who is the audience?; Who should be responsible for this effort?; and What should the content be? Policy issues associated with these themes included the need to agree on philosophical framework(s) to guide the rationale, content, and target audiences for education programs; coordinate previous/ongoing educational efforts; and develop a centralized knowledge base. Suggestions for next steps are presented. Conclusion: A complex interplay of philosophical, professional, and cultural issues can create impediments to genomic education of the public. Many challenges, however, can be addressed by agreement on a guiding philosophical framework(s) and identification of a responsible entity(s) to provide
Translational Genomics: Practical Applications of the Genomic Revolution in Breast Cancer.

PubMed

Yates, Lucy R; Desmedt, Christine

2017-06-01

The genomic revolution has fundamentally changed our perception of breast cancer. It is now apparent from DNA-based massively parallel sequencing data that at the genomic level, every breast cancer is unique and shaped by the mutational processes to which it was exposed during its lifetime. More than 90 breast cancer driver genes have been identified as recurrently mutated, and many occur at low frequency across the breast cancer population. Certain cancer genes are associated with traditionally defined histologic subtypes, but genomic intertumoral heterogeneity exists even between cancers that appear the same under the microscope. Most breast cancers contain subclonal populations, many of which harbor driver alterations, and subclonal structure is typically remodeled over time, across metastasis and as a consequence of treatment interventions. Genomics is deepening our understanding of breast cancer biology, contributing to an accelerated phase of targeted drug development and providing insights into resistance mechanisms. Genomics is also providing tools necessary to deliver personalized cancer medicine, but a number of challenges must still be addressed. Clin Cancer Res; 23(11); 2630-9. ©2017 AACR See all articles in this CCR Focus section, "Breast Cancer Research: From Base Pairs to Populations." ©2017 American Association for Cancer Research.
High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat.

PubMed

Yan, Honghai; Bekele, Wubishet A; Wight, Charlene P; Peng, Yuanying; Langdon, Tim; Latta, Robert G; Fu, Yong-Bi; Diederichsen, Axel; Howarth, Catherine J; Jellen, Eric N; Boyle, Brian; Wei, Yuming; Tinker, Nicholas A

2016-11-01

Genome analysis of 27 oat species identifies ancestral groups, delineates the D genome, and identifies ancestral origin of 21 mapped chromosomes in hexaploid oat. We investigated genomic relationships among 27 species of the genus Avena using high-density genetic markers revealed by genotyping-by-sequencing (GBS). Two methods of GBS analysis were used: one based on tag-level haplotypes that were previously mapped in cultivated hexaploid oat (A. sativa), and one intended to sample and enumerate tag-level haplotypes originating from all species under investigation. Qualitatively, both methods gave similar predictions regarding the clustering of species and shared ancestral genomes. Furthermore, results were consistent with previous phylogenies of the genus obtained with conventional approaches, supporting the robustness of whole genome GBS analysis. Evidence is presented to justify the final and definitive classification of the tetraploids A. insularis, A. maroccana (=A. magna), and A. murphyi as containing D-plus-C genomes, and not A-plus-C genomes, as is most often specified in past literature. Through electronic painting of the 21 chromosome representations in the hexaploid oat consensus map, we show how the relative frequency of matches between mapped hexaploid-derived haplotypes and AC (DC)-genome tetraploids vs. A- and C-genome diploids can accurately reveal the genome origin of all hexaploid chromosomes, including the approximate positions of inter-genome translocations. Evidence is provided that supports the continued classification of a diverged B genome in AB tetraploids, and it is confirmed that no extant A-genome diploids, including A. canariensis, are similar enough to the D genome of tetraploid and hexaploid oat to warrant consideration as a D-genome diploid.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.