direct genomic values: Topics by Science.gov

Sample records for direct genomic values

Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation.

PubMed

Saatchi, Mahdi; McClure, Mathew C; McKay, Stephanie D; Rolf, Megan M; Kim, JaeWoo; Decker, Jared E; Taxis, Tasia M; Chapple, Richard H; Ramey, Holly R; Northcutt, Sally L; Bauck, Stewart; Woodward, Brent; Dekkers, Jack C M; Fernando, Rohan L; Schnabel, Robert D; Garrick, Dorian J; Taylor, Jeremy F

2011-11-28

Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.
Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation

PubMed Central

2011-01-01

Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy. PMID:22122853
Comparison of Bayesian models to estimate direct genomic values in multi-breed commercial beef cattle

USDA-ARS?s Scientific Manuscript database

Background Several studies have examined the accuracy of genomic selection both within and across purebred beef or dairy populations. However, the accuracy of direct genomic breeding values (DGVs) has been less well studied in crossbred or admixed cattle populations. We used a population of 3,240 cr...
Visualization of the transmission of direct genomic values for paternal and maternal chromosomes for 15 traits in U.S. Brown Swiss, Holstein, and Jersey cattle

USDA-ARS?s Scientific Manuscript database

Reliable haplotypes are available for 171,420 Brown Swiss, Holstein, and Jersey bulls and cows that received genomic evaluations in April 2012. Differences in least-squares means of direct genomic values (DGV) for paternal and maternal haplotypes of Bos taurus autosome (BTA) 1, 6, 14, and 18 for lif...
Approximation of reliability of direct genomic breeding values

USDA-ARS?s Scientific Manuscript database

Two methods to efficiently approximate theoretical genomic reliabilities are presented. The first method is based on the direct inverse of the left hand side (LHS) of mixed model equations. It uses the genomic relationship matrix for a small subset of individuals with the highest genomic relationshi...
Dissection of genomic correlation matrices using multivariate factor analysis in dairy and dual-purpose cattle breeds

USDA-ARS?s Scientific Manuscript database

SNP effects estimated in genomic selection programs allow for the prediction of direct genomic values (DGV) both at genome-wide and chromosomal level. As a consequence, genome-wide (G_GW) or chromosomal (G_CHR) correlation matrices between genomic predictions for different traits can be calculated. ...
New Views on Strand Asymmetry in Insect Mitochondrial Genomes

PubMed Central

Wei, Shu-Jun; Shi, Min; Chen, Xue-Xin; Sharkey, Michael J.; van Achterberg, Cornelis; Ye, Gong-Yin; He, Jun-Hua

2010-01-01

Strand asymmetry in nucleotide composition is a remarkable feature of animal mitochondrial genomes. Understanding the mutation processes that shape strand asymmetry is essential for comprehensive knowledge of genome evolution, demographical population history and accurate phylogenetic inference. Previous studies found that the relative contributions of different substitution types to strand asymmetry are associated with replication alone or both replication and transcription. However, the relative contributions of replication and transcription to strand asymmetry remain unclear. Here we conducted a broad survey of strand asymmetry across 120 insect mitochondrial genomes, with special reference to the correlation between the signs of skew values and replication orientation/gene direction. The results show that the sign of GC skew on entire mitochondrial genomes is reversed in all species of three distantly related families of insects, Philopteridae (Phthiraptera), Aleyrodidae (Hemiptera) and Braconidae (Hymenoptera); the replication-related elements in the A+T-rich regions of these species are inverted, confirming that reversal of strand asymmetry (GC skew) was caused by inversion of replication origin; and finally, the sign of GC skew value is associated with replication orientation but not with gene direction, while that of AT skew value varies with gene direction, replication and codon positions used in analyses. These findings show that deaminations during replication and other mutations contribute more than selection on amino acid sequences to strand compositions of G and C, and that the replication process has a stronger affect on A and T content than does transcription. Our results may contribute to genome-wide studies of replication and transcription mechanisms. PMID:20856815
Value-based genomics.

PubMed

Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi

2018-03-20

Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics.
Value-based genomics

PubMed Central

Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi

2018-01-01

Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics. PMID:29644010
Improving draft genome contiguity with reference-derived in silico mate-pair libraries.

PubMed

Grau, José Horacio; Hackl, Thomas; Koepfli, Klaus-Peter; Hofreiter, Michael

2018-05-01

Contiguous genome assemblies are a highly valued biological resource because of the higher number of completely annotated genes and genomic elements that are usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult to obtain if only low coverage data and/or only distantly related reference genome assemblies are available. In order to improve genome contiguity, we have developed Cross-Species Scaffolding-a new pipeline that imports long-range distance information directly into the de novo assembly process by constructing mate-pair libraries in silico. We show how genome assembly metrics and gene prediction dramatically improve with our pipeline by assembling two primate genomes solely based on ∼30x coverage of shotgun sequencing data.
Short communication: Genotyping of cows to speed up availability of genomic estimated breeding values for direct health traits in Austrian Fleckvieh (Simmental) cattle--genetic and economic aspects.

PubMed

Egger-Danner, C; Schwarzenbacher, H; Willam, A

2014-07-01

The aim of this study was to quantify the impact of genotyping cows with reliable phenotypes for direct health traits on annual monetary genetic gain (AMGG) and discounted profit. The calculations were based on a deterministic approach using ZPLAN software (University of Hohenheim, Stuttgart, Germany). It was assumed that increases in reliability of the total merit index (TMI) of 5, 15, and 25 percentage points were achieved through genotyping 5,000, 25,000, and 50,000 cows, respectively. Costs for phenotyping, genotyping, and genomic estimated breeding values vary between €150 and €20 per cow. The gain in genotyping cows for traits with medium to high heritability is more than for direct health traits with low heritability. The AMGG is increased by 1.5% if the reliability of TMI is 5 percentage points higher (i.e., 5,000 cows genotyped) and 6.53% higher AMGG can be expected when the reliability of TMI is increased by 25 percentage points (i.e., 50,000 cows genotyped). The discounted profit depends not only on the costs of genotyping but also on the population size. This study indicates that genotyping cows with reliable phenotypes is feasible to speed up the availability of genomic estimated breeding values for direct health traits. But, because of the huge amount of valid phenotypes and genotypes needed to establish an efficient genomic evaluation, it is likely that financial constraints will be the main limiting factor for implementation into breeding program such as Fleckvieh Austria. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Genomic prediction using different estimation methodology, blending and cross-validation techniques for growth traits and visual scores in Hereford and Braford cattle.

PubMed

Campos, G S; Reimann, F A; Cardoso, L L; Ferreira, C E R; Junqueira, V S; Schmidt, P I; Braccini Neto, J; Yokoo, M J I; Sollero, B P; Boligon, A A; Cardoso, F F

2018-05-07

The objective of the present study was to evaluate the accuracy and bias of direct and blended genomic predictions using different methods and cross-validation techniques for growth traits (weight and weight gains) and visual scores (conformation, precocity, muscling and size) obtained at weaning and at yearling in Hereford and Braford breeds. Phenotypic data contained 126,290 animals belonging to the Delta G Connection genetic improvement program, and a set of 3,545 animals genotyped with the 50K chip and 131 sires with the 777K. After quality control, 41,045 markers remained for all animals. An animal model was used to estimate (co)variances components and to predict breeding values, which were later used to calculate the deregressed estimated breeding values (DEBV). Animals with genotype and phenotype for the traits studied were divided into four or five groups by random and k-means clustering cross-validation strategies. The values of accuracy of the direct genomic values (DGV) were moderate to high magnitude for at weaning and at yearling traits, ranging from 0.19 to 0.45 for the k-means and 0.23 to 0.78 for random clustering among all traits. The greatest gain in relation to the pedigree BLUP (PBLUP) was 9.5% with the BayesB method with both the k-means and the random clustering. Blended genomic value accuracies ranged from 0.19 to 0.56 for k-means and from 0.21 to 0.82 for random clustering. The analyzes using the historical pedigree and phenotypes contributed additional information to calculate the GEBV and in general, the largest gains were for the single-step (ssGBLUP) method in bivariate analyses with a mean increase of 43.00% among all traits measured at weaning and of 46.27% for those evaluated at yearling. The accuracy values for the marker effects estimation methods were lower for k-means clustering, indicating that the training set relationship to the selection candidates is a major factor affecting accuracy of genomic predictions. The gains in accuracy obtained with genomic blending methods, mainly ssGBLUP in bivariate analyses, indicate that genomic predictions should be used as a tool to improve genetic gains in relation to the traditional PBLUP selection.
Genomic analysis and geographic visualization of H5N1 and SARS-CoV.

PubMed

Hill, Andrew W; Alexandrov, Boyan; Guralnick, Robert P; Janies, Daniel

2007-10-11

Emerging infectious diseases and organisms present critical issues of national security public health, and economic welfare. We still understand little about the zoonotic potential of many viruses. To this end, we are developing novel database tools to manage comparative genomic datasets. These tools add value because they allow us to summarize the direction, frequency and order of genomic changes. We will perform numerous real world tests with our tools with both Avian Influenza and Coronaviruses.
Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.

PubMed

Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G

2010-06-01

The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
Prediction of genomic breeding values for dairy traits in Italian Brown and Simmental bulls using a principal component approach.

PubMed

Pintus, M A; Gaspa, G; Nicolazzi, E L; Vicario, D; Rossoni, A; Ajmone-Marsan, P; Nardone, A; Dimauro, C; Macciotta, N P P

2012-06-01

The large number of markers available compared with phenotypes represents one of the main issues in genomic selection. In this work, principal component analysis was used to reduce the number of predictors for calculating genomic breeding values (GEBV). Bulls of 2 cattle breeds farmed in Italy (634 Brown and 469 Simmental) were genotyped with the 54K Illumina beadchip (Illumina Inc., San Diego, CA). After data editing, 37,254 and 40,179 single nucleotide polymorphisms (SNP) were retained for Brown and Simmental, respectively. Principal component analysis carried out on the SNP genotype matrix extracted 2,257 and 3,596 new variables in the 2 breeds, respectively. Bulls were sorted by birth year to create reference and prediction populations. The effect of principal components on deregressed proofs in reference animals was estimated with a BLUP model. Results were compared with those obtained by using SNP genotypes as predictors with either the BLUP or Bayes_A method. Traits considered were milk, fat, and protein yields, fat and protein percentages, and somatic cell score. The GEBV were obtained for prediction population by blending direct genomic prediction and pedigree indexes. No substantial differences were observed in squared correlations between GEBV and EBV in prediction animals between the 3 methods in the 2 breeds. The principal component analysis method allowed for a reduction of about 90% in the number of independent variables when predicting direct genomic values, with a substantial decrease in calculation time and without loss of accuracy. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Short communication: Implementation of a breeding value for heat tolerance in Australian dairy cattle.

PubMed

Nguyen, Thuy T T; Bowman, Phil J; Haile-Mariam, Mekonnen; Nieuwhof, Gert J; Hayes, Benjamin J; Pryce, Jennie E

2017-09-01

Excessive ambient temperature and humidity can impair milk production and fertility of dairy cows. Selection for heat-tolerant animals is one possible option to mitigate the effects of heat stress. To enable selection for this trait, we describe the development of a heat tolerance breeding value for Australian dairy cattle. We estimated the direct genomic values of decline in milk, fat, and protein yield per unit increase of temperature-humidity index (THI) using 46,726 single nucleotide polymorphisms and a reference population of 2,236 sires and 11,853 cows for Holsteins and 506 sires and 4,268 cows for Jerseys. This new direct genomic value is the Australian genomic breeding value for heat tolerance (HT ABVg). The components of the HT ABVg are the decline in milk, fat, and protein per unit increase in THI when THI increases above the threshold of 60. These components are weighted by their respective economic values, assumed to be equivalent to the weights applied to milk, fat, and protein yield in the Australian selection indices. Within each breed, the HT ABVg is then standardized to have a mean of 100 and standard deviation (SD) of 5, which is consistent with the presentation of breeding values for many other traits in Australia. The HT ABVg ranged from -4 to +3 SD in Holsteins and -3 to +4 SD in Jerseys. The mean reliabilities of HT ABVg among validation sires, calculated from the prediction error variance and additive genetic variance, were 38% in both breeds. The range in ABVg and their reliability suggests that HT can be improved using genomic selection. There has been a deterioration in the genetic trend of HT, and to moderate the decline it is suggested that the HT ABVg should be included in a multitrait economic index with other traits that contribute to farm profit. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Should direct-to-consumer personalized genomic medicine remain unregulated?: a rebuttal of the defenses.

PubMed

Valles, Sean A

2012-01-01

Direct-to-consumer personalized genomic medicine has recently grown into a small industry that sells mail-order DNA sample kits and then provides disease risk assessments, typically based upon results from genome-trait association studies. The companies selling these services have been largely exempted from FDA regulation in the United States. Testing kit companies and their supporters have defended the industry's unregulated status using two arguments. First, defenders have argued that mere absence of harm is all that must be proved for mail-order tests to be acceptable. Second, defenders of mail-order testing have argued that there is an individual right to the tests' information. This article rebuts these arguments. The article demonstrates that the direct-to-consumer market has resulted in the sidelining of clinical utility (medical value to patients), leading to the development of certain mail-order tests that do not promote customers' interests and to defenders' downplaying of a potentially damaging empirical study of mail-order genomic testing's effects on consumers. The article also shows that the notion of an individual right to these tests rests on a flawed reading of the key service provided by mail-order companies, which is the provision of medical interpretations, not simply genetic information. Absent these two justifications, there is no reason to exempt direct-to-consumer personalized genomic medicine from stringent federal oversight.
Selective intra-dinucleotide interactions and periodicities of bases separated by K sites: a new vision and tool for phylogeny analyses.

PubMed

Valenzuela, Carlos Y

2017-02-13

Direct tests of the random or non-random distribution of nucleotides on genomes have been devised to test the hypothesis of neutral, nearly-neutral or selective evolution. These tests are based on the direct base distribution and are independent of the functional (coding or non-coding) or structural (repeated or unique sequences) properties of the DNA. The first approach described the longitudinal distribution of bases in tandem repeats under the Bose-Einstein statistics. A huge deviation from randomness was found. A second approach was the study of the base distribution within dinucleotides whose bases were separated by 0, 1, 2… K nucleotides. Again an enormous difference from the random distribution was found with significances out of tables and programs. These test values were periodical and included the 16 dinucleotides. For example a high "positive" (more observed than expected dinucleotides) value, found in dinucleotides whose bases were separated by (3K + 2) sites, was preceded by two smaller "negative" (less observed than expected dinucleotides) values, whose bases were separated by (3K) or (3K + 1) sites. We examined mtDNAs, prokaryote genomes and some eukaryote chromosomes and found that the significant non-random interactions and periodicities were present up to 1000 or more sites of base separation and in human chromosome 21 until separations of more than 10 millions sites. Each nucleotide has its own significant value of its distance to neutrality; this yields 16 hierarchical significances. A three dimensional table with the number of sites of separation between the bases and the 16 significances (the third dimension is the dinucleotide, individual or taxon involved) gives directly an evolutionary state of the analyzed genome that can be used to obtain phylogenies. An example is provided.
Genome Alignment Spanning Major Poaceae Lineages Reveals Heterogeneous Evolutionary Rates and Alters Inferred Dates for Key Evolutionary Events.

PubMed

Wang, Xiyin; Wang, Jingpeng; Jin, Dianchuan; Guo, Hui; Lee, Tae-Ho; Liu, Tao; Paterson, Andrew H

2015-06-01

Multiple comparisons among genomes can clarify their evolution, speciation, and functional innovations. To date, the genome sequences of eight grasses representing the most economically important Poaceae (grass) clades have been published, and their genomic-level comparison is an essential foundation for evolutionary, functional, and translational research. Using a formal and conservative approach, we aligned these genomes. Direct comparison of paralogous gene pairs all duplicated simultaneously reveal striking variation in evolutionary rates among whole genomes, with nucleotide substitution slowest in rice and up to 48% faster in other grasses, adding a new dimension to the value of rice as a grass model. We reconstructed ancestral genome contents for major evolutionary nodes, potentially contributing to understanding the divergence and speciation of grasses. Recent fossil evidence suggests revisions of the estimated dates of key evolutionary events, implying that the pan-grass polyploidization occurred ∼96 million years ago and could not be related to the Cretaceous-Tertiary mass extinction as previously inferred. Adjusted dating to reflect both updated fossil evidence and lineage-specific evolutionary rates suggested that maize subgenome divergence and maize-sorghum divergence were virtually simultaneous, a coincidence that would be explained if polyploidization directly contributed to speciation. This work lays a solid foundation for Poaceae translational genomics. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Systematic bias in genomic classification due to contaminating non-neoplastic tissue in breast tumor samples.

PubMed

Elloumi, Fathi; Hu, Zhiyuan; Li, Yan; Parker, Joel S; Gulley, Margaret L; Amos, Keith D; Troester, Melissa A

2011-06-30

Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification. To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors. Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability. Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.

Controlling new knowledge: Genomic science, governance and the politics of bioinformatics.

PubMed

Salter, Brian; Salter, Charlotte

2017-04-01

The rise of bioinformatics is a direct response to the political difficulties faced by genomics in its quest to be a new biomedical innovation, and the value of bioinformatics lies in its role as the bridge between the promise of genomics and its realization in the form of health benefits. Western scientific elites are able to use their close relationship with the state to control and facilitate the emergence of new domains compatible with the existing distribution of epistemic power - all within the embrace of public trust. The incorporation of bioinformatics as the saviour of genomics had to be integrated with the operation of two key aspects of governance in this field: the definition and ownership of the new knowledge. This was achieved mainly by the development of common standards and by the promotion of the values of communality, open access and the public ownership of data to legitimize and maintain the governance power of publicly funded genomic science. Opposition from industry advocating the private ownership of knowledge has been largely neutered through the institutions supporting the science-state concordat. However, in order for translation into health benefits to occur and public trust to be assured, genomic and clinical data have to be integrated and knowledge ownership agreed upon across the separate and distinct governance territories of scientist, clinical medicine and society. Tensions abound as science seeks ways of maintaining its control of knowledge production through the negotiation of new forms of governance with the institutions and values of clinicians and patients.
Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

PubMed

Kärkkäinen, Hanni P; Sillanpää, Mikko J

2013-09-04

Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

PubMed Central

Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

2013-01-01

Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
Technical note: Equivalent genomic models with a residual polygenic effect.

PubMed

Liu, Z; Goddard, M E; Hayes, B J; Reinhardt, F; Reents, R

2016-03-01

Routine genomic evaluations in animal breeding are usually based on either a BLUP with genomic relationship matrix (GBLUP) or single nucleotide polymorphism (SNP) BLUP model. For a multi-step genomic evaluation, these 2 alternative genomic models were proven to give equivalent predictions for genomic reference animals. The model equivalence was verified also for young genotyped animals without phenotypes. Due to incomplete linkage disequilibrium of SNP markers to genes or causal mutations responsible for genetic inheritance of quantitative traits, SNP markers cannot explain all the genetic variance. A residual polygenic effect is normally fitted in the genomic model to account for the incomplete linkage disequilibrium. In this study, we start by showing the proof that the multi-step GBLUP and SNP BLUP models are equivalent for the reference animals, when they have a residual polygenic effect included. Second, the equivalence of both multi-step genomic models with a residual polygenic effect was also verified for young genotyped animals without phenotypes. Additionally, we derived formulas to convert genomic estimated breeding values of the GBLUP model to its components, direct genomic values and residual polygenic effect. Third, we made a proof that the equivalence of these 2 genomic models with a residual polygenic effect holds also for single-step genomic evaluation. Both the single-step GBLUP and SNP BLUP models lead to equal prediction for genotyped animals with phenotypes (e.g., reference animals), as well as for (young) genotyped animals without phenotypes. Finally, these 2 single-step genomic models with a residual polygenic effect were proven to be equivalent for estimation of SNP effects, too. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
TU-CD-BRB-12: Radiogenomics of MRI-Guided Prostate Cancer Biopsy Habitats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoyanova, R; Lynne, C; Abraham, S

2015-06-15

Purpose: Diagnostic prostate biopsies are subject to sampling bias. We hypothesize that quantitative imaging with multiparametric (MP)-MRI can more accurately direct targeted biopsies to index lesions associated with highest risk clinical and genomic features. Methods: Regionally distinct prostate habitats were delineated on MP-MRI (T2-weighted, perfusion and diffusion imaging). Directed biopsies were performed on 17 habitats from 6 patients using MRI-ultrasound fusion. Biopsy location was characterized with 52 radiographic features. Transcriptome-wide analysis of 1.4 million RNA probes was performed on RNA from each habitat. Genomics features with insignificant expression values (<0.25) and interquartile range <0.5 were filtered, leaving total of 212more » genes. Correlation between imaging features, genes and a 22 feature genomic classifier (GC), developed as a prognostic assay for metastasis after radical prostatectomy was investigated. Results: High quality genomic data was derived from 17 (100%) biopsies. Using the 212 ‘unbiased’ genes, the samples clustered by patient origin in unsupervised analysis. When only prostate cancer related genomic features were used, hierarchical clustering revealed samples clustered by needle-biopsy Gleason score (GS). Similarly, principal component analysis of the imaging features, found the primary source of variance segregated the samples into high (≥7) and low (6) GS. Pearson’s correlation analysis of genes with significant expression showed two main patterns of gene expression clustering prostate peripheral and transitional zone MRI features. Two-way hierarchical clustering of GC with radiomics features resulted in the expected groupings of high and low expressed genes in this metastasis signature. Conclusions: MP-MRI-targeted diagnostic biopsies can potentially improve risk stratification by directing pathological and genomic analysis to clinically significant index lesions. As determinant lesions are more reliably identified, targeting with radiotherapy should improve outcome. This is the first demonstration of a link between quantitative imaging features (radiomics) with genomic features in MRI-directed prostate biopsies. The research was supported by NIH- NCI R01 CA 189295 and R01 CA 189295; E Davicioni is partial owner of GenomeDx Biosciences, Inc. M Takhar, N Erho, L Lam, C Buerki and E Davicioni are current employees at GenomeDx Biosciences, Inc.« less
Evaluation of FTA ® paper for storage of oral meta-genomic DNA.

PubMed

Foitzik, Magdalena; Stumpp, Sascha N; Grischke, Jasmin; Eberhard, Jörg; Stiesch, Meike

2014-10-01

The purpose of the present study was to evaluate the short-term storage of meta-genomic DNA from native oral biofilms on FTA(®) paper. Thirteen volunteers of both sexes received an acrylic splint for intraoral biofilm formation over a period of 48 hours. The biofilms were collected, resuspended in phosphate-buffered saline, and either stored on FTA(®) paper or directly processed by standard laboratory DNA extraction. The nucleic acid extraction efficiencies were evaluated by 16S rDNA targeted SSCP fingerprinting. The acquired banding pattern of FTA-derived meta-genomic DNA was compared to a standard DNA preparation protocol. Sensitivity and positive predictive values were calculated. The volunteers showed inter-individual differences in their bacterial species composition. A total of 200 bands were found for both methods and 85% of the banding patterns were equal, representing a sensitivity of 0.941 and a false-negative predictive value of 0.059. Meta-genomic DNA sampling, extraction, and adhesion using FTA(®) paper is a reliable method for storage of microbial DNA for a short period of time.
Controlling new knowledge: Genomic science, governance and the politics of bioinformatics

PubMed Central

Salter, Brian; Salter, Charlotte

2017-01-01

The rise of bioinformatics is a direct response to the political difficulties faced by genomics in its quest to be a new biomedical innovation, and the value of bioinformatics lies in its role as the bridge between the promise of genomics and its realization in the form of health benefits. Western scientific elites are able to use their close relationship with the state to control and facilitate the emergence of new domains compatible with the existing distribution of epistemic power – all within the embrace of public trust. The incorporation of bioinformatics as the saviour of genomics had to be integrated with the operation of two key aspects of governance in this field: the definition and ownership of the new knowledge. This was achieved mainly by the development of common standards and by the promotion of the values of communality, open access and the public ownership of data to legitimize and maintain the governance power of publicly funded genomic science. Opposition from industry advocating the private ownership of knowledge has been largely neutered through the institutions supporting the science-state concordat. However, in order for translation into health benefits to occur and public trust to be assured, genomic and clinical data have to be integrated and knowledge ownership agreed upon across the separate and distinct governance territories of scientist, clinical medicine and society. Tensions abound as science seeks ways of maintaining its control of knowledge production through the negotiation of new forms of governance with the institutions and values of clinicians and patients. PMID:28056721
Impact of direct-to-consumer genomic testing at long term follow-up.

PubMed

Bloss, Cinnamon S; Wineinger, Nathan E; Darst, Burcu F; Schork, Nicholas J; Topol, Eric J

2013-06-01

There are few empirical data to inform the debate surrounding the use and regulation of direct-to-consumer (DTC) genome-wide disease risk tests. This study aimed to determine the long term psychological, behavioural, and clinical impacts of genomic risk testing for common disease. The Scripps Genomic Health Initiative is a prospective longitudinal cohort study of adults who purchased the Navigenics Health Compass, a commercially available genomic test. Web based assessments were administered at baseline, short (3 months), and long term (1 year) follow-up. 2240 participants completed either or both follow-ups and a subset of 1325 completed long term follow-up. There were no significant differences from baseline in anxiety (p=0.50), fat intake (p=0.34), or exercise (p=0.39) at long term follow-up, and 96.8% of the sample had no test related distress. Longitudinal linear mixed model analyses were consistent with results of cross-sectional analyses. Screening test completion was associated with sharing genomic test results with a physician (36.0% shared; p<0.001) and perceived utility of the test (61.5% high perceived utility; p=0.002), but was not associated with the genomic risk estimate values themselves. Over a third of DTC genomic test recipients shared their results with their own physician during an approximate 1 year follow-up period, and this sharing was associated with higher screening test completion. Genomic testing was not associated with long term psychological risks, and most participants reportedly perceived the test to be of high personal utility.
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

PubMed

Chin, Chen-Shan; Alexander, David H; Marks, Patrick; Klammer, Aaron A; Drake, James; Heiner, Cheryl; Clum, Alicia; Copeland, Alex; Huddleston, John; Eichler, Evan E; Turner, Stephen W; Korlach, Jonas

2013-06-01

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.
Accuracy of the unified approach in maternally influenced traits - illustrated by a simulation study in the honey bee (Apis mellifera)

PubMed Central

2013-01-01

Background The honey bee is an economically important species. With a rapid decline of the honey bee population, it is necessary to implement an improved genetic evaluation methodology. In this study, we investigated the applicability of the unified approach and its impact on the accuracy of estimation of breeding values for maternally influenced traits on a simulated dataset for the honey bee. Due to the limitation to the number of individuals that can be genotyped in a honey bee population, the unified approach can be an efficient strategy to increase the genetic gain and to provide a more accurate estimation of breeding values. We calculated the accuracy of estimated breeding values for two evaluation approaches, the unified approach and the traditional pedigree based approach. We analyzed the effects of different heritabilities as well as genetic correlation between direct and maternal effects on the accuracy of estimation of direct, maternal and overall breeding values (sum of maternal and direct breeding values). The genetic and reproductive biology of the honey bee was accounted for by taking into consideration characteristics such as colony structure, uncertain paternity, overlapping generations and polyandry. In addition, we used a modified numerator relationship matrix and a realistic genome for the honey bee. Results For all values of heritability and correlation, the accuracy of overall estimated breeding values increased significantly with the unified approach. The increase in accuracy was always higher for the case when there was no correlation as compared to the case where a negative correlation existed between maternal and direct effects. Conclusions Our study shows that the unified approach is a useful methodology for genetic evaluation in honey bees, and can contribute immensely to the improvement of traits of apicultural interest such as resistance to Varroa or production and behavioural traits. In particular, the study is of great interest for cases where negative correlation between maternal and direct effects and uncertain paternity exist, thus, is of relevance for other species as well. The study also provides an important framework for simulating genomic and pedigree datasets that will prove to be helpful for future studies. PMID:23647776
Accuracy of the unified approach in maternally influenced traits--illustrated by a simulation study in the honey bee (Apis mellifera).

PubMed

Gupta, Pooja; Reinsch, Norbert; Spötter, Andreas; Conrad, Tim; Bienefeld, Kaspar

2013-05-06

The honey bee is an economically important species. With a rapid decline of the honey bee population, it is necessary to implement an improved genetic evaluation methodology. In this study, we investigated the applicability of the unified approach and its impact on the accuracy of estimation of breeding values for maternally influenced traits on a simulated dataset for the honey bee. Due to the limitation to the number of individuals that can be genotyped in a honey bee population, the unified approach can be an efficient strategy to increase the genetic gain and to provide a more accurate estimation of breeding values. We calculated the accuracy of estimated breeding values for two evaluation approaches, the unified approach and the traditional pedigree based approach. We analyzed the effects of different heritabilities as well as genetic correlation between direct and maternal effects on the accuracy of estimation of direct, maternal and overall breeding values (sum of maternal and direct breeding values). The genetic and reproductive biology of the honey bee was accounted for by taking into consideration characteristics such as colony structure, uncertain paternity, overlapping generations and polyandry. In addition, we used a modified numerator relationship matrix and a realistic genome for the honey bee. For all values of heritability and correlation, the accuracy of overall estimated breeding values increased significantly with the unified approach. The increase in accuracy was always higher for the case when there was no correlation as compared to the case where a negative correlation existed between maternal and direct effects. Our study shows that the unified approach is a useful methodology for genetic evaluation in honey bees, and can contribute immensely to the improvement of traits of apicultural interest such as resistance to Varroa or production and behavioural traits. In particular, the study is of great interest for cases where negative correlation between maternal and direct effects and uncertain paternity exist, thus, is of relevance for other species as well. The study also provides an important framework for simulating genomic and pedigree datasets that will prove to be helpful for future studies.
Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast

PubMed Central

Oud, Bart; Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

2012-01-01

Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. PMID:22152095
Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast.

PubMed

Oud, Bart; van Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

2012-03-01

Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
A criticism of the value of midparent in polyploidization.

PubMed

Gianinetti, A

2013-11-01

The hypothesis of genetic additivity states that the effects of different alleles, or different genes, add up to produce the phenotype. When considering the F1 progeny of a cross, the hypothesis of additivity of the genetic dosages provided by the parents is tested against the mid-parent value (MPV), which is the average of parental phenotypes and represents the reference value for genetic additivity. Non-additive effects (genetic interactions) are typically measured as deviations from MPV. Recently, however, the use of MPV has been directly transposed to the study of genetic additivity in newly synthesized plant polyploids, assuming that they should as well display mid-parent expression patterns for additive traits. It is shown here that this direct transposition is incorrect. It is suggested that, in neo-polyploids, mid-parent expression has to be reconsidered in terms of reduced genetic additivity. Homeostatic mechanisms are deemed to be the obvious ones responsible for this effect. Genomes are therefore ruled by negative epistasis, and heterosis in allopolyploids is due to a decreased interaction of the parental repressive systems. It is contended that focalizing on the right perspective has relevant theoretical consequences and makes the studies of neo-polyploids very important for our understanding of how genomes work.
Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers.

PubMed

Weigel, K A; de los Campos, G; González-Recio, O; Naya, H; Wu, X L; Long, N; Rosa, G J M; Gianola, D

2009-10-01

The objective of the present study was to assess the predictive ability of subsets of single nucleotide polymorphism (SNP) markers for development of low-cost, low-density genotyping assays in dairy cattle. Dense SNP genotypes of 4,703 Holstein bulls were provided by the USDA Agricultural Research Service. A subset of 3,305 bulls born from 1952 to 1998 was used to fit various models (training set), and a subset of 1,398 bulls born from 1999 to 2002 was used to evaluate their predictive ability (testing set). After editing, data included genotypes for 32,518 SNP and August 2003 and April 2008 predicted transmitting abilities (PTA) for lifetime net merit (LNM$), the latter resulting from progeny testing. The Bayesian least absolute shrinkage and selection operator method was used to regress August 2003 PTA on marker covariates in the training set to arrive at estimates of marker effects and direct genomic PTA. The coefficient of determination (R(2)) from regressing the April 2008 progeny test PTA of bulls in the testing set on their August 2003 direct genomic PTA was 0.375. Subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP were created by choosing equally spaced and highly ranked SNP, with the latter based on the absolute value of their estimated effects obtained from the training set. The SNP effects were re-estimated from the training set for each subset of SNP, and the 2008 progeny test PTA of bulls in the testing set were regressed on corresponding direct genomic PTA. The R(2) values for subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP with largest effects (evenly spaced SNP) were 0.184 (0.064), 0.236 (0.111), 0.269 (0.190), 0.289 (0.179), 0.307 (0.228), 0.313 (0.268), and 0.322 (0.291), respectively. These results indicate that a low-density assay comprising selected SNP could be a cost-effective alternative for selection decisions and that significant gains in predictive ability may be achieved by increasing the number of SNP allocated to such an assay from 300 or fewer to 1,000 or more.
Genomic tests for ovarian cancer detection and management.

PubMed

Myers, Evan R; Havrilesky, Laura J; Kulasingam, Shalini L; Sanders, Gillian D; Cline, Kathryn E; Gray, Rebecca N; Berchuck, Andrew; McCrory, Douglas C

2006-10-01

To assess the evidence that the use of genomic tests for ovarian cancer screening, diagnosis, and treatment leads to improved outcomes. PubMed and reference lists of recent reviews. We evaluated tests for: (a) single gene products; (b) genetic variations affecting risk of ovarian cancer; (c) gene expression; and (d) proteomics. For tests covered in recent evidence reports (cancer antigen 125 [CA-125] and breast cancer genes 1 and 2 [BRCA1/2]), we added studies published subsequent to the reports. We sought evidence on: (a) the analytic performance of tests in clinical laboratories; (b) the sensitivity and specificity of tests in different patient populations; (c) the clinical impact of testing in asymptomatic women, women with suspected ovarian cancer, and women with diagnosed ovarian cancer; (d) the harms of genomic testing; and (e) the impact of direct-to-consumer and direct-to-physician advertising on appropriate use of tests. We also constructed a computer simulation model to test the impact of different assumptions about ovarian cancer natural history on the relative effectiveness of different strategies. There are reasonable data on the clinical laboratory performance of most radioimmunoassays, but the majority of the data on other genomic tests comes from research laboratories. Genomic test sensitivity/specificity estimates are limited by small sample sizes, spectrum bias, and unrealistically large prevalences of ovarian cancer; in particular, estimates of positive predictive values derived from most of the studies are substantially higher than would be expected in most screening or diagnostic settings. We found no evidence relevant to the question of the impact of genomic tests on health outcomes in asymptomatic women. Although there is a relatively large literature on the association of test results and various clinical outcomes, the clinical utility of changing management based on these results has not been evaluated. We found no evidence that genomic tests for ovarian cancer have unique harms beyond those common to other tests for genetic susceptibility or other tests used in screening, diagnosis, and management of ovarian cancer. Studies of a direct-to-consumer campaign for BRCA1/2 testing suggest increased utilization, but the effect on "appropriateness" was unclear. Model simulations suggest that annual screening, even with a highly sensitive test, will not reduce ovarian cancer mortality by more than 50 percent; frequent screening has a very low positive predictive value, even with a highly specific test. Although research remains promising, adaptation of genomic tests into clinical practice must await appropriately designed and powered studies in relevant clinical settings.
Twenty years of artificial directional selection have shaped the genome of the Italian Large White pig breed.

PubMed

Schiavo, G; Galimberti, G; Calò, D G; Samorè, A B; Bertolini, F; Russo, V; Gallo, M; Buttazzoni, L; Fontanesi, L

2016-04-01

In this study, we investigated at the genome-wide level if 20 years of artificial directional selection based on boar genetic evaluation obtained with a classical BLUP animal model shaped the genome of the Italian Large White pig breed. The most influential boars of this breed (n = 192), born from 1992 (the beginning of the selection program of this breed) to 2012, with an estimated breeding value reliability of >0.85, were genotyped with the Illumina Porcine SNP60 BeadChip. After grouping the boars in eight classes according to their year of birth, filtered single nucleotide polymorphisms (SNPs) were used to evaluate the effects of time on genotype frequency changes using multinomial logistic regression models. Of these markers, 493 had a PBonferroni < 0.10. However, there was an increasing number of SNPs with a decreasing level of allele frequency changes over time, representing a continuous profile across the genome. The largest proportion of the 493 SNPs was on porcine chromosome (SSC) 7, SSC2, SSC8 and SSC18 for a total of 204 haploblocks. Functional annotations of genomic regions, including the 493 shifted SNPs, reported a few Gene Ontology terms that might underly the biological processes that contributed to increase performances of the pigs over the 20 years of the selection program. The obtained results indicated that the genome of the Italian Large White pigs was shaped by a directional selection program derived by the application of methodologies assuming the infinitesimal model that captured a continuous trend of allele frequency changes in the boar population. © 2015 Stichting International Foundation for Animal Genetics.
Multilocus approaches for the measurement of selection on correlated genetic loci.

PubMed

Gompert, Zachariah; Egan, Scott P; Barrett, Rowan D H; Feder, Jeffrey L; Nosil, Patrik

2017-01-01

The study of ecological speciation is inherently linked to the study of selection. Methods for estimating phenotypic selection within a generation based on associations between trait values and fitness (e.g. survival) of individuals are established. These methods attempt to disentangle selection acting directly on a trait from indirect selection caused by correlations with other traits via multivariate statistical approaches (i.e. inference of selection gradients). The estimation of selection on genotypic or genomic variation could also benefit from disentangling direct and indirect selection on genetic loci. However, achieving this goal is difficult with genomic data because the number of potentially correlated genetic loci (p) is very large relative to the number of individuals sampled (n). In other words, the number of model parameters exceeds the number of observations (p ≫ n). We present simulations examining the utility of whole-genome regression approaches (i.e. Bayesian sparse linear mixed models) for quantifying direct selection in cases where p ≫ n. Such models have been used for genome-wide association mapping and are common in artificial breeding. Our results show they hold promise for studies of natural selection in the wild and thus of ecological speciation. But we also demonstrate important limitations to the approach and discuss study designs required for more robust inferences. © 2016 John Wiley & Sons Ltd.
Genomic basis of the differences between cider and dessert apple varieties

PubMed Central

Leforestier, Diane; Ravon, Elisa; Muranty, Hélène; Cornille, Amandine; Lemaire, Christophe; Giraud, Tatiana; Durel, Charles-Eric; Branca, Antoine

2015-01-01

Unraveling the genomic processes at play during variety diversification is of fundamental interest for understanding evolution, but also of applied interest in crop science. It can indeed provide knowledge on the genetic bases of traits for crop improvement and germplasm diversity management. Apple is one of the most important fruit crops in temperate regions, having both great economic and cultural values. Sweet dessert apples are used for direct consumption, while bitter cider apples are used to produce cider. Several important traits are known to differentiate the two variety types, in particular fruit size, biennial versus annual fruit bearing, and bitterness, caused by a higher content in polyphenols. Here, we used an Illumina 8k SNP chip on two core collections, of 48 dessert and 48 cider apples, respectively, for identifying genomic regions responsible for the differences between cider and dessert apples. The genome-wide level of genetic differentiation between cider and dessert apples was low, although 17 candidate regions showed signatures of divergent selection, displaying either outlier FST values or significant association with phenotypic traits (bitter versus sweet fruits). These candidate regions encompassed 420 genes involved in a variety of functions and metabolic pathways, including several colocalizations with QTLs for polyphenol compounds. PMID:26240603
Signatures of co-evolutionary host-pathogen interactions in the genome of the entomopathogenic nematode Steinernema carpocapsae.

PubMed

Flores-Ponce, Mitzi; Vallebueno-Estrada, Miguel; González-Orozco, Eduardo; Ramos-Aboites, Hilda E; García-Chávez, J Noé; Simões, Nelson; Montiel, Rafael

2017-04-26

The entomopathogenic nematode Steinernema carpocapsae has been used worldwide as a biocontrol agent for insect pests, making it an interesting model for understanding parasite-host interactions. Two models propose that these interactions are co-evolutionary processes in such a way that equilibrium is never reached. In one model, known as "arms race", new alleles in relevant genes are fixed in both host and pathogens by directional positive selection, producing recurrent and alternating selective sweeps. In the other model, known as"trench warfare", persistent dynamic fluctuations in allele frequencies are sustained by balancing selection. There are some examples of genes evolving according to both models, however, it is not clear to what extent these interactions might alter genome-level evolutionary patterns and intraspecific diversity. Here we investigate some of these aspects by studying genomic variation in S. carpocapsae and other pathogenic and free-living nematodes from phylogenetic clades IV and V. To look for signatures of an arms-race dynamic, we conducted massive scans to detect directional positive selection in interspecific data. In free-living nematodes, we detected a significantly higher proportion of genes with sites under positive selection than in parasitic nematodes. However, in these genes, we found more enriched Gene Ontology terms in parasites. To detect possible effects of dynamic polymorphisms interactions we looked for signatures of balancing selection in intraspecific genomic data. The observed distribution of Tajima's D values in S. carpocapsae was more skewed to positive values and significantly different from the observed distribution in the free-living Caenorhabditis briggsae. Also, the proportion of significant positive values of Tajima's D was elevated in genes that were differentially expressed after induction with insect tissues as compared to both non-differentially expressed genes and the global scan. Our study provides a first portrait of the effects that lifestyle might have in shaping the patterns of selection at the genomic level. An arms-race between hosts and pathogens seems to be affecting specific genetic functions but not necessarily increasing the number of positively selected genes. Trench warfare dynamics seem to be acting more generally in the genome, likely focusing on genes responding to the interaction, rather than targeting specific genetic functions.

Entropic Profiler – detection of conservation in genomes using information theory

PubMed Central

Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana

2009-01-01

Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
Your DNA, Your Say.

PubMed

Middleton, Anna

2017-04-01

Genomic and medical data sharing is pivotal if the promise of genomic medicine is to be fully realised. Social scientists working in the genomics arena ask the public 'how is the technology working for you?' Empirical studies on attitudes, values and beliefs are incredibly valuable; they offer a voice from those who are, or will be, directly affected. This is paramount if personalised medicine is to be truly personal. An International attitude study, Your DNA, Your Say, uses film to provide background information and an online survey to gather public views on donating one's own personal DNA and medical data for use by others. In this paper the rationale to the project is introduced together with an overview of the survey and film design. The project has been translated into multiple languages and the results will be used in policy for the Global Alliance for Genomics and Health.
Your DNA, Your Say

PubMed Central

Middleton, Anna

2017-01-01

Genomic and medical data sharing is pivotal if the promise of genomic medicine is to be fully realised. Social scientists working in the genomics arena ask the public ‘how is the technology working for you?’ Empirical studies on attitudes, values and beliefs are incredibly valuable; they offer a voice from those who are, or will be, directly affected. This is paramount if personalised medicine is to be truly personal. An International attitude study, Your DNA, Your Say, uses film to provide background information and an online survey to gather public views on donating one's own personal DNA and medical data for use by others. In this paper the rationale to the project is introduced together with an overview of the survey and film design. The project has been translated into multiple languages and the results will be used in policy for the Global Alliance for Genomics and Health. PMID:28517993
Comparison of dimensionality reduction methods to predict genomic breeding values for carcass traits in pigs.

PubMed

Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S

2015-10-09

A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.
The Pathologist Workforce in the United States: II. An Interactive Modeling Tool for Analyzing Future Qualitative and Quantitative Staffing Demands for Services.

PubMed

Robboy, Stanley J; Gupta, Saurabh; Crawford, James M; Cohen, Michael B; Karcher, Donald S; Leonard, Debra G B; Magnani, Barbarajean; Novis, David A; Prystowsky, Michael B; Powell, Suzanne Z; Gross, David J; Black-Schaffer, W Stephen

2015-11-01

Pathologists are physicians who make diagnoses based on interpretation of tissue and cellular specimens (surgical/cytopathology, molecular/genomic pathology, autopsy), provide medical leadership and consultation for laboratory medicine, and are integral members of their institutions' interdisciplinary patient care teams. To develop a dynamic modeling tool to examine how individual factors and practice variables can forecast demand for pathologist services. Build and test a computer-based software model populated with data from surveys and best estimates about current and new pathologist efforts. Most pathologists' efforts focus on anatomic (52%), laboratory (14%), and other direct services (8%) for individual patients. Population-focused services (12%) (eg, laboratory medical direction) and other professional responsibilities (14%) (eg, teaching, research, and hospital committees) consume the rest of their time. Modeling scenarios were used to assess the need to increase or decrease efforts related globally to the Affordable Care Act, and specifically, to genomic medicine, laboratory consolidation, laboratory medical direction, and new areas where pathologists' expertise can add value. Our modeling tool allows pathologists, educators, and policy experts to assess how various factors may affect demand for pathologists' services. These factors include an aging population, advances in biomedical technology, and changing roles in capitated, value-based, and team-based medical care systems. In the future, pathologists will likely have to assume new roles, develop new expertise, and become more efficient in practicing medicine to accommodate new value-based delivery models.
Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.

PubMed

Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen

2017-12-27

Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding

PubMed Central

2013-01-01

Background In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. Results The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. Conclusions The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies. PMID:24314298
Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding.

PubMed

Ould Estaghvirou, Sidi Boubacar; Ogutu, Joseph O; Schulz-Streeck, Torben; Knaak, Carsten; Ouzunova, Milena; Gordillo, Andres; Piepho, Hans-Peter

2013-12-06

In genomic prediction, an important measure of accuracy is the correlation between the predicted and the true breeding values. Direct computation of this quantity for real datasets is not possible, because the true breeding value is unknown. Instead, the correlation between the predicted breeding values and the observed phenotypic values, called predictive ability, is often computed. In order to indirectly estimate predictive accuracy, this latter correlation is usually divided by an estimate of the square root of heritability. In this study we use simulation to evaluate estimates of predictive accuracy for seven methods, four (1 to 4) of which use an estimate of heritability to divide predictive ability computed by cross-validation. Between them the seven methods cover balanced and unbalanced datasets as well as correlated and uncorrelated genotypes. We propose one new indirect method (4) and two direct methods (5 and 6) for estimating predictive accuracy and compare their performances and those of four other existing approaches (three indirect (1 to 3) and one direct (7)) with simulated true predictive accuracy as the benchmark and with each other. The size of the estimated genetic variance and hence heritability exerted the strongest influence on the variation in the estimated predictive accuracy. Increasing the number of genotypes considerably increases the time required to compute predictive accuracy by all the seven methods, most notably for the five methods that require cross-validation (Methods 1, 2, 3, 4 and 6). A new method that we propose (Method 5) and an existing method (Method 7) used in animal breeding programs were the fastest and gave the least biased, most precise and stable estimates of predictive accuracy. Of the methods that use cross-validation Methods 4 and 6 were often the best. The estimated genetic variance and the number of genotypes had the greatest influence on predictive accuracy. Methods 5 and 7 were the fastest and produced the least biased, the most precise, robust and stable estimates of predictive accuracy. These properties argue for routinely using Methods 5 and 7 to assess predictive accuracy in genomic selection studies.
Research 2.0: social networking and direct-to-consumer (DTC) genomics.

PubMed

Lee, Sandra Soo-Jin; Crawley, LaVera

2009-01-01

The convergence of increasingly efficient high throughput sequencing technology and ubiquitous Internet use by the public has fueled the proliferation of companies that provide personal genetic information (PGI) direct-to-consumers. Companies such as 23andme (Mountain View, CA) and Navigenics (Foster City, CA) are emblematic of a growing market for PGI that some argue represents a paradigm shift in how the public values this information and incorporates it into how they behave and plan for their futures. This new class of social networking business ventures that market the science of the personal genome illustrates the new trend in collaborative science. In addition to fostering a consumer empowerment movement, it promotes the trend of democratizing information--openly sharing of data with all interested parties, not just the biomedical researcher--for the purposes of pooling data (increasing statistical power) and escalating the innovation process. This target article discusses the need for new approaches to studying DTC genomics using social network analysis to identify the impact of obtaining, sharing, and using PGI. As a locus of biosociality, DTC personal genomics forges social relationships based on beliefs of common genetic susceptibility that links risk, disease, and group identity. Ethical issues related to the reframing of DTC personal genomic consumers as advocates and research subjects and the creation of new social formations around health research may be identified through social network analysis.
Facts, values, and journalism.

PubMed

Gilbert, Susan

2017-03-01

At a time of fake news, hacks, leaks, and unverified reports, many people are unsure whom to believe. How can we communicate in ways that make individuals question their assumptions and learn? My colleagues at The Hastings Center and many journalists and scientists are grappling with this question and have, independently, reached the same first step: recognize that facts can't be fully understood without probing their connection to values. "Explaining the basics is important, of course, but we also need to diversify our approach to the coverage of science-particularly as it intersects with the matrix of cultural, religious, social, and political values of our readers," said an article in Undark, an online magazine of science journalism. An editorial in Nature called for scientists to engage directly with citizens in debates over climate change and genome editing, noting that "the ethical issues can be critically dependent on the science, for example, in understanding where the boundaries between non-heritable and heritable genome modifications might be." We're here to help. © 2017 The Hastings Center.
Approaches to advancing quantitative human health risk assessment of environmental chemicals in the post-genomic era.

PubMed

Chiu, Weihsueh A; Euling, Susan Y; Scott, Cheryl Siegel; Subramaniam, Ravi P

2013-09-15

The contribution of genomics and associated technologies to human health risk assessment for environmental chemicals has focused largely on elucidating mechanisms of toxicity, as discussed in other articles in this issue. However, there is interest in moving beyond hazard characterization to making more direct impacts on quantitative risk assessment (QRA)--i.e., the determination of toxicity values for setting exposure standards and cleanup values. We propose that the evolution of QRA of environmental chemicals in the post-genomic era will involve three, somewhat overlapping phases in which different types of approaches begin to mature. The initial focus (in Phase I) has been and continues to be on "augmentation" of weight of evidence--using genomic and related technologies qualitatively to increase the confidence in and scientific basis of the results of QRA. Efforts aimed towards "integration" of these data with traditional animal-based approaches, in particular quantitative predictors, or surrogates, for the in vivo toxicity data to which they have been anchored are just beginning to be explored now (in Phase II). In parallel, there is a recognized need for "expansion" of the use of established biomarkers of susceptibility or risk of human diseases and disorders for QRA, particularly for addressing the issues of cumulative assessment and population risk. Ultimately (in Phase III), substantial further advances could be realized by the development of novel molecular and pathway-based biomarkers and statistical and in silico models that build on anticipated progress in understanding the pathways of human diseases and disorders. Such efforts would facilitate a gradual "reorientation" of QRA towards approaches that more directly link environmental exposures to human outcomes. Published by Elsevier Inc.
Genome-Based Taxonomic Classification of Bacteroidetes

DOE PAGES

Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; ...

2016-12-20

The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Genome-Based Taxonomic Classification of Bacteroidetes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina

The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Genome-Based Taxonomic Classification of Bacteroidetes

PubMed Central

Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N.; Woyke, Tanja; Kyrpides, Nikos C.; Klenk, Hans-Peter; Göker, Markus

2016-01-01

The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved. PMID:28066339
Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

PubMed

Lee, Hayan; Schatz, Michael C

2012-08-15

Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net
Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers.

PubMed

Da, Yang

2015-12-18

The amount of functional genomic information has been growing rapidly but remains largely unused in genomic selection. Genomic prediction and estimation using haplotypes in genome regions with functional elements such as all genes of the genome can be an approach to integrate functional and structural genomic information for genomic selection. Towards this goal, this article develops a new haplotype approach for genomic prediction and estimation. A multi-allelic haplotype model treating each haplotype as an 'allele' was developed for genomic prediction and estimation based on the partition of a multi-allelic genotypic value into additive and dominance values. Each additive value is expressed as a function of h - 1 additive effects, where h = number of alleles or haplotypes, and each dominance value is expressed as a function of h(h - 1)/2 dominance effects. For a sample of q individuals, the limit number of effects is 2q - 1 for additive effects and is the number of heterozygous genotypes for dominance effects. Additive values are factorized as a product between the additive model matrix and the h - 1 additive effects, and dominance values are factorized as a product between the dominance model matrix and the h(h - 1)/2 dominance effects. Genomic additive relationship matrix is defined as a function of the haplotype model matrix for additive effects, and genomic dominance relationship matrix is defined as a function of the haplotype model matrix for dominance effects. Based on these results, a mixed model implementation for genomic prediction and variance component estimation that jointly use haplotypes and single markers is established, including two computing strategies for genomic prediction and variance component estimation with identical results. The multi-allelic genetic partition fills a theoretical gap in genetic partition by providing general formulations for partitioning multi-allelic genotypic values and provides a haplotype method based on the quantitative genetics model towards the utilization of functional and structural genomic information for genomic prediction and estimation.
Economic evaluation of genomic test-directed chemotherapy for early-stage lymph node-positive breast cancer.

PubMed

Hall, Peter S; McCabe, Christopher; Stein, Robert C; Cameron, David

2012-01-04

Multi-parameter genomic tests identify patients with early-stage breast cancer who are likely to derive little benefit from adjuvant chemotherapy. These tests can potentially spare patients the morbidity from unnecessary chemotherapy and reduce costs. However, the costs of the test must be balanced against the health benefits and cost savings produced. This economic evaluation compared genomic test-directed chemotherapy using the Oncotype DX 21-gene assay with chemotherapy for all eligible patients with lymph node-positive, estrogen receptor-positive early-stage breast cancer. We performed a cost-utility analysis using a state transition model to calculate expected costs and benefits over the lifetime of a cohort of women with estrogen receptor-positive lymph node-positive breast cancer from a UK perspective. Recurrence rates for Oncotype DX-selected risk groups were derived from parametric survival models fitted to data from the Southwest Oncology Group 8814 trial. The primary outcome was the incremental cost-effectiveness ratio, expressed as the cost (in 2011 GBP) per quality-adjusted life-year (QALY). Confidence in the incremental cost-effectiveness ratio was expressed as a probability of cost-effectiveness and was calculated using Monte Carlo simulation. Model parameters were varied deterministically and probabilistically in sensitivity analysis. Value of information analysis was used to rank priorities for further research. The incremental cost-effectiveness ratio for Oncotype DX-directed chemotherapy using a recurrence score cutoff of 18 was £5529 (US $8852) per QALY. The probability that test-directed chemotherapy is cost-effective was 0.61 at a willingness-to-pay threshold of £30 000 per QALY. Results were sensitive to the recurrence rate, long-term anthracycline-related cardiac toxicity, quality of life, test cost, and the time horizon. The highest priority for further research identified by value of information analysis is the recurrence rate in test-selected subgroups. There is substantial uncertainty regarding the cost-effectiveness of Oncotype DX-directed chemotherapy. It is particularly important that future research studies to inform cost-effectiveness-based decisions collect long-term outcome data.
Genomic selection in plant breeding

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor ...
Genome Editing with Engineered Nucleases in Economically Important Animals and Plants: State of the Art in the Research Pipeline.

PubMed

Sovová, Tereza; Kerins, Gerard; Demnerová, Kateřina; Ovesná, Jaroslava

2017-01-01

After induced mutagenesis and transgenesis, genome editing is the next step in the development of breeding techniques. Genome editing using site-directed nucleases - including meganucleases, zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the CRISPR/Cas9 system - is based on the mechanism of double strand breaks. The nuclease is directed to cleave the DNA at a specific place of the genome which is then repaired by natural repair mechanisms. Changes are introduced during the repair that are either accidental or can be targeted if a DNA template with the desirable sequence is provided. These techniques allow making virtually any change to the genome including specific DNA sequence changes, gene insertion, replacements or deletions with unprecedented precision and specificity while being less laborious and more straightforward compared to traditional breeding techniques or transgenesis. Therefore, the research in this field is developing quickly and, apart from model species, multiple studies have focused on economically important species and agronomically important traits that were the key subjects of this review. In plants, studies have been undertaken on disease resistance, herbicide tolerance, nutrient metabolism and nutritional value. In animals, the studies have mainly focused on disease resistance, meat production and allergenicity of milk. However, none of the promising studies has led to commercialization despite several patent applications. The uncertain legal status of genome-editing methods is one of the reasons for poor commercial development, as it is not clear whether the products would fall under the GMO regulation. We believe this issue should be clarified soon in order to allow promising methods to reach their full potential.
Uncovering the genetic signature of quantitative trait evolution with replicated time series data.

PubMed

Franssen, S U; Kofler, R; Schlötterer, C

2017-01-01

The genetic architecture of adaptation in natural populations has not yet been resolved: it is not clear to what extent the spread of beneficial mutations (selective sweeps) or the response of many quantitative trait loci drive adaptation to environmental changes. Although much attention has been given to the genomic footprint of selective sweeps, the importance of selection on quantitative traits is still not well studied, as the associated genomic signature is extremely difficult to detect. We propose 'Evolve and Resequence' as a promising tool, to study polygenic adaptation of quantitative traits in evolving populations. Simulating replicated time series data we show that adaptation to a new intermediate trait optimum has three characteristic phases that are reflected on the genomic level: (1) directional frequency changes towards the new trait optimum, (2) plateauing of allele frequencies when the new trait optimum has been reached and (3) subsequent divergence between replicated trajectories ultimately leading to the loss or fixation of alleles while the trait value does not change. We explore these 3 phase characteristics for relevant population genetic parameters to provide expectations for various experimental evolution designs. Remarkably, over a broad range of parameters the trajectories of selected alleles display a pattern across replicates, which differs both from neutrality and directional selection. We conclude that replicated time series data from experimental evolution studies provide a promising framework to study polygenic adaptation from whole-genome population genetics data.

Mutation rates among RNA viruses

PubMed Central

Drake, John W.; Holland, John J.

1999-01-01

The rate of spontaneous mutation is a key parameter in modeling the genetic structure and evolution of populations. The impact of the accumulated load of mutations and the consequences of increasing the mutation rate are important in assessing the genetic health of populations. Mutation frequencies are among the more directly measurable population parameters, although the information needed to convert them into mutation rates is often lacking. A previous analysis of mutation rates in RNA viruses (specifically in riboviruses rather than retroviruses) was constrained by the quality and quantity of available measurements and by the lack of a specific theoretical framework for converting mutation frequencies into mutation rates in this group of organisms. Here, we describe a simple relation between ribovirus mutation frequencies and mutation rates, apply it to the best (albeit far from satisfactory) available data, and observe a central value for the mutation rate per genome per replication of μg ≈ 0.76. (The rate per round of cell infection is twice this value or about 1.5.) This value is so large, and ribovirus genomes are so informationally dense, that even a modest increase extinguishes the population. PMID:10570172
The New World of Human Genetics: A dialogue between Practitioners & the General Public on Ethical, Legal & Social Implications of the Human Genome Project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schofield, Amy

The history and reasons for launching the Human Genome project and the current uses of genetic human material; Identifying and discussing the major issues stemming directly from genetic research and therapy-including genetic discrimination, medical/ person privacy, allocation of government resources and individual finances, and the effect on the way in which we perceive the value of human life; Discussing the sometimes hidden ethical, social and legislative implications of genetic research and therapy such as informed consent, screening and preservation of genetic materials, efficacy of medical procedures, the role of the government, and equal access to medical coverage.
Genomic Epidemiology of Vibrio cholerae O1 Associated with Floods, Pakistan, 2010

PubMed Central

Shah, Muhammad Ali; Mutreja, Ankur; Thomson, Nicholas; Baker, Stephen; Parkhill, Julian; Dougan, Gordon; Bokhari, Habib

2014-01-01

In August 2010, Pakistan experienced major floods and a subsequent cholera epidemic. To clarify the population dynamics and transmission of Vibrio cholerae in Pakistan, we sequenced the genomes of all V. cholerae O1 El Tor isolates and compared the sequences to a global collection of 146 V. cholerae strains. Within the global phylogeny, all isolates from Pakistan formed 2 new subclades (PSC-1 and PSC-2), lying in the third transmission wave of the seventh-pandemic lineage that could be distinguished by signature deletions and their antimicrobial susceptibilities. Geographically, PSC-1 isolates originated from the coast, whereas PSC-2 isolates originated from inland areas flooded by the Indus River. Single-nucleotide polymorphism accumulation analysis correlated river flow direction with the spread of PSC-2. We found at least 2 sources of cholera in Pakistan during the 2010 epidemic and illustrate the value of a global genomic data bank in contextualizing cholera outbreaks. PMID:24378019
Genomic epidemiology of Vibrio cholerae O1 associated with floods, Pakistan, 2010.

PubMed

Shah, Muhammad Ali; Mutreja, Ankur; Thomson, Nicholas; Baker, Stephen; Parkhill, Julian; Dougan, Gordon; Bokhari, Habib; Wren, Brendan W

2014-01-01

In August 2010, Pakistan experienced major floods and a subsequent cholera epidemic. To clarify the population dynamics and transmission of Vibrio cholerae in Pakistan, we sequenced the genomes of all V. cholerae O1 El Tor isolates and compared the sequences to a global collection of 146 V. cholerae strains. Within the global phylogeny, all isolates from Pakistan formed 2 new subclades (PSC-1 and PSC-2), lying in the third transmission wave of the seventh-pandemic lineage that could be distinguished by signature deletions and their antimicrobial susceptibilities. Geographically, PSC-1 isolates originated from the coast, whereas PSC-2 isolates originated from inland areas flooded by the Indus River. Single-nucleotide polymorphism accumulation analysis correlated river flow direction with the spread of PSC-2. We found at least 2 sources of cholera in Pakistan during the 2010 epidemic and illustrate the value of a global genomic data bank in contextualizing cholera outbreaks.
Future Health Applications of Genomics

PubMed Central

McBride, Colleen M.; Bowen, Deborah; Brody, Lawrence C.; Condit, Celeste M.; Croyle, Robert T.; Gwinn, Marta; Khoury, Muin J.; Koehly, Laura M.; Korf, Bruce R.; Marteau, Theresa M.; McLeroy, Kenneth; Patrick, Kevin; Valente, Thomas W.

2014-01-01

Despite the quickening momentum of genomic discovery, the communication, behavioral, and social sciences research needed for translating this discovery into public health applications has lagged behind. The National Human Genome Research Institute held a 2-day workshop in October 2008 convening an interdisciplinary group of scientists to recommend forward-looking priorities for translational research. This research agenda would be designed to redress the top three risk factors (tobacco use, poor diet, and physical inactivity) that contribute to the four major chronic diseases (heart disease, type 2 diabetes, lung disease, and many cancers) and account for half of all deaths worldwide. Three priority research areas were identified: (1) improving the public’s genetic literacy in order to enhance consumer skills; (2) gauging whether genomic information improves risk communication and adoption of healthier behaviors more than current approaches; and (3) exploring whether genomic discovery in concert with emerging technologies can elucidate new behavioral intervention targets. Important crosscutting themes also were identified, including the need to: (1) anticipate directions of genomic discovery; (2) take an agnostic scientific perspective in framing research questions asking whether genomic discovery adds value to other health promotion efforts; and (3) consider multiple levels of influence and systems that contribute to important public health problems. The priorities and themes offer a framework for a variety of stakeholders, including those who develop priorities for research funding, interdisciplinary teams engaged in genomics research, and policymakers grappling with how to use the products born of genomics research to address public health challenges. PMID:20409503
Direct-to-consumer genomics on the scales of autonomy

PubMed Central

Vayena, Effy

2015-01-01

Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the ‘harm’ arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers’ independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610
A Ranking Approach to Genomic Selection.

PubMed

Blondel, Mathieu; Onogi, Akio; Iwata, Hiroyoshi; Ueda, Naonori

2015-01-01

Genomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual's breeding value for a particular trait of interest, i.e., as a regression problem. To assess predictive accuracy of the model, the Pearson correlation between observed and predicted trait values was used. In this paper, we propose to formulate GS as the problem of ranking individuals according to their breeding value. Our proposed framework allows us to employ machine learning methods for ranking which had previously not been considered in the GS literature. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. Therefore, NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value. We conducted a comparison of 10 existing regression methods and 3 new ranking methods on 6 datasets, consisting of 4 plant species and 25 traits. Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy. RKHS regression and RankSVM also achieve good accuracy when used with an RBF kernel. Traditional regression methods such as Bayesian lasso, wBSR and BayesC were found less suitable for ranking. Pearson correlation was found to correlate poorly with NDCG. Our study suggests two important messages. First, ranking methods are a promising research direction in GS. Second, NDCG can be a useful evaluation measure for GS.
Use of Contemporary Genetics in Cardiovascular Diagnosis

PubMed Central

George, Alfred L.

2015-01-01

An explosion of knowledge regarding the genetic and genomic basis for rare and common diseases has provided a framework for revolutionizing the practice of medicine. Achieving the reality of a genomic medicine era requires that basic discoveries are effectively translated into clinical practice through implementation of genetic and genomic testing. Clinical genetic tests have become routine for many inherited disorders and can be regarded as the standard-of-care in many circumstances including disorders affecting the cardiovascular system. New, high-throughput methods for determining the DNA sequence of all coding exons or complete genomes are being adopted for clinical use to expand the speed and breadth of genetic testing. Along with these extraordinary advances have emerged new challenges to practicing physicians for understanding when and how to use genetic testing along with how to appropriately interpret test results. This review will acquaint readers with general principles of genetic testing including newer technologies, test interpretation and pitfalls. The focus will be on testing genes responsible for monogenic disorders and on other emerging applications such as pharmacogenomic profiling. The discussion will be extended to the new paradigm of direct-to-consumer genetic testing and the value of assessing genomic risk for common diseases. PMID:25421045
Nuclear DNA content and base composition in 28 taxa of Musa.

PubMed

Kamaté, K; Brown, S; Durand, P; Bureau, J M; De Nay, D; Trinh, T H

2001-08-01

The nuclear DNA content of 28 taxa of Musa was assessed by flow cytometry, using line PxPC6 of Petunia hybrida as an internal standard. The 2C DNA value of Musa balbisiana (BB genome) was 1.16 pg, whereas Musa acuminata (AA genome) had an average 2C DNA value of 1.27 pg, with a difference of 11% between its subspecies. The two haploid (IC) genomes, A and B, comprising most of the edible bananas, are therefore of similar size, 0.63 pg (610 million bp) and 0.58 pg (560 million bp), respectively. The genome of diploid Musa is thus threefold that of Arabidopsis thaliana. The genome sizes in a set of triploid Musa cultivars or clones were quite different, with 2C DNA values ranging from 1.61 to 2.23 pg. Likewise, the genome sizes of tetraploid cultivars ranged from 1.94 to 2.37 pg (2C). Apparently, tetraploids (for instance, accession I.C.2) can have a genome size that falls within the range of triploid genome sizes, and vice versa (as in the case of accession Simili Radjah). The 2C values estimated for organs such as leaf, leaf sheath, rhizome, and flower were consistent, whereas root material gave atypical results, owing to browning. The genomic base composition of these Musa taxa had a median value of 40.8% GC (SD = 0.43%).
Directional dominance on stature and cognition in diverse human populations.

PubMed

Joshi, Peter K; Esko, Tonu; Mattsson, Hannele; Eklund, Niina; Gandin, Ilaria; Nutile, Teresa; Jackson, Anne U; Schurmann, Claudia; Smith, Albert V; Zhang, Weihua; Okada, Yukinori; Stančáková, Alena; Faul, Jessica D; Zhao, Wei; Bartz, Traci M; Concas, Maria Pina; Franceschini, Nora; Enroth, Stefan; Vitart, Veronique; Trompet, Stella; Guo, Xiuqing; Chasman, Daniel I; O'Connel, Jeffery R; Corre, Tanguy; Nongmaithem, Suraj S; Chen, Yuning; Mangino, Massimo; Ruggiero, Daniela; Traglia, Michela; Farmaki, Aliki-Eleni; Kacprowski, Tim; Bjonnes, Andrew; van der Spek, Ashley; Wu, Ying; Giri, Anil K; Yanek, Lisa R; Wang, Lihua; Hofer, Edith; Rietveld, Cornelius A; McLeod, Olga; Cornelis, Marilyn C; Pattaro, Cristian; Verweij, Niek; Baumbach, Clemens; Abdellaoui, Abdel; Warren, Helen R; Vuckovic, Dragana; Mei, Hao; Bouchard, Claude; Perry, John R B; Cappellani, Stefania; Mirza, Saira S; Benton, Miles C; Broeckel, Ulrich; Medland, Sarah E; Lind, Penelope A; Malerba, Giovanni; Drong, Alexander; Yengo, Loic; Bielak, Lawrence F; Zhi, Degui; van der Most, Peter J; Shriner, Daniel; Mägi, Reedik; Hemani, Gibran; Karaderi, Tugce; Wang, Zhaoming; Liu, Tian; Demuth, Ilja; Zhao, Jing Hua; Meng, Weihua; Lataniotis, Lazaros; van der Laan, Sander W; Bradfield, Jonathan P; Wood, Andrew R; Bonnefond, Amelie; Ahluwalia, Tarunveer S; Hall, Leanne M; Salvi, Erika; Yazar, Seyhan; Carstensen, Lisbeth; de Haan, Hugoline G; Abney, Mark; Afzal, Uzma; Allison, Matthew A; Amin, Najaf; Asselbergs, Folkert W; Bakker, Stephan J L; Barr, R Graham; Baumeister, Sebastian E; Benjamin, Daniel J; Bergmann, Sven; Boerwinkle, Eric; Bottinger, Erwin P; Campbell, Archie; Chakravarti, Aravinda; Chan, Yingleong; Chanock, Stephen J; Chen, Constance; Chen, Y-D Ida; Collins, Francis S; Connell, John; Correa, Adolfo; Cupples, L Adrienne; Smith, George Davey; Davies, Gail; Dörr, Marcus; Ehret, Georg; Ellis, Stephen B; Feenstra, Bjarke; Feitosa, Mary F; Ford, Ian; Fox, Caroline S; Frayling, Timothy M; Friedrich, Nele; Geller, Frank; Scotland, Generation; Gillham-Nasenya, Irina; Gottesman, Omri; Graff, Misa; Grodstein, Francine; Gu, Charles; Haley, Chris; Hammond, Christopher J; Harris, Sarah E; Harris, Tamara B; Hastie, Nicholas D; Heard-Costa, Nancy L; Heikkilä, Kauko; Hocking, Lynne J; Homuth, Georg; Hottenga, Jouke-Jan; Huang, Jinyan; Huffman, Jennifer E; Hysi, Pirro G; Ikram, M Arfan; Ingelsson, Erik; Joensuu, Anni; Johansson, Åsa; Jousilahti, Pekka; Jukema, J Wouter; Kähönen, Mika; Kamatani, Yoichiro; Kanoni, Stavroula; Kerr, Shona M; Khan, Nazir M; Koellinger, Philipp; Koistinen, Heikki A; Kooner, Manraj K; Kubo, Michiaki; Kuusisto, Johanna; Lahti, Jari; Launer, Lenore J; Lea, Rodney A; Lehne, Benjamin; Lehtimäki, Terho; Liewald, David C M; Lind, Lars; Loh, Marie; Lokki, Marja-Liisa; London, Stephanie J; Loomis, Stephanie J; Loukola, Anu; Lu, Yingchang; Lumley, Thomas; Lundqvist, Annamari; Männistö, Satu; Marques-Vidal, Pedro; Masciullo, Corrado; Matchan, Angela; Mathias, Rasika A; Matsuda, Koichi; Meigs, James B; Meisinger, Christa; Meitinger, Thomas; Menni, Cristina; Mentch, Frank D; Mihailov, Evelin; Milani, Lili; Montasser, May E; Montgomery, Grant W; Morrison, Alanna; Myers, Richard H; Nadukuru, Rajiv; Navarro, Pau; Nelis, Mari; Nieminen, Markku S; Nolte, Ilja M; O'Connor, George T; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R; Pankow, James S; Patarcic, Inga; Pavani, Francesca; Peyser, Patricia A; Pietilainen, Kirsi; Poulter, Neil; Prokopenko, Inga; Ralhan, Sarju; Redmond, Paul; Rich, Stephen S; Rissanen, Harri; Robino, Antonietta; Rose, Lynda M; Rose, Richard; Sala, Cinzia; Salako, Babatunde; Salomaa, Veikko; Sarin, Antti-Pekka; Saxena, Richa; Schmidt, Helena; Scott, Laura J; Scott, William R; Sennblad, Bengt; Seshadri, Sudha; Sever, Peter; Shrestha, Smeeta; Smith, Blair H; Smith, Jennifer A; Soranzo, Nicole; Sotoodehnia, Nona; Southam, Lorraine; Stanton, Alice V; Stathopoulou, Maria G; Strauch, Konstantin; Strawbridge, Rona J; Suderman, Matthew J; Tandon, Nikhil; Tang, Sian-Tsun; Taylor, Kent D; Tayo, Bamidele O; Töglhofer, Anna Maria; Tomaszewski, Maciej; Tšernikova, Natalia; Tuomilehto, Jaakko; Uitterlinden, Andre G; Vaidya, Dhananjay; van Hylckama Vlieg, Astrid; van Setten, Jessica; Vasankari, Tuula; Vedantam, Sailaja; Vlachopoulou, Efthymia; Vozzi, Diego; Vuoksimaa, Eero; Waldenberger, Melanie; Ware, Erin B; Wentworth-Shields, William; Whitfield, John B; Wild, Sarah; Willemsen, Gonneke; Yajnik, Chittaranjan S; Yao, Jie; Zaza, Gianluigi; Zhu, Xiaofeng; Project, The BioBank Japan; Salem, Rany M; Melbye, Mads; Bisgaard, Hans; Samani, Nilesh J; Cusi, Daniele; Mackey, David A; Cooper, Richard S; Froguel, Philippe; Pasterkamp, Gerard; Grant, Struan F A; Hakonarson, Hakon; Ferrucci, Luigi; Scott, Robert A; Morris, Andrew D; Palmer, Colin N A; Dedoussis, George; Deloukas, Panos; Bertram, Lars; Lindenberger, Ulman; Berndt, Sonja I; Lindgren, Cecilia M; Timpson, Nicholas J; Tönjes, Anke; Munroe, Patricia B; Sørensen, Thorkild I A; Rotimi, Charles N; Arnett, Donna K; Oldehinkel, Albertine J; Kardia, Sharon L R; Balkau, Beverley; Gambaro, Giovanni; Morris, Andrew P; Eriksson, Johan G; Wright, Margie J; Martin, Nicholas G; Hunt, Steven C; Starr, John M; Deary, Ian J; Griffiths, Lyn R; Tiemeier, Henning; Pirastu, Nicola; Kaprio, Jaakko; Wareham, Nicholas J; Pérusse, Louis; Wilson, James G; Girotto, Giorgia; Caulfield, Mark J; Raitakari, Olli; Boomsma, Dorret I; Gieger, Christian; van der Harst, Pim; Hicks, Andrew A; Kraft, Peter; Sinisalo, Juha; Knekt, Paul; Johannesson, Magnus; Magnusson, Patrik K E; Hamsten, Anders; Schmidt, Reinhold; Borecki, Ingrid B; Vartiainen, Erkki; Becker, Diane M; Bharadwaj, Dwaipayan; Mohlke, Karen L; Boehnke, Michael; van Duijn, Cornelia M; Sanghera, Dharambir K; Teumer, Alexander; Zeggini, Eleftheria; Metspalu, Andres; Gasparini, Paolo; Ulivi, Sheila; Ober, Carole; Toniolo, Daniela; Rudan, Igor; Porteous, David J; Ciullo, Marina; Spector, Tim D; Hayward, Caroline; Dupuis, Josée; Loos, Ruth J F; Wright, Alan F; Chandak, Giriraj R; Vollenweider, Peter; Shuldiner, Alan; Ridker, Paul M; Rotter, Jerome I; Sattar, Naveed; Gyllensten, Ulf; North, Kari E; Pirastu, Mario; Psaty, Bruce M; Weir, David R; Laakso, Markku; Gudnason, Vilmundur; Takahashi, Atsushi; Chambers, John C; Kooner, Jaspal S; Strachan, David P; Campbell, Harry; Hirschhorn, Joel N; Perola, Markus; Polašek, Ozren; Wilson, James F

2015-07-23

Homozygosity has long been associated with rare, often devastating, Mendelian disorders, and Darwin was one of the first to recognize that inbreeding reduces evolutionary fitness. However, the effect of the more distant parental relatedness that is common in modern human populations is less well understood. Genomic data now allow us to investigate the effects of homozygosity on traits of public health importance by observing contiguous homozygous segments (runs of homozygosity), which are inferred to be homozygous along their complete length. Given the low levels of genome-wide homozygosity prevalent in most human populations, information is required on very large numbers of people to provide sufficient power. Here we use runs of homozygosity to study 16 health-related quantitative traits in 354,224 individuals from 102 cohorts, and find statistically significant associations between summed runs of homozygosity and four complex traits: height, forced expiratory lung volume in one second, general cognitive ability and educational attainment (P < 1 × 10(-300), 2.1 × 10(-6), 2.5 × 10(-10) and 1.8 × 10(-10), respectively). In each case, increased homozygosity was associated with decreased trait value, equivalent to the offspring of first cousins being 1.2 cm shorter and having 10 months' less education. Similar effect sizes were found across four continental groups and populations with different degrees of genome-wide homozygosity, providing evidence that homozygosity, rather than confounding, directly contributes to phenotypic variance. Contrary to earlier reports in substantially smaller samples, no evidence was seen of an influence of genome-wide homozygosity on blood pressure and low density lipoprotein cholesterol, or ten other cardio-metabolic traits. Since directional dominance is predicted for traits under directional evolutionary selection, this study provides evidence that increased stature and cognitive function have been positively selected in human evolution, whereas many important risk factors for late-onset complex diseases may not have been.
Directional dominance on stature and cognition in diverse human populations

PubMed Central

Mattsson, Hannele; Eklund, Niina; Gandin, Ilaria; Nutile, Teresa; Jackson, Anne U.; Schurmann, Claudia; Smith, Albert V.; Zhang, Weihua; Okada, Yukinori; Stančáková, Alena; Faul, Jessica D.; Zhao, Wei; Bartz, Traci M.; Concas, Maria Pina; Franceschini, Nora; Enroth, Stefan; Vitart, Veronique; Trompet, Stella; Guo, Xiuqing; Chasman, Daniel I.; O’Connel, Jeffery R.; Corre, Tanguy; Nongmaithem, Suraj S.; Chen, Yuning; Mangino, Massimo; Ruggiero, Daniela; Traglia, Michela; Farmaki, Aliki-Eleni; Kacprowski, Tim; Bjonnes, Andrew; van der Spek, Ashley; Wu, Ying; Giri, Anil K.; Yanek, Lisa R.; Wang, Lihua; Hofer, Edith; Rietveld, Cornelius A.; McLeod, Olga; Cornelis, Marilyn C.; Pattaro, Cristian; Verweij, Niek; Baumbach, Clemens; Abdellaoui, Abdel; Warren, Helen R.; Vuckovic, Dragana; Mei, Hao; Bouchard, Claude; Perry, John R.B.; Cappellani, Stefania; Mirza, Saira S.; Benton, Miles C.; Broeckel, Ulrich; Medland, Sarah E.; Lind, Penelope A.; Malerba, Giovanni; Drong, Alexander; Yengo, Loic; Bielak, Lawrence F.; Zhi, Degui; van der Most, Peter J.; Shriner, Daniel; Mägi, Reedik; Hemani, Gibran; Karaderi, Tugce; Wang, Zhaoming; Liu, Tian; Demuth, Ilja; Zhao, Jing Hua; Meng, Weihua; Lataniotis, Lazaros; van der Laan, Sander W.; Bradfield, Jonathan P.; Wood, Andrew R.; Bonnefond, Amelie; Ahluwalia, Tarunveer S.; Hall, Leanne M.; Salvi, Erika; Yazar, Seyhan; Carstensen, Lisbeth; de Haan, Hugoline G.; Abney, Mark; Afzal, Uzma; Allison, Matthew A.; Amin, Najaf; Asselbergs, Folkert W.; Bakker, Stephan J.L.; Barr, R. Graham; Baumeister, Sebastian E.; Benjamin, Daniel J.; Bergmann, Sven; Boerwinkle, Eric; Bottinger, Erwin P.; Campbell, Archie; Chakravarti, Aravinda; Chan, Yingleong; Chanock, Stephen J.; Chen, Constance; Chen, Y.-D. Ida; Collins, Francis S.; Connell, John; Correa, Adolfo; Cupples, L. Adrienne; Smith, George Davey; Davies, Gail; Dörr, Marcus; Ehret, Georg; Ellis, Stephen B.; Feenstra, Bjarke; Feitosa, Mary F.; Ford, Ian; Fox, Caroline S.; Frayling, Timothy M.; Friedrich, Nele; Geller, Frank; Scotland, Generation; Gillham-Nasenya, Irina; Gottesman, Omri; Graff, Misa; Grodstein, Francine; Gu, Charles; Haley, Chris; Hammond, Christopher J.; Harris, Sarah E.; Harris, Tamara B.; Hastie, Nicholas D.; Heard-Costa, Nancy L.; Heikkilä, Kauko; Hocking, Lynne J.; Homuth, Georg; Hottenga, Jouke-Jan; Huang, Jinyan; Huffman, Jennifer E.; Hysi, Pirro G.; Ikram, M. Arfan; Ingelsson, Erik; Joensuu, Anni; Johansson, Åsa; Jousilahti, Pekka; Jukema, J. Wouter; Kähönen, Mika; Kamatani, Yoichiro; Kanoni, Stavroula; Kerr, Shona M.; Khan, Nazir M.; Koellinger, Philipp; Koistinen, Heikki A.; Kooner, Manraj K.; Kubo, Michiaki; Kuusisto, Johanna; Lahti, Jari; Launer, Lenore J.; Lea, Rodney A.; Lehne, Benjamin; Lehtimäki, Terho; Liewald, David C.M.; Lind, Lars; Loh, Marie; Lokki, Marja-Liisa; London, Stephanie J.; Loomis, Stephanie J.; Loukola, Anu; Lu, Yingchang; Lumley, Thomas; Lundqvist, Annamari; Männistö, Satu; Marques-Vidal, Pedro; Masciullo, Corrado; Matchan, Angela; Mathias, Rasika A.; Matsuda, Koichi; Meigs, James B.; Meisinger, Christa; Meitinger, Thomas; Menni, Cristina; Mentch, Frank D.; Mihailov, Evelin; Milani, Lili; Montasser, May E.; Montgomery, Grant W.; Morrison, Alanna; Myers, Richard H.; Nadukuru, Rajiv; Navarro, Pau; Nelis, Mari; Nieminen, Markku S.; Nolte, Ilja M.; O’Connor, George T.; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R.; Pankow, James S.; Patarcic, Inga; Pavani, Francesca; Peyser, Patricia A.; Pietilainen, Kirsi; Poulter, Neil; Prokopenko, Inga; Ralhan, Sarju; Redmond, Paul; Rich, Stephen S.; Rissanen, Harri; Robino, Antonietta; Rose, Lynda M.; Rose, Richard; Sala, Cinzia; Salako, Babatunde; Salomaa, Veikko; Sarin, Antti-Pekka; Saxena, Richa; Schmidt, Helena; Scott, Laura J.; Scott, William R.; Sennblad, Bengt; Seshadri, Sudha; Sever, Peter; Shrestha, Smeeta; Smith, Blair H.; Smith, Jennifer A.; Soranzo, Nicole; Sotoodehnia, Nona; Southam, Lorraine; Stanton, Alice V.; Stathopoulou, Maria G.; Strauch, Konstantin; Strawbridge, Rona J.; Suderman, Matthew J.; Tandon, Nikhil; Tang, Sian-Tsun; Taylor, Kent D.; Tayo, Bamidele O.; Töglhofer, Anna Maria; Tomaszewski, Maciej; Tšernikova, Natalia; Tuomilehto, Jaakko; Uitterlinden, Andre G.; Vaidya, Dhananjay; van Hylckama Vlieg, Astrid; van Setten, Jessica; Vasankari, Tuula; Vedantam, Sailaja; Vlachopoulou, Efthymia; Vozzi, Diego; Vuoksimaa, Eero; Waldenberger, Melanie; Ware, Erin B.; Wentworth-Shields, William; Whitfield, John B.; Wild, Sarah; Willemsen, Gonneke; Yajnik, Chittaranjan S.; Yao, Jie; Zaza, Gianluigi; Zhu, Xiaofeng; Project, The BioBank Japan; Salem, Rany M.; Melbye, Mads; Bisgaard, Hans; Samani, Nilesh J.; Cusi, Daniele; Mackey, David A.; Cooper, Richard S.; Froguel, Philippe; Pasterkamp, Gerard; Grant, Struan F.A.; Hakonarson, Hakon; Ferrucci, Luigi; Scott, Robert A.; Morris, Andrew D.; Palmer, Colin N.A.; Dedoussis, George; Deloukas, Panos; Bertram, Lars; Lindenberger, Ulman; Berndt, Sonja I.; Lindgren, Cecilia M.; Timpson, Nicholas J.; Tönjes, Anke; Munroe, Patricia B.; Sørensen, Thorkild I.A.; Rotimi, Charles N.; Arnett, Donna K.; Oldehinkel, Albertine J.; Kardia, Sharon L.R.; Balkau, Beverley; Gambaro, Giovanni; Morris, Andrew P.; Eriksson, Johan G.; Wright, Margie J.; Martin, Nicholas G.; Hunt, Steven C.; Starr, John M.; Deary, Ian J.; Griffiths, Lyn R.; Tiemeier, Henning; Pirastu, Nicola; Kaprio, Jaakko; Wareham, Nicholas J.; Pérusse, Louis; Wilson, James G.; Girotto, Giorgia; Caulfield, Mark J.; Raitakari, Olli; Boomsma, Dorret I.; Gieger, Christian; van der Harst, Pim; Hicks, Andrew A.; Kraft, Peter; Sinisalo, Juha; Knekt, Paul; Johannesson, Magnus; Magnusson, Patrik K.E.; Hamsten, Anders; Schmidt, Reinhold; Borecki, Ingrid B.; Vartiainen, Erkki; Becker, Diane M.; Bharadwaj, Dwaipayan; Mohlke, Karen L.; Boehnke, Michael; van Duijn, Cornelia M.; Sanghera, Dharambir K.; Teumer, Alexander; Zeggini, Eleftheria; Metspalu, Andres; Gasparini, Paolo; Ulivi, Sheila; Ober, Carole; Toniolo, Daniela; Rudan, Igor; Porteous, David J.; Ciullo, Marina; Spector, Tim D.; Hayward, Caroline; Dupuis, Josée; Loos, Ruth J.F.; Wright, Alan F.; Chandak, Giriraj R.; Vollenweider, Peter; Shuldiner, Alan; Ridker, Paul M.; Rotter, Jerome I.; Sattar, Naveed; Gyllensten, Ulf; North, Kari E.; Pirastu, Mario; Psaty, Bruce M.; Weir, David R.; Laakso, Markku; Gudnason, Vilmundur; Takahashi, Atsushi; Chambers, John C.; Kooner, Jaspal S.; Strachan, David P.; Campbell, Harry; Hirschhorn, Joel N.; Perola, Markus

2015-01-01

Homozygosity has long been associated with rare, often devastating, Mendelian disorders1 and Darwin was one of the first to recognise that inbreeding reduces evolutionary fitness2. However, the effect of the more distant parental relatedness common in modern human populations is less well understood. Genomic data now allow us to investigate the effects of homozygosity on traits of public health importance by observing contiguous homozygous segments (runs of homozygosity, ROH), which are inferred to be homozygous along their complete length. Given the low levels of genome-wide homozygosity prevalent in most human populations, information is required on very large numbers of people to provide sufficient power3,4. Here we use ROH to study 16 health-related quantitative traits in 354,224 individuals from 102 cohorts and find statistically significant associations between summed runs of homozygosity (SROH) and four complex traits: height, forced expiratory lung volume in 1 second (FEV1), general cognitive ability (g) and educational attainment (nominal p<1 × 10−300, 2.1 × 10−6, 2.5 × 10−10, 1.8 × 10−10). In each case increased homozygosity was associated with decreased trait value, equivalent to the offspring of first cousins being 1.2 cm shorter and having 10 months less education. Similar effect sizes were found across four continental groups and populations with different degrees of genome-wide homozygosity, providing convincing evidence for the first time that homozygosity, rather than confounding, directly contributes to phenotypic variance. Contrary to earlier reports in substantially smaller samples5,6, no evidence was seen of an influence of genome-wide homozygosity on blood pressure and low density lipoprotein (LDL) cholesterol, or ten other cardio-metabolic traits. Since directional dominance is predicted for traits under directional evolutionary selection7, this study provides evidence that increased stature and cognitive function have been positively selected in human evolution, whereas many important risk factors for late-onset complex diseases may not have been. PMID:26131930
Inducible CRISPR genome-editing tool: classifications and future trends.

PubMed

Dai, Xiaofeng; Chen, Xiao; Fang, Qiuwu; Li, Jia; Bai, Zhonghu

2018-06-01

The discovery of CRISPR-Cas9/dCas9 system has reinforced our ability and revolutionized our history in genome engineering. While Cas9 and dCas9 are programed to modulate gene expression by introducing DNA breaks, blocking transcription factor recruitment or dragging functional groups towards the targeted sites, sgRNAs determine the genomic loci where the modulation occurs. The off-target problem, due to limited sgRNA specificity and genome complexity of many species, has posed concerns for the wide application of this revolutionary technique. To solve this problem and, more importantly, gain power over gene functionality and cell fate control, inducible strategies have been continuously evolved to offer tailored solutions to address specific biological questions. By reviewing recent advances in inducible CRISPR system design and critical elements potentially adding values to such systems, we classify current approaches in this domain into four mechanically distinct categories, namely, "split system", "allosteric system", "combinatorial system", and "transient delivery system", discuss the pros and cons of each system, and point out the under-explored areas and future directions, with the aim of enriching our toolbox of delicate life engineering.
Health System Implications of Direct-to-Consumer Personal Genome Testing

PubMed Central

McGuire, Amy L.; Burke, Wylie

2010-01-01

Direct-to-consumer personal genome testing is now widely available to consumers. Proponents argue that knowledge is power but critics worry about consumer safety and potential harms resulting from misinterpretation of test information. In this article, we consider the health system implications of direct-to-consumer personal genome testing, focusing on issues of accountability, both corporate and professional. PMID:21071927
An Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values.

PubMed

Alberti, Claudio; Daniels, Noah; Hernaez, Mikel; Voges, Jan; Goldfeder, Rachel L; Hernandez-Lopez, Ana A; Mattavelli, Marco; Berger, Bonnie

2016-01-01

This paper provides the specification and an initial validation of an evaluation framework for the comparison of lossy compressors of genome sequencing quality values. The goal is to define reference data, test sets, tools and metrics that shall be used to evaluate the impact of lossy compression of quality values on human genome variant calling. The functionality of the framework is validated referring to two state-of-the-art genomic compressors. This work has been spurred by the current activity within the ISO/IEC SC29/WG11 technical committee (a.k.a. MPEG), which is investigating the possibility of starting a standardization activity for genomic information representation.
Future health applications of genomics: priorities for communication, behavioral, and social sciences research.

PubMed

McBride, Colleen M; Bowen, Deborah; Brody, Lawrence C; Condit, Celeste M; Croyle, Robert T; Gwinn, Marta; Khoury, Muin J; Koehly, Laura M; Korf, Bruce R; Marteau, Theresa M; McLeroy, Kenneth; Patrick, Kevin; Valente, Thomas W

2010-05-01

Despite the quickening momentum of genomic discovery, the communication, behavioral, and social sciences research needed for translating this discovery into public health applications has lagged behind. The National Human Genome Research Institute held a 2-day workshop in October 2008 convening an interdisciplinary group of scientists to recommend forward-looking priorities for translational research. This research agenda would be designed to redress the top three risk factors (tobacco use, poor diet, and physical inactivity) that contribute to the four major chronic diseases (heart disease, type 2 diabetes, lung disease, and many cancers) and account for half of all deaths worldwide. Three priority research areas were identified: (1) improving the public's genetic literacy in order to enhance consumer skills; (2) gauging whether genomic information improves risk communication and adoption of healthier behaviors more than current approaches; and (3) exploring whether genomic discovery in concert with emerging technologies can elucidate new behavioral intervention targets. Important crosscutting themes also were identified, including the need to: (1) anticipate directions of genomic discovery; (2) take an agnostic scientific perspective in framing research questions asking whether genomic discovery adds value to other health promotion efforts; and (3) consider multiple levels of influence and systems that contribute to important public health problems. The priorities and themes offer a framework for a variety of stakeholders, including those who develop priorities for research funding, interdisciplinary teams engaged in genomics research, and policymakers grappling with how to use the products born of genomics research to address public health challenges. 2010. Published by Elsevier Inc.
Evaluation of non-additive genetic variation in feed-related traits of broiler chickens.

PubMed

Li, Y; Hawken, R; Sapp, R; George, A; Lehnert, S A; Henshall, J M; Reverter, A

2017-03-01

Genome-wide association mapping and genomic predictions of phenotype of individuals in livestock are predominately based on the detection and estimation of additive genetic effects. Non-additive genetic effects are largely ignored. Studies in animals, plants, and humans to assess the impact of non-additive genetic effects in genetic analyses have led to differing conclusions. In this paper, we examined the consequences of including non-additive genetic effects in genome-wide association mapping and genomic prediction of total genetic values in a commercial population of 5,658 broiler chickens genotyped for 45,176 single nucleotide polymorphism (SNP) markers. We employed mixed-model equations and restricted maximum likelihood to analyze 7 feed related traits (TRT1 - TRT7). Dominance variance accounted for a significant proportion of the total genetic variance in all 7 traits, ranging from 29.5% for TRT1 to 58.4% for TRT7. Using a 5-fold cross-validation schema, we found that in spite of the large dominance component, including the estimated dominance effects in the prediction of total genetic values did not improve the accuracy of the predictions for any of the phenotypes. We offer some possible explanations for this counter-intuitive result including the possible confounding of dominance deviations with common environmental effects such as hatch, different directional effects of SNP additive and dominance variations, and the gene-gene interactions' failure to contribute to the level of variance. © 2016 Poultry Science Association Inc.
A prospective study of a quantitative PCR ELISA assay for the diagnosis of CMV pneumonia in lung and heart-transplant recipients.

PubMed

Barber, L; Egan, J J; Lomax, J; Haider, Y; Yonan, N; Woodcock, A A; Turner, A J; Fox, A J

2000-08-01

Qualitative polymerase chain reaction (PCR) for the identification of cytomegalovirus (CMV) infection has a low predictive value for the identification of CMV pneumonia. This study prospectively evaluated the application of a quantitative PCR Enzyme-Linked Immuno-Sorbent Assay (ELISA) assay in 9 lung- and 18 heart-transplant recipients who did not receive ganciclovir prophylaxis. DNA was collected from peripheral blood polymorphonuclear leucocytes (PMNL) posttransplantation. Oligonucleotide primers for the glycoprotein B gene (149 bp) were used in a PCR ELISA assay using an internal standard for quantitation. CMV disease was defined as histological evidence of end organ damage. The median level CMV genome equivalents in patients with CMV disease was 2665/2 x 10(5) PMNL (range 1,200 to 61,606) compared to 100 x 10(5) PMNL (range 20 to 855) with infection but no CMV disease (p = 0.036). All patients with CMV disease had genome equivalents levels of >1200/2 x 10(5) PMNL. A cut-off level of 1,200 PMNL had a positive predictive value for CMV disease of 100% and a negative predictive value of 100%. The first detection of levels of CMV genome equivalents above a level of 1200/2 x 10(5) PMNL was at a median of 58 days (range 47 to 147) posttransplant. Quantitative PCR assays for the diagnosis of CMV infection may predict patients at risk of CMV disease and thereby direct preemptive treatment to high-risk patients.
Fish genome manipulation and directional breeding.

PubMed

Ye, Ding; Zhu, ZuoYan; Sun, YongHua

2015-02-01

Aquaculture is one of the fastest developing agricultural industries worldwide. One of the most important factors for sustainable aquaculture is the development of high performing culture strains. Genome manipulation offers a powerful method to achieve rapid and directional breeding in fish. We review the history of fish breeding methods based on classical genome manipulation, including polyploidy breeding and nuclear transfer. Then, we discuss the advances and applications of fish directional breeding based on transgenic technology and recently developed genome editing technologies. These methods offer increased efficiency, precision and predictability in genetic improvement over traditional methods.
Chromium and Genomic Stability

PubMed Central

Wise, Sandra S.; Wise, John Pierce

2014-01-01

Many metals serve as micronutrients which protect against genomic instability. Chromium is most abundant in its trivalent and hexavalent forms. Trivalent chromium has historically been considered an essential element, though recent data indicate that while it can have pharmacological effects and value, it is not essential. There are no data indicating that trivalent chromium promotes genomic stability and, instead may promote genomic instability. Hexavalent chromium is widely accepted as highly toxic and carcinogenic with no nutritional value. Recent data indicate that it causes genomic instability and also has no role in promoting genomic stability. PMID:22192535
Genomic selection in plant breeding.

PubMed

Newell, Mark A; Jannink, Jean-Luc

2014-01-01

Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor and major marker effects. Thus, the GEBV may capture more of the genetic variation for the particular trait under selection.

Bayes Factor based on the Trend Test Incorporating Hardy-Weinberg Disequilibrium: More Powerful to Detect Genetic Association

PubMed Central

Xu, Jinfeng; Yuan, Ao; Zheng, Gang

2012-01-01

Summary In the analysis of case-control genetic association, the trend test and Pearson’s test are the two most commonly used tests. In genome-wide association studies (GWAS), Bayes factor is a useful tool to support significant p-values, and a better measure than p-value when results are compared across studies with different sample sizes. When reporting the p-value of the trend test, we propose a Bayes factor directly based on the trend test. To improve the power to detect association under recessive or dominant genetic models, we propose a Bayes factor based on the trend test and incorporating Hardy-Weinberg disequilibrium in cases. When the true model is unknown, or both the trend test and Pearson’s test or other robust tests are applied in genome-wide scans, we propose a joint Bayes factor, combining the previous two Bayes factors. All three Bayes factors studied in this paper have closed forms and are easy to compute without integrations, so they can be reported along with p-values, especially in GWAS. We discuss how to use each of them and how to specify priors. Simulation studies and applications to three GWAS are provided to illustrate their usefulness to detect non-additive gene susceptibility in practice. PMID:22607017
Improving Genomic Prediction in Cassava Field Experiments Using Spatial Analysis.

PubMed

Elias, Ani A; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc

2018-01-04

Cassava ( Manihot esculenta Crantz) is an important staple food in sub-Saharan Africa. Breeding experiments were conducted at the International Institute of Tropical Agriculture in cassava to select elite parents. Taking into account the heterogeneity in the field while evaluating these trials can increase the accuracy in estimation of breeding values. We used an exploratory approach using the parametric spatial kernels Power, Spherical, and Gaussian to determine the best kernel for a given scenario. The spatial kernel was fit simultaneously with a genomic kernel in a genomic selection model. Predictability of these models was tested through a 10-fold cross-validation method repeated five times. The best model was chosen as the one with the lowest prediction root mean squared error compared to that of the base model having no spatial kernel. Results from our real and simulated data studies indicated that predictability can be increased by accounting for spatial variation irrespective of the heritability of the trait. In real data scenarios we observed that the accuracy can be increased by a median value of 3.4%. Through simulations, we showed that a 21% increase in accuracy can be achieved. We also found that Range (row) directional spatial kernels, mostly Gaussian, explained the spatial variance in 71% of the scenarios when spatial correlation was significant. Copyright © 2018 Elias et al.
Genetic gatekeepers: regulating direct-to-consumer genomic services in an era of participatory medicine.

PubMed

Palmer, Jessica Elizabeth

2012-01-01

Should consumers be able to obtain information about their own bodies, even if it has no proven medical value? Direct-to-consumer ("DTC") genomic companies offer consumers two services: generation of the consumer's personal genetic sequence, and interpretation of that sequence in light of current research. Concerned that consumers will misunderstand genomic information and make ill-advised health decisions, regulators, legislators and scholars have advocated restricted access to DTC genomic services. The Food and Drug Administration, which has historically refrained from regulating most genetic tests, has announced its intent to treat DTC genomic services as medical devices because they make "medical claims." This Article argues that FDA regulation of genomic services as medical devices would be counterproductive. Clinical laboratories conducting genetic tests are already overseen by a federal regime administered by the Centers for Medicare and Medicaid Services. While consumers and clinicians would benefit from clearer communication of test results and their health implications, FDA's gatekeeping framework is ill-suited to weigh the safety and efficacy of genomic information that is not medically actionable in traditional ways. Playing gatekeeper would burden FDA's resources, conflict with the patient-empowering policies promoted by personalized medicine initiatives, impair individuals' access to information in which they have powerful autonomy interests, weaken novel participatory research infrastructures, and set a poor precedent for the future regulation of medical information. Rather than applying its risk-based regulatory framework to genetic information, FDA should ameliorate regulatory uncertainty by working with the Federal Trade Commission and Centers for Medicare and Medicaid Services to ensure that DTC genomic services deliver analytically valid data, market and implement their services in a truthful manner, and fully disclose the limitations of their services. Federal agencies with relevant expertise should collaborate on standards and best practices for interpreting genetic information in light of scientific uncertainty, and an adverse event reporting system should be established to collect empirical data verifying or disproving the speculative harms resulting from individual access to genetic information. Most of all, FDA should take advantage of this opportunity to adapt its regulatory process to an increasingly informational health ecosystem.
Direct-to-consumer genomics on the scales of autonomy.

PubMed

Vayena, Effy

2015-04-01

Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
A score-statistic approach for determining threshold values in QTL mapping.

PubMed

Kao, Chen-Hung; Ho, Hsiang-An

2012-06-01

Issues in determining the threshold values of QTL mapping are often investigated for the backcross and F2 populations with relatively simple genome structures so far. The investigations of these issues in the progeny populations after F2 (advanced populations) with relatively more complicated genomes are generally inadequate. As these advanced populations have been well implemented in QTL mapping, it is important to address these issues for them in more details. Due to an increasing number of meiosis cycle, the genomes of the advanced populations can be very different from the backcross and F2 genomes. Therefore, special devices that consider the specific genome structures present in the advanced populations are required to resolve these issues. By considering the differences in genome structure between populations, we formulate more general score test statistics and gaussian processes to evaluate their threshold values. In general, we found that, given a significance level and a genome size, threshold values for QTL detection are higher in the denser marker maps and in the more advanced populations. Simulations were performed to validate our approach.
Independent test assessment using the extreme value distribution theory.

PubMed

Almeida, Marcio; Blondell, Lucy; Peralta, Juan M; Kent, Jack W; Jun, Goo; Teslovich, Tanya M; Fuchsberger, Christian; Wood, Andrew R; Manning, Alisa K; Frayling, Timothy M; Cingolani, Pablo E; Sladek, Robert; Dyer, Thomas D; Abecasis, Goncalo; Duggirala, Ravindranath; Blangero, John

2016-01-01

The new generation of whole genome sequencing platforms offers great possibilities and challenges for dissecting the genetic basis of complex traits. With a very high number of sequence variants, a naïve multiple hypothesis threshold correction hinders the identification of reliable associations by the overreduction of statistical power. In this report, we examine 2 alternative approaches to improve the statistical power of a whole genome association study to detect reliable genetic associations. The approaches were tested using the Genetic Analysis Workshop 19 (GAW19) whole genome sequencing data. The first tested method estimates the real number of effective independent tests actually being performed in whole genome association project by the use of an extreme value distribution and a set of phenotype simulations. Given the familiar nature of the GAW19 data and the finite number of pedigree founders in the sample, the number of correlations between genotypes is greater than in a set of unrelated samples. Using our procedure, we estimate that the effective number represents only 15 % of the total number of independent tests performed. However, even using this corrected significance threshold, no genome-wide significant association could be detected for systolic and diastolic blood pressure traits. The second approach implements a biological relevance-driven hypothesis tested by exploiting prior computational predictions on the effect of nonsynonymous genetic variants detected in a whole genome sequencing association study. This guided testing approach was able to identify 2 promising single-nucleotide polymorphisms (SNPs), 1 for each trait, targeting biologically relevant genes that could help shed light on the genesis of the human hypertension. The first gene, PFH14 , associated with systolic blood pressure, interacts directly with genes involved in calcium-channel formation and the second gene, MAP4 , encodes a microtubule-associated protein and had already been detected by previous genome-wide association study experiments conducted in an Asian population. Our results highlight the necessity of the development of alternative approached to improve the efficiency on the detection of reasonable candidate associations in whole genome sequencing studies.
Can multi-subpopulation reference sets improve the genomic predictive ability for pigs?

PubMed

Fangmann, A; Bergfelder-Drüing, S; Tholen, E; Simianer, H; Erbe, M

2015-12-01

In most countries and for most livestock species, genomic evaluations are obtained from within-breed analyses. To achieve reliable breeding values, however, a sufficient reference sample size is essential. To increase this size, the use of multibreed reference populations for small populations is considered a suitable option in other species. Over decades, the separate breeding work of different pig breeding organizations in Germany has led to stratified subpopulations in the breed German Large White. Due to this fact and the limited number of Large White animals available in each organization, there was a pressing need for ascertaining if multi-subpopulation genomic prediction is superior compared with within-subpopulation prediction in pigs. Direct genomic breeding values were estimated with genomic BLUP for the trait "number of piglets born alive" using genotype data (Illumina Porcine 60K SNP BeadChip) from 2,053 German Large White animals from five different commercial pig breeding companies. To assess the prediction accuracy of within- and multi-subpopulation reference sets, a random 5-fold cross-validation with 20 replications was performed. The five subpopulations considered were only slightly differentiated from each other. However, the prediction accuracy of the multi-subpopulations approach was not better than that of the within-subpopulation evaluation, for which the predictive ability was already high. Reference sets composed of closely related multi-subpopulation sets performed better than sets of distantly related subpopulations but not better than the within-subpopulation approach. Despite the low differentiation of the five subpopulations, the genetic connectedness between these different subpopulations seems to be too small to improve the prediction accuracy by applying multi-subpopulation reference sets. Consequently, resources should be used for enlarging the reference population within subpopulation, for example, by adding genotyped females.
Comparison of Marker-Based Genomic Estimated Breeding Values and Phenotypic Evaluation for Selection of Bacterial Spot Resistance in Tomato.

PubMed

Liabeuf, Debora; Sim, Sung-Chur; Francis, David M

2018-03-01

Bacterial spot affects tomato crops (Solanum lycopersicum) grown under humid conditions. Major genes and quantitative trait loci (QTL) for resistance have been described, and multiple loci from diverse sources need to be combined to improve disease control. We investigated genomic selection (GS) prediction models for resistance to Xanthomonas euvesicatoria and experimentally evaluated the accuracy of these models. The training population consisted of 109 families combining resistance from four sources and directionally selected from a population of 1,100 individuals. The families were evaluated on a plot basis in replicated inoculated trials and genotyped with single nucleotide polymorphisms (SNP). We compared the prediction ability of models developed with 14 to 387 SNP. Genomic estimated breeding values (GEBV) were derived using Bayesian least absolute shrinkage and selection operator regression (BL) and ridge regression (RR). Evaluations were based on leave-one-out cross validation and on empirical observations in replicated field trials using the next generation of inbred progeny and a hybrid population resulting from selections in the training population. Prediction ability was evaluated based on correlations between GEBV and phenotypes (r g ), percentage of coselection between genomic and phenotypic selection, and relative efficiency of selection (r g /r p ). Results were similar with BL and RR models. Models using only markers previously identified as significantly associated with resistance but weighted based on GEBV and mixed models with markers associated with resistance treated as fixed effects and markers distributed in the genome treated as random effects offered greater accuracy and a high percentage of coselection. The accuracy of these models to predict the performance of progeny and hybrids exceeded the accuracy of phenotypic selection.
Empirical comparison between different methods for genomic prediction of number of piglets born alive in moderate sized breeding populations.

PubMed

Fangmann, A; Sharifi, R A; Heinkel, J; Danowski, K; Schrade, H; Erbe, M; Simianer, H

2017-04-01

Currently used multi-step methods to incorporate genomic information in the prediction of breeding values (BV) implicitly involve many assumptions which, if violated, may result in loss of information, inaccuracies and bias. To overcome this, single-step genomic best linear unbiased prediction (ssGBLUP) was proposed combining pedigree, phenotype and genotype of all individuals for genetic evaluation. Our objective was to implement ssGBLUP for genomic predictions in pigs and to compare the accuracy of ssGBLUP with that of multi-step methods with empirical data of moderately sized pig breeding populations. Different predictions were performed: conventional parent average (PA), direct genomic value (DGV) calculated with genomic BLUP (GBLUP), a GEBV obtained by blending the DGV with PA, and ssGBLUP. Data comprised individuals from a German Landrace (LR) and Large White (LW) population. The trait 'number of piglets born alive' (NBA) was available for 182,054 litters of 41,090 LR sows and 15,750 litters from 4534 LW sows. The pedigree contained 174,021 animals, of which 147,461 (26,560) animals were LR (LW) animals. In total, 526 LR and 455 LW animals were genotyped with the Illumina PorcineSNP60 BeadChip. After quality control and imputation, 495 LR (424 LW) animals with 44,368 (43,678) SNP on 18 autosomes remained for the analysis. Predictive abilities, i.e., correlations between de-regressed proofs and genomic BV, were calculated with a five-fold cross validation and with a forward prediction for young genotyped validation animals born after 2011. Generally, predictive abilities for LR were rather small (0.08 for GBLUP, 0.19 for GEBV and 0.18 for ssGBLUP). For LW, ssGBLUP had the greatest predictive ability (0.45). For both breeds, assessment of reliabilities for young genotyped animals indicated that genomic prediction outperforms PA with ssGBLUP providing greater reliabilities (0.40 for LR and 0.32 for LW) than GEBV (0.35 for LR and 0.29 for LW). Grouping of animals according to information sources revealed that genomic prediction had the highest potential benefit for genotyped animals without their own phenotype. Although, ssGBLUP did not generally outperform GBLUP or GEBV, the results suggest that ssGBLUP can be a useful and conceptually convincing approach for practical genomic prediction of NBA in moderately sized LR and LW populations.
The business value and cost-effectiveness of genomic medicine.

PubMed

Crawford, James M; Aspinall, Mara G

2012-05-01

Genomic medicine offers the promise of more effective diagnosis and treatment of human diseases. Genome sequencing early in the course of disease may enable more timely and informed intervention, with reduced healthcare costs and improved long-term outcomes. However, genomic medicine strains current models for demonstrating value, challenging efforts to achieve fair payment for services delivered, both for laboratory diagnostics and for use of molecular information in clinical management. Current models of healthcare reform stipulate that care must be delivered at equal or lower cost, with better patient and population outcomes. To achieve demonstrated value, genomic medicine must overcome many uncertainties: the clinical relevance of genomic variation; potential variation in technical performance and/or computational analysis; management of massive information sets; and must have available clinical interventions that can be informed by genomic analysis, so as to attain more favorable cost management of healthcare delivery and demonstrate improvements in cost-effectiveness.
Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

PubMed Central

2010-01-01

Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP. PMID:20950478
DOE Office of Scientific and Technical Information (OSTI.GOV)

Allen, J; Velsko, S

This report explores the question of whether meaningful conclusions can be drawn regarding the transmission relationship between two microbial samples on the basis of differences observed between the two sample's respective genomes. Unlike similar forensic applications using human DNA, the rapid rate of microbial genome evolution combined with the dynamics of infectious disease require a shift in thinking on what it means for two samples to 'match' in support of a forensic hypothesis. Previous outbreaks for SARS-CoV, FMDV and HIV were examined to investigate the question of how microbial sequence data can be used to draw inferences that link twomore » infected individuals by direct transmission. The results are counter intuitive with respect to human DNA forensic applications in that some genetic change rather than exact matching improve confidence in inferring direct transmission links, however, too much genetic change poses challenges, which can weaken confidence in inferred links. High rates of infection coupled with relatively weak selective pressure observed in the SARS-CoV and FMDV data lead to fairly low confidence for direct transmission links. Confidence values for forensic hypotheses increased when testing for the possibility that samples are separated by at most a few intermediate hosts. Moreover, the observed outbreak conditions support the potential to provide high confidence values for hypothesis that exclude direct transmission links. Transmission inferences are based on the total number of observed or inferred genetic changes separating two sequences rather than uniquely weighing the importance of any one genetic mismatch. Thus, inferences are surprisingly robust in the presence of sequencing errors provided the error rates are randomly distributed across all samples in the reference outbreak database and the novel sequence samples in question. When the number of observed nucleotide mutations are limited due to characteristics of the outbreak or the availability of only partial rather than whole genome sequencing, indel information was shown to have the potential to improve performance but only for select outbreak conditions. In examined HIV transmission cases, extended evolution proved to be the limiting factor in assigning high confidence to transmission links, however, the potential to correct for extended evolution not associated with transmission events is demonstrated. Outbreak specific conditions such as selective pressure (in the form of varying mutation rate), are shown to impact the strength of inference made and a Monte Carlo simulation tool is introduced, which is used to provide upper and lower bounds on the confidence values associated with a forensic hypothesis.« less
Nuclear DNA Amounts in Angiosperms: Progress, Problems and Prospects

PubMed Central

BENNETT, M. D.; LEITCH, I. J.

2005-01-01

CONTENTSINTRODUCTION45PROGRESS46 Improved systematic representation (species and families)46 (i) First estimates for species46 (ii) First estimates for families47PROBLEMS48 Geographical representation and distribution48 Plant life form48 Obsolescence time bomb49 Errors and inexactitudes49 Genome size, ‘complete’ genome sequencing, and, the euchromatic genome50 The completely sequenced genome50 Weeding out erroneous data52 What is the smallest reliable C-value for an angiosperm?52 What is the minimum C-value for a free-living angiosperm and other free-living organisms?53PROSPECTS FOR THE NEXT TEN YEARS54 Holistic genomics55LITERATURE CITED56APPENDIX59 Notes to the Appendix59 Original references for DNA values89 • Background The nuclear DNA amount in an unreplicated haploid chromosome complement (1C-value) is a key diversity character with many uses. Angiosperm C-values have been listed for reference purposes since 1976, and pooled in an electronic database since 1997 (http://www.kew.org/cval/homepage). Such lists are cited frequently and provide data for many comparative studies. The last compilation was published in 2000, so a further supplementary list is timely to monitor progress against targets set at the first plant genome size workshop in 1997 and to facilitate new goal setting. • Scope The present work lists DNA C-values for 804 species including first values for 628 species from 88 original sources, not included in any previous compilation, plus additional values for 176 species included in a previous compilation. • Conclusions 1998–2002 saw striking progress in our knowledge of angiosperm C-values. At least 1700 first values for species were measured (the most in any five-year period) and familial representation rose from 30 % to 50 %. The loss of many densitometers used to measure DNA C-values proved less serious than feared, owing to the development of relatively inexpensive flow cytometers and computer-based image analysis systems. New uses of the term genome (e.g. in ‘complete’ genome sequencing) can cause confusion. The Arabidopsis Genome Initiative C-value for Arabidopsis thaliana (125 Mb) was a gross underestimate, and an exact C-value based on genome sequencing alone is unlikely to be obtained soon for any angiosperm. Lack of this expected benchmark poses a quandary as to what to use as the basal calibration standard for angiosperms. The next decade offers exciting prospects for angiosperm genome size research. The database (http://www.kew.org/cval/homepage) should become sufficiently representative of the global flora to answer most questions without needing new estimations. DNA amount variation will remain a key interest as an integrated strand of holistic genomics. PMID:15596457
Genomic Model with Correlation Between Additive and Dominance Effects.

PubMed

Xiang, Tao; Christensen, Ole Fredslund; Vitezica, Zulma Gladis; Legarra, Andres

2018-05-09

Dominance genetic effects are rarely included in pedigree-based genetic evaluation. With the availability of single nucleotide polymorphism markers and the development of genomic evaluation, estimates of dominance genetic effects have become feasible using genomic best linear unbiased prediction (GBLUP). Usually, studies involving additive and dominance genetic effects ignore possible relationships between them. It has been often suggested that the magnitude of functional additive and dominance effects at the quantitative trait loci are related, but there is no existing GBLUP-like approach accounting for such correlation. Wellmann and Bennewitz showed two ways of considering directional relationships between additive and dominance effects, which they estimated in a Bayesian framework. However, these relationships cannot be fitted at the level of individuals instead of loci in a mixed model and are not compatible with standard animal or plant breeding software. This comes from a fundamental ambiguity in assigning the reference allele at a given locus. We show that, if there has been selection, assigning the most frequent as the reference allele orients the correlation between functional additive and dominance effects. As a consequence, the most frequent reference allele is expected to have a positive value. We also demonstrate that selection creates negative covariance between genotypic additive and dominance genetic values. For parameter estimation, it is possible to use a combined additive and dominance relationship matrix computed from marker genotypes, and to use standard restricted maximum likelihood (REML) algorithms based on an equivalent model. Through a simulation study, we show that such correlations can easily be estimated by mixed model software and accuracy of prediction for genetic values is slightly improved if such correlations are used in GBLUP. However, a model assuming uncorrelated effects and fitting orthogonal breeding values and dominant deviations performed similarly for prediction. Copyright © 2018, Genetics.
Information management systems for pharmacogenomics.

PubMed

Thallinger, Gerhard G; Trajanoski, Slave; Stocker, Gernot; Trajanoski, Zlatko

2002-09-01

The value of high-throughput genomic research is dramatically enhanced by association with key patient data. These data are generally available but of disparate quality and not typically directly associated. A system that could bring these disparate data sources into a common resource connected with functional genomic data would be tremendously advantageous. However, the integration of clinical and accurate interpretation of the generated functional genomic data requires the development of information management systems capable of effectively capturing the data as well as tools to make that data accessible to the laboratory scientist or to the clinician. In this review these challenges and current information technology solutions associated with the management, storage and analysis of high-throughput data are highlighted. It is suggested that the development of a pharmacogenomic data management system which integrates public and proprietary databases, clinical datasets, and data mining tools embedded in a high-performance computing environment should include the following components: parallel processing systems, storage technologies, network technologies, databases and database management systems (DBMS), and application services.
Origin of amphibian and avian chromosomes by fission, fusion, and retention of ancestral chromosomes

PubMed Central

Voss, Stephen R.; Kump, D. Kevin; Putta, Srikrishna; Pauly, Nathan; Reynolds, Anna; Henry, Rema J.; Basa, Saritha; Walker, John A.; Smith, Jeramiah J.

2011-01-01

Amphibian genomes differ greatly in DNA content and chromosome size, morphology, and number. Investigations of this diversity are needed to identify mechanisms that have shaped the evolution of vertebrate genomes. We used comparative mapping to investigate the organization of genes in the Mexican axolotl (Ambystoma mexicanum), a species that presents relatively few chromosomes (n = 14) and a gigantic genome (>20 pg/N). We show extensive conservation of synteny between Ambystoma, chicken, and human, and a positive correlation between the length of conserved segments and genome size. Ambystoma segments are estimated to be four to 51 times longer than homologous human and chicken segments. Strikingly, genes demarking the structures of 28 chicken chromosomes are ordered among linkage groups defining the Ambystoma genome, and we show that these same chromosomal segments are also conserved in a distantly related anuran amphibian (Xenopus tropicalis). Using linkage relationships from the amphibian maps, we predict that three chicken chromosomes originated by fusion, nine to 14 originated by fission, and 12–17 evolved directly from ancestral tetrapod chromosomes. We further show that some ancestral segments were fused prior to the divergence of salamanders and anurans, while others fused independently and randomly as chromosome numbers were reduced in lineages leading to Ambystoma and Xenopus. The maintenance of gene order relationships between chromosomal segments that have greatly expanded and contracted in salamander and chicken genomes, respectively, suggests selection to maintain synteny relationships and/or extremely low rates of chromosomal rearrangement. Overall, the results demonstrate the value of data from diverse, amphibian genomes in studies of vertebrate genome evolution. PMID:21482624
Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies.

PubMed

Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine

2010-03-01

Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.
DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.

PubMed

Goris, Johan; Konstantinidis, Konstantinos T; Klappenbach, Joel A; Coenye, Tom; Vandamme, Peter; Tiedje, James M

2007-01-01

DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.
The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes.

PubMed

Clark, Samuel A; Hickey, John M; Daetwyler, Hans D; van der Werf, Julius H J

2012-02-09

The theory of genomic selection is based on the prediction of the effects of genetic markers in linkage disequilibrium with quantitative trait loci. However, genomic selection also relies on relationships between individuals to accurately predict genetic value. This study aimed to examine the importance of information on relatives versus that of unrelated or more distantly related individuals on the estimation of genomic breeding values. Simulated and real data were used to examine the effects of various degrees of relationship on the accuracy of genomic selection. Genomic Best Linear Unbiased Prediction (gBLUP) was compared to two pedigree based BLUP methods, one with a shallow one generation pedigree and the other with a deep ten generation pedigree. The accuracy of estimated breeding values for different groups of selection candidates that had varying degrees of relationships to a reference data set of 1750 animals was investigated. The gBLUP method predicted breeding values more accurately than BLUP. The most accurate breeding values were estimated using gBLUP for closely related animals. Similarly, the pedigree based BLUP methods were also accurate for closely related animals, however when the pedigree based BLUP methods were used to predict unrelated animals, the accuracy was close to zero. In contrast, gBLUP breeding values, for animals that had no pedigree relationship with animals in the reference data set, allowed substantial accuracy. An animal's relationship to the reference data set is an important factor for the accuracy of genomic predictions. Animals that share a close relationship to the reference data set had the highest accuracy from genomic predictions. However a baseline accuracy that is driven by the reference data set size and the overall population effective population size enables gBLUP to estimate a breeding value for unrelated animals within a population (breed), using information previously ignored by pedigree based BLUP methods.
Combining genomic selection and gene identification for crop improvement

USDA-ARS?s Scientific Manuscript database

The use of genetic information to predict the value of individuals in plant breeding populations began about 40 years ago. The original paradigm was to identify genomic regions with outsize influence on a trait of economic value, then to use markers in that genomic region to select individuals carry...

Nuclear DNA amounts in angiosperms: progress, problems and prospects.

PubMed

Bennett, M D; Leitch, I J

2005-01-01

The nuclear DNA amount in an unreplicated haploid chromosome complement (1C-value) is a key diversity character with many uses. Angiosperm C-values have been listed for reference purposes since 1976, and pooled in an electronic database since 1997 (http://www.kew.org/cval/homepage). Such lists are cited frequently and provide data for many comparative studies. The last compilation was published in 2000, so a further supplementary list is timely to monitor progress against targets set at the first plant genome size workshop in 1997 and to facilitate new goal setting. The present work lists DNA C-values for 804 species including first values for 628 species from 88 original sources, not included in any previous compilation, plus additional values for 176 species included in a previous compilation. 1998-2002 saw striking progress in our knowledge of angiosperm C-values. At least 1700 first values for species were measured (the most in any five-year period) and familial representation rose from 30 % to 50 %. The loss of many densitometers used to measure DNA C-values proved less serious than feared, owing to the development of relatively inexpensive flow cytometers and computer-based image analysis systems. New uses of the term genome (e.g. in 'complete' genome sequencing) can cause confusion. The Arabidopsis Genome Initiative C-value for Arabidopsis thaliana (125 Mb) was a gross underestimate, and an exact C-value based on genome sequencing alone is unlikely to be obtained soon for any angiosperm. Lack of this expected benchmark poses a quandary as to what to use as the basal calibration standard for angiosperms. The next decade offers exciting prospects for angiosperm genome size research. The database (http://www.kew.org/cval/homepage) should become sufficiently representative of the global flora to answer most questions without needing new estimations. DNA amount variation will remain a key interest as an integrated strand of holistic genomics.
Comparisons with Caenorhabditis (approximately 100 Mb) and Drosophila (approximately 175 Mb) using flow cytometry show genome size in Arabidopsis to be approximately 157 Mb and thus approximately 25% larger than the Arabidopsis genome initiative estimate of approximately 125 Mb.

PubMed

Bennett, Michael D; Leitch, Ilia J; Price, H James; Johnston, J Spencer

2003-04-01

Recent genome sequencing papers have given genome sizes of 180 Mb for Drosophila melanogaster Iso-1 and 125 Mb for Arabidopsis thaliana Columbia. The former agrees with early cytochemical estimates, but numerous cytometric estimates of around 170 Mb imply that a genome size of 125 Mb for arabidopsis is an underestimate. In this study, nuclei of species pairs were compared directly using flow cytometry. Co-run Columbia and Iso-1 female gave a 2C peak for arabidopsis only approx. 15 % below that for drosophila, and 16C endopolyploid Columbia nuclei had approx. 15 % more DNA than 2C chicken nuclei (with >2280 Mb). Caenorhabditis elegans Bristol N2 (genome size approx. 100 Mb) co-run with Columbia or Iso-1 gave a 2C peak for drosophila approx. 75 % above that for 2C C. elegans, and a 2C peak for arabidopsis approx. 57 % above that for C. elegans. This confirms that 1C in drosophila is approx. 175 Mb and, combined with other evidence, leads us to conclude that the genome size of arabidopsis is not approx. 125 Mb, but probably approx. 157 Mb. It is likely that the discrepancy represents extra repeated sequences in unsequenced gaps in heterochromatic regions. Complete sequencing of the arabidopsis genome until no gaps remain at telomeres, nucleolar organizing regions or centromeres is still needed to provide the first precise angiosperm C-value as a benchmark calibration standard for plant genomes, and to ensure that no genes have been missed in arabidopsis, especially in centromeric regions, which are clearly larger than once imagined.
Efficient, footprint-free human iPSC genome editing by consolidation of Cas9/CRISPR and piggyBac technologies.

PubMed

Wang, Gang; Yang, Luhan; Grishin, Dennis; Rios, Xavier; Ye, Lillian Y; Hu, Yong; Li, Kai; Zhang, Donghui; Church, George M; Pu, William T

2017-01-01

Genome editing of human induced pluripotent stem cells (hiPSCs) offers unprecedented opportunities for in vitro disease modeling and personalized cell replacement therapy. The introduction of Cas9-directed genome editing has expanded adoption of this approach. However, marker-free genome editing using standard protocols remains inefficient, yielding desired targeted alleles at a rate of ∼1-5%. We developed a protocol based on a doxycycline-inducible Cas9 transgene carried on a piggyBac transposon to enable robust and highly efficient Cas9-directed genome editing, so that a parental line can be expeditiously engineered to harbor many separate mutations. Treatment with doxycycline and transfection with guide RNA (gRNA), donor DNA and piggyBac transposase resulted in efficient, targeted genome editing and concurrent scarless transgene excision. Using this approach, in 7 weeks it is possible to efficiently obtain genome-edited clones with minimal off-target mutagenesis and with indel mutation frequencies of 40-50% and homology-directed repair (HDR) frequencies of 10-20%.
Expanding the metabolic engineering toolbox with directed evolution.

PubMed

Abatemarco, Joseph; Hill, Andrew; Alper, Hal S

2013-12-01

Cellular systems can be engineered into factories that produce high-value chemicals from renewable feedstock. Such an approach requires an expanded toolbox for metabolic engineering. Recently, protein engineering and directed evolution strategies have started to play a growing and critical role within metabolic engineering. This review focuses on the various ways in which directed evolution can be applied in conjunction with metabolic engineering to improve product yields. Specifically, we discuss the application of directed evolution on both catalytic and non-catalytic traits of enzymes, on regulatory elements, and on whole genomes in a metabolic engineering context. We demonstrate how the goals of metabolic pathway engineering can be achieved in part through evolving cellular parts as opposed to traditional approaches that rely on gene overexpression and deletion. Finally, we discuss the current limitations in screening technology that hinder the full implementation of a metabolic pathway-directed evolution approach. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mixed Model Methods for Genomic Prediction and Variance Component Estimation of Additive and Dominance Effects Using SNP Markers

PubMed Central

Da, Yang; Wang, Chunkao; Wang, Shengwen; Hu, Guo

2014-01-01

We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005–0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level. PMID:24498162
Mixed model methods for genomic prediction and variance component estimation of additive and dominance effects using SNP markers.

PubMed

Da, Yang; Wang, Chunkao; Wang, Shengwen; Hu, Guo

2014-01-01

We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005-0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level.
Genome size of termites (Insecta, Dictyoptera, Isoptera) and wood roaches (Insecta, Dictyoptera, Cryptocercidae)

NASA Astrophysics Data System (ADS)

Koshikawa, Shigeyuki; Miyazaki, Satoshi; Cornette, Richard; Matsumoto, Tadao; Miura, Toru

2008-09-01

The evolution of genome size has been discussed in relation to the evolution of various biological traits. In the present study, the genome sizes of 22 dictyopteran species were estimated by Feulgen image analysis densitometry and 6-diamidino-2-phenylindole (DAPI)-based flow cytometry. The haploid genome sizes ( C-values) of termites (Isoptera) ranged from 0.58 to 1.90 pg, and those of Cryptocercus wood roaches (Cryptocercidae) were 1.16 to 1.32 pg. Compared to known values of other cockroaches (Blattaria) and mantids (Mantodea), these values are low. A relatively small genome size appears to be a (syn)apomorphy of Isoptera + Cryptocercus, together with their sociality. In some phylogenetic groups, genome size evolution is thought to be influenced by selective pressure on a particular trait, such as cell size or rate of development. The present results raise the possibility that genome size is influenced by selective pressures on traits associated with the evolution of sociality.
Genome size of termites (Insecta, Dictyoptera, Isoptera) and wood roaches (Insecta, Dictyoptera, Cryptocercidae).

PubMed

Koshikawa, Shigeyuki; Miyazaki, Satoshi; Cornette, Richard; Matsumoto, Tadao; Miura, Toru

2008-09-01

The evolution of genome size has been discussed in relation to the evolution of various biological traits. In the present study, the genome sizes of 22 dictyopteran species were estimated by Feulgen image analysis densitometry and 6-diamidino-2-phenylindole (DAPI)-based flow cytometry. The haploid genome sizes (C-values) of termites (Isoptera) ranged from 0.58 to 1.90 pg, and those of Cryptocercus wood roaches (Cryptocercidae) were 1.16 to 1.32 pg. Compared to known values of other cockroaches (Blattaria) and mantids (Mantodea), these values are low. A relatively small genome size appears to be a (syn)apomorphy of Isoptera + Cryptocercus, together with their sociality. In some phylogenetic groups, genome size evolution is thought to be influenced by selective pressure on a particular trait, such as cell size or rate of development. The present results raise the possibility that genome size is influenced by selective pressures on traits associated with the evolution of sociality.
Informational laws of genome structures

PubMed Central

Bonnici, Vincenzo; Manca, Vincenzo

2016-01-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155
Informational laws of genome structures

NASA Astrophysics Data System (ADS)

Bonnici, Vincenzo; Manca, Vincenzo

2016-06-01

In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Quantifying whole transcriptome size, a prerequisite for understanding transcriptome evolution across species: an example from a plant allopolyploid.

PubMed

Coate, Jeremy E; Doyle, Jeff J

2010-01-01

Evolutionary biologists are increasingly comparing gene expression patterns across species. Due to the way in which expression assays are normalized, such studies provide no direct information about expression per gene copy (dosage responses) or per cell and can give a misleading picture of genes that are differentially expressed. We describe an assay for estimating relative expression per cell. When used in conjunction with transcript profiling data, it is possible to compare the sizes of whole transcriptomes, which in turn makes it possible to compare expression per cell for each gene in the transcript profiling data set. We applied this approach, using quantitative reverse transcriptase-polymerase chain reaction and high throughput RNA sequencing, to a recently formed allopolyploid and showed that its leaf transcriptome was approximately 1.4-fold larger than either progenitor transcriptome (70% of the sum of the progenitor transcriptomes). In contrast, the allopolyploid genome is 94.3% as large as the sum of its progenitor genomes and retains > or =93.5% of the sum of its progenitor gene complements. Thus, "transcriptome downsizing" is greater than genome downsizing. Using this transcriptome size estimate, we inferred dosage responses for several thousand genes and showed that the majority exhibit partial dosage compensation. Homoeologue silencing is nonrandomly distributed across dosage responses, with genes showing extreme responses in either direction significantly more likely to have a silent homoeologue. This experimental approach will add value to transcript profiling experiments involving interspecies and interploidy comparisons by converting expression per transcriptome to expression per genome, eliminating the need for assumptions about transcriptome size.
Novel genes identified in a high-density genome wide association study for nicotine dependence.

PubMed

Bierut, Laura Jean; Madden, Pamela A F; Breslau, Naomi; Johnson, Eric O; Hatsukami, Dorothy; Pomerleau, Ovide F; Swan, Gary E; Rutter, Joni; Bertelsen, Sarah; Fox, Louis; Fugman, Douglas; Goate, Alison M; Hinrichs, Anthony L; Konvicka, Karel; Martin, Nicholas G; Montgomery, Grant W; Saccone, Nancy L; Saccone, Scott F; Wang, Jen C; Chase, Gary A; Rice, John P; Ballinger, Dennis G

2007-01-01

Tobacco use is a leading contributor to disability and death worldwide, and genetic factors contribute in part to the development of nicotine dependence. To identify novel genes for which natural variation contributes to the development of nicotine dependence, we performed a comprehensive genome wide association study using nicotine dependent smokers as cases and non-dependent smokers as controls. To allow the efficient, rapid, and cost effective screen of the genome, the study was carried out using a two-stage design. In the first stage, genotyping of over 2.4 million single nucleotide polymorphisms (SNPs) was completed in case and control pools. In the second stage, we selected SNPs for individual genotyping based on the most significant allele frequency differences between cases and controls from the pooled results. Individual genotyping was performed in 1050 cases and 879 controls using 31 960 selected SNPs. The primary analysis, a logistic regression model with covariates of age, gender, genotype and gender by genotype interaction, identified 35 SNPs with P-values less than 10(-4) (minimum P-value 1.53 x 10(-6)). Although none of the individual findings is statistically significant after correcting for multiple tests, additional statistical analyses support the existence of true findings in this group. Our study nominates several novel genes, such as Neurexin 1 (NRXN1), in the development of nicotine dependence while also identifying a known candidate gene, the beta3 nicotinic cholinergic receptor. This work anticipates the future directions of large-scale genome wide association studies with state-of-the-art methodological approaches and sharing of data with the scientific community.
Natural parameter values for generalized gene adjacency.

PubMed

Yang, Zhenyu; Sankoff, David

2010-09-01

Given the gene orders in two modern genomes, it may be difficult to decide if some genes are close enough in both genomes to infer some ancestral proximity or some functional relationship. Current methods all depend on arbitrary parameters. We explore a class of gene proximity criteria and find two kinds of natural values for their parameters. One kind has to do with the parameter value where the expected information contained in two genomes about each other is maximized. The other kind of natural value has to do with parameter values beyond which all genes are clustered. We analyze these using combinatorial and probabilistic arguments as well as simulations.
Gatekeepers or intermediaries? The role of clinicians in commercial genomic testing.

PubMed

McGowan, Michelle L; Fishman, Jennifer R; Settersten, Richard A; Lambrix, Marcie A; Juengst, Eric T

2014-01-01

Many commentators on "direct-to-consumer" genetic risk information have raised concerns that giving results to individuals with insufficient knowledge and training in genomics may harm consumers, the health care system, and society. In response, several commercial laboratories offering genomic risk profiling have shifted to more traditional "direct-to-provider" (DTP) marketing strategies, repositioning clinicians as the intended recipients of advertising of laboratory services and as gatekeepers to personal genomic information. Increasing popularity of next generation sequencing puts a premium on ensuring that those who are charged with interpreting, translating, communicating and managing commercial genomic risk information are appropriately equipped for the job. To shed light on their gatekeeping role, we conducted a study to assess how and why early clinical users of genomic risk assessment incorporate these tools in their clinical practices and how they interpret genomic information for their patients. We conducted qualitative in-depth interviews with 18 clinicians providing genomic risk assessment services to their patients in partnership with DNA Direct and Navigenics. Our findings suggest that clinicians learned most of what they knew about genomics directly from the commercial laboratories. Clinicians rely on the expertise of the commercial laboratories without the ability to critically evaluate the knowledge or assess risks. DTP service delivery model cannot guarantee that providers will have adequate expertise or sound clinical judgment. Even if clinicians want greater genomic knowledge, the current market structure is unlikely to build the independent substantive expertise of clinicians, but rather promote its continued outsourcing. Because commercial laboratories have the most "skin in the game" financially, genetics professionals and policymakers should scrutinize the scientific validity and clinical soundness of the process by which these laboratories interpret their findings to assess whether self-interested commercial sources are the most appropriate entities for gate-keeping genomic interpretation.
Value-Based Medicine and Integration of Tumor Biology.

PubMed

Brooks, Gabriel A; Bosserman, Linda D; Mambetsariev, Isa; Salgia, Ravi

2017-01-01

Clinical oncology is in the midst of a genomic revolution, as molecular insights redefine our understanding of cancer biology. Greater awareness of the distinct aberrations that drive carcinogenesis is also contributing to a growing armamentarium of genomically targeted therapies. Although much work remains to better understand how to combine and sequence these therapies, improved outcomes for patients are becoming manifest. As we welcome this genomic revolution in cancer care, oncologists also must grapple with a number of practical problems. Costs of cancer care continue to grow, with targeted therapies responsible for an increasing proportion of spending. Rising costs are bringing the concept of value into sharper focus and challenging the oncology community with implementation of value-based cancer care. This article explores the ways that the genomic revolution is transforming cancer care, describes various frameworks for considering the value of genomically targeted therapies, and outlines key challenges for delivering on the promise of personalized cancer care. It highlights practical solutions for the implementation of value-based care, including investment in biomarker development and clinical trials to improve the efficacy of targeted therapy, the use of evidence-based clinical pathways, team-based care, computerized clinical decision support, and value-based payment approaches.
Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies

PubMed Central

Zhang, Yu; Liu, Jun S.

2011-01-01

Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288
Direct-to-consumer genetic testing: good, bad or benign?

PubMed

Caulfield, T; Ries, N M; Ray, P N; Shuman, C; Wilson, B

2010-02-01

A wide variety of genetic tests are now being marketed and sold in direct-to-consumer (DTC) commercial transactions. However, risk information revealed through many DTC testing services, especially those based on emerging genome wide-association studies, has limited predictive value for consumers. Some commentators contend that tests are being marketed prematurely, while others support rapid translation of genetic research findings to the marketplace. The potential harms and benefits of DTC access to genetic testing are not yet well understood, but some large-scale studies have recently been launched to examine how consumers understand and use genetic risk information. Greater consumer access to genetic tests creates a need for continuing education for health care professionals so they can respond to patients' inquiries about the benefits, risks and limitations of DTC services. Governmental bodies in many jurisdictions are considering options for regulating practices of DTC genetic testing companies, particularly to govern quality of commercial genetic tests and ensure fair and truthful advertising. Intersectoral initiatives involving government regulators, professional bodies and industry are important to facilitate development of standards to govern this rapidly developing area of personalized genomic commerce.
The CRISPR-Cas system - from bacterial immunity to genome engineering.

PubMed

Czarnek, Maria; Bereta, Joanna

2016-09-01

Precise and efficient genome modifications present a great value in attempts to comprehend the roles of particular genes and other genetic elements in biological processes as well as in various pathologies. In recent years novel methods of genome modification known as genome editing, which utilize so called "programmable" nucleases, came into use. A true revolution in genome editing has been brought about by the introduction of the CRISP-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) system, in which one of such nucleases, i.e. Cas9, plays a major role. This system is based on the elements of the bacterial and archaeal mechanism responsible for acquired immunity against phage infections and transfer of foreign genetic material. Microorganisms incorporate fragments of foreign DNA into CRISPR loci present in their genomes, which enables fast recognition and elimination of future infections. There are several types of CRISPR-Cas systems among prokaryotes but only elements of CRISPR type II are employed in genome engineering. CRISPR-Cas type II utilizes small RNA molecules (crRNA and tracrRNA) to precisely direct the effector nuclease - Cas9 - to a specific site in the genome, i.e. to the sequence complementary to crRNA. Cas9 may be used to: (i) introduce stable changes into genomes e.g. in the process of generation of knock-out and knock-in animals and cell lines, (ii) activate or silence the expression of a gene of interest, and (iii) visualize specific sites in genomes of living cells. The CRISPR-Cas-based tools have been successfully employed for generation of animal and cell models of a number of diseases, e.g. specific types of cancer. In the future, the genome editing by programmable nucleases may find wide application in medicine e.g. in the therapies of certain diseases of genetic origin and in the therapy of HIV-infected patients.
[Direct-to-consumer genetic testing through Internet: marketing, ethical and social issues].

PubMed

Ducournau, Pascal; Gourraud, Pierre-Antoine; Rial-Sebbag, Emmanuelle; Bulle, Alexandre; Cambon-Thomsen, Anne

2011-01-01

We probably did not anticipate all the consequences of the direct to consumer genetic tests on Internet, resulting from the combined skills of communication and genomic advances. What are the commercial strategies used by the companies offering direct-to-consumer genetic tests on Internet and what are the different social expectations on which they focus? Through a quantitative and qualitative analysis of the web sites offering such tests, it seems that these companies target a triple market based on: the "healthism" which raises health and hygiene to the top of the social values; the contemporary demands of the users to become actual actors of health decisions; and finally on the need for bio-social relationships. These three commercial strategies underlie various ethical and societal issues justifying a general analysis.
Whole-genome typing and characterization of blaVIM19-harbouring ST383 Klebsiella pneumoniae by PFGE, whole-genome mapping and WGS.

PubMed

Sabirova, Julia S; Xavier, Basil Britto; Coppens, Jasmine; Zarkotou, Olympia; Lammens, Christine; Janssens, Lore; Burggrave, Ronald; Wagner, Trevor; Goossens, Herman; Malhotra-Kumar, Surbhi

2016-06-01

We utilized whole-genome mapping (WGM) and WGS to characterize 12 clinical carbapenem-resistant Klebsiella pneumoniae strains (TGH1-TGH12). All strains were screened for carbapenemase genes by PCR, and typed by MLST, PFGE (XbaI) and WGM (AflII) (OpGen, USA). WGS (Illumina) was performed on TGH8 and TGH10. Reads were de novo assembled and annotated [SPAdes, Rapid Annotation Subsystem Technology (RAST)]. Contigs were aligned directly, and after in silico AflII restriction, with corresponding WGMs (MapSolver, OpGen; BioNumerics, Applied Maths). All 12 strains were ST383. Of the 12 strains, 11 were carbapenem resistant, 7 harboured blaKPC-2 and 11 harboured blaVIM-19. Varying the parameters for assigning WGM clusters showed that these were comparable to STs and to the eight PFGE types or subtypes (difference of three or more bands). A 95% similarity coefficient assigned all 12 WGMs to a single cluster, whereas a 99% similarity coefficient (or ≥10 unmatched-fragment difference) assigned the 12 WGMs to eight (sub)clusters. Based on a difference of three or more bands between PFGE profiles, the Simpson's diversity indices (SDIs) of WGM (0.94, Jackknife pseudo-values CI: 0.883-0.996) and PFGE (0.93, Jackknife pseudo-values CI: 0.828-1.000) were similar (P = 0.649). However, the discriminatory power of WGM was significantly higher (SDI: 0.94, Jackknife pseudo-values CI: 0.883-0.996) than that of PFGE profiles typed on a difference of seven or more bands (SDI: 0.53, Jackknife pseudo-values CI: 0.212-0.849) (P = 0.007). This study demonstrates the application of WGM to understanding the epidemiology of hospital-associated K. pneumoniae. Utilizing a combination of WGM and WGS, we also present here the first longitudinal genomic characterization of the highly dynamic carbapenem-resistant ST383 K. pneumoniae clone that is rapidly gaining importance in Europe. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Ensembl variation resources

PubMed Central

2010-01-01

Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org. PMID:20459805
Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.

PubMed

van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F

2015-09-17

In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.
Genome Editing Tools in Plants

PubMed Central

Mohanta, Tapan Kumar; Bashir, Tufail; Hashem, Abeer; Bae, Hanhong

2017-01-01

Genome editing tools have the potential to change the genomic architecture of a genome at precise locations, with desired accuracy. These tools have been efficiently used for trait discovery and for the generation of plants with high crop yields and resistance to biotic and abiotic stresses. Due to complex genomic architecture, it is challenging to edit all of the genes/genomes using a particular genome editing tool. Therefore, to overcome this challenging task, several genome editing tools have been developed to facilitate efficient genome editing. Some of the major genome editing tools used to edit plant genomes are: Homologous recombination (HR), zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), pentatricopeptide repeat proteins (PPRs), the CRISPR/Cas9 system, RNA interference (RNAi), cisgenesis, and intragenesis. In addition, site-directed sequence editing and oligonucleotide-directed mutagenesis have the potential to edit the genome at the single-nucleotide level. Recently, adenine base editors (ABEs) have been developed to mutate A-T base pairs to G-C base pairs. ABEs use deoxyadeninedeaminase (TadA) with catalytically impaired Cas9 nickase to mutate A-T base pairs to G-C base pairs. PMID:29257124
Drosophila Females Undergo Genome Expansion after Interspecific Hybridization

PubMed Central

Romero-Soriano, Valèria; Burlet, Nelly; Vela, Doris; Fontdevila, Antonio; Vieira, Cristina; García Guerreiro, María Pilar

2016-01-01

Genome size (or C-value) can present a wide range of values among eukaryotes. This variation has been attributed to differences in the amplification and deletion of different noncoding repetitive sequences, particularly transposable elements (TEs). TEs can be activated under different stress conditions such as interspecific hybridization events, as described for several species of animals and plants. These massive transposition episodes can lead to considerable genome expansions that could ultimately be involved in hybrid speciation processes. Here, we describe the effects of hybridization and introgression on genome size of Drosophila hybrids. We measured the genome size of two close Drosophila species, Drosophila buzzatii and Drosophila koepferae, their F1 offspring and the offspring from three generations of backcrossed hybrids; where mobilization of up to 28 different TEs was previously detected. We show that hybrid females indeed present a genome expansion, especially in the first backcross, which could likely be explained by transposition events. Hybrid males, which exhibit more variable C-values among individuals of the same generation, do not present an increased genome size. Thus, we demonstrate that the impact of hybridization on genome size can be detected through flow cytometry and is sex-dependent. PMID:26872773
[The application of metabonomics in modern studies of Chinese materia medica].

PubMed

Chen, Hai-Bin; Zhou, Hong-Guang; Yu, Xiao-Yi

2012-06-01

Metabonomics, a newly developing subject secondary to genomics, transcriptomics, and proteomics, is an important constituent part of systems biology. It is believed to be the final direction of the systems biology. It can be directly applied to understand the physiological and biochemical states by its "metabolome profile" as a whole. Therefore, it can provide a huge amount of information different from those originating from other "omics". In the modernization of Chinese materia medica research, the application of metabonomics methods and technologies has a broad potential for future development. Especially it is of important theoretical significance and application value in holistic efficacies evaluation, active ingredients studies, and safety research of Chinese materia medica.
Accuracy of estimation of genomic breeding values in pigs using low-density genotypes and imputation.

PubMed

Badke, Yvonne M; Bates, Ronald O; Ernst, Catherine W; Fix, Justin; Steibel, Juan P

2014-04-16

Genomic selection has the potential to increase genetic progress. Genotype imputation of high-density single-nucleotide polymorphism (SNP) genotypes can improve the cost efficiency of genomic breeding value (GEBV) prediction for pig breeding. Consequently, the objectives of this work were to: (1) estimate accuracy of genomic evaluation and GEBV for three traits in a Yorkshire population and (2) quantify the loss of accuracy of genomic evaluation and GEBV when genotypes were imputed under two scenarios: a high-cost, high-accuracy scenario in which only selection candidates were imputed from a low-density platform and a low-cost, low-accuracy scenario in which all animals were imputed using a small reference panel of haplotypes. Phenotypes and genotypes obtained with the PorcineSNP60 BeadChip were available for 983 Yorkshire boars. Genotypes of selection candidates were masked and imputed using tagSNP in the GeneSeek Genomic Profiler (10K). Imputation was performed with BEAGLE using 128 or 1800 haplotypes as reference panels. GEBV were obtained through an animal-centric ridge regression model using de-regressed breeding values as response variables. Accuracy of genomic evaluation was estimated as the correlation between estimated breeding values and GEBV in a 10-fold cross validation design. Accuracy of genomic evaluation using observed genotypes was high for all traits (0.65-0.68). Using genotypes imputed from a large reference panel (accuracy: R(2) = 0.95) for genomic evaluation did not significantly decrease accuracy, whereas a scenario with genotypes imputed from a small reference panel (R(2) = 0.88) did show a significant decrease in accuracy. Genomic evaluation based on imputed genotypes in selection candidates can be implemented at a fraction of the cost of a genomic evaluation using observed genotypes and still yield virtually the same accuracy. On the other side, using a very small reference panel of haplotypes to impute training animals and candidates for selection results in lower accuracy of genomic evaluation.
Direct Capture Technologies for Genomics-Guided Discovery of Natural Products.

PubMed

Chan, Andrew N; Santa Maria, Kevin C; Li, Bo

2016-01-01

Microbes are important producers of natural products, which have played key roles in understanding biology and treating disease. However, the full potential of microbes to produce natural products has yet to be realized; the overwhelming majority of natural product gene clusters encoded in microbial genomes remain "cryptic", and have not been expressed or characterized. In contrast to the fast-growing number of genomic sequences and bioinformatic tools, methods to connect these genes to natural product molecules are still limited, creating a bottleneck in genome-mining efforts to discover novel natural products. Here we review developing technologies that leverage the power of homologous recombination to directly capture natural product gene clusters and express them in model hosts for isolation and structural characterization. Although direct capture is still in its early stages of development, it has been successfully utilized in several different classes of natural products. These early successes will be reviewed, and the methods will be compared and contrasted with existing traditional technologies. Lastly, we will discuss the opportunities for the development of direct capture in other organisms, and possibilities to integrate direct capture with emerging genome-editing techniques to accelerate future study of natural products.
Impact of reduced marker set estimation of genomic relationship matrices on genomic selection for feed efficiency in Angus cattle.

PubMed

Rolf, Megan M; Taylor, Jeremy F; Schnabel, Robert D; McKay, Stephanie D; McClure, Matthew C; Northcutt, Sally L; Kerley, Monty S; Weaber, Robert L

2010-04-19

Molecular estimates of breeding value are expected to increase selection response due to improvements in the accuracy of selection and a reduction in generation interval, particularly for traits that are difficult or expensive to record or are measured late in life. Several statistical methods for incorporating molecular data into breeding value estimation have been proposed, however, most studies have utilized simulated data in which the generated linkage disequilibrium may not represent the targeted livestock population. A genomic relationship matrix was developed for 698 Angus steers and 1,707 Angus sires using 41,028 single nucleotide polymorphisms and breeding values were estimated using feed efficiency phenotypes (average daily feed intake, residual feed intake, and average daily gain) recorded on the steers. The number of SNPs needed to accurately estimate a genomic relationship matrix was evaluated in this population. Results were compared to estimates produced from pedigree-based mixed model analysis of 862 Angus steers with 34,864 identified paternal relatives but no female ancestors. Estimates of additive genetic variance and breeding value accuracies were similar for AFI and RFI using the numerator and genomic relationship matrices despite fewer animals in the genomic analysis. Bootstrap analyses indicated that 2,500-10,000 markers are required for robust estimation of genomic relationship matrices in cattle. This research shows that breeding values and their accuracies may be estimated for commercially important sires for traits recorded in experimental populations without the need for pedigree data to establish identity by descent between members of the commercial and experimental populations when at least 2,500 SNPs are available for the generation of a genomic relationship matrix.
Cow genotyping strategies for genomic selection in a small dairy cattle population.

PubMed

Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A

2017-01-01

This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Efficient Breeding by Genomic Mating.

PubMed

Akdemir, Deniz; Sánchez, Julio I

2016-01-01

Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.
Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.

PubMed

Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi

2018-01-01

Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.
Genotype imputation from various low-density SNP panels and its impact on accuracy of genomic breeding values in pigs.

PubMed

Grossi, D A; Brito, L F; Jafarikia, M; Schenkel, F S; Feng, Z

2018-04-30

The uptake of genomic selection (GS) by the swine industry is still limited by the costs of genotyping. A feasible alternative to overcome this challenge is to genotype animals using an affordable low-density (LD) single nucleotide polymorphism (SNP) chip panel followed by accurate imputation to a high-density panel. Therefore, the main objective of this study was to screen incremental densities of LD panels in order to systematically identify one that balances the tradeoffs among imputation accuracy, prediction accuracy of genomic estimated breeding values (GEBVs), and genotype density (directly associated with genotyping costs). Genotypes using the Illumina Porcine60K BeadChip were available for 1378 Duroc (DU), 2361 Landrace (LA) and 3192 Yorkshire (YO) pigs. In addition, pseudo-phenotypes (de-regressed estimated breeding values) for five economically important traits were provided for the analysis. The reference population for genotyping imputation consisted of 931 DU, 1631 LA and 2103 YO animals and the remainder individuals were included in the validation population of each breed. A LD panel of 3000 evenly spaced SNPs (LD3K) yielded high imputation accuracy rates: 93.78% (DU), 97.07% (LA) and 97.00% (YO) and high correlations (>0.97) between the predicted GEBVs using the actual 60 K SNP genotypes and the imputed 60 K SNP genotypes for all traits and breeds. The imputation accuracy was influenced by the reference population size as well as the amount of parental genotype information available in the reference population. However, parental genotype information became less important when the LD panel had at least 3000 SNPs. The correlation of the GEBVs directly increased with an increase in imputation accuracy. When genotype information for both parents was available, a panel of 300 SNPs (imputed to 60 K) yielded GEBV predictions highly correlated (⩾0.90) with genomic predictions obtained based on the true 60 K panel, for all traits and breeds. For a small reference population size with no parents on reference population, it is recommended the use of a panel at least as dense as the LD3K and, when there are two parents in the reference population, a panel as small as the LD300 might be a feasible option. These findings are of great importance for the development of LD panels for swine in order to reduce genotyping costs, increase the uptake of GS and, therefore, optimize the profitability of the swine industry.
Using flow cytometry to estimate pollen DNA content: improved methodology and applications

PubMed Central

Kron, Paul; Husband, Brian C.

2012-01-01

Background and Aims Flow cytometry has been used to measure nuclear DNA content in pollen, mostly to understand pollen development and detect unreduced gametes. Published data have not always met the high-quality standards required for some applications, in part due to difficulties inherent in the extraction of nuclei. Here we describe a simple and relatively novel method for extracting pollen nuclei, involving the bursting of pollen through a nylon mesh, compare it with other methods and demonstrate its broad applicability and utility. Methods The method was tested across 80 species, 64 genera and 33 families, and the data were evaluated using established criteria for estimating genome size and analysing cell cycle. Filter bursting was directly compared with chopping in five species, yields were compared with published values for sonicated samples, and the method was applied by comparing genome size estimates for leaf and pollen nuclei in six species. Key Results Data quality met generally applied standards for estimating genome size in 81 % of species and the higher best practice standards for cell cycle analysis in 51 %. In 41 % of species we met the most stringent criterion of screening 10 000 pollen grains per sample. In direct comparison with two chopping techniques, our method produced better quality histograms with consistently higher nuclei yields, and yields were higher than previously published results for sonication. In three binucleate and three trinucleate species we found that pollen-based genome size estimates differed from leaf tissue estimates by 1·5 % or less when 1C pollen nuclei were used, while estimates from 2C generative nuclei differed from leaf estimates by up to 2·5 %. Conclusions The high success rate, ease of use and wide applicability of the filter bursting method show that this method can facilitate the use of pollen for estimating genome size and dramatically improve unreduced pollen production estimation with flow cytometry. PMID:22875815
Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species.

PubMed

Chen, Ze-Hui; Zhang, Min; Lv, Feng-Hua; Ren, Xue; Li, Wen-Rong; Liu, Ming-Jun; Nam, Kiwoong; Bruford, Michael W; Li, Meng-Hua

2018-04-01

Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world's sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05-79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep's recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on the sheep X-chromosome.
Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species

PubMed Central

Chen, Ze-Hui; Zhang, Min; Lv, Feng-Hua; Ren, Xue; Li, Wen-Rong; Liu, Ming-Jun; Nam, Kiwoong; Bruford, Michael W; Li, Meng-Hua

2018-01-01

Abstract Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world’s sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05–79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep’s recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on the sheep X-chromosome. PMID:29790980
Estimation of total genetic effects for survival time in crossbred laying hens showing cannibalism, using pedigree or genomic information.

PubMed

Brinker, T; Raymond, B; Bijma, P; Vereijken, A; Ellen, E D

2017-02-01

Mortality of laying hens due to cannibalism is a major problem in the egg-laying industry. Survival depends on two genetic effects: the direct genetic effect of the individual itself (DGE) and the indirect genetic effects of its group mates (IGE). For hens housed in sire-family groups, DGE and IGE cannot be estimated using pedigree information, but the combined effect of DGE and IGE is estimated in the total breeding value (TBV). Genomic information provides information on actual genetic relationships between individuals and might be a tool to improve TBV accuracy. We investigated whether genomic information of the sire increased TBV accuracy compared with pedigree information, and we estimated genetic parameters for survival time. A sire model with pedigree information (BLUP) and a sire model with genomic information (ssGBLUP) were used. We used survival time records of 7290 crossbred offspring with intact beaks from four crosses. Cross-validation was used to compare the models. Using ssGBLUP did not improve TBV accuracy compared with BLUP which is probably due to the limited number of sires available per cross (~50). Genetic parameter estimates were similar for BLUP and ssGBLUP. For both BLUP and ssGBLUP, total heritable variance (T 2 ), expressed as a proportion of phenotypic variance, ranged from 0.03 ± 0.04 to 0.25 ± 0.09. Further research is needed on breeding value estimation for socially affected traits measured on individuals kept in single-family groups. © 2016 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH.
Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis.

PubMed

Jakupciak, John P; Wells, Jeffrey M; Karalus, Richard J; Pawlowski, David R; Lin, Jeffrey S; Feldman, Andrew B

2013-01-01

Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.
Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

PubMed Central

Jakupciak, John P.; Wells, Jeffrey M.; Karalus, Richard J.; Pawlowski, David R.; Lin, Jeffrey S.; Feldman, Andrew B.

2013-01-01

Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. PMID:24455204
Genome-enabled Modeling of Microbial Biogeochemistry using a Trait-based Approach. Does Increasing Metabolic Complexity Increase Predictive Capabilities?

NASA Astrophysics Data System (ADS)

King, E.; Karaoz, U.; Molins, S.; Bouskill, N.; Anantharaman, K.; Beller, H. R.; Banfield, J. F.; Steefel, C. I.; Brodie, E.

2015-12-01

The biogeochemical functioning of ecosystems is shaped in part by genomic information stored in the subsurface microbiome. Cultivation-independent approaches allow us to extract this information through reconstruction of thousands of genomes from a microbial community. Analysis of these genomes, in turn, gives an indication of the organisms present and their functional roles. However, metagenomic analyses can currently deliver thousands of different genomes that range in abundance/importance, requiring the identification and assimilation of key physiologies and metabolisms to be represented as traits for successful simulation of subsurface processes. Here we focus on incorporating -omics information into BioCrunch, a genome-informed trait-based model that represents the diversity of microbial functional processes within a reactive transport framework. This approach models the rate of nutrient uptake and the thermodynamics of coupled electron donors and acceptors for a range of microbial metabolisms including heterotrophs and chemolithotrophs. Metabolism of exogenous substrates fuels catabolic and anabolic processes, with the proportion of energy used for cellular maintenance, respiration, biomass development, and enzyme production based upon dynamic intracellular and environmental conditions. This internal resource partitioning represents a trade-off against biomass formation and results in microbial community emergence across a fitness landscape. Biocrunch was used here in simulations that included organisms and metabolic pathways derived from a dataset of ~1200 non-redundant genomes reflecting a microbial community in a floodplain aquifer. Metagenomic data was directly used to parameterize trait values related to growth and to identify trait linkages associated with respiration, fermentation, and key enzymatic functions such as plant polymer degradation. Simulations spanned a range of metabolic complexities and highlight benefits originating from simulations including a larger number of organisms that more appropriately reflect the in situ microbial community.
A strategy for implementing genomics into nursing practice informed by three behaviour change theories.

PubMed

Leach, Verity; Tonkin, Emma; Lancastle, Deborah; Kirk, Maggie

2016-06-01

Genomics is an ever increasing aspect of nursing practice, with focus being directed towards improving health. The authors present an implementation strategy for the incorporation of genomics into nursing practice within the UK, based on three behaviour change theories and the identification of individuals who are likely to provide support for change. Individuals identified as Opinion Leaders and Adopters of genomics illustrate how changes in behaviour might occur among the nursing profession. The core philosophy of the strategy is that genomic nurse Adopters and Opinion Leaders who have direct interaction with their peers in practice will be best placed to highlight the importance of genomics within the nursing role. The strategy discussed in this paper provides scope for continued nursing education and development of genomics within nursing practice on a larger scale. The recommendations might be of particular relevance for senior staff and management. © 2016 John Wiley & Sons Australia, Ltd.

Factors affecting reproducibility between genome-scale siRNA-based screens

PubMed Central

Barrows, Nicholas J.; Le Sommer, Caroline; Garcia-Blanco, Mariano A.; Pearson, James L.

2011-01-01

RNA interference-based screening is a powerful new genomic technology which addresses gene function en masse. To evaluate factors influencing hit list composition and reproducibility, we performed two identically designed small interfering RNA (siRNA)-based, whole genome screens for host factors supporting yellow fever virus infection. These screens represent two separate experiments completed five months apart and allow the direct assessment of the reproducibility of a given siRNA technology when performed in the same environment. Candidate hit lists generated by sum rank, median absolute deviation, z-score, and strictly standardized mean difference were compared within and between whole genome screens. Application of these analysis methodologies within a single screening dataset using a fixed threshold equivalent to a p-value ≤ 0.001 resulted in hit lists ranging from 82 to 1,140 members and highlighted the tremendous impact analysis methodology has on hit list composition. Intra- and inter-screen reproducibility was significantly influenced by the analysis methodology and ranged from 32% to 99%. This study also highlighted the power of testing at least two independent siRNAs for each gene product in primary screens. To facilitate validation we conclude by suggesting methods to reduce false discovery at the primary screening stage. In this study we present the first comprehensive comparison of multiple analysis strategies, and demonstrate the impact of the analysis methodology on the composition of the “hit list”. Therefore, we propose that the entire dataset derived from functional genome-scale screens, especially if publicly funded, should be made available as is done with data derived from gene expression and genome-wide association studies. PMID:20625183
Searching for statistically significant regulatory modules.

PubMed

Bailey, Timothy L; Noble, William Stafford

2003-10-01

The regulatory machinery controlling gene expression is complex, frequently requiring multiple, simultaneous DNA-protein interactions. The rate at which a gene is transcribed may depend upon the presence or absence of a collection of transcription factors bound to the DNA near the gene. Locating transcription factor binding sites in genomic DNA is difficult because the individual sites are small and tend to occur frequently by chance. True binding sites may be identified by their tendency to occur in clusters, sometimes known as regulatory modules. We describe an algorithm for detecting occurrences of regulatory modules in genomic DNA. The algorithm, called mcast, takes as input a DNA database and a collection of binding site motifs that are known to operate in concert. mcast uses a motif-based hidden Markov model with several novel features. The model incorporates motif-specific p-values, thereby allowing scores from motifs of different widths and specificities to be compared directly. The p-value scoring also allows mcast to only accept motif occurrences with significance below a user-specified threshold, while still assigning better scores to motif occurrences with lower p-values. mcast can search long DNA sequences, modeling length distributions between motifs within a regulatory module, but ignoring length distributions between modules. The algorithm produces a list of predicted regulatory modules, ranked by E-value. We validate the algorithm using simulated data as well as real data sets from fruitfly and human. http://meme.sdsc.edu/MCAST/paper
Big Data Analytics in Medicine and Healthcare.

PubMed

Ristevski, Blagoj; Chen, Ming

2018-05-10

This paper surveys big data with highlighting the big data analytics in medicine and healthcare. Big data characteristics: value, volume, velocity, variety, veracity and variability are described. Big data analytics in medicine and healthcare covers integration and analysis of large amount of complex heterogeneous data such as various - omics data (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenomics, diseasomics), biomedical data and electronic health records data. We underline the challenging issues about big data privacy and security. Regarding big data characteristics, some directions of using suitable and promising open-source distributed data processing software platform are given.
A human genome-wide loss-of-function screen identifies effective chikungunya antiviral drugs

PubMed Central

Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F.; Lecuit, Marc

2016-01-01

Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents. PMID:27177310
A human genome-wide loss-of-function screen identifies effective chikungunya antiviral drugs.

PubMed

Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F; Lecuit, Marc

2016-05-12

Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents.
Isolation and characterization of a virus-specific ribonucleoprotein complex from reticuloendotheliosis virus-transformed chicken bone marrow cells.

PubMed Central

Wong, T C; Kang, C Y

1978-01-01

Chicken bone marrow cells transformed by reticuloendotheliosis virus (REV) produce in the cytoplasm a ribonucleoprotein (RNP) complex which has a sedimentation value of approximately 80 to 100S and a density of 1.23 g/cm3. This RNP complex is not derived from the mature virion. An endogenous RNA-directed DNA polymerase activity is associated with the RNP complex. The enzyme activity was completely neutralized by anti-REV DNA polymerase antibody but not by anti-avian myeloblastosis virus DNA polymerase antibody. The DNA product from the endogenous RNA-directed DNA polymerase reaction of the RNP complex hybridized to REV RNA but not to avian leukosis virus RNA. The RNA extracted from the RNP hybridized only to REV-specific complementary DNA synthesized from an endogenous DNA polymerase reaction of purified REV. The size of the RNA in the RNP is 30 to 35S, which represents the subunit size of the genomic RNA. No 60S mature genomic RNA was found within the RNP complex. The significance of finding the endogenous DNA polymerase activity in the viral RNP in infected cells and the maturation process of 60S virion RNA of REV are discussed. PMID:81319
Conservation of mRNA secondary structures may filter out mutations in Escherichia coli evolution

PubMed Central

Chursov, Andrey; Frishman, Dmitrij; Shneider, Alexander

2013-01-01

Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins. PMID:23783573
Impact of fitting dominance and additive effects on accuracy of genomic prediction of breeding values in layers.

PubMed

Heidaritabar, M; Wolc, A; Arango, J; Zeng, J; Settar, P; Fulton, J E; O'Sullivan, N P; Bastiaansen, J W M; Fernando, R L; Garrick, D J; Dekkers, J C M

2016-10-01

Most genomic prediction studies fit only additive effects in models to estimate genomic breeding values (GEBV). However, if dominance genetic effects are an important source of variation for complex traits, accounting for them may improve the accuracy of GEBV. We investigated the effect of fitting dominance and additive effects on the accuracy of GEBV for eight egg production and quality traits in a purebred line of brown layers using pedigree or genomic information (42K single-nucleotide polymorphism (SNP) panel). Phenotypes were corrected for the effect of hatch date. Additive and dominance genetic variances were estimated using genomic-based [genomic best linear unbiased prediction (GBLUP)-REML and BayesC] and pedigree-based (PBLUP-REML) methods. Breeding values were predicted using a model that included both additive and dominance effects and a model that included only additive effects. The reference population consisted of approximately 1800 animals hatched between 2004 and 2009, while approximately 300 young animals hatched in 2010 were used for validation. Accuracy of prediction was computed as the correlation between phenotypes and estimated breeding values of the validation animals divided by the square root of the estimate of heritability in the whole population. The proportion of dominance variance to total phenotypic variance ranged from 0.03 to 0.22 with PBLUP-REML across traits, from 0 to 0.03 with GBLUP-REML and from 0.01 to 0.05 with BayesC. Accuracies of GEBV ranged from 0.28 to 0.60 across traits. Inclusion of dominance effects did not improve the accuracy of GEBV, and differences in their accuracies between genomic-based methods were small (0.01-0.05), with GBLUP-REML yielding higher prediction accuracies than BayesC for egg production, egg colour and yolk weight, while BayesC yielded higher accuracies than GBLUP-REML for the other traits. In conclusion, fitting dominance effects did not impact accuracy of genomic prediction of breeding values in this population. © 2016 Blackwell Verlag GmbH.
Whole-genome sequence-based genomic prediction in laying chickens with different genomic relationship matrices to account for genetic architecture.

PubMed

Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner

2017-01-16

With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights, regardless of the SNP set used. Our results show that little or no benefit was gained when using all imputed WGS data to perform genomic prediction compared to using HD array data regardless of the weighting factors tested. However, using only genic SNPs from WGS data had a positive effect on prediction ability.
Directional genomic hybridization for chromosomal inversion discovery and detection.

PubMed

Ray, F Andrew; Zimmerman, Erin; Robinson, Bruce; Cornforth, Michael N; Bedford, Joel S; Goodwin, Edwin H; Bailey, Susan M

2013-04-01

Chromosomal rearrangements are a source of structural variation within the genome that figure prominently in human disease, where the importance of translocations and deletions is well recognized. In principle, inversions-reversals in the orientation of DNA sequences within a chromosome-should have similar detrimental potential. However, the study of inversions has been hampered by traditional approaches used for their detection, which are not particularly robust. Even with significant advances in whole genome approaches, changes in the absolute orientation of DNA remain difficult to detect routinely. Consequently, our understanding of inversions is still surprisingly limited, as is our appreciation for their frequency and involvement in human disease. Here, we introduce the directional genomic hybridization methodology of chromatid painting-a whole new way of looking at structural features of the genome-that can be employed with high resolution on a cell-by-cell basis, and demonstrate its basic capabilities for genome-wide discovery and targeted detection of inversions. Bioinformatics enabled development of sequence- and strand-specific directional probe sets, which when coupled with single-stranded hybridization, greatly improved the resolution and ease of inversion detection. We highlight examples of the far-ranging applicability of this cytogenomics-based approach, which include confirmation of the alignment of the human genome database and evidence that individuals themselves share similar sequence directionality, as well as use in comparative and evolutionary studies for any species whose genome has been sequenced. In addition to applications related to basic mechanistic studies, the information obtainable with strand-specific hybridization strategies may ultimately enable novel gene discovery, thereby benefitting the diagnosis and treatment of a variety of human disease states and disorders including cancer, autism, and idiopathic infertility.
Identification of Thiotetronic Acid Antibiotic Biosynthetic Pathways by Target-directed Genome Mining.

PubMed

Tang, Xiaoyu; Li, Jie; Millán-Aguiñaga, Natalie; Zhang, Jia Jia; O'Neill, Ellis C; Ugalde, Juan A; Jensen, Paul R; Mantovani, Simone M; Moore, Bradley S

2015-12-18

Recent genome sequencing efforts have led to the rapid accumulation of uncharacterized or "orphaned" secondary metabolic biosynthesis gene clusters (BGCs) in public databases. This increase in DNA-sequenced big data has given rise to significant challenges in the applied field of natural product genome mining, including (i) how to prioritize the characterization of orphan BGCs and (ii) how to rapidly connect genes to biosynthesized small molecules. Here, we show that by correlating putative antibiotic resistance genes that encode target-modified proteins with orphan BGCs, we predict the biological function of pathway specific small molecules before they have been revealed in a process we call target-directed genome mining. By querying the pan-genome of 86 Salinispora bacterial genomes for duplicated house-keeping genes colocalized with natural product BGCs, we prioritized an orphan polyketide synthase-nonribosomal peptide synthetase hybrid BGC (tlm) with a putative fatty acid synthase resistance gene. We employed a new synthetic double-stranded DNA-mediated cloning strategy based on transformation-associated recombination to efficiently capture tlm and the related ttm BGCs directly from genomic DNA and to heterologously express them in Streptomyces hosts. We show the production of a group of unusual thiotetronic acid natural products, including the well-known fatty acid synthase inhibitor thiolactomycin that was first described over 30 years ago, yet never at the genetic level in regards to biosynthesis and autoresistance. This finding not only validates the target-directed genome mining strategy for the discovery of antibiotic producing gene clusters without a priori knowledge of the molecule synthesized but also paves the way for the investigation of novel enzymology involved in thiotetronic acid natural product biosynthesis.
Genomic-based multiple-trait evaluation in Eucalyptus grandis using dominant DArT markers.

PubMed

Cappa, Eduardo P; El-Kassaby, Yousry A; Muñoz, Facundo; Garcia, Martín N; Villalba, Pamela V; Klápště, Jaroslav; Marcucci Poltri, Susana N

2018-06-01

We investigated the impact of combining the pedigree- and genomic-based relationship matrices in a multiple-trait individual-tree mixed model (a.k.a., multiple-trait combined approach) on the estimates of heritability and on the genomic correlations between growth and stem straightness in an open-pollinated Eucalyptus grandis population. Additionally, the added advantage of incorporating genomic information on the theoretical accuracies of parents and offspring breeding values was evaluated. Our results suggested that the use of the combined approach for estimating heritabilities and additive genetic correlations in multiple-trait evaluations is advantageous and including genomic information increases the expected accuracy of breeding values. Furthermore, the multiple-trait combined approach was proven to be superior to the single-trait combined approach in predicting breeding values, in particular for low-heritability traits. Finally, our results advocate the use of the combined approach in forest tree progeny testing trials, specifically when a multiple-trait individual-tree mixed model is considered. Copyright © 2018 Elsevier B.V. All rights reserved.
[Preface for genome editing special issue].

PubMed

Gu, Feng; Gao, Caixia

2017-10-25

Genome editing technology, as an innovative biotechnology, has been widely used for editing the genome from model organisms, animals, plants and microbes. CRISPR/Cas9-based genome editing technology shows its great value and potential in the dissection of functional genomics, improved breeding and genetic disease treatment. In the present special issue, the principle and application of genome editing techniques has been summarized. The advantages and disadvantages of the current genome editing technology and future prospects would also be highlighted.
A nine-scaffold genome assembly of the nine chromosome sugar beet

USDA-ARS?s Scientific Manuscript database

A sugar beet genome sequence is required to take full advantage of the increasingly powerful approaches directed a single nucleotide resolution across the whole genome. A high quality reference genome serves as a benchmark from which other genotypes might be compared and exploited for sugar beet imp...
The development of genomics applied to dairy breeding

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) has profoundly changed dairy cattle breeding in the last decade and can be defined as the use of genomic breeding values (GEBV) in selection programs. The GEBV is the sum of the effects of dense DNA markers across the whole genome, capturing all the quantitative trait loci (QT...
Integration of genomic information into sport horse breeding programs for optimization of accuracy of selection.

PubMed

Haberland, A M; König von Borstel, U; Simianer, H; König, S

2012-09-01

Reliable selection criteria are required for young riding horses to increase genetic gain by increasing accuracy of selection and decreasing generation intervals. In this study, selection strategies incorporating genomic breeding values (GEBVs) were evaluated. Relevant stages of selection in sport horse breeding programs were analyzed by applying selection index theory. Results in terms of accuracies of indices (r(TI) ) and relative selection response indicated that information on single nucleotide polymorphism (SNP) genotypes considerably increases the accuracy of breeding values estimated for young horses without own or progeny performance. In a first scenario, the correlation between the breeding value estimated from the SNP genotype and the true breeding value (= accuracy of GEBV) was fixed to a relatively low value of r(mg) = 0.5. For a low heritability trait (h(2) = 0.15), and an index for a young horse based only on information from both parents, additional genomic information doubles r(TI) from 0.27 to 0.54. Including the conventional information source 'own performance' into the before mentioned index, additional SNP information increases r(TI) by 40%. Thus, particularly with regard to traits of low heritability, genomic information can provide a tool for well-founded selection decisions early in life. In a further approach, different sources of breeding values (e.g. GEBV and estimated breeding values (EBVs) from different countries) were combined into an overall index when altering accuracies of EBVs and correlations between traits. In summary, we showed that genomic selection strategies have the potential to contribute to a substantial reduction in generation intervals in horse breeding programs.
Genomics in childhood acute myeloid leukemia comes of age | Office of Cancer Genomics

Cancer.gov

TARGET investigator’s study of nearly 1,000 pediatric acute myeloid leukemia (AML) cases reveals marked differences between the genomic landscapes of pediatric and adult AML and offers directions for future work.
Genomic selection for crossbred performance accounting for breed-specific effects.

PubMed

Lopes, Marcos S; Bovenhuis, Henk; Hidalgo, André M; van Arendonk, Johan A M; Knol, Egbert F; Bastiaansen, John W M

2017-06-26

Breed-specific effects are observed when the same allele of a given genetic marker has a different effect depending on its breed origin, which results in different allele substitution effects across breeds. In such a case, single-breed breeding values may not be the most accurate predictors of crossbred performance. Our aim was to estimate the contribution of alleles from each parental breed to the genetic variance of traits that are measured in crossbred offspring, and to compare the prediction accuracies of estimated direct genomic values (DGV) from a traditional genomic selection model (GS) that are trained on purebred or crossbred data, with accuracies of DGV from a model that accounts for breed-specific effects (BS), trained on purebred or crossbred data. The final dataset was composed of 924 Large White, 924 Landrace and 924 two-way cross (F1) genotyped and phenotyped animals. The traits evaluated were litter size (LS) and gestation length (GL) in pigs. The genetic correlation between purebred and crossbred performance was higher than 0.88 for both LS and GL. For both traits, the additive genetic variance was larger for alleles inherited from the Large White breed compared to alleles inherited from the Landrace breed (0.74 and 0.56 for LS, and 0.42 and 0.40 for GL, respectively). The highest prediction accuracies of crossbred performance were obtained when training was done on crossbred data. For LS, prediction accuracies were the same for GS and BS DGV (0.23), while for GL, prediction accuracy for BS DGV was similar to the accuracy of GS DGV (0.53 and 0.52, respectively). In this study, training on crossbred data resulted in higher prediction accuracy than training on purebred data and evidence of breed-specific effects for LS and GL was demonstrated. However, when training was done on crossbred data, both GS and BS models resulted in similar prediction accuracies. In future studies, traits with a lower genetic correlation between purebred and crossbred performance should be included to further assess the value of the BS model in genomic predictions.
Small genomes and large seeds: chromosome numbers, genome size and seed mass in diploid Aesculus species (Sapindaceae)

PubMed Central

Krahulcová, Anna; Trávníček, Pavel; Rejmánek, Marcel

2017-01-01

Background and Aims Aesculus L. (horse chestnut, buckeye) is a genus of 12–19 extant woody species native to the temperate Northern Hemisphere. This genus is known for unusually large seeds among angiosperms. While chromosome counts are available for many Aesculus species, only one has had its genome size measured. The aim of this study is to provide more genome size data and analyse the relationship between genome size and seed mass in this genus. Methods Chromosome numbers in root tip cuttings were confirmed for four species and reported for the first time for three additional species. Flow cytometric measurements of 2C nuclear DNA values were conducted on eight species, and mean seed mass values were estimated for the same taxa. Key Results The same chromosome number, 2n = 40, was determined in all investigated taxa. Original measurements of 2C values for seven Aesculus species (eight taxa), added to just one reliable datum for A. hippocastanum, confirmed the notion that the genome size in this genus with relatively large seeds is surprisingly low, ranging from 0·955 pg 2C–1 in A. parviflora to 1·275 pg 2C–1 in A. glabra var. glabra. Conclusions The chromosome number of 2n = 40 seems to be conclusively the universal 2n number for non-hybrid species in this genus. Aesculus genome sizes are relatively small, not only within its own family, Sapindaceae, but also within woody angiosperms. The genome sizes seem to be distinct and non-overlapping among the four major Aesculus clades. These results provide an extra support for the most recent reconstruction of Aesculus phylogeny. The correlation between the 2C values and seed masses in examined Aesculus species is slightly negative and not significant. However, when the four major clades are treated separately, there is consistent positive association between larger genome size and larger seed mass within individual lineages. PMID:28065925
Design, methods, and participant characteristics of the Impact of Personal Genomics (PGen) Study, a prospective cohort study of direct-to-consumer personal genomic testing customers.

PubMed

Carere, Deanna Alexis; Couper, Mick P; Crawford, Scott D; Kalia, Sarah S; Duggan, Jake R; Moreno, Tanya A; Mountain, Joanna L; Roberts, J Scott; Green, Robert C

2014-01-01

Designed in collaboration with 23andMe and Pathway Genomics, the Impact of Personal Genomics (PGen) Study serves as a model for academic-industry partnership and provides a longitudinal dataset for studying psychosocial, behavioral, and health outcomes related to direct-to-consumer personal genomic testing (PGT). Web-based surveys administered at three time points, and linked to individual-level PGT results, provide data on 1,464 PGT customers, of which 71% completed each follow-up survey and 64% completed all three surveys. The cohort includes 15.7% individuals of non-white ethnicity, and encompasses a range of income, education, and health levels. Over 90% of participants agreed to re-contact for future research.

Deleterious Mutations, Apparent Stabilizing Selection and the Maintenance of Quantitative Variation

PubMed Central

Kondrashov, A. S.; Turelli, M.

1992-01-01

Apparent stabilizing selection on a quantitative trait that is not causally connected to fitness can result from the pleiotropic effects of unconditionally deleterious mutations, because as N. Barton noted, ``... individuals with extreme values of the trait will tend to carry more deleterious alleles ....'' We use a simple model to investigate the dependence of this apparent selection on the genomic deleterious mutation rate, U; the equilibrium distribution of K, the number of deleterious mutations per genome; and the parameters describing directional selection against deleterious mutations. Unlike previous analyses, we allow for epistatic selection against deleterious alleles. For various selection functions and realistic parameter values, the distribution of K, the distribution of breeding values for a pleiotropically affected trait, and the apparent stabilizing selection function are all nearly Gaussian. The additive genetic variance for the quantitative trait is kQa(2), where k is the average number of deleterious mutations per genome, Q is the proportion of deleterious mutations that affect the trait, and a(2) is the variance of pleiotropic effects for individual mutations that do affect the trait. In contrast, when the trait is measured in units of its additive standard deviation, the apparent fitness function is essentially independent of Q and a(2); and β, the intensity of selection, measured as the ratio of additive genetic variance to the ``variance'' of the fitness curve, is very close to s = U/k, the selection coefficient against individual deleterious mutations at equilibrium. Therefore, this model predicts appreciable apparent stabilizing selection if s exceeds about 0.03, which is consistent with various data. However, the model also predicts that β must equal V(m)/V(G), the ratio of new additive variance for the trait introduced each generation by mutation to the standing additive variance. Most, although not all, estimates of this ratio imply apparent stabilizing selection weaker than generally observed. A qualitative argument suggests that even when direct selection is responsible for most of the selection observed on a character, it may be essentially irrelevant to the maintenance of variation for the character by mutation-selection balance. Simple experiments can indicate the fraction of observed stabilizing selection attributable to the pleiotropic effects of deleterious mutations. PMID:1427047
Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

PubMed

Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

2016-01-01

Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Psychological and behavioural impact of returning personal results from whole-genome sequencing: the HealthSeq project.

PubMed

Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Zinberg, Randi; Wasserstein, Melissa; Kasarskis, Andrew; Diaz, George A; Schadt, Eric E

2017-02-01

Providing ostensibly healthy individuals with personal results from whole-genome sequencing could lead to improved health and well-being via enhanced disease risk prediction, prevention, and diagnosis, but also poses practical and ethical challenges. Understanding how individuals react psychologically and behaviourally will be key in assessing the potential utility of personal whole-genome sequencing. We conducted an exploratory longitudinal cohort study in which quantitative surveys and in-depth qualitative interviews were conducted before and after personal results were returned to individuals who underwent whole-genome sequencing. The participants were offered a range of interpreted results, including Alzheimer's disease, type 2 diabetes, pharmacogenomics, rare disease-associated variants, and ancestry. They were also offered their raw data. Of the 35 participants at baseline, 29 (82.9%) completed the 6-month follow-up. In the quantitative surveys, test-related distress was low, although it was higher at 1-week than 6-month follow-up (Z=2.68, P=0.007). In the 6-month qualitative interviews, most participants felt happy or relieved about their results. A few were concerned, particularly about rare disease-associated variants and Alzheimer's disease results. Two of the 29 participants had sought clinical follow-up as a direct or indirect consequence of rare disease-associated variants results. Several had mentioned their results to their doctors. Some participants felt having their raw data might be medically useful to them in the future. The majority reported positive reactions to having their genomes sequenced, but there were notable exceptions to this. The impact and value of returning personal results from whole-genome sequencing when implemented on a larger scale remains to be seen.
Psychological and behavioural impact of returning personal results from whole-genome sequencing: the HealthSeq project

PubMed Central

Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Zinberg, Randi; Wasserstein, Melissa; Kasarskis, Andrew; Diaz, George A; Schadt, Eric E

2017-01-01

Providing ostensibly healthy individuals with personal results from whole-genome sequencing could lead to improved health and well-being via enhanced disease risk prediction, prevention, and diagnosis, but also poses practical and ethical challenges. Understanding how individuals react psychologically and behaviourally will be key in assessing the potential utility of personal whole-genome sequencing. We conducted an exploratory longitudinal cohort study in which quantitative surveys and in-depth qualitative interviews were conducted before and after personal results were returned to individuals who underwent whole-genome sequencing. The participants were offered a range of interpreted results, including Alzheimer's disease, type 2 diabetes, pharmacogenomics, rare disease-associated variants, and ancestry. They were also offered their raw data. Of the 35 participants at baseline, 29 (82.9%) completed the 6-month follow-up. In the quantitative surveys, test-related distress was low, although it was higher at 1-week than 6-month follow-up (Z=2.68, P=0.007). In the 6-month qualitative interviews, most participants felt happy or relieved about their results. A few were concerned, particularly about rare disease-associated variants and Alzheimer's disease results. Two of the 29 participants had sought clinical follow-up as a direct or indirect consequence of rare disease-associated variants results. Several had mentioned their results to their doctors. Some participants felt having their raw data might be medically useful to them in the future. The majority reported positive reactions to having their genomes sequenced, but there were notable exceptions to this. The impact and value of returning personal results from whole-genome sequencing when implemented on a larger scale remains to be seen. PMID:28051073
Predictive performance of genomic selection methods for carcass traits in Hanwoo beef cattle: impacts of the genetic architecture.

PubMed

Mehrban, Hossein; Lee, Deuk Hwan; Moradi, Mohammad Hossein; IlCho, Chung; Naserkheil, Masoumeh; Ibáñez-Escriche, Noelia

2017-01-04

Hanwoo beef is known for its marbled fat, tenderness, juiciness and characteristic flavor, as well as for its low cholesterol and high omega 3 fatty acid contents. As yet, there has been no comprehensive investigation to estimate genomic selection accuracy for carcass traits in Hanwoo cattle using dense markers. This study aimed at evaluating the accuracy of alternative statistical methods that differed in assumptions about the underlying genetic model for various carcass traits: backfat thickness (BT), carcass weight (CW), eye muscle area (EMA), and marbling score (MS). Accuracies of direct genomic breeding values (DGV) for carcass traits were estimated by applying fivefold cross-validation to a dataset including 1183 animals and approximately 34,000 single nucleotide polymorphisms (SNPs). Accuracies of BayesC, Bayesian LASSO (BayesL) and genomic best linear unbiased prediction (GBLUP) methods were similar for BT, EMA and MS. However, for CW, DGV accuracy was 7% higher with BayesC than with BayesL and GBLUP. The increased accuracy of BayesC, compared to GBLUP and BayesL, was maintained for CW, regardless of the training sample size, but not for BT, EMA, and MS. Genome-wide association studies detected consistent large effects for SNPs on chromosomes 6 and 14 for CW. The predictive performance of the models depended on the trait analyzed. For CW, the results showed a clear superiority of BayesC compared to GBLUP and BayesL. These findings indicate the importance of using a proper variable selection method for genomic selection of traits and also suggest that the genetic architecture that underlies CW differs from that of the other carcass traits analyzed. Thus, our study provides significant new insights into the carcass traits of Hanwoo cattle.
Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits.

PubMed

Randall, Joshua C; Winkler, Thomas W; Kutalik, Zoltán; Berndt, Sonja I; Jackson, Anne U; Monda, Keri L; Kilpeläinen, Tuomas O; Esko, Tõnu; Mägi, Reedik; Li, Shengxu; Workalemahu, Tsegaselassie; Feitosa, Mary F; Croteau-Chonka, Damien C; Day, Felix R; Fall, Tove; Ferreira, Teresa; Gustafsson, Stefan; Locke, Adam E; Mathieson, Iain; Scherag, Andre; Vedantam, Sailaja; Wood, Andrew R; Liang, Liming; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Dermitzakis, Emmanouil T; Dimas, Antigone S; Karpe, Fredrik; Min, Josine L; Nicholson, George; Clegg, Deborah J; Person, Thomas; Krohn, Jon P; Bauer, Sabrina; Buechler, Christa; Eisinger, Kristina; Bonnefond, Amélie; Froguel, Philippe; Hottenga, Jouke-Jan; Prokopenko, Inga; Waite, Lindsay L; Harris, Tamara B; Smith, Albert Vernon; Shuldiner, Alan R; McArdle, Wendy L; Caulfield, Mark J; Munroe, Patricia B; Grönberg, Henrik; Chen, Yii-Der Ida; Li, Guo; Beckmann, Jacques S; Johnson, Toby; Thorsteinsdottir, Unnur; Teder-Laving, Maris; Khaw, Kay-Tee; Wareham, Nicholas J; Zhao, Jing Hua; Amin, Najaf; Oostra, Ben A; Kraja, Aldi T; Province, Michael A; Cupples, L Adrienne; Heard-Costa, Nancy L; Kaprio, Jaakko; Ripatti, Samuli; Surakka, Ida; Collins, Francis S; Saramies, Jouko; Tuomilehto, Jaakko; Jula, Antti; Salomaa, Veikko; Erdmann, Jeanette; Hengstenberg, Christian; Loley, Christina; Schunkert, Heribert; Lamina, Claudia; Wichmann, H Erich; Albrecht, Eva; Gieger, Christian; Hicks, Andrew A; Johansson, Asa; Pramstaller, Peter P; Kathiresan, Sekar; Speliotes, Elizabeth K; Penninx, Brenda; Hartikainen, Anna-Liisa; Jarvelin, Marjo-Riitta; Gyllensten, Ulf; Boomsma, Dorret I; Campbell, Harry; Wilson, James F; Chanock, Stephen J; Farrall, Martin; Goel, Anuj; Medina-Gomez, Carolina; Rivadeneira, Fernando; Estrada, Karol; Uitterlinden, André G; Hofman, Albert; Zillikens, M Carola; den Heijer, Martin; Kiemeney, Lambertus A; Maschio, Andrea; Hall, Per; Tyrer, Jonathan; Teumer, Alexander; Völzke, Henry; Kovacs, Peter; Tönjes, Anke; Mangino, Massimo; Spector, Tim D; Hayward, Caroline; Rudan, Igor; Hall, Alistair S; Samani, Nilesh J; Attwood, Antony Paul; Sambrook, Jennifer G; Hung, Joseph; Palmer, Lyle J; Lokki, Marja-Liisa; Sinisalo, Juha; Boucher, Gabrielle; Huikuri, Heikki; Lorentzon, Mattias; Ohlsson, Claes; Eklund, Niina; Eriksson, Johan G; Barlassina, Cristina; Rivolta, Carlo; Nolte, Ilja M; Snieder, Harold; Van der Klauw, Melanie M; Van Vliet-Ostaptchouk, Jana V; Gejman, Pablo V; Shi, Jianxin; Jacobs, Kevin B; Wang, Zhaoming; Bakker, Stephan J L; Mateo Leach, Irene; Navis, Gerjan; van der Harst, Pim; Martin, Nicholas G; Medland, Sarah E; Montgomery, Grant W; Yang, Jian; Chasman, Daniel I; Ridker, Paul M; Rose, Lynda M; Lehtimäki, Terho; Raitakari, Olli; Absher, Devin; Iribarren, Carlos; Basart, Hanneke; Hovingh, Kees G; Hyppönen, Elina; Power, Chris; Anderson, Denise; Beilby, John P; Hui, Jennie; Jolley, Jennifer; Sager, Hendrik; Bornstein, Stefan R; Schwarz, Peter E H; Kristiansson, Kati; Perola, Markus; Lindström, Jaana; Swift, Amy J; Uusitupa, Matti; Atalay, Mustafa; Lakka, Timo A; Rauramaa, Rainer; Bolton, Jennifer L; Fowkes, Gerry; Fraser, Ross M; Price, Jackie F; Fischer, Krista; Krjutå Kov, Kaarel; Metspalu, Andres; Mihailov, Evelin; Langenberg, Claudia; Luan, Jian'an; Ong, Ken K; Chines, Peter S; Keinanen-Kiukaanniemi, Sirkka M; Saaristo, Timo E; Edkins, Sarah; Franks, Paul W; Hallmans, Göran; Shungin, Dmitry; Morris, Andrew David; Palmer, Colin N A; Erbel, Raimund; Moebus, Susanne; Nöthen, Markus M; Pechlivanis, Sonali; Hveem, Kristian; Narisu, Narisu; Hamsten, Anders; Humphries, Steve E; Strawbridge, Rona J; Tremoli, Elena; Grallert, Harald; Thorand, Barbara; Illig, Thomas; Koenig, Wolfgang; Müller-Nurasyid, Martina; Peters, Annette; Boehm, Bernhard O; Kleber, Marcus E; März, Winfried; Winkelmann, Bernhard R; Kuusisto, Johanna; Laakso, Markku; Arveiler, Dominique; Cesana, Giancarlo; Kuulasmaa, Kari; Virtamo, Jarmo; Yarnell, John W G; Kuh, Diana; Wong, Andrew; Lind, Lars; de Faire, Ulf; Gigante, Bruna; Magnusson, Patrik K E; Pedersen, Nancy L; Dedoussis, George; Dimitriou, Maria; Kolovou, Genovefa; Kanoni, Stavroula; Stirrups, Kathleen; Bonnycastle, Lori L; Njølstad, Inger; Wilsgaard, Tom; Ganna, Andrea; Rehnberg, Emil; Hingorani, Aroon; Kivimaki, Mika; Kumari, Meena; Assimes, Themistocles L; Barroso, Inês; Boehnke, Michael; Borecki, Ingrid B; Deloukas, Panos; Fox, Caroline S; Frayling, Timothy; Groop, Leif C; Haritunians, Talin; Hunter, David; Ingelsson, Erik; Kaplan, Robert; Mohlke, Karen L; O'Connell, Jeffrey R; Schlessinger, David; Strachan, David P; Stefansson, Kari; van Duijn, Cornelia M; Abecasis, Gonçalo R; McCarthy, Mark I; Hirschhorn, Joel N; Qi, Lu; Loos, Ruth J F; Lindgren, Cecilia M; North, Kari E; Heid, Iris M

2013-06-01

Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10(-8)), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits.
Clinical Application of Genome and Exome Sequencing as a Diagnostic Tool for Pediatric Patients: a Scoping Review of the Literature.

PubMed

Smith, Hadley Stevens; Swint, J Michael; Lalani, Seema R; Yamal, Jose-Miguel; de Oliveira Otto, Marcia C; Castellanos, Stephan; Taylor, Amy; Lee, Brendan H; Russell, Heidi V

2018-05-14

Availability of clinical genomic sequencing (CGS) has generated questions about the value of genome and exome sequencing as a diagnostic tool. Analysis of reported CGS application can inform uptake and direct further research. This scoping literature review aims to synthesize evidence on the clinical and economic impact of CGS. PubMed, Embase, and Cochrane were searched for peer-reviewed articles published between 2009 and 2017 on diagnostic CGS for infant and pediatric patients. Articles were classified according to sample size and whether economic evaluation was a primary research objective. Data on patient characteristics, clinical setting, and outcomes were extracted and narratively synthesized. Of 171 included articles, 131 were case reports, 40 were aggregate analyses, and 4 had a primary economic evaluation aim. Diagnostic yield was the only consistently reported outcome. Median diagnostic yield in aggregate analyses was 33.2% but varied by broad clinical categories and test type. Reported CGS use has rapidly increased and spans diverse clinical settings and patient phenotypes. Economic evaluations support the cost-saving potential of diagnostic CGS. Multidisciplinary implementation research, including more robust outcome measurement and economic evaluation, is needed to demonstrate clinical utility and cost-effectiveness of CGS.
Sex-stratified Genome-wide Association Studies Including 270,000 Individuals Show Sexual Dimorphism in Genetic Loci for Anthropometric Traits

PubMed Central

Jackson, Anne U.; Monda, Keri L.; Kilpeläinen, Tuomas O.; Esko, Tõnu; Mägi, Reedik; Li, Shengxu; Workalemahu, Tsegaselassie; Feitosa, Mary F.; Croteau-Chonka, Damien C.; Day, Felix R.; Fall, Tove; Ferreira, Teresa; Gustafsson, Stefan; Locke, Adam E.; Mathieson, Iain; Scherag, Andre; Vedantam, Sailaja; Wood, Andrew R.; Liang, Liming; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Dermitzakis, Emmanouil T.; Dimas, Antigone S.; Karpe, Fredrik; Min, Josine L.; Nicholson, George; Clegg, Deborah J.; Person, Thomas; Krohn, Jon P.; Bauer, Sabrina; Buechler, Christa; Eisinger, Kristina; Bonnefond, Amélie; Froguel, Philippe; Hottenga, Jouke-Jan; Prokopenko, Inga; Waite, Lindsay L.; Harris, Tamara B.; Smith, Albert Vernon; Shuldiner, Alan R.; McArdle, Wendy L.; Caulfield, Mark J.; Munroe, Patricia B.; Grönberg, Henrik; Chen, Yii-Der Ida; Li, Guo; Beckmann, Jacques S.; Johnson, Toby; Thorsteinsdottir, Unnur; Teder-Laving, Maris; Khaw, Kay-Tee; Wareham, Nicholas J.; Zhao, Jing Hua; Amin, Najaf; Oostra, Ben A.; Kraja, Aldi T.; Province, Michael A.; Cupples, L. Adrienne; Heard-Costa, Nancy L.; Kaprio, Jaakko; Ripatti, Samuli; Surakka, Ida; Collins, Francis S.; Saramies, Jouko; Tuomilehto, Jaakko; Jula, Antti; Salomaa, Veikko; Erdmann, Jeanette; Hengstenberg, Christian; Loley, Christina; Schunkert, Heribert; Lamina, Claudia; Wichmann, H. Erich; Albrecht, Eva; Gieger, Christian; Hicks, Andrew A.; Johansson, Åsa; Pramstaller, Peter P.; Kathiresan, Sekar; Speliotes, Elizabeth K.; Penninx, Brenda; Hartikainen, Anna-Liisa; Jarvelin, Marjo-Riitta; Gyllensten, Ulf; Boomsma, Dorret I.; Campbell, Harry; Wilson, James F.; Chanock, Stephen J.; Farrall, Martin; Goel, Anuj; Medina-Gomez, Carolina; Rivadeneira, Fernando; Estrada, Karol; Uitterlinden, André G.; Hofman, Albert; Zillikens, M. Carola; den Heijer, Martin; Kiemeney, Lambertus A.; Maschio, Andrea; Hall, Per; Tyrer, Jonathan; Teumer, Alexander; Völzke, Henry; Kovacs, Peter; Tönjes, Anke; Mangino, Massimo; Spector, Tim D.; Hayward, Caroline; Rudan, Igor; Hall, Alistair S.; Samani, Nilesh J.; Attwood, Antony Paul; Sambrook, Jennifer G.; Hung, Joseph; Palmer, Lyle J.; Lokki, Marja-Liisa; Sinisalo, Juha; Boucher, Gabrielle; Huikuri, Heikki; Lorentzon, Mattias; Ohlsson, Claes; Eklund, Niina; Eriksson, Johan G.; Barlassina, Cristina; Rivolta, Carlo; Nolte, Ilja M.; Snieder, Harold; Van der Klauw, Melanie M.; Van Vliet-Ostaptchouk, Jana V.; Gejman, Pablo V.; Shi, Jianxin; Jacobs, Kevin B.; Wang, Zhaoming; Bakker, Stephan J. L.; Mateo Leach, Irene; Navis, Gerjan; van der Harst, Pim; Martin, Nicholas G.; Medland, Sarah E.; Montgomery, Grant W.; Yang, Jian; Chasman, Daniel I.; Ridker, Paul M.; Rose, Lynda M.; Lehtimäki, Terho; Raitakari, Olli; Absher, Devin; Iribarren, Carlos; Basart, Hanneke; Hovingh, Kees G.; Hyppönen, Elina; Power, Chris; Anderson, Denise; Beilby, John P.; Hui, Jennie; Jolley, Jennifer; Sager, Hendrik; Bornstein, Stefan R.; Schwarz, Peter E. H.; Kristiansson, Kati; Perola, Markus; Lindström, Jaana; Swift, Amy J.; Uusitupa, Matti; Atalay, Mustafa; Lakka, Timo A.; Rauramaa, Rainer; Bolton, Jennifer L.; Fowkes, Gerry; Fraser, Ross M.; Price, Jackie F.; Fischer, Krista; KrjutÅ¡kov, Kaarel; Metspalu, Andres; Mihailov, Evelin; Langenberg, Claudia; Luan, Jian'an; Ong, Ken K.; Chines, Peter S.; Keinanen-Kiukaanniemi, Sirkka M.; Saaristo, Timo E.; Edkins, Sarah; Franks, Paul W.; Hallmans, Göran; Shungin, Dmitry; Morris, Andrew David; Palmer, Colin N. A.; Erbel, Raimund; Moebus, Susanne; Nöthen, Markus M.; Pechlivanis, Sonali; Hveem, Kristian; Narisu, Narisu; Hamsten, Anders; Humphries, Steve E.; Strawbridge, Rona J.; Tremoli, Elena; Grallert, Harald; Thorand, Barbara; Illig, Thomas; Koenig, Wolfgang; Müller-Nurasyid, Martina; Peters, Annette; Boehm, Bernhard O.; Kleber, Marcus E.; März, Winfried; Winkelmann, Bernhard R.; Kuusisto, Johanna; Laakso, Markku; Arveiler, Dominique; Cesana, Giancarlo; Kuulasmaa, Kari; Virtamo, Jarmo; Yarnell, John W. G.; Kuh, Diana; Wong, Andrew; Lind, Lars; de Faire, Ulf; Gigante, Bruna; Magnusson, Patrik K. E.; Pedersen, Nancy L.; Dedoussis, George; Dimitriou, Maria; Kolovou, Genovefa; Kanoni, Stavroula; Stirrups, Kathleen; Bonnycastle, Lori L.; Njølstad, Inger; Wilsgaard, Tom; Ganna, Andrea; Rehnberg, Emil; Hingorani, Aroon; Kivimaki, Mika; Kumari, Meena; Assimes, Themistocles L.; Barroso, Inês; Boehnke, Michael; Borecki, Ingrid B.; Deloukas, Panos; Fox, Caroline S.; Frayling, Timothy; Groop, Leif C.; Haritunians, Talin; Hunter, David; Ingelsson, Erik; Kaplan, Robert; Mohlke, Karen L.; O'Connell, Jeffrey R.; Schlessinger, David; Strachan, David P.; Stefansson, Kari; van Duijn, Cornelia M.; Abecasis, Gonçalo R.; McCarthy, Mark I.; Hirschhorn, Joel N.; Qi, Lu; Loos, Ruth J. F.; Lindgren, Cecilia M.; North, Kari E.; Heid, Iris M.

2013-01-01

Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10−8), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits. PMID:23754948
Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR).

PubMed

Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J; Laclette, Juan P; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

2015-05-19

Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest.
Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR)

PubMed Central

Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J.; Laclette, Juan P.; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

2015-01-01

Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest. PMID:25989346
Genetic Diversity, Population Structure, and Linkage Disequilibrium in Bread Wheat (Triticum aestivum L.).

PubMed

Tascioglu, Tulin; Metin, Ozge Karakas; Aydin, Yildiz; Sakiroglu, Muhammet; Akan, Kadir; Uncuoglu, Ahu Altinkut

2016-08-01

Bread wheat (Triticum aestivum L.) gene pool was analyzed with 117 microsatellite markers scattered throughout A, B, and D genomes. Ninety microsatellite markers were giving 1620 polymorphic alleles in 55 different bread wheat genotypes. These genotypes were found to be divided into three subgroups based on Bayesian model and Principal component analysis. The highest polymorphism information content value for the markers resides on A genome was estimated for wmc262 marker located on 4A chromosome with the polymorphism information content value of 0.960. The highest polymorphism information content value (0.954) among the markers known to be located on B genome was realized for wmc44 marker located on 1B chromosome. The highest polymorphism information content value for the markers specific to D genome was found in gwm174 marker located on 5D chromosome with the polymorphism information content value of 0.948. The presence of linkage disequilibrium between 81 pairwise SSR markers reside on the same chromosome was tested and very limited linkage disequilibrium was observed. The results confirmed that the most distant genotype pairs were as follows Ceyhan-99-Behoth 6, Gerek 79-Douma 40989, and Karahan-99-Douma 48114.
Mutational Dynamics of Aroid Chloroplast Genomes

PubMed Central

Ahmed, Ibrar; Biggs, Patrick J.; Matthews, Peter J.; Collins, Lesley J.; Hendy, Michael D.; Lockhart, Peter J.

2012-01-01

A characteristic feature of eukaryote and prokaryote genomes is the co-occurrence of nucleotide substitution and insertion/deletion (indel) mutations. Although similar observations have also been made for chloroplast DNA, genome-wide associations have not been reported. We determined the chloroplast genome sequences for two morphotypes of taro (Colocasia esculenta; family Araceae) and compared these with four publicly available aroid chloroplast genomes. Here, we report the extent of genome-wide association between direct and inverted repeats, indels, and substitutions in these aroid chloroplast genomes. We suggest that alternative but not mutually exclusive hypotheses explain the mutational dynamics of chloroplast genome evolution. PMID:23204304
Genome Size Variation in the Genus Carthamus (Asteraceae, Cardueae): Systematic Implications and Additive Changes During Allopolyploidization

PubMed Central

GARNATJE, TERESA; GARCIA, SÒNIA; VILATERSANA, ROSER; VALLÈS, JOAN

2006-01-01

• Background and Aims Plant genome size is an important biological characteristic, with relationships to systematics, ecology and distribution. Currently, there is no information regarding nuclear DNA content for any Carthamus species. In addition to improving the knowledge base, this research focuses on interspecific variation and its implications for the infrageneric classification of this genus. Genome size variation in the process of allopolyploid formation is also addressed. • Methods Nuclear DNA samples from 34 populations of 16 species of the genus Carthamus were assessed by flow cytometry using propidium iodide. • Key Results The 2C values ranged from 2·26 pg for C. leucocaulos to 7·46 pg for C. turkestanicus, and monoploid genome size (1Cx-value) ranged from 1·13 pg in C. leucocaulos to 1·53 pg in C. alexandrinus. Mean genome sizes differed significantly, based on sectional classification. Both allopolyploid species (C. creticus and C. turkestanicus) exhibited nuclear DNA contents in accordance with the sum of the putative parental C-values (in one case with a slight reduction, frequent in polyploids), supporting their hybrid origin. • Conclusions Genome size represents a useful tool in elucidating systematic relationships between closely related species. A considerable reduction in monoploid genome size, possibly due to the hybrid formation, is also reported within these taxa. PMID:16390843
On the additive and dominant variance and covariance of individuals within the genomic selection scope.

PubMed

Vitezica, Zulma G; Varona, Luis; Legarra, Andres

2013-12-01

Genomic evaluation models can fit additive and dominant SNP effects. Under quantitative genetics theory, additive or "breeding" values of individuals are generated by substitution effects, which involve both "biological" additive and dominant effects of the markers. Dominance deviations include only a portion of the biological dominant effects of the markers. Additive variance includes variation due to the additive and dominant effects of the markers. We describe a matrix of dominant genomic relationships across individuals, D, which is similar to the G matrix used in genomic best linear unbiased prediction. This matrix can be used in a mixed-model context for genomic evaluations or to estimate dominant and additive variances in the population. From the "genotypic" value of individuals, an alternative parameterization defines additive and dominance as the parts attributable to the additive and dominant effect of the markers. This approach underestimates the additive genetic variance and overestimates the dominance variance. Transforming the variances from one model into the other is trivial if the distribution of allelic frequencies is known. We illustrate these results with mouse data (four traits, 1884 mice, and 10,946 markers) and simulated data (2100 individuals and 10,000 markers). Variance components were estimated correctly in the model, considering breeding values and dominance deviations. For the model considering genotypic values, the inclusion of dominant effects biased the estimate of additive variance. Genomic models were more accurate for the estimation of variance components than their pedigree-based counterparts.
Genome-wide association analysis for feed efficiency in Angus cattle.

PubMed

Rolf, M M; Taylor, J F; Schnabel, R D; McKay, S D; McClure, M C; Northcutt, S L; Kerley, M S; Weaber, R L

2012-08-01

Estimated breeding values for average daily feed intake (AFI; kg/day), residual feed intake (RFI; kg/day) and average daily gain (ADG; kg/day) were generated using a mixed linear model incorporating genomic relationships for 698 Angus steers genotyped with the Illumina BovineSNP50 assay. Association analyses of estimated breeding values (EBVs) were performed for 41,028 single nucleotide polymorphisms (SNPs), and permutation analysis was used to empirically establish the genome-wide significance threshold (P < 0.05) for each trait. SNPs significantly associated with each trait were used in a forward selection algorithm to identify genomic regions putatively harbouring genes with effects on each trait. A total of 53, 66 and 68 SNPs explained 54.12% (24.10%), 62.69% (29.85%) and 55.13% (26.54%) of the additive genetic variation (when accounting for the genomic relationships) in steer breeding values for AFI, RFI and ADG, respectively, within this population. Evaluation by pathway analysis revealed that many of these SNPs are in genomic regions that harbour genes with metabolic functions. The presence of genetic correlations between traits resulted in 13.2% of SNPs selected for AFI and 4.5% of SNPs selected for RFI also being selected for ADG in the analysis of breeding values. While our study identifies panels of SNPs significant for efficiency traits in our population, validation of all SNPs in independent populations will be necessary before commercialization. © 2011 The Authors, Animal Genetics © 2011 Stichting International Foundation for Animal Genetics.
First Nuclear DNA Amounts in more than 300 Angiosperms

PubMed Central

ZONNEVELD, B. J. M.; LEITCH, I. J.; BENNETT, M. D.

2005-01-01

• Background and Aims Genome size (DNA C-value) data are key biodiversity characters of fundamental significance used in a wide variety of biological fields. Since 1976, Bennett and colleagues have made scattered published and unpublished genome size data more widely accessible by assembling them into user-friendly compilations. Initially these were published as hard copy lists, but since 1997 they have also been made available electronically (see the Plant DNA C-values database www.kew.org/cval/homepage.html). Nevertheless, at the Second Plant Genome Size Meeting in 2003, Bennett noted that as many as 1000 DNA C-value estimates were still unpublished and hence unavailable. Scientists were strongly encouraged to communicate such unpublished data. The present work combines the databasing experience of the Kew-based authors with the unpublished C-values produced by Zonneveld to make a large body of valuable genome size data available to the scientific community. • Methods C-values for angiosperm species, selected primarily for their horticultural interest, were estimated by flow cytometry using the fluorochrome propidium iodide. The data were compiled into a table whose form is similar to previously published lists of DNA amounts by Bennett and colleagues. • Key Results and Conclusions The present work contains C-values for 411 taxa including first values for 308 species not listed previously by Bennett and colleagues. Based on a recent estimate of the global published output of angiosperm DNA C-value data (i.e. 200 first C-value estimates per annum) the present work equals 1·5 years of average global published output; and constitutes over 12 % of the latest 5-year global target set by the Second Plant Genome Size Workshop (see www.kew.org/cval/workshopreport.html). Hopefully, the present example will encourage others to unveil further valuable data which otherwise may lie forever unpublished and unavailable for comparative analyses. PMID:15905300
Genome-wide association analysis of bacterial cold water disease resistance in rainbow trout reveals the potential of a hybrid approach between genomic selection and marker assisted selection

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) simultaneously incorporates dense SNP marker genotypes with phenotypic data from related animals to predict animal-specific genomic breeding value (GEBV), which circumvents the need to measure the disease phenotype in potential breeders. Marker assisted selection (MAS) involv...
Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

USDA-ARS?s Scientific Manuscript database

A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
Comparative genomic analysis of Mycobacterium tuberculosis clinical isolates.

PubMed

Liu, Fei; Hu, Yongfei; Wang, Qi; Li, Hong Min; Gao, George F; Liu, Cui Hua; Zhu, Baoli

2014-06-13

Due to excessive antibiotic use, drug-resistant Mycobacterium tuberculosis has become a serious public health threat and a major obstacle to disease control in many countries. To better understand the evolution of drug-resistant M. tuberculosis strains, we performed whole genome sequencing for 7 M. tuberculosis clinical isolates with different antibiotic resistance profiles and conducted comparative genomic analysis of gene variations among them. We observed that all 7 M. tuberculosis clinical isolates with different levels of drug resistance harbored similar numbers of SNPs, ranging from 1409-1464. The numbers of insertion/deletions (Indels) identified in the 7 isolates were also similar, ranging from 56 to 101. A total of 39 types of mutations were identified in drug resistance-associated loci, including 14 previously reported ones and 25 newly identified ones. Sixteen of the identified large Indels spanned PE-PPE-PGRS genes, which represents a major source of antigenic variability. Aside from SNPs and Indels, a CRISPR locus with varied spacers was observed in all 7 clinical isolates, suggesting that they might play an important role in plasticity of the M. tuberculosis genome. The nucleotide diversity (Л value) and selection intensity (dN/dS value) of the whole genome sequences of the 7 isolates were similar. The dN/dS values were less than 1 for all 7 isolates (range from 0.608885 to 0.637365), supporting the notion that M. tuberculosis genomes undergo purifying selection. The Л values and dN/dS values were comparable between drug-susceptible and drug-resistant strains. In this study, we show that clinical M. tuberculosis isolates exhibit distinct variations in terms of the distribution of SNP, Indels, CRISPR-cas locus, as well as the nucleotide diversity and selection intensity, but there are no generalizable differences between drug-susceptible and drug-resistant isolates on the genomic scale. Our study provides evidence strengthening the notion that the evolution of drug resistance among clinical M. tuberculosis isolates is clearly a complex and diversified process.
Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

PubMed

Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

2014-11-01

Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

Estimating variance components and breeding values for number of oocytes and number of embryos in dairy cattle using a single-step genomic evaluation.

PubMed

Cornelissen, M A M C; Mullaart, E; Van der Linde, C; Mulder, H A

2017-06-01

Reproductive technologies such as multiple ovulation and embryo transfer (MOET) and ovum pick-up (OPU) accelerate genetic improvement in dairy breeding schemes. To enhance the efficiency of embryo production, breeding values for traits such as number of oocytes (NoO) and number of MOET embryos (NoM) can help in selection of donors with high MOET or OPU efficiency. The aim of this study was therefore to estimate variance components and (genomic) breeding values for NoO and NoM based on Dutch Holstein data. Furthermore, a 10-fold cross-validation was carried out to assess the accuracy of pedigree and genomic breeding values for NoO and NoM. For NoO, 40,734 OPU sessions between 1993 and 2015 were analyzed. These OPU sessions originated from 2,543 donors, from which 1,144 were genotyped. For NoM, 35,695 sessions between 1994 and 2015 were analyzed. These MOET sessions originated from 13,868 donors, from which 3,716 were genotyped. Analyses were done using only pedigree information and using a single-step genomic BLUP (ssGBLUP) approach combining genomic information and pedigree information. Heritabilities were very similar based on pedigree information or based on ssGBLUP [i.e., 0.32 (standard error = 0.03) for NoO and 0.21 (standard error = 0.01) for NoM with pedigree, 0.31 (standard error = 0.03) for NoO, and 0.22 (standard error = 0.01) for NoM with ssGBLUP]. For animals without their own information as mimicked in the cross-validation, the accuracy of pedigree-based breeding values was 0.46 for NoO and NoM. The accuracies of genomic breeding values from ssGBLUP were 0.54 for NoO and 0.52 for NoM. These results show that including genomic information increases the accuracies. These moderate accuracies in combination with a large genetic variance show good opportunities for selection of potential bull dams. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Accounting for Genotype-by-Environment Interactions and Residual Genetic Variation in Genomic Selection for Water-Soluble Carbohydrate Concentration in Wheat.

PubMed

Ovenden, Ben; Milgate, Andrew; Wade, Len J; Rebetzke, Greg J; Holland, James B

2018-05-31

Abiotic stress tolerance traits are often complex and recalcitrant targets for conventional breeding improvement in many crop species. This study evaluated the potential of genomic selection to predict water-soluble carbohydrate concentration (WSCC), an important drought tolerance trait, in wheat under field conditions. A panel of 358 varieties and breeding lines constrained for maturity was evaluated under rainfed and irrigated treatments across two locations and two years. Whole-genome marker profiles and factor analytic mixed models were used to generate genomic estimated breeding values (GEBVs) for specific environments and environment groups. Additive genetic variance was smaller than residual genetic variance for WSCC, such that genotypic values were dominated by residual genetic effects rather than additive breeding values. As a result, GEBVs were not accurate predictors of genotypic values of the extant lines, but GEBVs should be reliable selection criteria to choose parents for intermating to produce new populations. The accuracy of GEBVs for untested lines was sufficient to increase predicted genetic gain from genomic selection per unit time compared to phenotypic selection if the breeding cycle is reduced by half by the use of GEBVs in off-season generations. Further, genomic prediction accuracy depended on having phenotypic data from environments with strong correlations with target production environments to build prediction models. By combining high-density marker genotypes, stress-managed field evaluations, and mixed models that model simultaneously covariances among genotypes and covariances of complex trait performance between pairs of environments, we were able to train models with good accuracy to facilitate genetic gain from genomic selection. Copyright © 2018 Ovenden et al.
Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

PubMed

Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

2016-06-02

Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine. Copyright © 2016 American Society of Human Genetics. All rights reserved.
Stratification of clear cell renal cell carcinoma (ccRCC) genomes by gene-directed copy number alteration (CNA) analysis

PubMed Central

Thiesen, H.-J.; Steinbeck, F.; Maruschke, M.; Koczan, D.; Ziems, B.; Hakenberg, O. W.

2017-01-01

Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value. PMID:28486536
Stratification of clear cell renal cell carcinoma (ccRCC) genomes by gene-directed copy number alteration (CNA) analysis.

PubMed

Thiesen, H-J; Steinbeck, F; Maruschke, M; Koczan, D; Ziems, B; Hakenberg, O W

2017-01-01

Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value.
The Neandertal genome and ancient DNA authenticity

PubMed Central

Green, Richard E; Briggs, Adrian W; Krause, Johannes; Prüfer, Kay; Burbano, Hernán A; Siebauer, Michael; Lachmann, Michael; Pääbo, Svante

2009-01-01

Recent advances in high-thoughput DNA sequencing have made genome-scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large-scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar ‘boot-strap' approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired. PMID:19661919
Statistical Methods in Integrative Genomics

PubMed Central

Richardson, Sylvia; Tseng, George C.; Sun, Wei

2016-01-01

Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531
solGS: a web-based tool for genomic selection

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, ana...
Gene conversion events and variable degree of homogenization of rDNA loci in cultivars of Brassica napus

PubMed Central

Sochorová, Jana; Coriton, Olivier; Kuderová, Alena; Lunerová, Jana; Chèvre, Anne-Marie; Kovařík, Aleš

2017-01-01

Background and aims Brassica napus (AACC, 2n = 38, oilseed rape) is a relatively recent allotetraploid species derived from the putative progenitor diploid species Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18). To determine the influence of intensive breeding conditions on the evolution of its genome, we analysed structure and copy number of rDNA in 21 cultivars of B. napus, representative of genetic diversity. Methods We used next-generation sequencing genomic approaches, Southern blot hybridization, expression analysis and fluorescence in situ hybridization (FISH). Subgenome-specific sequences derived from rDNA intergenic spacers (IGS) were used as probes for identification of loci composition on chromosomes. Key Results Most B. napus cultivars (18/21, 86 %) had more A-genome than C-genome rDNA copies. Three cultivars analysed by FISH (‘Darmor’, ‘Yudal’ and ‘Asparagus kale’) harboured the same number (12 per diploid set) of loci. In B. napus ‘Darmor’, the A-genome-specific rDNA probe hybridized to all 12 rDNA loci (eight on the A-genome and four on the C-genome) while the C-genome-specific probe showed weak signals on the C-genome loci only. Deep sequencing revealed high homogeneity of arrays suggesting that the C-genome genes were largely overwritten by the A-genome variants in B. napus ‘Darmor’. In contrast, B. napus ‘Yudal’ showed a lack of gene conversion evidenced by additive inheritance of progenitor rDNA variants and highly localized hybridization signals of subgenome-specific probes on chromosomes. Brassica napus ‘Asparagus kale’ showed an intermediate pattern to ‘Darmor’ and ‘Yudal’. At the expression level, most cultivars (95 %) exhibited stable A-genome nucleolar dominance while one cultivar (‘Norin 9’) showed co-dominance. Conclusions The B. napus cultivars differ in the degree and direction of rDNA homogenization. The prevalent direction of gene conversion (towards the A-genome) correlates with the direction of expression dominance indicating that gene activity may be needed for interlocus gene conversion. PMID:27707747
Family-based Association Analyses of Imputed Genotypes Reveal Genome-Wide Significant Association of Alzheimer’s disease with OSBPL6, PTPRG and PDCL3

PubMed Central

Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.

2015-01-01

The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138
A draft annotation and overview of the human genome

PubMed Central

Wright, Fred A; Lemon, William J; Zhao, Wei D; Sears, Russell; Zhuo, Degen; Wang, Jian-Ping; Yang, Hee-Yung; Baer, Troy; Stredney, Don; Spitzner, Joe; Stutz, Al; Krahe, Ralf; Yuan, Bo

2001-01-01

Background The recent draft assembly of the human genome provides a unified basis for describing genomic structure and function. The draft is sufficiently accurate to provide useful annotation, enabling direct observations of previously inferred biological phenomena. Results We report here a functionally annotated human gene index placed directly on the genome. The index is based on the integration of public transcript, protein, and mapping information, supplemented with computational prediction. We describe numerous global features of the genome and examine the relationship of various genetic maps with the assembly. In addition, initial sequence analysis reveals highly ordered chromosomal landscapes associated with paralogous gene clusters and distinct functional compartments. Finally, these annotation data were synthesized to produce observations of gene density and number that accord well with historical estimates. Such a global approach had previously been described only for chromosomes 21 and 22, which together account for 2.2% of the genome. Conclusions We estimate that the genome contains 65,000-75,000 transcriptional units, with exon sequences comprising 4%. The creation of a comprehensive gene index requires the synthesis of all available computational and experimental evidence. PMID:11516338
Accurate computation of survival statistics in genome-wide studies.

PubMed

Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli

2015-05-01

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.
Accurate Computation of Survival Statistics in Genome-Wide Studies

PubMed Central

Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli

2015-01-01

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620
Challenges of flow-cytometric estimation of nuclear genome size in orchids, a plant group with both whole-genome and progressively partial endoreplication.

PubMed

Trávníček, Pavel; Ponert, Jan; Urfus, Tomáš; Jersáková, Jana; Vrána, Jan; Hřibová, Eva; Doležel, Jaroslav; Suda, Jan

2015-10-01

Nuclear genome size is an inherited quantitative trait of eukaryotic organisms with both practical and biological consequences. A detailed analysis of major families is a promising approach to fully understand the biological meaning of the extensive variation in genome size in plants. Although Orchidaceae accounts for ∼10% of the angiosperm diversity, the knowledge of patterns and dynamics of their genome size is limited, in part due to difficulties in flow cytometric analyses. Cells in various somatic tissues of orchids undergo extensive endoreplication, either whole-genome or partial, and the G1-phase nuclei with 2C DNA amounts may be lacking, resulting in overestimated genome size values. Interpretation of DNA content histograms is particularly challenging in species with progressively partial endoreplication, in which the ratios between the positions of two neighboring DNA peaks are lower than two. In order to assess distributions of nuclear DNA amounts and identify tissue suitable for reliable estimation of nuclear DNA content, we analyzed six different tissue types in 48 orchid species belonging to all recognized subfamilies. Although traditionally used leaves may provide incorrect C-values, particularly in species with progressively partial endoreplication, young ovaries and pollinaria consistently yield 2C and 1C peaks of their G1-phase nuclei, respectively, and are, therefore, the most suitable parts for genome size studies in orchids. We also provide new DNA C-values for 22 orchid genera and 42 species. Adhering to the proposed methodology would allow for reliable genome size estimates in this largest plant family. Although our research was limited to orchids, the need to find a suitable tissue with dominant 2C peak of G1-phase nuclei applies to all endopolyploid species. © 2015 International Society for Advancement of Cytometry.
Arthropod genomic resources for the 21st century

USDA-ARS?s Scientific Manuscript database

Genome references are foundational for high quality entomological research today. Species, sub populations and taxonomy are defined by gene flow and genome sequences. Gene content in arthropods is often directly reflective of life history, for example, diet and symbiont related gene loss is observed...
Generation of a conditional analog-sensitive kinase in human cells using CRISPR/Cas9-mediated genome engineering.

PubMed

Moyer, Tyler C; Holland, Andrew J

2015-01-01

The ability to rapidly and specifically modify the genome of mammalian cells has been a long-term goal of biomedical researchers. Recently, the clustered, regularly interspaced, short palindromic repeats (CRISPR)/Cas9 system from bacteria has been exploited for genome engineering in human cells. The CRISPR system directs the RNA-guided Cas9 nuclease to a specific genomic locus to induce a DNA double-strand break that may be subsequently repaired by homology-directed repair using an exogenous DNA repair template. Here we describe a protocol using CRISPR/Cas9 to achieve bi-allelic insertion of a point mutation in human cells. Using this method, homozygous clonal cell lines can be constructed in 5-6 weeks. This method can also be adapted to insert larger DNA elements, such as fluorescent proteins and degrons, at defined genomic locations. CRISPR/Cas9 genome engineering offers exciting applications in both basic science and translational research. Copyright © 2015 Elsevier Inc. All rights reserved.
Accuracy of genomic prediction for BCWD resistance in rainbow trout using different genotyping platforms and genomic selection models

USDA-ARS?s Scientific Manuscript database

In this study, we aimed to (1) predict genomic estimated breeding value (GEBV) for bacterial cold water disease (BCWD) resistance by genotyping training (n=583) and validation samples (n=53) with two genotyping platforms (24K RAD-SNP and 49K SNP) and using different genomic selection (GS) models (Ba...
Forging New Cocoa Keys: The Impact of Unlocking the Cocoa Bean’s Genome on Pre-harvest Food Safety

USDA-ARS?s Scientific Manuscript database

Forging New Cocoa Keys: The Impact of Unlocking the Cocoa Bean’s Genome on Pre-harvest Food Safety David N. Kuhn, USDA ARS SHRS, Miami FL Sometimes it's hard to see the value and application of genomics to real world problems. How will sequencing the cacao genome affect West African farmers? Thi...
Assessing genomic selection prediction accuracy in a dynamic barley breeding

USDA-ARS?s Scientific Manuscript database

Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...
Genome-enabled prediction models for yield related traits in chickpea

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) unlike marker-assisted backcrossing (MABC) predicts breeding values of lines using genome-wide marker profiling and allows selection of lines prior to field-phenotyping, thereby shortening the breeding cycle. A collection of 320 elite breeding lines was selected and phenotyped...

Keep your Sox on: Community genomics-directed isolation and microscopic characterization of the dominant subsurface sulfur-oxidizing bacterium in a sediment aquifer

NASA Astrophysics Data System (ADS)

Mullin, S. W.; Wrighton, K. C.; Luef, B.; Wilkins, M. J.; Handley, K. M.; Williams, K. H.; Banfield, J. F.

2012-12-01

Community genomics and proteomics (proteogenomics) can be used to predict the metabolic potential of complex microbial communities and provide insight into microbial activity and nutrient cycling in situ. Inferences regarding the physiology of specific organisms then can guide isolation efforts, which, if successful, can yield strains that can be metabolically and structurally characterized to further test metagenomic predictions. Here we used proteogenomic data from an acetate-stimulated, sulfidic sediment column deployed in a groundwater well in Rifle, CO to direct laboratory amendment experiments to isolate a bacterial strain potentially involved in sulfur oxidation for physiological and microscopic characterization (Handley et al, submitted 2012). Field strains of Sulfurovum (genome r9c2) were predicted to be capable of CO2 fixation via the reverse TCA cycle and sulfur oxidation (Sox and SQR) coupled to either nitrate reduction (Nap, Nir, Nos) in anaerobic environments or oxygen reduction in microaerobic (cbb3 and bd oxidases) environments; however, key genes for sulfur oxidation (soxXAB) were not identified. Sulfidic groundwater and sediment from the Rifle site were used to inoculate cultures that contained various sulfur species, with and without nitrate and oxygen. We isolated a bacterium, Sulfurovum sp. OBA, whose 16S rRNA gene shares 99.8 % identity to the gene of the dominant genomically characterized strain (genome r9c2) in the Rifle sediment column. The 16S rRNA gene of the isolate most closely matches (95 % sequence identity) the gene of Sulfurovum sp. NBC37-1, a genome-sequenced deep-sea sulfur oxidizer. Strain OBA grew via polysulfide, colloidal sulfur, and tetrathionate oxidation coupled to nitrate reduction under autotrophic and mixotrophic conditions. Strain OBA also grew heterotrophically, oxidizing glucose, fructose, mannose, and maltose with nitrate as an electron acceptor. Over the range of oxygen concentrations tested, strain OBA was not capable of aerobic growth, but it could tolerate low oxygen conditions in the polysulfide/nitrate growth medium, suggesting that oxidases identified by genomics may play a role in detoxification rather than energy generation. Cryo-TEM imaging showed that strain OBA cells are rod-shaped and ~0.4 wide and 1.0 μm in length, and confirmed metagenomics-based predictions of a Gram-negative cell envelope, pili and polyphosphate body production. Our results show the value of integrating metagenomics, culturing, and microscopic imaging to discern the physiology of bacteria involved in biogeochemical transformations in the subsurface.
De Novo Transcriptome Sequence Assembly from Coconut Leaves and Seeds with a Focus on Factors Involved in RNA-Directed DNA Methylation

PubMed Central

Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L.; Chang, Bill Chia-Han; Matzke, Antonius J. M.; Matzke, Marjori

2014-01-01

Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. PMID:25193496
De novo transcriptome sequence assembly from coconut leaves and seeds with a focus on factors involved in RNA-directed DNA methylation.

PubMed

Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L; Chang, Bill Chia-Han; Matzke, Antonius J M; Matzke, Marjori

2014-09-04

Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. Copyright © 2014 Huang et al.
Detection of measles, mumps, and rubella viruses.

PubMed

Tipples, Graham; Hiebert, Joanne

2011-01-01

Measles, mumps, and rubella are infections caused by RNA viruses of the same name and are vaccine preventable. The vaccines are frequently administered in a trivalent form. Laboratory diagnostic methods can include indirect detection via antibody (IgM and IgG) detection methods and direct detection by viral culture or viral genome detection. There are challenges for the laboratory in areas with low prevalence due to high vaccine uptake. In those areas, routine serological methods such as IgM detection may have a reduced positive predictive value and thus require confirmation by other methods. Direct detection of viral genomic material using reverse transcription polymerase chain reaction (RT-PCR) methodologies can play an important role for laboratory confirmation of acute infections. Furthermore, genotyping of these three viruses provides useful molecular epidemiological data for differentiating vaccine from wild-type strains, linking cases and outbreaks, and tracking geographic spread and elimination. The purpose of this chapter is to provide guidance for the laboratory diagnosis of measles, mumps, and rubella virus infections. Where assays are commercially available or previously published, the appropriate references are provided as well as brief comments on the interpretation of results. Detailed protocols are provided for the molecular assays which have been developed and more commonly applied in recent years.
Evolution of genome size and chromosome number in the carnivorous plant genus Genlisea (Lentibulariaceae), with a new estimate of the minimum genome size in angiosperms

PubMed Central

Fleischmann, Andreas; Michael, Todd P.; Rivadavia, Fernando; Sousa, Aretuza; Wang, Wenqin; Temsch, Eva M.; Greilhuber, Johann; Müller, Kai F.; Heubl, Günther

2014-01-01

Background and Aims Some species of Genlisea possess ultrasmall nuclear genomes, the smallest known among angiosperms, and some have been found to have chromosomes of diminutive size, which may explain why chromosome numbers and karyotypes are not known for the majority of species of the genus. However, other members of the genus do not possess ultrasmall genomes, nor do most taxa studied in related genera of the family or order. This study therefore examined the evolution of genome sizes and chromosome numbers in Genlisea in a phylogenetic context. The correlations of genome size with chromosome number and size, with the phylogeny of the group and with growth forms and habitats were also examined. Methods Nuclear genome sizes were measured from cultivated plant material for a comprehensive sampling of taxa, including nearly half of all species of Genlisea and representing all major lineages. Flow cytometric measurements were conducted in parallel in two laboratories in order to compare the consistency of different methods and controls. Chromosome counts were performed for the majority of taxa, comparing different staining techniques for the ultrasmall chromosomes. Key Results Genome sizes of 15 taxa of Genlisea are presented and interpreted in a phylogenetic context. A high degree of congruence was found between genome size distribution and the major phylogenetic lineages. Ultrasmall genomes with 1C values of <100 Mbp were almost exclusively found in a derived lineage of South American species. The ancestral haploid chromosome number was inferred to be n = 8. Chromosome numbers in Genlisea ranged from 2n = 2x = 16 to 2n = 4x = 32. Ascendant dysploid series (2n = 36, 38) are documented for three derived taxa. The different ploidy levels corresponded to the two subgenera, but were not directly correlated to differences in genome size; the three different karyotype ranges mirrored the different sections of the genus. The smallest known plant genomes were not found in G. margaretae, as previously reported, but in G. tuberosa (1C ≈ 61 Mbp) and some strains of G. aurea (1C ≈ 64 Mbp). Conclusions Genlisea is an ideal candidate model organism for the understanding of genome reduction as the genus includes species with both relatively large (∼1700 Mbp) and ultrasmall (∼61 Mbp) genomes. This comparative, phylogeny-based analysis of genome sizes and karyotypes in Genlisea provides essential data for selection of suitable species for comparative whole-genome analyses, as well as for further studies on both the molecular and cytogenetic basis of genome reduction in plants. PMID:25274549
Recent updates and developments to plant genome size databases

PubMed Central

Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.

2014-01-01

Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

PubMed

Kumar, Rakshak; Acharya, Vishal; Singh, Dharam; Kumar, Sanjay

2018-01-01

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum . The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.
A Proposed Genus Boundary for the Prokaryotes Based on Genomic Insights

PubMed Central

Qin, Qi-Long; Xie, Bin-Bin; Zhang, Xi-Ying; Chen, Xiu-Lan; Zhou, Bai-Cheng; Zhou, Jizhong; Oren, Aharon

2014-01-01

Genomic information has already been applied to prokaryotic species definition and classification. However, the contribution of the genome sequence to prokaryotic genus delimitation has been less studied. To gain insights into genus definition for the prokaryotes, we attempted to reveal the genus-level genomic differences in the current prokaryotic classification system and to delineate the boundary of a genus on the basis of genomic information. The average nucleotide sequence identity between two genomes can be used for prokaryotic species delineation, but it is not suitable for genus demarcation. We used the percentage of conserved proteins (POCP) between two strains to estimate their evolutionary and phenotypic distance. A comprehensive genomic survey indicated that the POCP can serve as a robust genomic index for establishing the genus boundary for prokaryotic groups. Basically, two species belonging to the same genus would share at least half of their proteins. In a specific lineage, the genus and family/order ranks showed slight or no overlap in terms of POCP values. A prokaryotic genus can be defined as a group of species with all pairwise POCP values higher than 50%. Integration of whole-genome data into the current taxonomy system can provide comprehensive information for prokaryotic genus definition and delimitation. PMID:24706738
The three-dimensional genome organization of Drosophila melanogaster through data integration.

PubMed

Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

2017-07-31

Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.
Assessing the impact of natural service bulls and genotype by environment interactions on genetic gain and inbreeding in organic dairy cattle genomic breeding programs.

PubMed

Yin, T; Wensch-Dorendorf, M; Simianer, H; Swalve, H H; König, S

2014-06-01

The objective of the present study was to compare genetic gain and inbreeding coefficients of dairy cattle in organic breeding program designs by applying stochastic simulations. Evaluated breeding strategies were: (i) selecting bulls from conventional breeding programs, and taking into account genotype by environment (G×E) interactions, (ii) selecting genotyped bulls within the organic environment for artificial insemination (AI) programs and (iii) selecting genotyped natural service bulls within organic herds. The simulated conventional population comprised 148 800 cows from 2976 herds with an average herd size of 50 cows per herd, and 1200 cows were assigned to 60 organic herds. In a young bull program, selection criteria of young bulls in both production systems (conventional and organic) were either 'conventional' estimated breeding values (EBV) or genomic estimated breeding values (GEBV) for two traits with low (h 2=0.05) and moderate heritability (h 2=0.30). GEBV were calculated for different accuracies (r mg), and G×E interactions were considered by modifying originally simulated true breeding values in the range from r g=0.5 to 1.0. For both traits (h 2=0.05 and 0.30) and r mg⩾0.8, genomic selection of bulls directly in the organic population and using selected bulls via AI revealed higher genetic gain than selecting young bulls in the larger conventional population based on EBV; also without the existence of G×E interactions. Only for pronounced G×E interactions (r g=0.5), and for highly accurate GEBV for natural service bulls (r mg>0.9), results suggests the use of genotyped organic natural service bulls instead of implementing an AI program. Inbreeding coefficients of selected bulls and their offspring were generally lower when basing selection decisions for young bulls on GEBV compared with selection strategies based on pedigree indices.
Targeted gene knock-in by homology-directed genome editing using Cas9 ribonucleoprotein and AAV donor delivery.

PubMed

Gaj, Thomas; Staahl, Brett T; Rodrigues, Gonçalo M C; Limsirichai, Prajit; Ekman, Freja K; Doudna, Jennifer A; Schaffer, David V

2017-06-20

Realizing the full potential of genome editing requires the development of efficient and broadly applicable methods for delivering programmable nucleases and donor templates for homology-directed repair (HDR). The RNA-guided Cas9 endonuclease can be introduced into cells as a purified protein in complex with a single guide RNA (sgRNA). Such ribonucleoproteins (RNPs) can facilitate the high-fidelity introduction of single-base substitutions via HDR following co-delivery with a single-stranded DNA oligonucleotide. However, combining RNPs with transgene-containing donor templates for targeted gene addition has proven challenging, which in turn has limited the capabilities of the RNP-mediated genome editing toolbox. Here, we demonstrate that combining RNP delivery with naturally recombinogenic adeno-associated virus (AAV) donor vectors enables site-specific gene insertion by homology-directed genome editing. Compared to conventional plasmid-based expression vectors and donor templates, we show that combining RNP and AAV donor delivery increases the efficiency of gene addition by up to 12-fold, enabling the creation of lineage reporters that can be used to track the conversion of striatal neurons from human fibroblasts in real time. These results thus illustrate the potential for unifying nuclease protein delivery with AAV donor vectors for homology-directed genome editing. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Basics and applications of genome editing technology.

PubMed

Yamamoto, Takashi; Sakamoto, Naoaki

2016-01-01

Genome editing with programmable site-specific nucleases is an emerging technology that enables the manipulation of targeted genes in many organisms and cell lines. Since the development of the CRISPR-Cas9 system in 2012, genome editing has rapidly become an indispensable technology for all life science researchers, applicable in various fields. In this seminar, we will introduce the basics of genome editing and focus on the recent development of genome editing tools and technologies for the modification of various organisms and discuss future directions of the genome editing research field, from basic to medical applications.
Small genomes and large seeds: chromosome numbers, genome size and seed mass in diploid Aesculus species (Sapindaceae).

PubMed

Krahulcová, Anna; Trávnícek, Pavel; Krahulec, František; Rejmánek, Marcel

2017-04-01

Aesculus L. (horse chestnut, buckeye) is a genus of 12-19 extant woody species native to the temperate Northern Hemisphere. This genus is known for unusually large seeds among angiosperms. While chromosome counts are available for many Aesculus species, only one has had its genome size measured. The aim of this study is to provide more genome size data and analyse the relationship between genome size and seed mass in this genus. Chromosome numbers in root tip cuttings were confirmed for four species and reported for the first time for three additional species. Flow cytometric measurements of 2C nuclear DNA values were conducted on eight species, and mean seed mass values were estimated for the same taxa. The same chromosome number, 2 n = 40, was determined in all investigated taxa. Original measurements of 2C values for seven Aesculus species (eight taxa), added to just one reliable datum for A. hippocastanum , confirmed the notion that the genome size in this genus with relatively large seeds is surprisingly low, ranging from 0·955 pg 2C -1 in A. parviflora to 1·275 pg 2C -1 in A. glabra var. glabra. The chromosome number of 2 n = 40 seems to be conclusively the universal 2 n number for non-hybrid species in this genus. Aesculus genome sizes are relatively small, not only within its own family, Sapindaceae, but also within woody angiosperms. The genome sizes seem to be distinct and non-overlapping among the four major Aesculus clades. These results provide an extra support for the most recent reconstruction of Aesculus phylogeny. The correlation between the 2C values and seed masses in examined Aesculus species is slightly negative and not significant. However, when the four major clades are treated separately, there is consistent positive association between larger genome size and larger seed mass within individual lineages. © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Datasets for evolutionary comparative genomics

PubMed Central

Liberles, David A

2005-01-01

Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856
Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports

PubMed Central

Shaer, Orit; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

2015-01-01

Background In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. Objective We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users’ needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. Methods The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. Results The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users’ understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly higher than scores received using the industry standard tabular reports currently used for communicating personal genomic information. Conclusions Drawing on multiple research methods and populations, the findings of the studies reported in this paper offer deep understanding of users’ needs and practices, and demonstrate that interactive online design interventions can improve the understandability of personal genomic reports for consumers. We discuss implications for designers and researchers. PMID:26070951
Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports.

PubMed

Shaer, Orit; Nov, Oded; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

2015-06-12

In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users' needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users' understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly higher than scores received using the industry standard tabular reports currently used for communicating personal genomic information. Drawing on multiple research methods and populations, the findings of the studies reported in this paper offer deep understanding of users' needs and practices, and demonstrate that interactive online design interventions can improve the understandability of personal genomic reports for consumers. We discuss implications for designers and researchers.
Rejection of reclassification of Lactobacillus kimchii and Lactobacillus bobalius as later subjective synonyms of Lactobacillus paralimentarius using comparative genomics.

PubMed

Yang, Seung-Jo; Kim, Byung-Yong; Chun, Jongsik

2017-11-01

Lactobacillus bobalius, Lactobacillus kimchii and Lactobacillus paralimentarius belong to the genus Lactobacillus and show close phylogenetic relationships. In a previous study, L. bobalius and L. kimchii were proposed to be reclassified as later heterotypic synonyms of L. paralimentarius using high 16S rRNA gene sequence similarities (≥99.5 %) and DNA-DNA hybridization values (≥82 %). We determined high quality whole genome assemblies of the type strains of L. bobalius and L. kimchii, which were then compared with that of L. paralimentarius. Average nucleotide identity values among three genomes ranged from 91.4 to 92.3 % which are clearly below 95~96 %, the generally recognized cutoff value for bacterial species boundaries. On the basis of comparative genomic evidence, L. bobalius, L. kimchii, and L. paralimentarius should stand as separate species in the genus Lactobacillus. We therefore suggest rejecting the previous proposal to combine these three species into a single species.
Microbial minimalism: genome reduction in bacterial pathogens.

PubMed

Moran, Nancy A

2002-03-08

When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts.
Genome editing: progress and challenges for medical applications.

PubMed

Carroll, Dana

2016-11-15

The development of the CRISPR-Cas platform for genome editing has greatly simplified the process of making targeted genetic modifications. Applications of genome editing are expected to have a substantial impact on human therapies through the development of better animal models, new target discovery, and direct therapeutic intervention.
Provision of personalized genomic diagnostic technologies for breast and colorectal cancer: an analysis of patient needs, expectations and priorities.

PubMed

Issa, Amalia M; Hutchinson, Janis F; Tufail, Waqas; Fletcher, Erica; Ajike, Roseline; Tenorio, Jose

2011-07-01

Several novel pharmacogenomic diagnostic tests are commercially available for breast and colorectal cancer, and are increasingly being used in clinical practice for improving treatment decisions. However, there is little evidence evaluating the value of these new genomic technologies from the perspective of patients. As part of an ongoing effort to understand the continuum of the process of adoption of genomic diagnostics, our aim in this study was to examine the value of genomic diagnostics to breast and colorectal cancer patients, and their willingness to adopt and use genomic diagnostics. We conducted six focus groups of breast and colorectal cancer patients from the oncology clinics at The Methodist Hospital, Houston, TX, USA. An adapted Q-sort instrument was also administered to focus group participants. The majority of breast and colorectal cancer patients are interested in using novel genomic diagnostics for deciding about treatment options. Most participants in our study expressed a willingness to pay out-of-pocket for genomic testing (z = 0.736). Reliability and validity of genomic testing were of significant concern (z = 1.32) for the majority of breast and colorectal cancer patients. Participants identified several facilitators and barriers within health systems that might either facilitate or impede the widespread adoption and use of genomic diagnostics in healthcare delivery. This study demonstrates breast and colorectal cancer patients' willingness to adopt and pay for novel genomic diagnostics, as well as identifies several salient factors associated with patient preferences for genomic diagnostics.

Was it worth it? Patients' perspectives on the perceived value of genomic-based individualized medicine.

PubMed

Halverson, Colin Me; Clift, Kristin E; McCormick, Jennifer B

2016-04-01

The value of genomic sequencing is often understood in terms of its ability to affect diagnosis or treatment. In these terms, successes occur only in a minority of cases. This paper presents views from patients who had exome sequencing done clinically to explore how they perceive the utility of genomic medicine. The authors used semi-structured, qualitative interviews in order to study patients' attitudes toward genomic sequencing in oncology and rare-disease settings. Participants from 37 cases were interviewed. In terms of the testing's key values-regardless of having received what clinicians described as meaningful results-participants expressed four qualities that are separate from traditional views of clinical utility: Participants felt they had been empowered over their own health. They felt they had contributed altruistically to the progress of genomic technology in medicine. They felt their suffering had been legitimated. They also felt a sense of closure, having done everything they could. Patients expressed overwhelmingly positive attitudes toward sequencing. Their rationale was not solely based on the results' clinical utility. It is important for clinicians to understand this non-medical reasoning as it pertains to patient decision-making and informed consent.
OrthoANI: An improved algorithm and software for calculating average nucleotide identity.

PubMed

Lee, Imchang; Ouk Kim, Yeong; Park, Sang-Cheol; Chun, Jongsik

2016-02-01

Species demarcation in Bacteria and Archaea is mainly based on overall genome relatedness, which serves a framework for modern microbiology. Current practice for obtaining these measures between two strains is shifting from experimentally determined similarity obtained by DNA-DNA hybridization (DDH) to genome-sequence-based similarity. Average nucleotide identity (ANI) is a simple algorithm that mimics DDH. Like DDH, ANI values between two genome sequences may be different from each other when reciprocal calculations are compared. We compared 63 690 pairs of genome sequences and found that the differences in reciprocal ANI values are significantly high, exceeding 1 % in some cases. To resolve this problem of not being symmetrical, a new algorithm, named OrthoANI, was developed to accommodate the concept of orthology for which both genome sequences were fragmented and only orthologous fragment pairs taken into consideration for calculating nucleotide identities. OrthoANI is highly correlated with ANI (using BLASTn) and the former showed approximately 0.1 % higher values than the latter. In conclusion, OrthoANI provides a more robust and faster means of calculating average nucleotide identity for taxonomic purposes. The standalone software tools are freely available at http://www.ezbiocloud.net/sw/oat.
Cas9-Guide RNA Directed Genome Editing in Soybean[OPEN

PubMed Central

Li, Zhongsen; Liu, Zhan-Bin; Xing, Aiqiu; Moon, Bryan P.; Koellhoffer, Jessica P.; Huang, Lingxia; Ward, R. Timothy; Clifton, Elizabeth; Falco, S. Carl; Cigan, A. Mark

2015-01-01

Recently discovered bacteria and archaea adaptive immune system consisting of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) endonuclease has been explored in targeted genome editing in different species. Streptococcus pyogenes Cas9-guide RNA (gRNA) was successfully applied to generate targeted mutagenesis, gene integration, and gene editing in soybean (Glycine max). Two genomic sites, DD20 and DD43 on chromosome 4, were mutagenized with frequencies of 59% and 76%, respectively. Sequencing randomly selected transgenic events confirmed that the genome modifications were specific to the Cas9-gRNA cleavage sites and consisted of small deletions or insertions. Targeted gene integrations through homology-directed recombination were detected by border-specific polymerase chain reaction analysis for both sites at callus stage, and one DD43 homology-directed recombination event was transmitted to T1 generation. T1 progenies of the integration event segregated according to Mendelian laws and clean homozygous T1 plants with the donor gene precisely inserted at the DD43 target site were obtained. The Cas9-gRNA system was also successfully applied to make a directed P178S mutation of acetolactate synthase1 gene through in planta gene editing. PMID:26294043
Privacy Challenges of Genomic Big Data.

PubMed

Shen, Hong; Ma, Jian

2017-01-01

With the rapid advancement of high-throughput DNA sequencing technologies, genomics has become a big data discipline where large-scale genetic information of human individuals can be obtained efficiently with low cost. However, such massive amount of personal genomic data creates tremendous challenge for privacy, especially given the emergence of direct-to-consumer (DTC) industry that provides genetic testing services. Here we review the recent development in genomic big data and its implications on privacy. We also discuss the current dilemmas and future challenges of genomic privacy.
Increased prediction accuracy in wheat breeding trials using a marker x environment interaction genomic selection model

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates for selection. Originally these models were developed without considering genotype ' environment interaction (GE). Several authors have proposed extensions of the cannonical GS model that accomm...
Cow genotyping strategies for genomic selection in small dairy cattle population

USDA-ARS?s Scientific Manuscript database

This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds there are few sires with progeny records and genotyping cows can improve the accuracy of genomic EBV. The Guernsey bre...
Personalized medicine, genomics, and pharmacogenomics: a primer for nurses.

PubMed

Blix, Andrew

2014-08-01

Personalized medicine is the study of patients' unique environmental influences as well as the totality of their genetic code-their genome-to tailor personalized risk assessments, diagnoses, prognoses, and treatments. The study of how patients' genomes affect responses to medications, or pharmacogenomics, is a related field. Personalized medicine and genomics are particularly relevant in oncology because of the genetic basis of cancer. Nurses need to understand related issues such as the role of genetic and genomic counseling, the ethical and legal questions surrounding genomics, and the growing direct-to-consumer genomics industry. As genomics research is incorporated into health care, nurses need to understand the technology to provide advocacy and education for patients and their families.
Aye-aye population genomic analyses highlight an important center of endemism in northern Madagascar

PubMed Central

Perry, George H.; Louis, Edward E.; Ratan, Aakrosh; Bedoya-Reina, Oscar C.; Burhans, Richard C.; Lei, Runhua; Johnson, Steig E.; Schuster, Stephan C.; Miller, Webb

2013-01-01

We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations is critical for conservation planning. Such information may also advance our general understanding of Malagasy biogeography, as aye-ayes have the largest species distribution of any lemur. We generated and analyzed whole-genome sequence data for 12 aye-ayes from three regions of Madagascar (North, West, and East). We found that the North population is genetically distinct, with strong differentiation from other aye-ayes over relatively short geographic distances. For comparison, the average FST value between the North and East aye-aye populations—separated by only 248 km—is over 2.1-times greater than that observed between human Africans and Europeans. This finding is consistent with prior watershed- and climate-based hypotheses of a center of endemism in northern Madagascar. Taken together, these results suggest a strong and long-term biogeographical barrier to gene flow. Thus, the specific attention that should be directed toward preserving large, contiguous aye-aye habitats in northern Madagascar may also benefit the conservation of other distinct taxonomic units. To help facilitate future ecological- and conservation-motivated population genomic analyses by noncomputational biologists, the analytical toolkit used in this study is available on the Galaxy Web site. PMID:23530231
Aye-aye population genomic analyses highlight an important center of endemism in northern Madagascar.

PubMed

Perry, George H; Louis, Edward E; Ratan, Aakrosh; Bedoya-Reina, Oscar C; Burhans, Richard C; Lei, Runhua; Johnson, Steig E; Schuster, Stephan C; Miller, Webb

2013-04-09

We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations is critical for conservation planning. Such information may also advance our general understanding of Malagasy biogeography, as aye-ayes have the largest species distribution of any lemur. We generated and analyzed whole-genome sequence data for 12 aye-ayes from three regions of Madagascar (North, West, and East). We found that the North population is genetically distinct, with strong differentiation from other aye-ayes over relatively short geographic distances. For comparison, the average FST value between the North and East aye-aye populations--separated by only 248 km--is over 2.1-times greater than that observed between human Africans and Europeans. This finding is consistent with prior watershed- and climate-based hypotheses of a center of endemism in northern Madagascar. Taken together, these results suggest a strong and long-term biogeographical barrier to gene flow. Thus, the specific attention that should be directed toward preserving large, contiguous aye-aye habitats in northern Madagascar may also benefit the conservation of other distinct taxonomic units. To help facilitate future ecological- and conservation-motivated population genomic analyses by noncomputational biologists, the analytical toolkit used in this study is available on the Galaxy Web site.
Trends in genome-wide and region-specific genetic diversity in the Dutch-Flemish Holstein-Friesian breeding program from 1986 to 2015.

PubMed

Doekes, Harmen P; Veerkamp, Roel F; Bijma, Piter; Hiemstra, Sipke J; Windig, Jack J

2018-04-11

In recent decades, Holstein-Friesian (HF) selection schemes have undergone profound changes, including the introduction of optimal contribution selection (OCS; around 2000), a major shift in breeding goal composition (around 2000) and the implementation of genomic selection (GS; around 2010). These changes are expected to have influenced genetic diversity trends. Our aim was to evaluate genome-wide and region-specific diversity in HF artificial insemination (AI) bulls in the Dutch-Flemish breeding program from 1986 to 2015. Pedigree and genotype data (~ 75.5 k) of 6280 AI-bulls were used to estimate rates of genome-wide inbreeding and kinship and corresponding effective population sizes. Region-specific inbreeding trends were evaluated using regions of homozygosity (ROH). Changes in observed allele frequencies were compared to those expected under pure drift to identify putative regions under selection. We also investigated the direction of changes in allele frequency over time. Effective population size estimates for the 1986-2015 period ranged from 69 to 102. Two major breakpoints were observed in genome-wide inbreeding and kinship trends. Around 2000, inbreeding and kinship levels temporarily dropped. From 2010 onwards, they steeply increased, with pedigree-based, ROH-based and marker-based inbreeding rates as high as 1.8, 2.1 and 2.8% per generation, respectively. Accumulation of inbreeding varied substantially across the genome. A considerable fraction of markers showed changes in allele frequency that were greater than expected under pure drift. Putative selected regions harboured many quantitative trait loci (QTL) associated to a wide range of traits. In consecutive 5-year periods, allele frequencies changed more often in the same direction than in opposite directions, except when comparing the 1996-2000 and 2001-2005 periods. Genome-wide and region-specific diversity trends reflect major changes in the Dutch-Flemish HF breeding program. Introduction of OCS and the shift in breeding goal were followed by a drop in inbreeding and kinship and a shift in the direction of changes in allele frequency. After introduction of GS, rates of inbreeding and kinship increased substantially while allele frequencies continued to change in the same direction as before GS. These results provide insight in the effect of breeding practices on genomic diversity and emphasize the need for efficient management of genetic diversity in GS schemes.
Direct-to-consumer personalized genomic testing

PubMed Central

Bloss, Cinnamon S.; Darst, Burcu F.; Topol, Eric J.; Schork, Nicholas J.

2011-01-01

Over the past 18 months, there have been notable developments in the direct-to-consumer (DTC) genomic testing arena, in particular with regard to issues surrounding governmental regulation in the USA. While commentaries continue to proliferate on this topic, actual empirical research remains relatively scant. In terms of DTC genomic testing for disease susceptibility, most of the research has centered on uptake, perceptions and attitudes toward testing among health care professionals and consumers. Only a few available studies have examined actual behavioral response among consumers, and we are not aware of any studies that have examined response to DTC genetic testing for ancestry or for drug response. We propose that further research in this area is desperately needed, despite challenges in designing appropriate studies given the rapid pace at which the field is evolving. Ultimately, DTC genomic testing for common markers and conditions is only a precursor to the eventual cost-effectiveness and wide availability of whole genome sequencing of individuals, although it remains unclear whether DTC genomic information will still be attainable. Either way, however, current knowledge needs to be extended and enhanced with respect to the delivery, impact and use of increasingly accurate and comprehensive individualized genomic data. PMID:21828075
Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

PubMed

Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

2014-01-01

Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.
The Molecular Genetics of Autism Spectrum Disorders: Genomic Mechanisms, Neuroimmunopathology, and Clinical Implications

PubMed Central

Guerra, Daniel J.

2011-01-01

Autism spectrum disorders (ASDs) have become increasingly common in recent years. The discovery of single-nucleotide polymorphisms and accompanying copy number variations within the genome has increased our understanding of the architecture of the disease. These genetic and genomic alterations coupled with epigenetic phenomena have pointed to a neuroimmunopathological mechanism for ASD. Model animal studies, developmental biology, and affective neuroscience laid a foundation for dissecting the neural pathways impacted by these disease-generating mechanisms. The goal of current autism research is directed toward a systems biological approach to find the most basic genetic and environmental causes to this severe developmental disease. It is hoped that future genomic and neuroimmunological research will be directed toward finding the road toward prevention, treatment, and cure of ASD. PMID:22937247
Ocean biogeochemistry modeled with emergent trait-based genomics

NASA Astrophysics Data System (ADS)

Coles, V. J.; Stukel, M. R.; Brooks, M. T.; Burd, A.; Crump, B. C.; Moran, M. A.; Paul, J. H.; Satinsky, B. M.; Yager, P. L.; Zielinski, B. L.; Hood, R. R.

2017-12-01

Marine ecosystem models have advanced to incorporate metabolic pathways discovered with genomic sequencing, but direct comparisons between models and “omics” data are lacking. We developed a model that directly simulates metagenomes and metatranscriptomes for comparison with observations. Model microbes were randomly assigned genes for specialized functions, and communities of 68 species were simulated in the Atlantic Ocean. Unfit organisms were replaced, and the model self-organized to develop community genomes and transcriptomes. Emergent communities from simulations that were initialized with different cohorts of randomly generated microbes all produced realistic vertical and horizontal ocean nutrient, genome, and transcriptome gradients. Thus, the library of gene functions available to the community, rather than the distribution of functions among specific organisms, drove community assembly and biogeochemical gradients in the model ocean.
A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

PubMed

Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

2018-05-09

The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.
Genome-wide Association Analysis Identifies PDE4D as an Asthma-Susceptibility Gene

PubMed Central

Himes, Blanca E.; Hunninghake, Gary M.; Baurley, James W.; Rafaels, Nicholas M.; Sleiman, Patrick; Strachan, David P.; Wilk, Jemma B.; Willis-Owen, Saffron A.G.; Klanderman, Barbara; Lasky-Su, Jessica; Lazarus, Ross; Murphy, Amy J.; Soto-Quiros, Manuel E.; Avila, Lydiana; Beaty, Terri; Mathias, Rasika A.; Ruczinski, Ingo; Barnes, Kathleen C.; Celedón, Juan C.; Cookson, William O.C.; Gauderman, W. James; Gilliland, Frank D.; Hakonarson, Hakon; Lange, Christoph; Moffatt, Miriam F.; O'Connor, George T.; Raby, Benjamin A.; Silverman, Edwin K.; Weiss, Scott T.

2009-01-01

Asthma, a chronic airway disease with known heritability, affects more than 300 million people around the world. A genome-wide association (GWA) study of asthma with 359 cases from the Childhood Asthma Management Program (CAMP) and 846 genetically matched controls from the Illumina ICONdb public resource was performed. The strongest region of association seen was on chromosome 5q12 in PDE4D. The phosphodiesterase 4D, cAMP-specific (phosphodiesterase E3 dunce homolog, Drosophila) gene (PDE4D) is a regulator of airway smooth-muscle contractility, and PDE4 inhibitors have been developed as medications for asthma. Allelic p values for top SNPs in this region were 4.3 × 10−07 for rs1588265 and 9.7 × 10−07 for rs1544791. Replications were investigated in ten independent populations with different ethnicities, study designs, and definitions of asthma. In seven white and Hispanic replication populations, two PDE4D SNPs had significant results with p values less than 0.05, and five had results in the same direction as the original population but had p values greater than 0.05. Combined p values for 18,891 white and Hispanic individuals (4,342 cases) in our replication populations were 4.1 × 10−04 for rs1588265 and 9.2 × 10−04 for rs1544791. In three black replication populations, which had different linkage disequilibrium patterns than the other populations, original findings were not replicated. Further study of PDE4D variants might lead to improved understanding of the role of PDE4D in asthma pathophysiology and the efficacy of PDE4 inhibitor medications. PMID:19426955
Comparative Genomics of the Balsaminaceae Sister Genera Hydrocera triflora and Impatiens pinfanensis

PubMed Central

Li, Zhi-Zhong; Saina, Josphat K.; Gichira, Andrew W.; Kyalo, Cornelius M.; Wang, Qing-Feng

2018-01-01

The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales. PMID:29360746
Identifying footprints of directional and balancing selection in marine and freshwater three-spined stickleback (Gasterosteus aculeatus) populations.

PubMed

Mäkinen, H S; Cano, J M; Merilä, J

2008-08-01

Natural selection is expected to leave an imprint on the neutral polymorphisms at the adjacent genomic regions of a selected gene. While directional selection tends to reduce within-population genetic diversity and increase among-population differentiation, the reverse is expected under balancing selection. To identify targets of natural selection in the three-spined stickleback (Gasterosteus aculeatus) genome, 103 microsatellite and two indel markers including expressed sequence tags (EST) and quantitative trait loci (QTL)-associated loci, were genotyped in four freshwater and three marine populations. The results indicated that a high proportion of loci (14.7%) might be affected by balancing selection and a lower proportion (2.8%) by directional selection. The strongest signatures of directional selection were detected in a microsatellite locus and two indel markers located in the intronic regions of the Eda-gene coding for the number of lateral plates. Yet, other microsatellite loci previously found to be informative in QTL-mapping studies revealed no signatures of selection. Two novel microsatellite loci (Stn12 and Stn90) located in chromosomes I and VIII, respectively, showed signals of directional selection and might be linked to genomic regions containing gene(s) important for adaptive divergence. Although the coverage of the total genomic content was relatively low, the predominance of balancing selection signals is in agreement with the contention that balancing, rather than directional selection is the predominant mode of selection in the wild.
Application of the stepwise focusing method to optimize the cost-effectiveness of genome-wide association studies with limited research budgets for genotyping and phenotyping.

PubMed

Ohashi, J; Clark, A G

2005-05-01

The recent cataloguing of a large number of SNPs enables us to perform genome-wide association studies for detecting common genetic variants associated with disease. Such studies, however, generally have limited research budgets for genotyping and phenotyping. It is therefore necessary to optimize the study design by determining the most cost-effective numbers of SNPs and individuals to analyze. In this report we applied the stepwise focusing method, with two-stage design, developed by Satagopan et al. (2002) and Saito & Kamatani (2002), to optimize the cost-effectiveness of a genome-wide direct association study using a transmission/disequilibrium test (TDT). The stepwise focusing method consists of two steps: a large number of SNPs are examined in the first focusing step, and then all the SNPs showing a significant P-value are tested again using a larger set of individuals in the second focusing step. In the framework of optimization, the numbers of SNPs and families and the significance levels in the first and second steps were regarded as variables to be considered. Our results showed that the stepwise focusing method achieves a distinct gain of power compared to a conventional method with the same research budget.
Beef cattle body temperature during climatic stress: a genome-wide association study.

PubMed

Howard, Jeremy T; Kachman, Stephen D; Snelling, Warren M; Pollak, E John; Ciobanu, Daniel C; Kuehn, Larry A; Spangler, Matthew L

2014-09-01

Cattle are reared in diverse environments and collecting phenotypic body temperature (BT) measurements to characterize BT variation across diverse environments is difficult and expensive. To better understand the genetic basis of BT regulation, a genome-wide association study was conducted utilizing crossbred steers and heifers totaling 239 animals of unknown pedigree and breed fraction. During predicted extreme heat and cold stress events, hourly tympanic and vaginal BT devices were placed in steers and heifers, respectively. Individuals were genotyped with the BovineSNP50K_v2 assay and data analyzed using Bayesian models for area under the curve (AUC), a measure of BT over time, using hourly BT observations summed across 5-days (AUC summer 5-day (AUCS5D) and AUC winter 5-day (AUCW5D)). Posterior heritability estimates were moderate to high and were estimated to be 0.68 and 0.21 for AUCS5D and AUCW5D, respectively. Moderately positive correlations between direct genomic values for AUCS5D and AUCW5D (0.40) were found, although a small percentage of the top 5% 1-Mb windows were in common. Different sets of genes were associated with BT during winter and summer, thus simultaneous selection for animals tolerant to both heat and cold appears possible.

Characterization of infectious Murray Valley encephalitis virus derived from a stably cloned genome-length cDNA.

PubMed

Hurrelbrink, R J; Nestorowicz, A; McMinn, P C

1999-12-01

An infectious cDNA clone of Murray Valley encephalitis virus prototype strain 1-51 (MVE-1-51) was constructed by stably inserting genome-length cDNA into the low-copy-number plasmid vector pMC18. Designated pMVE-1-51, the clone consisted of genome-length cDNA of MVE-1-51 under the control of a T7 RNA polymerase promoter. The clone was constructed by using existing components of a cDNA library, in addition to cDNA of the 3' terminus derived by RT-PCR of poly(A)-tailed viral RNA. Upon comparison with other flavivirus sequences, the previously undetermined sequence of the 3' UTR was found to contain elements conserved throughout the genus FLAVIVIRUS: RNA transcribed from pMVE-1-51 and subsequently transfected into BHK-21 cells generated infectious virus. The plaque morphology, replication kinetics and antigenic profile of clone-derived virus (CDV-1-51) was similar to the parental virus in vitro. Furthermore, the virulence properties of CDV-1-51 and MVE-1-51 (LD(50) values and mortality profiles) were found to be identical in vivo in the mouse model. Through site-directed mutagenesis, the infectious clone should serve as a valuable tool for investigating the molecular determinants of virulence in MVE virus.
Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery.

PubMed

Zhao, Yongan; Wang, Xiaofeng; Jiang, Xiaoqian; Ohno-Machado, Lucila; Tang, Haixu

2015-01-01

To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients' privacy. Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence. We evaluated our approach on real human genomic data using four popular association tests. Our study shows that the proposed approach can help data users make the right choices in most cases. Even though the pilot data cannot be directly used for scientific discovery, it provides a useful indication of which datasets are more likely to be useful to data users, who can therefore approach the appropriate data owners to gain access to the data. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

PubMed

Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

2012-01-01

Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.
Meta-Analysis of Genome-Wide Association Studies for Abdominal Aortic Aneurysm Identifies Four New Disease-Specific Risk Loci

PubMed Central

Tromp, Gerard; Kuivaniemi, Helena; Gretarsdottir, Solveig; Baas, Annette F.; Giusti, Betti; Strauss, Ewa; van‘t Hof, Femke N.G.; Webb, Thomas R.; Erdman, Robert; Ritchie, Marylyn D.; Elmore, James R.; Verma, Anurag; Pendergrass, Sarah; Kullo, Iftikhar J.; Ye, Zi; Peissig, Peggy L.; Gottesman, Omri; Verma, Shefali S.; Malinowski, Jennifer; Rasmussen-Torvik, Laura J.; Borthwick, Kenneth M.; Smelser, Diane T.; Crosslin, David R.; de Andrade, Mariza; Ryer, Evan J.; McCarty, Catherine A.; Böttinger, Erwin P.; Pacheco, Jennifer A.; Crawford, Dana C.; Carrell, David S.; Gerhard, Glenn S.; Franklin, David P.; Carey, David J.; Phillips, Victoria L.; Williams, Michael J.A.; Wei, Wenhua; Blair, Ross; Hill, Andrew A.; Vasudevan, Thodor M.; Lewis, David R.; Thomson, Ian A.; Krysa, Jo; Hill, Geraldine B.; Roake, Justin; Merriman, Tony R.; Oszkinis, Grzegorz; Galora, Silvia; Saracini, Claudia; Abbate, Rosanna; Pulli, Raffaele; Pratesi, Carlo; Saratzis, Athanasios; Verissimo, Ana R.; Bumpstead, Suzannah; Badger, Stephen A.; Clough, Rachel E.; Cockerill, Gillian; Hafez, Hany; Scott, D. Julian A.; Futers, T. Simon; Romaine, Simon P.R.; Bridge, Katherine; Griffin, Kathryn J.; Bailey, Marc A.; Smith, Alberto; Thompson, Matthew M.; van Bockxmeer, Frank M.; Matthiasson, Stefan E.; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Blankensteijn, Jan D.; Teijink, Joep A.W.; Wijmenga, Cisca; de Graaf, Jacqueline; Kiemeney, Lambertus A.; Lindholt, Jes S.; Hughes, Anne; Bradley, Declan T.; Stirrups, Kathleen; Golledge, Jonathan; Norman, Paul E.; Powell, Janet T.; Humphries, Steve E.; Hamby, Stephen E.; Goodall, Alison H.; Nelson, Christopher P.; Sakalihasan, Natzi; Courtois, Audrey; Ferrell, Robert E.; Eriksson, Per; Folkersen, Lasse; Franco-Cereceda, Anders; Eicher, John D.; Johnson, Andrew D.; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Schadt, Eric E.; Björkegren, Johan L.M.; Lipovich, Leonard; Drolet, Anne M.; Verhoeven, Eric L.; Zeebregts, Clark J.; Geelkerken, Robert H.; van Sambeek, Marc R.; van Sterkenburg, Steven M.; de Vries, Jean-Paul; Stefansson, Kari; Thompson, John R.; de Bakker, Paul I.W.; Deloukas, Panos; Sayers, Robert D.; Harrison, Seamus C.; van Rij, Andre M.; Samani, Nilesh J.

2017-01-01

Rationale: Abdominal aortic aneurysm (AAA) is a complex disease with both genetic and environmental risk factors. Together, 6 previously identified risk loci only explain a small proportion of the heritability of AAA. Objective: To identify additional AAA risk loci using data from all available genome-wide association studies. Methods and Results: Through a meta-analysis of 6 genome-wide association study data sets and a validation study totaling 10 204 cases and 107 766 controls, we identified 4 new AAA risk loci: 1q32.3 (SMYD2), 13q12.11 (LINC00540), 20q13.12 (near PCIF1/MMP9/ZNF335), and 21q22.2 (ERG). In various database searches, we observed no new associations between the lead AAA single nucleotide polymorphisms and coronary artery disease, blood pressure, lipids, or diabetes mellitus. Network analyses identified ERG, IL6R, and LDLR as modifiers of MMP9, with a direct interaction between ERG and MMP9. Conclusions: The 4 new risk loci for AAA seem to be specific for AAA compared with other cardiovascular diseases and related traits suggesting that traditional cardiovascular risk factor management may only have limited value in preventing the progression of aneurysmal disease. PMID:27899403
Dynamics and control of state-dependent networks for probing genomic organization

PubMed Central

Rajapakse, Indika; Groudine, Mark; Mesbahi, Mehran

2011-01-01

A state-dependent dynamic network is a collection of elements that interact through a network, whose geometry evolves as the state of the elements changes over time. The genome is an intriguing example of a state-dependent network, where chromosomal geometry directly relates to genomic activity, which in turn strongly correlates with geometry. Here we examine various aspects of a genomic state-dependent dynamic network. In particular, we elaborate on one of the important ramifications of viewing genomic networks as being state-dependent, namely, their controllability during processes of genomic reorganization such as in cell differentiation. PMID:21911407
Beef cattle body temperature during climatic stress: a genome-wide association study

NASA Astrophysics Data System (ADS)

Howard, Jeremy T.; Kachman, Stephen D.; Snelling, Warren M.; Pollak, E. John; Ciobanu, Daniel C.; Kuehn, Larry A.; Spangler, Matthew L.

2014-09-01

Cattle are reared in diverse environments and collecting phenotypic body temperature (BT) measurements to characterize BT variation across diverse environments is difficult and expensive. To better understand the genetic basis of BT regulation, a genome-wide association study was conducted utilizing crossbred steers and heifers totaling 239 animals of unknown pedigree and breed fraction. During predicted extreme heat and cold stress events, hourly tympanic and vaginal BT devices were placed in steers and heifers, respectively. Individuals were genotyped with the BovineSNP50K_v2 assay and data analyzed using Bayesian models for area under the curve (AUC), a measure of BT over time, using hourly BT observations summed across 5-days (AUC summer 5-day (AUCS5D) and AUC winter 5-day (AUCW5D)). Posterior heritability estimates were moderate to high and were estimated to be 0.68 and 0.21 for AUCS5D and AUCW5D, respectively. Moderately positive correlations between direct genomic values for AUCS5D and AUCW5D (0.40) were found, although a small percentage of the top 5 % 1-Mb windows were in common. Different sets of genes were associated with BT during winter and summer, thus simultaneous selection for animals tolerant to both heat and cold appears possible.
Draft sequencing and analysis of the genome of pufferfish Takifugu flavidus.

PubMed

Gao, Yang; Gao, Qiang; Zhang, Huan; Wang, Lingling; Zhang, Fuchong; Yang, Chuanyan; Song, Linsheng

2014-12-01

The pufferfish Takifugu flavidus is an important economic species due to its outstanding flavour and high market value. It has been regarded as an excellent model of genetic study for decades as well. In the present study, three mate-pair libraries of T. flavidus genome were sequenced by the SOLiD 4 next-generation sequencing platform, and the draft genome was constructed with the short reads using an assisted assembly strategy. The draft consists of 50,947 scaffolds with an N50 value of 305.7 kb, and the average GC content was 45.2%. The combined length of repetitive sequences was 26.5 Mb, which accounted for 6.87% of the genome, indicating that the compactness of T. flavidus genome was approximative with that of T. rubripes genome. A total of 1,253 non-coding RNA genes and 30,285 protein-encoding genes were assigned to the genome. There were 132,775 and 394 presumptive genes playing roles in the colour pattern variation, the relatively slow growth and the lipid metabolism, respectively. Among them, genes involved in the microtubule-dependent transport system, angiogenesis, decapentaplegic pathway and lipid mobilization were significantly expanded in the T. flavidus genome. This draft genome provides a valuable resource for understanding and improving both fundamental and applied research with pufferfish in the future. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
The genome sequence of Condylorrhiza vestigialis NPV, a novel baculovirus for the control of the Alamo moth on Populus spp. in Brazil.

PubMed

Castro, Maria Elita B; Melo, Fernando L; Tagliari, Marina; Inglis, Peter W; Craveiro, Saluana R; Ribeiro, Zilda Maria A; Ribeiro, Bergmann M; Báo, Sônia N

2017-09-01

Condylorrhiza vestigialis (Lepidoptera: Cambridae), commonly known as the Brazilian poplar moth or Alamo moth, is a serious defoliating pest of poplar, a crop of great economic importance for the production of wood, fiber, biofuel and other biomaterials as well as its significant ecological and environmental value. The complete genome sequence of a new alphabaculovirus isolated from C. vestigialis was determined and analyzed. Condylorrhiza vestigialis nucleopolyhedrovirus (CoveNPV) has a circular double-stranded DNA genome of 125,767bp with a GC content of 42.9%. One hundred and thirty-eight putative open reading frames were identified and annotated in the CoveNPV genome, including 38 core genes and 9 bros. Four homologous regions (hrs), a feature common to most baculoviruses, and 19 perfect and imperfect direct repeats (drs) were found. Phylogenetic analysis confirmed that CoveNPV is a Group I Alphabaculovirus and is most closely related to Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) and Choristoneura fumiferana DEF multiple nucleopolyhedrovirus CfDEFMNPV. The gp37 gene was not detected in the CoveNPV genome, although this gene is found in many NPVs. Two other common NPV genes, chitinase (v-chiA) and cathepsin (v-cath), that are responsible for host insect liquefaction and melanization, were also absent, where phylogenetic analysis suggests that the loss these genes occurred in the common ancestor of AgMNPV, CfDEFMNPV and CoveNPV, with subsequent reacquisition of these genes by CfDEFMNPV. The molecular biology and genetics of CoveNPV was formerly very little known and our expectation is that the findings presented here should accelerate research on this baculovirus, which will facilitate the use of CoveNPV in integrated pest management programs in Poplar crops. Copyright © 2017 Elsevier Inc. All rights reserved.
Accuracy of genomic selection for BCWD resistance in rainbow trout

USDA-ARS?s Scientific Manuscript database

Bacterial cold water disease (BCWD) causes significant economic losses in salmonids. In this study, we aimed to (1) predict genomic breeding values (GEBV) by genotyping training (n=583) and validation samples (n=53) with a SNP50K chip; and (2) assess the accuracy of genomic selection (GS) for BCWD r...
Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

1993-04-01

A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less
Chromosome Numbers and Genome Size Variation in Indian Species of Curcuma (Zingiberaceae)

PubMed Central

Leong-Škorničková, Jana; Šída, Otakar; Jarolímová, Vlasta; Sabu, Mamyil; Fér, Tomáš; Trávníček, Pavel; Suda, Jan

2007-01-01

Background and Aims Genome size and chromosome numbers are important cytological characters that significantly influence various organismal traits. However, geographical representation of these data is seriously unbalanced, with tropical and subtropical regions being largely neglected. In the present study, an investigation was made of chromosomal and genome size variation in the majority of Curcuma species from the Indian subcontinent, and an assessment was made of the value of these data for taxonomic purposes. Methods Genome size of 161 homogeneously cultivated plant samples classified into 51 taxonomic entities was determined by propidium iodide flow cytometry. Chromosome numbers were counted in actively growing root tips using conventional rapid squash techniques. Key Results Six different chromosome counts (2n = 22, 42, 63, >70, 77 and 105) were found, the last two representing new generic records. The 2C-values varied from 1·66 pg in C. vamana to 4·76 pg in C. oligantha, representing a 2·87-fold range. Three groups of taxa with significantly different homoploid genome sizes (Cx-values) and distinct geographical distribution were identified. Five species exhibited intraspecific variation in nuclear DNA content, reaching up to 15·1 % in cultivated C. longa. Chromosome counts and genome sizes of three Curcuma-like species (Hitchenia caulina, Kaempferia scaposa and Paracautleya bhatii) corresponded well with typical hexaploid (2n = 6x = 42) Curcuma spp. Conclusions The basic chromosome number in the majority of Indian taxa (belonging to subgenus Curcuma) is x = 7; published counts correspond to 6x, 9x, 11x, 12x and 15x ploidy levels. Only a few species-specific C-values were found, but karyological and/or flow cytometric data may support taxonomic decisions in some species alliances with morphological similarities. Close evolutionary relationships among some cytotypes are suggested based on the similarity in homoploid genome sizes and geographical grouping. A new species combination, Curcuma scaposa (Nimmo) Škorničk. & M. Sabu, comb. nov., is proposed. PMID:17686760
Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections.

PubMed

Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe; Avarre, Jean-Christophe

2016-01-01

Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×10 7 . The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.
Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

PubMed Central

Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe

2016-01-01

Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3. PMID:27703859
OPATs: Omnibus P-value association tests.

PubMed

Chen, Chia-Wei; Yang, Hsin-Chou

2017-07-10

Combining statistical significances (P-values) from a set of single-locus association tests in genome-wide association studies is a proof-of-principle method for identifying disease-associated genomic segments, functional genes and biological pathways. We review P-value combinations for genome-wide association studies and introduce an integrated analysis tool, Omnibus P-value Association Tests (OPATs), which provides popular analysis methods of P-value combinations. The software OPATs programmed in R and R graphical user interface features a user-friendly interface. In addition to analysis modules for data quality control and single-locus association tests, OPATs provides three types of set-based association test: window-, gene- and biopathway-based association tests. P-value combinations with or without threshold and rank truncation are provided. The significance of a set-based association test is evaluated by using resampling procedures. Performance of the set-based association tests in OPATs has been evaluated by simulation studies and real data analyses. These set-based association tests help boost the statistical power, alleviate the multiple-testing problem, reduce the impact of genetic heterogeneity, increase the replication efficiency of association tests and facilitate the interpretation of association signals by streamlining the testing procedures and integrating the genetic effects of multiple variants in genomic regions of biological relevance. In summary, P-value combinations facilitate the identification of marker sets associated with disease susceptibility and uncover missing heritability in association studies, thereby establishing a foundation for the genetic dissection of complex diseases and traits. OPATs provides an easy-to-use and statistically powerful analysis tool for P-value combinations. OPATs, examples, and user guide can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/genetics/association/OPATs.htm. © The Author 2017. Published by Oxford University Press.
Human Genome Project discoveries: Dialectics and rhetoric in the science of genetics

NASA Astrophysics Data System (ADS)

Robidoux, Charlotte A.

The Human Genome Project (HGP), a $437 million effort that began in 1990 to chart the chemical sequence of our three billion base pairs of DNA, was completed in 2003, marking the 50th anniversary that proved the definitive structure of the molecule. This study considered how dialectical and rhetorical arguments functioned in the science, political, and public forums over a 20-year period, from 1980 to 2000, to advance human genome research and to establish the official project. I argue that Aristotle's continuum of knowledge--which ranges from the probable on one end to certified or demonstrated knowledge on the other--provides useful distinctions for analyzing scientific reasoning. While contemporary scientific research seeks to discover certified knowledge, investigators generally employ the hypothetico-deductive or scientific method, which often yields probable rather than certain findings, making these dialectical in nature. Analysis of the discourse describing human genome research revealed the use of numerous rhetorical figures and topics. Persuasive and probable reasoning were necessary for scientists to characterize unknown genetic phenomena, to secure interest in and funding for large-scale human genome research, to solve scientific problems, to issue probable findings, to convince colleagues and government officials that the findings were sound and to disseminate information to the public. Both government and private venture scientists drew on these tools of reasoning to promote their methods of mapping and sequencing the genome. The debate over how to carry out sequencing was rooted in conflicting values. Scientists representing the academic tradition valued a more conservative method that would establish high quality results, and those supporting private industry valued an unconventional approach that would yield products and profits more quickly. Values in turn influenced political and public forum arguments. Agency representatives and investors sided with the approach that reflected values they supported. Fascinated with this controversy and the convincing comparisons, the media often endorsed Celera's work for its efficiency. The analysis of discourse from the science, political, and public forums revealed that value systems influenced the accuracy and quality of the arguments more than the type or number of figures used to describe the research to various audiences.
Multimedia Presentations on the Human Genome: Implementation and Assessment of a Teaching Program for the Introduction to Genome Science Using a Poster and Animations

ERIC Educational Resources Information Center

Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto

2008-01-01

Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from…
The first genome sequences of human bocaviruses from Vietnam

PubMed Central

Thanh, Tran Tan; Van, Hoang Minh Tu; Hong, Nguyen Thi Thu; Nhu, Le Nguyen Truc; Anh, Nguyen To; Tuan, Ha Manh; Hien, Ho Van; Tuong, Nguyen Manh; Kien, Trinh Trung; Khanh, Truong Huu; Nhan, Le Nguyen Thanh; Hung, Nguyen Thanh; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H. Rogier; Tan, Le Van

2017-01-01

As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the virus. PMID:28090592
Portero versus portador: Spanish interpretation of genomic terminology during whole exome sequencing results disclosure.

PubMed

Gutierrez, Amanda M; Robinson, Jill O; Statham, Emily E; Scollon, Sarah; Bergstrom, Katie L; Slashinski, Melody J; Parsons, Donald W; Plon, Sharon E; McGuire, Amy L; Street, Richard L

2017-11-01

Describe modifications to technical genomic terminology made by interpreters during disclosure of whole exome sequencing (WES) results. Using discourse analysis, we identified and categorized interpretations of genomic terminology in 42 disclosure sessions where Spanish-speaking parents received their child's WES results either from a clinician using a medical interpreter, or directly from a bilingual physician. Overall, 76% of genomic terms were interpreted accordantly, 11% were misinterpreted and 13% were omitted. Misinterpretations made by interpreters and bilingual physicians included using literal and nonmedical terminology to interpret genomic concepts. Modifications to genomic terminology made during interpretation highlight the need to standardize bilingual genomic lexicons. We recommend Spanish terms that can be used to refer to genomic concepts.
Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

PubMed

Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

2018-04-24

Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed by the environment and HGT.
Genomic selection & association mapping in rice: effect of trait genetic architecture, training population composition, marker number & statistical model on accuracy of rice genomic selection in elite, tropical rice breeding

USDA-ARS?s Scientific Manuscript database

Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its ef...

The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects

PubMed Central

Papanicolaou, Alexie

2016-01-01

Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called “genome projects”. The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure. PMID:27006757
The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

PubMed

Papanicolaou, Alexie

2016-01-01

Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.
Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

PubMed Central

Xu, Jian-zhong; Zhang, Wei-guo

2016-01-01

With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010
Evaluating droplet digital PCR for the quantification of human genomic DNA: converting copies per nanoliter to nanograms nuclear DNA per microliter.

PubMed

Duewer, David L; Kline, Margaret C; Romsos, Erica L; Toman, Blaza

2018-05-01

The highly multiplexed polymerase chain reaction (PCR) assays used for forensic human identification perform best when used with an accurately determined quantity of input DNA. To help ensure the reliable performance of these assays, we are developing a certified reference material (CRM) for calibrating human genomic DNA working standards. To enable sharing information over time and place, CRMs must provide accurate and stable values that are metrologically traceable to a common reference. We have shown that droplet digital PCR (ddPCR) limiting dilution end-point measurements of the concentration of DNA copies per volume of sample can be traceably linked to the International System of Units (SI). Unlike values assigned using conventional relationships between ultraviolet absorbance and DNA mass concentration, entity-based ddPCR measurements are expected to be stable over time. However, the forensic community expects DNA quantity to be stated in terms of mass concentration rather than entity concentration. The transformation can be accomplished given SI-traceable values and uncertainties for the number of nucleotide bases per human haploid genome equivalent (HHGE) and the average molar mass of a nucleotide monomer in the DNA polymer. This report presents the considerations required to establish the metrological traceability of ddPCR-based mass concentration estimates of human nuclear DNA. Graphical abstract The roots of metrological traceability for human nuclear DNA mass concentration results. Values for the factors in blue must be established experimentally. Values for the factors in red have been established from authoritative source materials. HHGE stands for "haploid human genome equivalent"; there are two HHGE per diploid human genome.
An integrated semiconductor device enabling non-optical genome sequencing.

PubMed

Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

2011-07-20

The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
"Is It Worth Knowing?" Focus Group Participants' Perceived Utility of Genomic Preconception Carrier Screening.

PubMed

Schneider, Jennifer L; Goddard, Katrina A B; Davis, James; Wilfond, Benjamin; Kauffman, Tia L; Reiss, Jacob A; Gilmore, Marian; Himes, Patricia; Lynch, Frances L; Leo, Michael C; McMullen, Carmit

2016-02-01

As genome sequencing technology advances, research is needed to guide decision-making about what results can or should be offered to patients in different clinical settings. We conducted three focus groups with individuals who had prior preconception genetic testing experience to explore perceived advantages and disadvantages of genome sequencing for preconception carrier screening, compared to usual care. Using a discussion guide, a trained qualitative moderator facilitated the audio-recorded focus groups. Sixteen individuals participated. Thematic analysis of transcripts started with a grounded approach and subsequently focused on participants' perceptions of the value of genetic information. Analysis uncovered two orientations toward genomic preconception carrier screening: "certain" individuals desiring all possible screening information; and "hesitant" individuals who were more cautious about its value. Participants revealed valuable information about barriers to screening: fear/anxiety about results; concerns about the method of returning results; concerns about screening necessity; and concerns about partner participation. All participants recommended offering choice to patients to enhance the value of screening and reduce barriers. Overall, two groups of likely users of genome sequencing for preconception carrier screening demonstrated different perceptions of the advantages or disadvantages of screening, suggesting tailored approaches to education, consent, and counseling may be warranted with each group.
Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value.

PubMed

Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-Hyeon

2017-03-01

Holsteins are known as the world's highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.
Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value

PubMed Central

Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-hyeon

2017-01-01

Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins. PMID:26954162
First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card

PubMed Central

Goharriz, H.; Marston, D. A.; Sharifzoda, F.; Ellis, R. J.; Horton, D. L.; Khakimov, T.; Whatmore, A.; Khamroev, K.; Makhmadshoev, A. N.; Bazarov, M.; Fooks, A. R.

2017-01-01

ABSTRACT A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data. PMID:28684566
First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card.

PubMed

Goharriz, H; Marston, D A; Sharifzoda, F; Ellis, R J; Horton, D L; Khakimov, T; Whatmore, A; Khamroev, K; Makhmadshoev, A N; Bazarov, M; Fooks, A R; Banyard, A C

2017-07-06

A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data. © Crown copyright 2017.
Directional Selection from Host Plants Is a Major Force Driving Host Specificity in Magnaporthe Species.

PubMed

Zhong, Zhenhui; Norvienyeku, Justice; Chen, Meilian; Bao, Jiandong; Lin, Lianyu; Chen, Liqiong; Lin, Yahong; Wu, Xiaoxian; Cai, Zena; Zhang, Qi; Lin, Xiaoye; Hong, Yonghe; Huang, Jun; Xu, Linghong; Zhang, Honghong; Chen, Long; Tang, Wei; Zheng, Huakun; Chen, Xiaofeng; Wang, Yanli; Lian, Bi; Zhang, Liangsheng; Tang, Haibao; Lu, Guodong; Ebbole, Daniel J; Wang, Baohua; Wang, Zonghua

2016-05-06

One major threat to global food security that requires immediate attention, is the increasing incidence of host shift and host expansion in growing number of pathogenic fungi and emergence of new pathogens. The threat is more alarming because, yield quality and quantity improvement efforts are encouraging the cultivation of uniform plants with low genetic diversity that are increasingly susceptible to emerging pathogens. However, the influence of host genome differentiation on pathogen genome differentiation and its contribution to emergence and adaptability is still obscure. Here, we compared genome sequence of 6 isolates of Magnaporthe species obtained from three different host plants. We demonstrated the evolutionary relationship between Magnaporthe species and the influence of host differentiation on pathogens. Phylogenetic analysis showed that evolution of pathogen directly corresponds with host divergence, suggesting that host-pathogen interaction has led to co-evolution. Furthermore, we identified an asymmetric selection pressure on Magnaporthe species. Oryza sativa-infecting isolates showed higher directional selection from host and subsequently tends to lower the genetic diversity in its genome. We concluded that, frequent gene loss or gain, new transposon acquisition and sequence divergence are host adaptability mechanisms for Magnaporthe species, and this coevolution processes is greatly driven by directional selection from host plants.
Directional Selection from Host Plants Is a Major Force Driving Host Specificity in Magnaporthe Species

PubMed Central

Zhong, Zhenhui; Norvienyeku, Justice; Chen, Meilian; Bao, Jiandong; Lin, Lianyu; Chen, Liqiong; Lin, Yahong; Wu, Xiaoxian; Cai, Zena; Zhang, Qi; Lin, Xiaoye; Hong, Yonghe; Huang, Jun; Xu, Linghong; Zhang, Honghong; Chen, Long; Tang, Wei; Zheng, Huakun; Chen, Xiaofeng; Wang, Yanli; Lian, Bi; Zhang, Liangsheng; Tang, Haibao; Lu, Guodong; Ebbole, Daniel J.; Wang, Baohua; Wang, Zonghua

2016-01-01

One major threat to global food security that requires immediate attention, is the increasing incidence of host shift and host expansion in growing number of pathogenic fungi and emergence of new pathogens. The threat is more alarming because, yield quality and quantity improvement efforts are encouraging the cultivation of uniform plants with low genetic diversity that are increasingly susceptible to emerging pathogens. However, the influence of host genome differentiation on pathogen genome differentiation and its contribution to emergence and adaptability is still obscure. Here, we compared genome sequence of 6 isolates of Magnaporthe species obtained from three different host plants. We demonstrated the evolutionary relationship between Magnaporthe species and the influence of host differentiation on pathogens. Phylogenetic analysis showed that evolution of pathogen directly corresponds with host divergence, suggesting that host-pathogen interaction has led to co-evolution. Furthermore, we identified an asymmetric selection pressure on Magnaporthe species. Oryza sativa-infecting isolates showed higher directional selection from host and subsequently tends to lower the genetic diversity in its genome. We concluded that, frequent gene loss or gain, new transposon acquisition and sequence divergence are host adaptability mechanisms for Magnaporthe species, and this coevolution processes is greatly driven by directional selection from host plants. PMID:27151494
Accuracy and training population design for genomic selection in elite north american oats

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) is a method to estimate the breeding values of individuals by using markers throughout the genome. We evaluated the accuracies of GS using data from five traits on 446 oat lines genotyped with 1005 Diversity Array Technology (DArT) markers and two GS methods (RR-BLUP and Bayes...
Genome sequence of an aflatoxigenic pathogen of Argentinian peanut, Aspergillus arachidicola

USDA-ARS?s Scientific Manuscript database

In this study we sequenced the genome of the A. arachidicola Type strain (CBS 117610) and found its genome size to be 38.9 Mb, and its number of predicted genes to be 12,091, which are values comparable to those in other sequenced Aspergilli. Of its predicted genes, 691 were identified as unique to ...
CpG Distribution and Methylation Pattern in Porcine Parvovirus

PubMed Central

Tóth, Renáta; Mészáros, István; Stefancsik, Rajmund; Bartha, Dániel; Bálint, Ádám; Zádori, Zoltán

2013-01-01

Based on GC content and the observed/expected CpG ratio (oCpGr), we found three major groups among the members of subfamily Parvovirinae: Group I parvoviruses with low GC content and low oCpGr values, Group II with low GC content and high oCpGr values and Group III with high GC content and high oCpGr values. Porcine parvovirus belongs to Group I and it features an ascendant CpG distribution by position in its coding regions similarly to the majority of the parvoviruses. The entire PPV genome remains hypomethylated during the viral lifecycle independently from the tissue of origin. In vitro CpG methylation of the genome has a modest inhibitory effect on PPV replication. The in vitro hypermethylation disappears from the replicating PPV genome suggesting that beside the maintenance DNMT1 the de novo DNMT3a and DNMT3b DNA methyltransferases can’t methylate replicating PPV DNA effectively either, despite that the PPV infection does not seem to influence the expression, translation or localization of the DNA methylases. SNP analysis revealed high mutability of the CpG sites in the PPV genome, while introduction of 29 extra CpG sites into the genome has no significant biological effects on PPV replication in vitro. These experiments raise the possibility that beyond natural selection mutational pressure may also significantly contribute to the low level of the CpG sites in the PPV genome. PMID:24392033
Genome-scale engineering of Saccharomyces cerevisiae with single-nucleotide precision.

PubMed

Bao, Zehua; HamediRad, Mohammad; Xue, Pu; Xiao, Han; Tasan, Ipek; Chao, Ran; Liang, Jing; Zhao, Huimin

2018-07-01

We developed a CRISPR-Cas9- and homology-directed-repair-assisted genome-scale engineering method named CHAnGE that can rapidly output tens of thousands of specific genetic variants in yeast. More than 98% of target sequences were efficiently edited with an average frequency of 82%. We validate the single-nucleotide resolution genome-editing capability of this technology by creating a genome-wide gene disruption collection and apply our method to improve tolerance to growth inhibitors.
Genomic profiling of multiple sequentially acquired tumor metastatic sites from an “exceptional responder” lung adenocarcinoma patient reveals extensive genomic heterogeneity and novel somatic variants driving treatment response. | Center for Cancer Research

Cancer.gov

Biswas et al. describe an “exceptional responder” lung adenocarcinoma patient who survived with metastatic lung adenocarcinoma for 7 years while undergoing single or combination ERBB2-directed therapies. Whole-genome, whole-exome, and high-coverage ion-torrent targeted sequencing were used to demonstrate extreme genomic heterogeneity between the lung and lymph node metastatic
NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.

PubMed

Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit

2018-01-01

Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .
The genome of Eimeria spp., with special reference to Eimeria tenella--a coccidium from the chicken.

PubMed

Shirley, M W

2000-04-10

Eimeria spp. contain at least four genomes. The nuclear genome is best studied in the avian species Eimeria tenella and comprises about 60 Mbp DNA contained within ca. 14 chromosomes; other avian and lupine species appear to possess a nuclear genome of similar size. In addition, sequence data and hybridisation studies have provided direct evidence for extrachromosomal mitochondrial and plastid DNA genomes, and double-stranded RNA segments have also been described. The unique phenotype of "precocious" development that characterises some selected lines of Eimeria spp. not only provides the basis for the first generation of live attenuated vaccines, but offers a significant entrée into studies on the regulation of an apicomplexan life-cycle. With a view to identifying loci implicated in the trait of precocious development, a genetic linkage map of the genome of E. tenella is being constructed in this laboratory from analyses of the inheritance of over 400 polymorphic DNA markers in the progeny of a cross between complementary drug-resistant and precocious parents. Other projects that impinge directly or indirectly on the genome and/or genetics of Eimeria spp. are currently in progress in several laboratories, and include the derivation of expressed sequence tag data and the development of ancillary technologies such as transfection techniques. No large-scale genomic DNA sequencing projects have been reported.
Seeking Optimal Region-Of-Interest (ROI) Single-Value Summary Measures for fMRI Studies in Imaging Genetics

PubMed Central

Tong, Yunxia; Chen, Qiang; Nichols, Thomas E.; Rasetti, Roberta; Callicott, Joseph H.; Berman, Karen F.; Weinberger, Daniel R.; Mattay, Venkata S.

2016-01-01

A data-driven hypothesis-free genome-wide association (GWA) approach in imaging genetics studies allows screening the entire genome to discover novel genes that modulate brain structure, chemistry, and function. However, a whole brain voxel-wise analysis approach in such genome-wide based imaging genetic studies can be computationally intense and also likely has low statistical power since a stringent multiple comparisons correction is needed for searching over the entire genome and brain. In imaging genetics with functional magnetic resonance imaging (fMRI) phenotypes, since many experimental paradigms activate focal regions that can be pre-specified based on a priori knowledge, reducing the voxel-wise search to single-value summary measures within a priori ROIs could prove efficient and promising. The goal of this investigation is to evaluate the sensitivity and reliability of different single-value ROI summary measures and provide guidance in future work. Four different fMRI databases were tested and comparisons across different groups (patients with schizophrenia, their siblings, vs. normal control subjects; across genotype groups) were conducted. Our results show that four of these measures, particularly those that represent values from the top most-activated voxels within an ROI are more powerful at reliably detecting group differences and generating greater effect sizes than the others. PMID:26974435

The SIDER2 elements, interspersed repeated sequences that populate the Leishmania genomes, constitute subfamilies showing chromosomal proximity relationship.

PubMed

Requena, Jose M; Folgueira, Cristina; López, Manuel C; Thomas, M Carmen

2008-06-02

Protozoan parasites of the genus Leishmania are causative agents of a diverse spectrum of human diseases collectively known as leishmaniasis. These eukaryotic pathogens that diverged early from the main eukaryotic lineage possess a number of unusual genomic, molecular and biochemical features. The completion of the genome projects for three Leishmania species has generated invaluable information enabling a direct analysis of genome structure and organization. By using DNA macroarrays, made with Leishmania infantum genomic clones and hybridized with total DNA from the parasite, we identified a clone containing a repeated sequence. An analysis of the recently completed genome sequence of L. infantum, using this repeated sequence as bait, led to the identification of a new class of repeated elements that are interspersed along the different L. infantum chromosomes. These elements turned out to be homologues of SIDER2 sequences, which were recently identified in the Leishmania major genome; thus, we adopted this nomenclature for the Leishmania elements described herein. Since SIDER2 elements are very heterogeneous in sequence, their precise identification is rather laborious. We have characterized 54 LiSIDER2 elements in chromosome 32 and 27 ones in chromosome 20. The mean size for these elements is 550 bp and their sequence is G+C rich (mean value of 66.5%). On the basis of sequence similarity, these elements can be grouped in subfamilies that show a remarkable relationship of proximity, i.e. SIDER2s of a given subfamily locate close in a chromosomal region without intercalating elements. For comparative purposes, we have identified the SIDER2 elements existing in L. major and Leishmania braziliensis chromosomes 32. While SIDER2 elements are highly conserved both in number and location between L. infantum and L. major, no such conservation exists when comparing with SIDER2s in L. braziliensis chromosome 32. SIDER2 elements constitute a relevant piece in the Leishmania genome organization. Sequence characteristics, genomic distribution and evolutionarily conservation of SIDER2s are suggestive of relevant functions for these elements in Leishmania. Apart from a proved involvement in post-transcriptional mechanisms of gene regulation, SIDER2 elements could be involved in DNA amplification processes and, perhaps, in chromosome segregation as centromeric sequences.
Practical implementation of cost-effective genomic selection in commercial pig breeding using imputation.

PubMed

Cleveland, M A; Hickey, J M

2013-08-01

Genomic selection can be implemented in pig breeding at a reduced cost using genotype imputation. Accuracy of imputation and the impact on resulting genomic breeding values (gEBV) was investigated. High-density genotype data was available for 4,763 animals from a single pig line. Three low-density genotype panels were constructed with SNP densities of 450 (L450), 3,071 (L3k) and 5,963 (L6k). Accuracy of imputation was determined using 184 test individuals with no genotyped descendants in the data but with parents and grandparents genotyped using the Illumina PorcineSNP60 Beadchip. Alternative genotyping scenarios were created in which parents, grandparents, and individuals that were not direct ancestors of test animals (Other) were genotyped at high density (S1), grandparents were not genotyped (S2), dams and granddams were not genotyped (S3), and dams and granddams were genotyped at low density (S4). Four additional scenarios were created by excluding Other animal genotypes. Test individuals were always genotyped at low density. Imputation was performed with AlphaImpute. Genomic breeding values were calculated using the single-step genomic evaluation. Test animals were evaluated for the information retained in the gEBV, calculated as the correlation between gEBV using imputed genotypes and gEBV using true genotypes. Accuracy of imputation was high for all scenarios but decreased with fewer SNP on the low-density panel (0.995 to 0.965 for S1) and with reduced genotyping of ancestors, where the largest changes were for L450 (0.965 in S1 to 0.914 in S3). Exclusion of genotypes for Other animals resulted in only small accuracy decreases. Imputation accuracy was not consistent across the genome. Information retained in the gEBV was related to genotyping scenario and thus to imputation accuracy. Reducing the number of SNP on the low-density panel reduced the information retained in the gEBV, with the largest decrease observed from L3k to L450. Excluding Other animal genotypes had little impact on imputation accuracy but caused large decreases in the information retained in the gEBV. These results indicate that accuracy of gEBV from imputed genotypes depends on the level of genotyping in close relatives and the size of the genotyped dataset. Fewer high-density genotyped individuals are needed to obtain accurate imputation than are needed to obtain accurate gEBV. Strategies to optimize development of low-density panels can improve both imputation and gEBV accuracy.
Performance comparison of two efficient genomic selection methods (gsbay & MixP) applied in aquacultural organisms

NASA Astrophysics Data System (ADS)

Su, Hailin; Li, Hengde; Wang, Shi; Wang, Yangfan; Bao, Zhenmin

2017-02-01

Genomic selection is more and more popular in animal and plant breeding industries all around the world, as it can be applied early in life without impacting selection candidates. The objective of this study was to bring the advantages of genomic selection to scallop breeding. Two different genomic selection tools MixP and gsbay were applied on genomic evaluation of simulated data and Zhikong scallop ( Chlamys farreri) field data. The data were compared with genomic best linear unbiased prediction (GBLUP) method which has been applied widely. Our results showed that both MixP and gsbay could accurately estimate single-nucleotide polymorphism (SNP) marker effects, and thereby could be applied for the analysis of genomic estimated breeding values (GEBV). In simulated data from different scenarios, the accuracy of GEBV acquired was ranged from 0.20 to 0.78 by MixP; it was ranged from 0.21 to 0.67 by gsbay; and it was ranged from 0.21 to 0.61 by GBLUP. Estimations made by MixP and gsbay were expected to be more reliable than those estimated by GBLUP. Predictions made by gsbay were more robust, while with MixP the computation is much faster, especially in dealing with large-scale data. These results suggested that both algorithms implemented by MixP and gsbay are feasible to carry out genomic selection in scallop breeding, and more genotype data will be necessary to produce genomic estimated breeding values with a higher accuracy for the industry.
Novel Virulent and Broad-Host-Range Erwinia amylovora Bacteriophages Reveal a High Degree of Mosaicism and a Relationship to Enterobacteriaceae Phages ▿†

PubMed Central

Born, Yannick; Fieseler, Lars; Marazzi, Janine; Lurz, Rudi; Duffy, Brion; Loessner, Martin J.

2011-01-01

A diverse set of 24 novel phages infecting the fire blight pathogen Erwinia amylovora was isolated from fruit production environments in Switzerland. Based on initial screening, four phages (L1, M7, S6, and Y2) with broad host ranges were selected for detailed characterization and genome sequencing. Phage L1 is a member of the Podoviridae, with a 39.3-kbp genome featuring invariable genome ends with direct terminal repeats. Phage S6, another podovirus, was also found to possess direct terminal repeats but has a larger genome (74.7 kbp), and the virus particle exhibits a complex tail fiber structure. Phages M7 and Y2 both belong to the Myoviridae family and feature long, contractile tails and genomes of 84.7 kbp (M7) and 56.6 kbp (Y2), respectively, with direct terminal repeats. The architecture of all four phage genomes is typical for tailed phages, i.e., organized into function-specific gene clusters. All four phages completely lack genes or functions associated with lysogeny control, which correlates well with their broad host ranges and indicates strictly lytic (virulent) lifestyles without the possibility for host lysogenization. Comparative genomics revealed that M7 is similar to E. amylovora virus ΦEa21-4, whereas L1, S6, and Y2 are unrelated to any other E. amylovora phage. Instead, they feature similarities to enterobacterial viruses T7, N4, and ΦEcoM-GJ1. In a series of laboratory experiments, we provide proof of concept that specific two-phage cocktails offer the potential for biocontrol of the pathogen. PMID:21764969
Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

PubMed

Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

2017-01-01

Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.
Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

PubMed Central

Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

2017-01-01

Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096
Novel virulent and broad-host-range Erwinia amylovora bacteriophages reveal a high degree of mosaicism and a relationship to Enterobacteriaceae phages.

PubMed

Born, Yannick; Fieseler, Lars; Marazzi, Janine; Lurz, Rudi; Duffy, Brion; Loessner, Martin J

2011-09-01

A diverse set of 24 novel phages infecting the fire blight pathogen Erwinia amylovora was isolated from fruit production environments in Switzerland. Based on initial screening, four phages (L1, M7, S6, and Y2) with broad host ranges were selected for detailed characterization and genome sequencing. Phage L1 is a member of the Podoviridae, with a 39.3-kbp genome featuring invariable genome ends with direct terminal repeats. Phage S6, another podovirus, was also found to possess direct terminal repeats but has a larger genome (74.7 kbp), and the virus particle exhibits a complex tail fiber structure. Phages M7 and Y2 both belong to the Myoviridae family and feature long, contractile tails and genomes of 84.7 kbp (M7) and 56.6 kbp (Y2), respectively, with direct terminal repeats. The architecture of all four phage genomes is typical for tailed phages, i.e., organized into function-specific gene clusters. All four phages completely lack genes or functions associated with lysogeny control, which correlates well with their broad host ranges and indicates strictly lytic (virulent) lifestyles without the possibility for host lysogenization. Comparative genomics revealed that M7 is similar to E. amylovora virus ΦEa21-4, whereas L1, S6, and Y2 are unrelated to any other E. amylovora phage. Instead, they feature similarities to enterobacterial viruses T7, N4, and ΦEcoM-GJ1. In a series of laboratory experiments, we provide proof of concept that specific two-phage cocktails offer the potential for biocontrol of the pathogen.
Whole genome amplification of DNA extracted from FFPE tissues.

PubMed

Bosso, Mira; Al-Mulla, Fahd

2011-01-01

Whole genome amplification systems were developed to meet the increasing research demands on DNA resources and to avoid DNA shortage. The technology enables amplification of nanogram amounts of DNA into microgram quantities and is increasingly used in the amplification of DNA from multiple origins such as blood, fresh frozen tissue, formalin-fixed paraffin-embedded tissues, saliva, buccal swabs, bacteria, and plant and animal sources. This chapter focuses on the use of GenomePlex(®) tissue Whole Genome Amplification Kit, to amplify DNA directly from archived tissue. In addition, this chapter documents our unique experience with the utilization of GenomePlex(®) amplified DNA using several molecular techniques including metaphase Comparative Genomic Hybridization, array Comparative Genomic Hybridization, and real-time quantitative polymerase chain reaction assays. GenomePlex(®) is a registered trademark of Rubicon Genomics Incorporation.
Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

PubMed

Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

2016-01-01

One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. Copyright © 2016 Elsevier Ltd. All rights reserved.
Single-cell PCR of genomic DNA enabled by automated single-cell printing for cell isolation.

PubMed

Stumpf, F; Schoendube, J; Gross, A; Rath, C; Niekrawietz, S; Koltay, P; Roth, G

2015-07-15

Single-cell analysis has developed into a key topic in cell biology with future applications in personalized medicine, tumor identification as well as tumor discovery (Editorial, 2013). Here we employ inkjet-like printing to isolate individual living single human B cells (Raji cell line) and load them directly into standard PCR tubes. Single cells are optically detected in the nozzle of the microfluidic piezoelectric dispenser chip to ensure printing of droplets with single cells only. The printing process has been characterized by using microbeads (10µm diameter) resulting in a single bead delivery in 27 out of 28 cases and relative positional precision of ±350µm at a printing distance of 6mm between nozzle and tube lid. Process-integrated optical imaging enabled to identify the printing failure as void droplet and to exclude it from downstream processing. PCR of truly single-cell DNA was performed without pre-amplification directly from single Raji cells with 33% success rate (N=197) and Cq values of 36.3±2.5. Additionally single cell whole genome amplification (WGA) was employed to pre-amplify the single-cell DNA by a factor of >1000. This facilitated subsequent PCR for the same gene yielding a success rate of 64% (N=33) which will allow more sophisticated downstream analysis like sequencing, electrophoresis or multiplexing. Copyright © 2015 Elsevier B.V. All rights reserved.
Phylogenomic relationship of feijoa (Acca sellowiana (O.Berg) Burret) with other Myrtaceae based on complete chloroplast genome sequences.

PubMed

Machado, Lilian de Oliveira; Vieira, Leila do Nascimento; Stefenon, Valdir Marcos; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Guerra, Miguel Pedro; Nodari, Rubens Onofre

2017-04-01

Given their distribution, importance, and richness, Myrtaceae species comprise a model system for studying the evolution of tropical plant diversity. In addition, chloroplast (cp) genome sequencing is an efficient tool for phylogenetic relationship studies. Feijoa [Acca sellowiana (O. Berg) Burret; CN: pineapple-guava] is a Myrtaceae species that occurs naturally in southern Brazil and northern Uruguay. Feijoa is known for its exquisite perfume and flavorful fruits, pharmacological properties, ornamental value and increasing economic relevance. In the present work, we reported the complete cp genome of feijoa. The feijoa cp genome is a circular molecule of 159,370 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC 88,028 bp) and a Small Single Copy region (SSC 18,598 bp) separated by Inverted Repeat regions (IRs 26,372 bp). The genome structure, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. When compared to other cp genome sequences of Myrtaceae, feijoa showed closest relationship with pitanga (Eugenia uniflora L.). Furthermore, a comparison of pitanga synonymous (Ks) and nonsynonymous (Ka) substitution rates revealed extremely low values. Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of three Myrtoideae clades.
Genome Wide Association Study of Sepsis in Extremely Premature Infants

PubMed Central

Srinivasan, Lakshmi; Page, Grier; Kirpalani, Haresh; Murray, Jeffrey C.; Das, Abhik; Higgins, Rosemary D.; Carlo, Waldemar A.; Bell, Edward F.; Goldberg, Ronald N.; Schibler, Kurt; Sood, Beena G.; Stevenson, David K.; Stoll, Barbara J.; Van Meurs, Krisa P.; Johnson, Karen J.; Levy, Joshua; McDonald, Scott A.; Zaterka-Baxter, Kristin M.; Kennedy, Kathleen A.; Sánchez, Pablo J.; Duara, Shahnaz; Walsh, Michele C.; Shankaran, Seetha; Wynn, James L.; Cotten, C. Michael

2017-01-01

Objective To identify genetic variants associated with sepsis (early and late-onset) using a genome wide association (GWA) analysis in a cohort of extremely premature infants. Study Design Previously generated GWA data from the Neonatal Research Network’s anonymized genomic database biorepository of extremely premature infants were used for this study. Sepsis was defined as culture-positive early-onset or late-onset sepsis or culture-proven meningitis. Genomic and whole genome amplified DNA was genotyped for 1.2 million single nucleotide polymorphisms (SNPs); 91% of SNPs were successfully genotyped. We imputed 7.2 million additional SNPs. P values and false discovery rates were calculated from multivariate logistic regression analysis adjusting for gender, gestational age and ancestry. Target statistical value was p<10−5. Secondary analyses assessed associations of SNPs with pathogen type. Pathway analyses were also run on primary and secondary end points. Results Data from 757 extremely premature infants were included: 351 infants with sepsis and 406 infants without sepsis. No SNPs reached genome-wide significance levels (5×10−8); two SNPs in proximity to FOXC2 and FOXL1 genes achieved target levels of significance. In secondary analyses, SNPs for ELMO1, IRAK2 (Gram positive sepsis), RALA, IMMP2L (Gram negative sepsis) and PIEZO2 (fungal sepsis) met target significance levels. Pathways associated with sepsis and Gram negative sepsis included gap junctions, fibroblast growth factor receptors, regulators of cell division and Interleukin-1 associated receptor kinase 2 (p values<0.001 and FDR<20%). Conclusions No SNPs met genome-wide significance in this cohort of ELBW infants; however, areas of potential association and pathways meriting further study were identified. PMID:28283553
Genome Dynamics and Molecular Infection Epidemiology of Multidrug-Resistant Helicobacter pullorum Isolates Obtained from Broiler and Free-Range Chickens in India.

PubMed

Qumar, Shamsul; Majid, Mohammad; Kumar, Narender; Tiwari, Sumeet K; Semmler, Torsten; Devi, Savita; Baddam, Ramani; Hussain, Arif; Shaik, Sabiha; Ahmed, Niyaz

2017-01-01

Some life-threatening, foodborne, and zoonotic infections are transmitted through poultry birds. Inappropriate and indiscriminate use of antimicrobials in the livestock industry has led to an increased prevalence of multidrug-resistant bacteria with epidemic potential. Here, we present a functional molecular epidemiological analysis entailing the phenotypic and whole-genome sequence-based characterization of 11 H. pullorum isolates from broiler and free-range chickens sampled from retail wet markets in Hyderabad City, India. Antimicrobial susceptibility tests revealed all of the isolates to be resistant to multiple antibiotic classes such as fluoroquinolones, cephalosporins, sulfonamides, and macrolides. The isolates were also found to be extended-spectrum β-lactamase producers and were even resistant to clavulanic acid. Whole-genome sequencing and comparative genomic analysis of these isolates revealed the presence of five or six well-characterized antimicrobial resistance genes, including those encoding a resistance-nodulation-division efflux pump(s). Phylogenetic analysis combined with pan-genome analysis revealed a remarkable degree of genetic diversity among the isolates from free-range chickens; in contrast, a high degree of genetic similarity was observed among broiler chicken isolates. Comparative genomic analysis of all publicly available H. pullorum genomes, including our isolates (n = 16), together with the genomes of 17 other Helicobacter species, revealed a high number (8,560) of H. pullorum-specific protein-encoding genes, with an average of 535 such genes per isolate. In silico virulence screening identified 182 important virulence genes and also revealed high strain-specific gene content in isolates from free-range chickens (average, 34) compared to broiler chicken isolates. A significant prevalence of prophages (ranging from 1 to 9) and a significant presence of genomic islands (0 to 4) were observed in free-range and broiler chicken isolates. Taken together, these observations provide significant baseline data for functional molecular infection epidemiology of nonpyloric Helicobacter species such as H. pullorum by unraveling their evolution in chickens and their possible zoonotic transmission to humans. Globally, the poultry industry is expanding with an ever-growing consumer base for chicken meat. Given this, food-associated transmission of multidrug-resistant bacteria represents an important health care issue. Our study involves a critical baseline approach directed at genome sequence-based epidemiology and transmission dynamics of H. pullorum, a poultry pathogen having established zoonotic potential. We believe our studies would facilitate the development of surveillance systems that ensure the safety of food for humans and guide public health policies related to the use of antibiotics in animal feed in countries such as India. We sequenced 11 new genomes of H. pullorum as a part of this study. These genomes would provide much value in addition to the ongoing comparative genomic studies of helicobacters. Copyright © 2016 American Society for Microbiology.
IMA Genome-F 3: Draft genomes of Amanita jacksonii, Ceratocystis albifundus, Fusarium circinatum, Huntiella omanensis, Leptographium procerum, Rutstroemia sydowiana, and Sclerotinia echinophila.

PubMed

van der Nest, Magriet A; Beirn, Lisa A; Crouch, Jo Anne; Demers, Jill E; de Beer, Z Wilhelm; De Vos, Lieschen; Gordon, Thomas R; Moncalvo, Jean-Marc; Naidoo, Kershney; Sanchez-Ramirez, Santiago; Roodt, Danielle; Santana, Quentin C; Slinski, Stephanie L; Stata, Matt; Taerum, Stephen J; Wilken, P Markus; Wilson, Andrea M; Wingfield, Michael J; Wingfield, Brenda D

2014-12-01

The genomes of fungi provide an important resource to resolve issues pertaining to their taxonomy, biology, and evolution. The genomes of Amanita jacksonii, Ceratocystis albifundus, a Fusarium circinatum variant, Huntiella omanensis, Leptographium procerum, Sclerotinia echinophila, and Rutstroemia sydowiana are presented in this genome announcement. These seven genomes are from a number of fungal pathogens and economically important species. The genome sizes range from 27 Mb in the case of Ceratocystis albifundus to 51.9 Mb for Rutstroemia sydowiana. The latter also encodes for a predicted 17 350 genes, more than double that of Ceratocystis albifundus. These genomes will add to the growing body of knowledge of these fungi and provide a value resource to researchers studying these fungi.
Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

PubMed

Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

2014-01-01

Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.
Ten Years of Landscape Genomics: Challenges and Opportunities.

PubMed

Li, Yong; Zhang, Xue-Xia; Mao, Run-Li; Yang, Jie; Miao, Cai-Yun; Li, Zhuo; Qiu, Ying-Xiong

2017-01-01

Landscape genomics is a relatively new discipline that aims to reveal the relationship between adaptive genetic imprints in genomes and environmental heterogeneity among natural populations. Although the interest in landscape genomics has increased since this term was coined, studies on this topic remain scarce. Landscape genomics has become a powerful method to scan and determine the genes responsible for the complex adaptive evolution of species at population (mostly) and individual (more rarely) level. This review outlines the sampling strategies, molecular marker types and research categories in 37 articles published during the first 10 years of this field (i.e., 2007-2016). We also address major challenges and future directions for landscape genomics. This review aims to promote interest in conducting additional studies in landscape genomics.
Direct detection of methylation in genomic DNA

PubMed Central

Bart, A.; van Passel, M. W. J.; van Amsterdam, K.; van der Ende, A.

2005-01-01

The identification of methylated sites on bacterial genomic DNA would be a useful tool to study the major roles of DNA methylation in prokaryotes: distinction of self and nonself DNA, direction of post-replicative mismatch repair, control of DNA replication and cell cycle, and regulation of gene expression. Three types of methylated nucleobases are known: N6-methyladenine, 5-methylcytosine and N4-methylcytosine. The aim of this study was to develop a method to detect all three types of DNA methylation in complete genomic DNA. It was previously shown that N6-methyladenine and 5-methylcytosine in plasmid and viral DNA can be detected by intersequence trace comparison of methylated and unmethylated DNA. We extended this method to include N4-methylcytosine detection in both in vitro and in vivo methylated DNA. Furthermore, application of intersequence trace comparison was extended to bacterial genomic DNA. Finally, we present evidence that intrasequence comparison suffices to detect methylated sites in genomic DNA. In conclusion, we present a method to detect all three natural types of DNA methylation in bacterial genomic DNA. This provides the possibility to define the complete methylome of any prokaryote. PMID:16091626
Chromosomes in a genome-wise order: evidence for metaphase architecture.

PubMed

Weise, Anja; Bhatt, Samarth; Piaszinski, Katja; Kosyakova, Nadezda; Fan, Xiaobo; Altendorf-Hofmann, Annelore; Tanomtong, Alongklod; Chaveerach, Arunrat; de Cioffi, Marcelo Bello; de Oliveira, Edivaldo; Walther, Joachim-U; Liehr, Thomas; Chaudhuri, Jyoti P

2016-01-01

One fundamental finding of the last decade is that, besides the primary DNA sequence information there are several epigenetic "information-layers" like DNA-and histone modifications, chromatin packaging and, last but not least, the position of genes in the nucleus. We postulate that the functional genomic architecture is not restricted to the interphase of the cell cycle but can also be observed in the metaphase stage, when chromosomes are most condensed and microscopically visible. If so, it offers the unique opportunity to directly analyze the functional aspects of genomic architecture in different cells, species and diseases. Another aspect not directly accessible by molecular techniques is the genome merged from two different haploid parental genomes represented by the homologous chromosome sets. Our results show that there is not only a well-known and defined nuclear architecture in interphase but also in metaphase leading to a bilateral organization of the two haploid sets of chromosomes. Moreover, evidence is provided for the parental origin of the haploid grouping. From our findings we postulate an additional epigenetic information layer within the genome including the organization of homologous chromosomes and their parental origin which may now substantially change the landscape of genetics.
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples

PubMed Central

Quick, Josh; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2018-01-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. PMID:28538739
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

PubMed

Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

2016-09-01

Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.

Site-Specific Editing of the Plasmodium falciparum Genome Using Engineered Zinc-Finger Nucleases

PubMed Central

Straimer, Judith; Lee, Marcus CS; Lee, Andrew H; Zeitler, Bryan; Williams, April E; Pearl, Jocelynn R; Zhang, Lei; Rebar, Edward J; Gregory, Philip D; Llinás, Manuel; Urnov, Fyodor D; Fidock, David A

2013-01-01

Malaria afflicts over 200 million people worldwide and its most lethal etiologic agent, Plasmodium falciparum, is evolving to resist even the latest-generation therapeutics. Efficient tools for genome-directed investigations of P. falciparum pathogenesis, including drug resistance mechanisms, are clearly required. Here we report rapid and targeted genetic engineering of this parasite, using zinc-finger nucleases (ZFNs) that produce a double-strand break in a user-defined locus and trigger homology-directed repair. Targeting an integrated egfp locus, we obtained gene deletion parasites with unprecedented speed (two weeks), both with and without direct selection. ZFNs engineered against the endogenous parasite gene pfcrt, responsible for chloroquine treatment escape, rapidly produced parasites that carried either an allelic replacement or a panel of specified point mutations. The efficiency, versatility and precision of this method will enable a diverse array of genome editing approaches to interrogate this human pathogen. PMID:22922501
Systems Biology Approaches for Understanding Genome Architecture.

PubMed

Sewitz, Sven; Lipkow, Karen

2016-01-01

The linear and three-dimensional arrangement and composition of chromatin in eukaryotic genomes underlies the mechanisms directing gene regulation. Understanding this organization requires the integration of many data types and experimental results. Here we describe the approach of integrating genome-wide protein-DNA binding data to determine chromatin states. To investigate spatial aspects of genome organization, we present a detailed description of how to run stochastic simulations of protein movements within a simulated nucleus in 3D. This systems level approach enables the development of novel questions aimed at understanding the basic mechanisms that regulate genome dynamics.
Systematic assignment of thermodynamic constraints in metabolic network models

PubMed Central

Kümmel, Anne; Panke, Sven; Heinemann, Matthias

2006-01-01

Background The availability of genome sequences for many organisms enabled the reconstruction of several genome-scale metabolic network models. Currently, significant efforts are put into the automated reconstruction of such models. For this, several computational tools have been developed that particularly assist in identifying and compiling the organism-specific lists of metabolic reactions. In contrast, the last step of the model reconstruction process, which is the definition of the thermodynamic constraints in terms of reaction directionalities, still needs to be done manually. No computational method exists that allows for an automated and systematic assignment of reaction directions in genome-scale models. Results We present an algorithm that – based on thermodynamics, network topology and heuristic rules – automatically assigns reaction directions in metabolic models such that the reaction network is thermodynamically feasible with respect to the production of energy equivalents. It first exploits all available experimentally derived Gibbs energies of formation to identify irreversible reactions. As these thermodynamic data are not available for all metabolites, in a next step, further reaction directions are assigned on the basis of network topology considerations and thermodynamics-based heuristic rules. Briefly, the algorithm identifies reaction subsets from the metabolic network that are able to convert low-energy co-substrates into their high-energy counterparts and thus net produce energy. Our algorithm aims at disabling such thermodynamically infeasible cyclic operation of reaction subnetworks by assigning reaction directions based on a set of thermodynamics-derived heuristic rules. We demonstrate our algorithm on a genome-scale metabolic model of E. coli. The introduced systematic direction assignment yielded 130 irreversible reactions (out of 920 total reactions), which corresponds to about 70% of all irreversible reactions that are required to disable thermodynamically infeasible energy production. Conclusion Although not being fully comprehensive, our algorithm for systematic reaction direction assignment could define a significant number of irreversible reactions automatically with low computational effort. We envision that the presented algorithm is a valuable part of a computational framework that assists the automated reconstruction of genome-scale metabolic models. PMID:17123434
Genomics and the Public Health Code of Ethics

PubMed Central

Thomas, James C.; Irwin, Debra E.; Zuiker, Erin Shaugnessy; Millikan, Robert C.

2005-01-01

We consider the public health applications of genomic technologies as viewed through the lens of the public health code of ethics. We note, for example, the potential for genomics to increase our appreciation for the public health value of interdependence, the potential for some genomic tools to exacerbate health disparities because of their inaccessibility by the poor and the way in which genomics forces public health to refine its notions of prevention. The public health code of ethics sheds light on concerns raised by commercial genomic products that are not discussed in detail by more clinically oriented perspectives. In addition, the concerns raised by genomics highlight areas of our understanding of the ethical principles of public health in which further refinement may be necessary. PMID:16257942
Do Public Involvement Activities in Biomedical Research and Innovation Recruit Representatively? A Systematic Qualitative Review.

PubMed

Lander, Jonas; Hainz, Tobias; Hirschberg, Irene; Bossert, Sabine; Strech, Daniel

2016-01-01

Public involvement activities (PIAs) may contribute to the governance of ethically challenging biomedical research and innovation by informing, consulting with and engaging the public in developments and decision-making processes. For PIAs to capture a population's preferences (e.g. on issues in whole genome sequencing, biobanks or genome editing), a central methodological requirement is to involve a sufficiently representative subgroup of the general public. While the existing literature focusses on theoretical and normative aspects of 'representation', this study assesses empirically how such considerations are implemented in practice. It evaluates how PIA reports describe representation objectives, the recruitment process and levels of representation achieved. PIA reports were included from a systematic literature search if they directly reported a PIA conducted in a relevant discipline such as genomics, biobanks, biotechnology or others. PIA reports were analyzed with thematic text analysis. The text analysis was guided by an assessment matrix based on PIA-specific guidelines and frameworks. We included 46 relevant reports, most focusing on issues in genomics. 27 reports (59%) explicitly described representation objectives, though mostly without adjusting eligibility criteria and recruiting methods to the specific objective. 11 reports (24%) explicitly reported to have achieved the intended representation; the rest either reported failure or were silent on this issue. Representation of study samples in PIAs in biomedical research and innovation is currently not reported systematically. Improved reporting on representation would not only improve the validity and value of PIAs, but could also contribute to PIA results being used more often in relevant policy and decision-making processes. © 2016 S. Karger AG, Basel.
Ocean biogeochemistry modeled with emergent trait-based genomics.

PubMed

Coles, V J; Stukel, M R; Brooks, M T; Burd, A; Crump, B C; Moran, M A; Paul, J H; Satinsky, B M; Yager, P L; Zielinski, B L; Hood, R R

2017-12-01

Marine ecosystem models have advanced to incorporate metabolic pathways discovered with genomic sequencing, but direct comparisons between models and "omics" data are lacking. We developed a model that directly simulates metagenomes and metatranscriptomes for comparison with observations. Model microbes were randomly assigned genes for specialized functions, and communities of 68 species were simulated in the Atlantic Ocean. Unfit organisms were replaced, and the model self-organized to develop community genomes and transcriptomes. Emergent communities from simulations that were initialized with different cohorts of randomly generated microbes all produced realistic vertical and horizontal ocean nutrient, genome, and transcriptome gradients. Thus, the library of gene functions available to the community, rather than the distribution of functions among specific organisms, drove community assembly and biogeochemical gradients in the model ocean. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
UCbase 2.0: ultraconserved sequences database (2014 update)

PubMed Central

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it PMID:24951797
Single molecule sequencing of the M13 virus genome without amplification

PubMed Central

Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X.; Yan, Qin; Deem, Michael W.; He, Jiankui

2017-01-01

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias. PMID:29253901
Single molecule sequencing of the M13 virus genome without amplification.

PubMed

Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X; Yan, Qin; Deem, Michael W; He, Jiankui

2017-01-01

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias.
Exploring the Potential of Direct-To-Consumer Genomic Test Data for Predicting Adverse Drug Events.

PubMed

Zhang, Patrick M; Sarkar, Indra Neil

2018-01-01

Recent technological advancements in genetic testing and the growing accessibility of public genomic data provide researchers with a unique avenue to approach personalized medicine. This feasibility study examined the potential of direct-to-consumer (DTC) genomic tests (focusing on 23andMe) in research and clinical applications. In particular, we combined population genetics information from the Personal Genome Project with adverse event reports from AEOLUS and pharmacogenetic information from PharmGKB. Primarily, associations between drugs based on co-occurring genetic variations and associations between variants and adverse events were used to assess the potential for leveraging single nucleotide polymorphism information from 23andMe. The results of this study suggest potential clinical uses of DTC tests in light of potential drug interactions. Furthermore, the results suggest great potential for analyzing associations at a population level to facilitate knowledge discovery in the realm of predicting adverse drug events.
Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits

PubMed Central

Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo

2017-01-01

Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits (by genomic selection or MAS) and forage yield. PMID:28068350
Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

PubMed

Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

2017-01-01

Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits (by genomic selection or MAS) and forage yield.
Evolutionary and Taxonomic Implications of Variation in Nuclear Genome Size: Lesson from the Grass Genus Anthoxanthum (Poaceae)

PubMed Central

Chumová, Zuzana; Krejčíková, Jana; Mandáková, Terezie; Suda, Jan; Trávníček, Pavel

2015-01-01

The genus Anthoxanthum (sweet vernal grass, Poaceae) represents a taxonomically intricate polyploid complex with large phenotypic variation and its evolutionary relationships still poorly resolved. In order to get insight into the geographic distribution of ploidy levels and assess the taxonomic value of genome size data, we determined C- and Cx-values in 628 plants representing all currently recognized European species collected from 197 populations in 29 European countries. The flow cytometric estimates were supplemented by conventional chromosome counts. In addition to diploids, we found two low (rare 3x and common 4x) and one high (~16x–18x) polyploid levels. Mean holoploid genome sizes ranged from 5.52 pg in diploid A. alpinum to 44.75 pg in highly polyploid A. amarum, while the size of monoploid genomes ranged from 2.75 pg in tetraploid A. alpinum to 9.19 pg in diploid A. gracile. In contrast to Central and Northern Europe, which harboured only limited cytological variation, a much more complex pattern of genome sizes was revealed in the Mediterranean, particularly in Corsica. Eight taxonomic groups that partly corresponded to traditionally recognized species were delimited based on genome size values and phenotypic variation. Whereas our data supported the merger of A. aristatum and A. ovatum, eastern Mediterranean populations traditionally referred to as diploid A. odoratum were shown to be cytologically distinct, and may represent a new taxon. Autopolyploid origin was suggested for 4x A. alpinum. In contrast, 4x A. odoratum seems to be an allopolyploid, based on the amounts of nuclear DNA. Intraspecific variation in genome size was observed in all recognized species, the most striking example being the A. aristatum/ovatum complex. Altogether, our study showed that genome size can be a useful taxonomic marker in Anthoxathum to not only guide taxonomic decisions but also help resolve evolutionary relationships in this challenging grass genus. PMID:26207824
Evolutionary and Taxonomic Implications of Variation in Nuclear Genome Size: Lesson from the Grass Genus Anthoxanthum (Poaceae).

PubMed

Chumová, Zuzana; Krejčíková, Jana; Mandáková, Terezie; Suda, Jan; Trávníček, Pavel

2015-01-01

The genus Anthoxanthum (sweet vernal grass, Poaceae) represents a taxonomically intricate polyploid complex with large phenotypic variation and its evolutionary relationships still poorly resolved. In order to get insight into the geographic distribution of ploidy levels and assess the taxonomic value of genome size data, we determined C- and Cx-values in 628 plants representing all currently recognized European species collected from 197 populations in 29 European countries. The flow cytometric estimates were supplemented by conventional chromosome counts. In addition to diploids, we found two low (rare 3x and common 4x) and one high (~16x-18x) polyploid levels. Mean holoploid genome sizes ranged from 5.52 pg in diploid A. alpinum to 44.75 pg in highly polyploid A. amarum, while the size of monoploid genomes ranged from 2.75 pg in tetraploid A. alpinum to 9.19 pg in diploid A. gracile. In contrast to Central and Northern Europe, which harboured only limited cytological variation, a much more complex pattern of genome sizes was revealed in the Mediterranean, particularly in Corsica. Eight taxonomic groups that partly corresponded to traditionally recognized species were delimited based on genome size values and phenotypic variation. Whereas our data supported the merger of A. aristatum and A. ovatum, eastern Mediterranean populations traditionally referred to as diploid A. odoratum were shown to be cytologically distinct, and may represent a new taxon. Autopolyploid origin was suggested for 4x A. alpinum. In contrast, 4x A. odoratum seems to be an allopolyploid, based on the amounts of nuclear DNA. Intraspecific variation in genome size was observed in all recognized species, the most striking example being the A. aristatum/ovatum complex. Altogether, our study showed that genome size can be a useful taxonomic marker in Anthoxathum to not only guide taxonomic decisions but also help resolve evolutionary relationships in this challenging grass genus.
'Mind genomics': the experimental, inductive science of the ordinary, and its application to aspects of food and feeding.

PubMed

Moskowitz, Howard R

2012-11-05

The paper introduces the empirical science of 'mind genomics', whose objective is to understand the dimensions of ordinary, everyday experience, identify mind-set segments of people who value different aspects of that everyday experience, and then assign a new person to a mind-set by a statistically appropriate procedure. By studying different experiences using experimental design of ideas, 'mind genomics' constructs an empirical, inductive science of perception and experience, layer by layer. The ultimate objective of 'mind genomics' is a large-scale science of experience created using induction, with the science based upon emergent commonalities across many different types of daily experience. The particular topic investigated in the paper is the experience of healthful snacks, what makes a person 'want' them, and the dollar value of different sensory aspects of the healthful snack. Copyright © 2012 Elsevier Inc. All rights reserved.
Development and Molecular Characterization of Novel Polymorphic Genomic DNA SSR Markers in Lentinula edodes.

PubMed

Moon, Suyun; Lee, Hwa-Yong; Shim, Donghwan; Kim, Myungkil; Ka, Kang-Hyeon; Ryoo, Rhim; Ko, Han-Gyu; Koo, Chang-Duck; Chung, Jong-Wook; Ryu, Hojin

2017-06-01

Sixteen genomic DNA simple sequence repeat (SSR) markers of Lentinula edodes were developed from 205 SSR motifs present in 46.1-Mb long L. edodes genome sequences. The number of alleles ranged from 3-14 and the major allele frequency was distributed from 0.17-0.96. The values of observed and expected heterozygosity ranged from 0.00-0.76 and 0.07-0.90, respectively. The polymorphic information content value ranged from 0.07-0.89. A dendrogram, based on 16 SSR markers clustered by the paired hierarchical clustering' method, showed that 33 shiitake cultivars could be divided into three major groups and successfully identified. These SSR markers will contribute to the efficient breeding of this species by providing diversity in shiitake varieties. Furthermore, the genomic information covered by the markers can provide a valuable resource for genetic linkage map construction, molecular mapping, and marker-assisted selection in the shiitake mushroom.
Chromatin Dynamics in Genome Stability: Roles in Suppressing Endogenous DNA Damage and Facilitating DNA Repair

PubMed Central

Nair, Nidhi; Shoaib, Muhammad

2017-01-01

Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote access of relevant enzymes to specific DNA regions for signaling and repair. Furthermore, recent data highlight genome maintenance roles of chromatin through the regulation of endogenous DNA-templated processes including transcription and replication. Here, we review research that shows the importance of chromatin structure regulation in maintaining genome integrity by multiple mechanisms including facilitating DNA repair and directly suppressing endogenous DNA damage. PMID:28698521
The 3D genome in transcriptional regulation and pluripotency.

PubMed

Gorkin, David U; Leung, Danny; Ren, Bing

2014-06-05

It can be convenient to think of the genome as simply a string of nucleotides, the linear order of which encodes an organism's genetic blueprint. However, the genome does not exist as a linear entity within cells where this blueprint is actually utilized. Inside the nucleus, the genome is organized in three-dimensional (3D) space, and lineage-specific transcriptional programs that direct stem cell fate are implemented in this native 3D context. Here, we review principles of 3D genome organization in mammalian cells. We focus on the emerging relationship between genome organization and lineage-specific transcriptional regulation, which we argue are inextricably linked. Copyright © 2014 Elsevier Inc. All rights reserved.
Non-viral delivery of genome-editing nucleases for gene therapy.

PubMed

Wang, M; Glass, Z A; Xu, Q

2017-03-01

Manipulating the genetic makeup of mammalian cells using programmable nuclease-based genome-editing technology has recently evolved into a powerful avenue that holds great potential for treating genetic disorders. There are four types of genome-editing nucleases, including meganucleases, zinc finger nucleases, transcription activator-like effector nucleases and clustered, regularly interspaced, short palindromic repeat-associated nucleases such as Cas9. These nucleases have been harnessed to introduce precise and specific changes of the genome sequence at virtually any genome locus of interest. The therapeutic relevance of these genome-editing technologies, however, is challenged by the safe and efficient delivery of nuclease into targeted cells. Herein, we summarize recent advances that have been made on non-viral delivery of genome-editing nucleases. In particular, we focus on non-viral delivery of Cas9/sgRNA ribonucleoproteins for genome editing. In addition, the future direction for developing non-viral delivery of programmable nucleases for genome editing is discussed.
Analysis of repeat-mediated deletions in the mitochondrial genome of Saccharomyces cerevisiae.

PubMed

Phadnis, Naina; Sia, Rey A; Sia, Elaine A

2005-12-01

Mitochondrial DNA deletions and point mutations accumulate in an age-dependent manner in mammals. The mitochondrial genome in aging humans often displays a 4977-bp deletion flanked by short direct repeats. Additionally, direct repeats flank two-thirds of the reported mitochondrial DNA deletions. The mechanism by which these deletions arise is unknown, but direct-repeat-mediated deletions involving polymerase slippage, homologous recombination, and nonhomologous end joining have been proposed. We have developed a genetic reporter to measure the rate at which direct-repeat-mediated deletions arise in the mitochondrial genome of Saccharomyces cerevisiae. Here we analyze the effect of repeat size and heterology between repeats on the rate of deletions. We find that the dependence on homology for repeat-mediated deletions is linear down to 33 bp. Heterology between repeats does not affect the deletion rate substantially. Analysis of recombination products suggests that the deletions are produced by at least two different pathways, one that generates only deletions and one that appears to generate both deletions and reciprocal products of recombination. We discuss how this reporter may be used to identify the proteins in yeast that have an impact on the generation of direct-repeat-mediated deletions.

Analysis of Repeat-Mediated Deletions in the Mitochondrial Genome of Saccharomyces cerevisiae

PubMed Central

Phadnis, Naina; Sia, Rey A.; Sia, Elaine A.

2005-01-01

Mitochondrial DNA deletions and point mutations accumulate in an age-dependent manner in mammals. The mitochondrial genome in aging humans often displays a 4977-bp deletion flanked by short direct repeats. Additionally, direct repeats flank two-thirds of the reported mitochondrial DNA deletions. The mechanism by which these deletions arise is unknown, but direct-repeat-mediated deletions involving polymerase slippage, homologous recombination, and nonhomologous end joining have been proposed. We have developed a genetic reporter to measure the rate at which direct-repeat-mediated deletions arise in the mitochondrial genome of Saccharomyces cerevisiae. Here we analyze the effect of repeat size and heterology between repeats on the rate of deletions. We find that the dependence on homology for repeat-mediated deletions is linear down to 33 bp. Heterology between repeats does not affect the deletion rate substantially. Analysis of recombination products suggests that the deletions are produced by at least two different pathways, one that generates only deletions and one that appears to generate both deletions and reciprocal products of recombination. We discuss how this reporter may be used to identify the proteins in yeast that have an impact on the generation of direct-repeat-mediated deletions. PMID:16157666
Leadership, Literacy, and Translational Expertise in Genomics: Challenges and Opportunities for Social Work.

PubMed

Werner-Lin, Allison; McCoyd, Judith L M; Doyle, Maya H; Gehlert, Sarah J

2016-08-01

The transdisciplinary field of genomics is revolutionizing conceptualizations of health, mental health, family formation, and public policy. Many professions must rapidly acquire genomic expertise to maintain state-of-the-art knowledge in their practice. Calls for social workers to build genomic capacity come regularly, yet social work education has not prepared practitioners to join the genomics workforce in providing socially just, ethically informed care to all clients, particularly those from vulnerable and marginalized groups. The authors suggest a set of action steps for bringing social work skills and practice into the 21st century. They propose that good genomic practice entails bringing social work values, skills, and behaviors to genomics. With education and training, social workers may facilitate socially just dissemination of genomic knowledge and services across practice domains. Increased genomic literacy will support the profession's mission to address disparities in health, health care access, and mortality. © 2016 National Association of Social Workers.
Leadership, Literacy, and Translational Expertise in Genomics: Challenges and Opportunities for Social Work

PubMed Central

Werner-Lin, Allison; McCoyd, Judith L. M.; Doyle, Maya H.; Gehlert, Sarah J.

2016-01-01

The transdisciplinary field of genomics is revolutionizing conceptualizations of health, mental health, family formation, and public policy. Many professions must rapidly acquire genomic expertise to maintain state-of-the-art knowledge in their practice. Calls for social workers to build genomic capacity come regularly, yet social work education has not prepared practitioners to join the genomics workforce in providing socially just, ethically informed care to all clients, particularly those from vulnerable and marginalized groups. The authors suggest a set of action steps for bringing social work skills and practice into the 21st century. They propose that good genomic practice entails bringing social work values, skills, and behaviors to genomics. With education and training, social workers may facilitate socially just dissemination of genomic knowledge and services across practice domains. Increased genomic literacy will support the profession’s mission to address disparities in health, health care access, and mortality. PMID:29206948
Multiplex Allele-Specific Amplification from Whole Blood for Detecting Multiple Polymorphisms Simultaneously

PubMed Central

Zhu, Jianjie; Chen, Lanxin; Mao, Yong; Zhou, Huan

2013-01-01

Allele-specific amplification on the basis of polymerase chain reaction (PCR) has been widely used for single-nucleotide polymorphism (SNP) genotyping. However, the extraction of PCR-compatible genomic DNA from whole blood is usually required. This process is complicated and tedious, and is prone to cause cross-contamination between samples. To facilitate direct PCR amplification from whole blood without the extraction of genomic DNA, we optimized the pH value of PCR solution and the concentrations of magnesium ions and facilitator glycerol. Then, we developed multiplex allele-specific amplifications from whole blood and applied them to a case–control study. In this study, we successfully established triplex, five-plex, and eight-plex allele-specific amplifications from whole blood for determining the distribution of genotypes and alleles of 14 polymorphisms in 97 gastric cancer patients and 141 healthy controls. Statistical analysis results showed significant association of SNPs rs9344, rs1799931, and rs1800629 with the risk of gastric cancer. This method is accurate, time-saving, cost-effective, and easy-to-do, especially suitable for clinical prediction of disease susceptibility. PMID:23072573
The use of genomic information increases the accuracy of breeding value predictions for sea louse (Caligus rogercresseyi) resistance in Atlantic salmon (Salmo salar).

PubMed

Correa, Katharina; Bangera, Rama; Figueroa, René; Lhorente, Jean P; Yáñez, José M

2017-01-31

Sea lice infestations caused by Caligus rogercresseyi are a main concern to the salmon farming industry due to associated economic losses. Resistance to this parasite was shown to have low to moderate genetic variation and its genetic architecture was suggested to be polygenic. The aim of this study was to compare accuracies of breeding value predictions obtained with pedigree-based best linear unbiased prediction (P-BLUP) methodology against different genomic prediction approaches: genomic BLUP (G-BLUP), Bayesian Lasso, and Bayes C. To achieve this, 2404 individuals from 118 families were measured for C. rogercresseyi count after a challenge and genotyped using 37 K single nucleotide polymorphisms. Accuracies were assessed using fivefold cross-validation and SNP densities of 0.5, 1, 5, 10, 25 and 37 K. Accuracy of genomic predictions increased with increasing SNP density and was higher than pedigree-based BLUP predictions by up to 22%. Both Bayesian and G-BLUP methods can predict breeding values with higher accuracies than pedigree-based BLUP, however, G-BLUP may be the preferred method because of reduced computation time and ease of implementation. A relatively low marker density (i.e. 10 K) is sufficient for maximal increase in accuracy when using G-BLUP or Bayesian methods for genomic prediction of C. rogercresseyi resistance in Atlantic salmon.
Informed consent in direct-to-consumer personal genome testing: the outline of a model between specific and generic consent.

PubMed

Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

2014-09-01

Broad genome-wide testing is increasingly finding its way to the public through the online direct-to-consumer marketing of so-called personal genome tests. Personal genome tests estimate genetic susceptibilities to multiple diseases and other phenotypic traits simultaneously. Providers commonly make use of Terms of Service agreements rather than informed consent procedures. However, to protect consumers from the potential physical, psychological and social harms associated with personal genome testing and to promote autonomous decision-making with regard to the testing offer, we argue that current practices of information provision are insufficient and that there is a place--and a need--for informed consent in personal genome testing, also when it is offered commercially. The increasing quantity, complexity and diversity of most testing offers, however, pose challenges for information provision and informed consent. Both specific and generic models for informed consent fail to meet its moral aims when applied to personal genome testing. Consumers should be enabled to know the limitations, risks and implications of personal genome testing and should be given control over the genetic information they do or do not wish to obtain. We present the outline of a new model for informed consent which can meet both the norm of providing sufficient information and the norm of providing understandable information. The model can be used for personal genome testing, but will also be applicable to other, future forms of broad genetic testing or screening in commercial and clinical settings. © 2012 John Wiley & Sons Ltd.
Direct Formalin Fixation Induces Widespread Genomic Effects in Archival Tissues

EPA Science Inventory

Recent advances in next generation sequencing have dramatically improved transcriptional analysis of degraded RNA from formalin-fixed paraffin-embedded (FFPE) samples. However, little is known about potential genomic artifacts induced by formalin fixation, which could affect toxi...
Approximating genomic reliabilities for national genomic evaluation

USDA-ARS?s Scientific Manuscript database

With the introduction of standard methods for approximating effective daughter/data contribution by Interbull in 2001, conventional EDC or reliabilities contributed by daughter phenotypes are directly comparable across countries and used in routine conventional evaluations. In order to make publishe...
CRISPR/Cas9-Based Multiplex Genome Editing in Monocot and Dicot Plants.

PubMed

Ma, Xingliang; Liu, Yao-Guang

2016-07-01

The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated genome targeting system has been applied to a variety of organisms, including plants. Compared to other genome-targeting technologies such as zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), the CRISPR/Cas9 system is easier to use and has much higher editing efficiency. In addition, multiple "single guide RNAs" (sgRNAs) with different target sequences can be designed to direct the Cas9 protein to multiple genomic sites for simultaneous multiplex editing. Here, we present a procedure for highly efficient multiplex genome targeting in monocot and dicot plants using a versatile and robust CRISPR/Cas9 vector system, emphasizing the construction of binary constructs with multiple sgRNA expression cassettes in one round of cloning using Golden Gate ligation. We also describe the genotyping of targeted mutations in transgenic plants by direct Sanger sequencing followed by decoding of superimposed sequencing chromatograms containing biallelic or heterozygous mutations using the Web-based tool DSDecode. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Single-Cell Microfluidics to Study the Effects of Genome Deletion on Bacterial Growth Behavior.

PubMed

Yuan, Xiaofei; Couto, Jillian M; Glidle, Andrew; Song, Yanqing; Sloan, William; Yin, Huabing

2017-12-15

By directly monitoring single cell growth in a microfluidic platform, we interrogated genome-deletion effects in Escherichia coli strains. We compared the growth dynamics of a wild type strain with a clean genome strain, and their derived mutants at the single-cell level. A decreased average growth rate and extended average lag time were found for the clean genome strain, compared to those of the wild type strain. Direct correlation between the growth rate and lag time of individual cells showed that the clean genome population was more heterogeneous. Cell culturability (the ratio of growing cells to the sum of growing and nongrowing cells) of the clean genome population was also lower. Interestingly, after the random mutations induced by a glucose starvation treatment, for the clean genome population mutants that had survived the competition of chemostat culture, each parameter markedly improved (i.e., the average growth rate and cell culturability increased, and the lag time and heterogeneity decreased). However, this effect was not seen in the wild type strain; the wild type mutants cultured in a chemostat retained a high diversity of growth phenotypes. These results suggest that quasi-essential genes that were deleted in the clean genome might be required to retain a diversity of growth characteristics at the individual cell level under environmental stress. These observations highlight that single-cell microfluidics can reveal subtle individual cellular responses, enabling in-depth understanding of the population.
Consumer Health Informatics Aspects of Direct-to-Consumer Personal Genomic Testing.

PubMed

Gray, Kathleen; Stephen, Remya; Terrill, Bronwyn; Wilson, Brenda; Middleton, Anna; Tytherleigh, Rigan; Turbitt, Erin; Gaff, Clara; Savard, Jacqueline; Hickerton, Chriselle; Newson, Ainsley; Metcalfe, Sylvia

2017-01-01

This paper uses consumer health informatics as a framework to explore whether and how direct-to-consumer personal genomic testing can be regarded as a form of information which assists consumers to manage their health. It presents findings from qualitative content analysis of web sites that offer testing services, and of transcripts from focus groups conducted as part a study of the Australian public's expectations of personal genomics. Content analysis showed that service offerings have some features of consumer health information but lack consistency. Focus group participants were mostly unfamiliar with the specifics of test reports and related information services. Some of their ideas about aids to knowledge were in line with the benefits described on provider web sites, but some expectations were inflated. People were ambivalent about whether these services would address consumers' health needs, interests and contexts and whether they would support consumers' health self-management decisions and outcomes. There is scope for consumer health informatics approaches to refine the usage and the utility of direct-to-consumer personal genomic testing. Further research may focus on how uptake is affected by consumers' health literacy or by services' engagement with consumers about what they really want.
Simulating a base population in honey bee for molecular genetic studies

PubMed Central

2012-01-01

Background Over the past years, reports have indicated that honey bee populations are declining and that infestation by an ecto-parasitic mite (Varroa destructor) is one of the main causes. Selective breeding of resistant bees can help to prevent losses due to the parasite, but it requires that a robust breeding program and genetic evaluation are implemented. Genomic selection has emerged as an important tool in animal breeding programs and simulation studies have shown that it yields more accurate breeding value estimates, higher genetic gain and low rates of inbreeding. Since genomic selection relies on marker data, simulations conducted on a genomic dataset are a pre-requisite before selection can be implemented. Although genomic datasets have been simulated in other species undergoing genetic evaluation, simulation of a genomic dataset specific to the honey bee is required since this species has a distinct genetic and reproductive biology. Our software program was aimed at constructing a base population by simulating a random mating honey bee population. A forward-time population simulation approach was applied since it allows modeling of genetic characteristics and reproductive behavior specific to the honey bee. Results Our software program yielded a genomic dataset for a base population in linkage disequilibrium. In addition, information was obtained on (1) the position of markers on each chromosome, (2) allele frequency, (3) χ2 statistics for Hardy-Weinberg equilibrium, (4) a sorted list of markers with a minor allele frequency less than or equal to the input value, (5) average r2 values of linkage disequilibrium between all simulated marker loci pair for all generations and (6) average r2 value of linkage disequilibrium in the last generation for selected markers with the highest minor allele frequency. Conclusion We developed a software program that takes into account the genetic and reproductive biology specific to the honey bee and that can be used to constitute a genomic dataset compatible with the simulation studies necessary to optimize breeding programs. The source code together with an instruction file is freely accessible at http://msproteomics.org/Research/Misc/honeybeepopulationsimulator.html PMID:22520469
Simulating a base population in honey bee for molecular genetic studies.

PubMed

Gupta, Pooja; Conrad, Tim; Spötter, Andreas; Reinsch, Norbert; Bienefeld, Kaspar

2012-06-27

Over the past years, reports have indicated that honey bee populations are declining and that infestation by an ecto-parasitic mite (Varroa destructor) is one of the main causes. Selective breeding of resistant bees can help to prevent losses due to the parasite, but it requires that a robust breeding program and genetic evaluation are implemented. Genomic selection has emerged as an important tool in animal breeding programs and simulation studies have shown that it yields more accurate breeding value estimates, higher genetic gain and low rates of inbreeding. Since genomic selection relies on marker data, simulations conducted on a genomic dataset are a pre-requisite before selection can be implemented. Although genomic datasets have been simulated in other species undergoing genetic evaluation, simulation of a genomic dataset specific to the honey bee is required since this species has a distinct genetic and reproductive biology. Our software program was aimed at constructing a base population by simulating a random mating honey bee population. A forward-time population simulation approach was applied since it allows modeling of genetic characteristics and reproductive behavior specific to the honey bee. Our software program yielded a genomic dataset for a base population in linkage disequilibrium. In addition, information was obtained on (1) the position of markers on each chromosome, (2) allele frequency, (3) χ(2) statistics for Hardy-Weinberg equilibrium, (4) a sorted list of markers with a minor allele frequency less than or equal to the input value, (5) average r(2) values of linkage disequilibrium between all simulated marker loci pair for all generations and (6) average r2 value of linkage disequilibrium in the last generation for selected markers with the highest minor allele frequency. We developed a software program that takes into account the genetic and reproductive biology specific to the honey bee and that can be used to constitute a genomic dataset compatible with the simulation studies necessary to optimize breeding programs. The source code together with an instruction file is freely accessible at http://msproteomics.org/Research/Misc/honeybeepopulationsimulator.html.
Consumers report lower confidence in their genetics knowledge following direct-to-consumer personal genomic testing

PubMed Central

Carere, Deanna Alexis; Kraft, Peter; Kaphingst, Kimberly A.; Roberts, J. Scott; Green, Robert C.

2015-01-01

Purpose To measure changes to genetics knowledge and self-efficacy following personal genomic testing (PGT). Methods New customers of 23andMe and Pathway Genomics completed a series of online surveys. Prior to receipt of results, and 6 months post-results, we measured genetics knowledge (9 true/false items) and genetics self-efficacy (5 Likert-scale items) and used paired methods to evaluate change over time. Correlates of change (e.g., decision regret) were identified using linear regression. Results 998 PGT customers (59.9% female; 85.8% White; mean age 46.9±15.5 years) were included in our analyses. Mean genetics knowledge score out of 9 was 8.15±0.95 at baseline and 8.25±0.92 at 6 months (p = .0024). Mean self-efficacy score out of 35 was 29.06±5.59 at baseline and 27.7±5.46 at 6 months (p < .0001); on each item, 30–45% of participants reported lower self-efficacy following PGT. Change in self-efficacy was positively associated with health care provider consultation (p = .0042), impact of PGT on perceived control over one’s health (p < .0001), and perceived value of PGT (p < .0001), and negatively associated with decision regret (p < .0001). Conclusion Lowered genetics self-efficacy following PGT may reflect an appropriate reevaluation by consumers in response to receiving complex genetic information. PMID:25812042
The Statistical Segment Length of DNA: Opportunities for Biomechanical Modeling in Polymer Physics and Next-Generation Genomics.

PubMed

Dorfman, Kevin D

2018-02-01

The development of bright bisintercalating dyes for deoxyribonucleic acid (DNA) in the 1990s, most notably YOYO-1, revolutionized the field of polymer physics in the ensuing years. These dyes, in conjunction with modern molecular biology techniques, permit the facile observation of polymer dynamics via fluorescence microscopy and thus direct tests of different theories of polymer dynamics. At the same time, they have played a key role in advancing an emerging next-generation method known as genome mapping in nanochannels. The effect of intercalation on the bending energy of DNA as embodied by a change in its statistical segment length (or, alternatively, its persistence length) has been the subject of significant controversy. The precise value of the statistical segment length is critical for the proper interpretation of polymer physics experiments and controls the phenomena underlying the aforementioned genomics technology. In this perspective, we briefly review the model of DNA as a wormlike chain and a trio of methods (light scattering, optical or magnetic tweezers, and atomic force microscopy (AFM)) that have been used to determine the statistical segment length of DNA. We then outline the disagreement in the literature over the role of bisintercalation on the bending energy of DNA, and how a multiscale biomechanical approach could provide an important model for this scientifically and technologically relevant problem.
Genetic parameter estimates for carcass traits and visual scores including or not genomic information.

PubMed

Gordo, D G M; Espigolan, R; Tonussi, R L; Júnior, G A F; Bresolin, T; Magalhães, A F Braga; Feitosa, F L; Baldi, F; Carvalheiro, R; Tonhati, H; de Oliveira, H N; Chardulo, L A L; de Albuquerque, L G

2016-05-01

The objective of this study was to determine whether visual scores used as selection criteria in Nellore breeding programs are effective indicators of carcass traits measured after slaughter. Additionally, this study evaluated the effect of different structures of the relationship matrix ( and ) on the estimation of genetic parameters and on the prediction accuracy of breeding values. There were 13,524 animals for visual scores of conformation (CS), finishing precocity (FP), and muscling (MS) and 1,753, 1,747, and 1,564 for LM area (LMA), backfat thickness (BF), and HCW, respectively. Of these, 1,566 animals were genotyped using a high-density panel containing 777,962 SNP. Six analyses were performed using multitrait animal models, each including the 3 visual scores and 1 carcass trait. For the visual scores, the model included direct additive genetic and residual random effects and the fixed effects of contemporary group (defined by year of birth, management group at yearling, and farm) and the linear effect of age of animal at yearling. The same model was used for the carcass traits, replacing the effect of age of animal at yearling with the linear effect of age of animal at slaughter. The variance and covariance components were estimated by the REML method in analyses using the numerator relationship matrix () or combining the genomic and the numerator relationship matrices (). The heritability estimates for the visual scores obtained with the 2 methods were similar and of moderate magnitude (0.23-0.34), indicating that these traits should response to direct selection. The heritabilities for LMA, BF, and HCW were 0.13, 0.07, and 0.17, respectively, using matrix and 0.29, 0.16, and 0.23, respectively, using matrix . The genetic correlations between the visual scores and carcass traits were positive, and higher correlations were generally obtained when matrix was used. Considering the difficulties and cost of measuring carcass traits postmortem, visual scores of CS, FP, and MS could be used as selection criteria to improve HCW, BF, and LMA. The use of genomic information permitted the detection of greater additive genetic variability for LMA and BF. For HCW, the high magnitude of the genetic correlations with visual scores was probably sufficient to recover genetic variability. The methods provided similar breeding value accuracies, especially for the visual scores.
Generation of Tandem Direct Duplications by Reversed-Ends Transposition of Maize Ac Elements

PubMed Central

Peterson, Thomas

2013-01-01

Tandem direct duplications are a common feature of the genomes of eukaryotes ranging from yeast to human, where they comprise a significant fraction of copy number variations. The prevailing model for the formation of tandem direct duplications is non-allelic homologous recombination (NAHR). Here we report the isolation of a series of duplications and reciprocal deletions isolated de novo from a maize allele containing two Class II Ac/Ds transposons. The duplication/deletion structures suggest that they were generated by alternative transposition reactions involving the termini of two nearby transposable elements. The deletion/duplication breakpoint junctions contain 8 bp target site duplications characteristic of Ac/Ds transposition events, confirming their formation directly by an alternative transposition mechanism. Tandem direct duplications and reciprocal deletions were generated at a relatively high frequency (∼0.5 to 1%) in the materials examined here in which transposons are positioned nearby each other in appropriate orientation; frequencies would likely be much lower in other genotypes. To test whether this mechanism may have contributed to maize genome evolution, we analyzed sequences flanking Ac/Ds and other hAT family transposons and identified three small tandem direct duplications with the structural features predicted by the alternative transposition mechanism. Together these results show that some class II transposons are capable of directly inducing tandem sequence duplications, and that this activity has contributed to the evolution of the maize genome. PMID:23966872
[Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].

PubMed

Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong

2015-11-01

The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.
Non-additive Effects in Genomic Selection

PubMed Central

Varona, Luis; Legarra, Andres; Toro, Miguel A.; Vitezica, Zulma G.

2018-01-01

In the last decade, genomic selection has become a standard in the genetic evaluation of livestock populations. However, most procedures for the implementation of genomic selection only consider the additive effects associated with SNP (Single Nucleotide Polymorphism) markers used to calculate the prediction of the breeding values of candidates for selection. Nevertheless, the availability of estimates of non-additive effects is of interest because: (i) they contribute to an increase in the accuracy of the prediction of breeding values and the genetic response; (ii) they allow the definition of mate allocation procedures between candidates for selection; and (iii) they can be used to enhance non-additive genetic variation through the definition of appropriate crossbreeding or purebred breeding schemes. This study presents a review of methods for the incorporation of non-additive genetic effects into genomic selection procedures and their potential applications in the prediction of future performance, mate allocation, crossbreeding, and purebred selection. The work concludes with a brief outline of some ideas for future lines of that may help the standard inclusion of non-additive effects in genomic selection. PMID:29559995
Non-additive Effects in Genomic Selection.

PubMed

Varona, Luis; Legarra, Andres; Toro, Miguel A; Vitezica, Zulma G

2018-01-01

In the last decade, genomic selection has become a standard in the genetic evaluation of livestock populations. However, most procedures for the implementation of genomic selection only consider the additive effects associated with SNP (Single Nucleotide Polymorphism) markers used to calculate the prediction of the breeding values of candidates for selection. Nevertheless, the availability of estimates of non-additive effects is of interest because: (i) they contribute to an increase in the accuracy of the prediction of breeding values and the genetic response; (ii) they allow the definition of mate allocation procedures between candidates for selection; and (iii) they can be used to enhance non-additive genetic variation through the definition of appropriate crossbreeding or purebred breeding schemes. This study presents a review of methods for the incorporation of non-additive genetic effects into genomic selection procedures and their potential applications in the prediction of future performance, mate allocation, crossbreeding, and purebred selection. The work concludes with a brief outline of some ideas for future lines of that may help the standard inclusion of non-additive effects in genomic selection.

Reference-free comparative genomics of 174 chloroplasts.

PubMed

Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R; Yu, Jun; Cannon, Charles H

2012-01-01

Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ~18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions.
Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onda, M.; Kudo, S.; Fukuda, M.

Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less
Reference-Free Comparative Genomics of 174 Chloroplasts

PubMed Central

Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R.; Yu, Jun; Cannon, Charles H.

2012-01-01

Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ∼18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions. PMID:23185288
Accuracy of genomic prediction using deregressed breeding values estimated from purebred and crossbred offspring phenotypes in pigs.

PubMed

Hidalgo, A M; Bastiaansen, J W M; Lopes, M S; Veroneze, R; Groenen, M A M; de Koning, D-J

2015-07-01

Genomic selection is applied to dairy cattle breeding to improve the genetic progress of purebred (PB) animals, whereas in pigs and poultry the target is a crossbred (CB) animal for which a different strategy appears to be needed. The source of information used to estimate the breeding values, i.e., using phenotypes of CB or PB animals, may affect the accuracy of prediction. The objective of our study was to assess the direct genomic value (DGV) accuracy of CB and PB pigs using different sources of phenotypic information. Data used were from 3 populations: 2,078 Dutch Landrace-based, 2,301 Large White-based, and 497 crossbreds from an F1 cross between the 2 lines. Two female reproduction traits were analyzed: gestation length (GLE) and total number of piglets born (TNB). Phenotypes used in the analyses originated from offspring of genotyped individuals. Phenotypes collected on CB and PB animals were analyzed as separate traits using a single-trait model. Breeding values were estimated separately for each trait in a pedigree BLUP analysis and subsequently deregressed. Deregressed EBV for each trait originating from different sources (CB or PB offspring) were used to study the accuracy of genomic prediction. Accuracy of prediction was computed as the correlation between DGV and the DEBV of the validation population. Accuracy of prediction within PB populations ranged from 0.43 to 0.62 across GLE and TNB. Accuracies to predict genetic merit of CB animals with one PB population in the training set ranged from 0.12 to 0.28, with the exception of using the CB offspring phenotype of the Dutch Landrace that resulted in an accuracy estimate around 0 for both traits. Accuracies to predict genetic merit of CB animals with both parental PB populations in the training set ranged from 0.17 to 0.30. We conclude that prediction within population and trait had good predictive ability regardless of the trait being the PB or CB performance, whereas using PB population(s) to predict genetic merit of CB animals had zero to moderate predictive ability. We observed that the DGV accuracy of CB animals when training on PB data was greater than or equal to training on CB data. However, when results are corrected for the different levels of reliabilities in the PB and CB training data, we showed that training on CB data does outperform PB data for the prediction of CB genetic merit, indicating that more CB animals should be phenotyped to increase the reliability and, consequently, accuracy of DGV for CB genetic merit.
Genetic-molecular characterization of backcross generations for sexual conversion in papaya (Carica papaya L.).

PubMed

Ramos, H C C; Pereira, M G; Pereira, T N S; Barros, G B A; Ferreguetti, G A

2014-12-04

The low number of improved cultivars limits the expansion of the papaya crop, particularly because of the time required for the development of new varieties using classical procedures. Molecular techniques associated with conventional procedures accelerate this process and allow targeted improvements. Thus, we used microsatellite markers to perform genetic-molecular characterization of papaya genotypes obtained from 3 backcross generations to monitor the inbreeding level and parental genome proportion in the evaluated genotypes. Based on the analysis of 20 microsatellite loci, 77 genotypes were evaluated, 25 of each generation of the backcross program as well as the parental genotypes. The markers analyzed were identified in 11 of the 12 linkage groups established for papaya, ranging from 1 to 4 per linkage group. The average values for the inbreeding coefficient were 0.88 (BC1S4), 0.47 (BC2S3), and 0.63 (BC3S2). Genomic analysis revealed average values of the recurrent parent genome of 82.7% in BC3S2, 64.4% in BC1S4, and 63.9% in BC2S3. Neither the inbreeding level nor the genomic proportions completely followed the expected average values. This demonstrates the significance of molecular analysis when examining different genotype values, given the importance of such information for selection processes in breeding programs.
GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis.

PubMed

Gonzalez, Michael A; Lebrigio, Rafael F Acosta; Van Booven, Derek; Ulloa, Rick H; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schüle, Rebecca; Züchner, Stephan

2013-06-01

Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. © 2013 Wiley Periodicals, Inc.
Systematic quantification of HDR and NHEJ reveals effects of locus, nuclease, and cell type on genome-editing.

PubMed

Miyaoka, Yuichiro; Berman, Jennifer R; Cooper, Samantha B; Mayerl, Steven J; Chan, Amanda H; Zhang, Bin; Karlin-Neumann, George A; Conklin, Bruce R

2016-03-31

Precise genome-editing relies on the repair of sequence-specific nuclease-induced DNA nicking or double-strand breaks (DSBs) by homology-directed repair (HDR). However, nonhomologous end-joining (NHEJ), an error-prone repair, acts concurrently, reducing the rate of high-fidelity edits. The identification of genome-editing conditions that favor HDR over NHEJ has been hindered by the lack of a simple method to measure HDR and NHEJ directly and simultaneously at endogenous loci. To overcome this challenge, we developed a novel, rapid, digital PCR-based assay that can simultaneously detect one HDR or NHEJ event out of 1,000 copies of the genome. Using this assay, we systematically monitored genome-editing outcomes of CRISPR-associated protein 9 (Cas9), Cas9 nickases, catalytically dead Cas9 fused to FokI, and transcription activator-like effector nuclease at three disease-associated endogenous gene loci in HEK293T cells, HeLa cells, and human induced pluripotent stem cells. Although it is widely thought that NHEJ generally occurs more often than HDR, we found that more HDR than NHEJ was induced under multiple conditions. Surprisingly, the HDR/NHEJ ratios were highly dependent on gene locus, nuclease platform, and cell type. The new assay system, and our findings based on it, will enable mechanistic studies of genome-editing and help improve genome-editing technology.
Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

PubMed

Schaid, Daniel J

2010-01-01

Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants.

PubMed

Fadista, João; Manning, Alisa K; Florez, Jose C; Groop, Leif

2016-08-01

Genome-wide association studies (GWAS) have long relied on proposed statistical significance thresholds to be able to differentiate true positives from false positives. Although the genome-wide significance P-value threshold of 5 × 10(-8) has become a standard for common-variant GWAS, it has not been updated to cope with the lower allele frequency spectrum used in many recent array-based GWAS studies and sequencing studies. Using a whole-genome- and -exome-sequencing data set of 2875 individuals of European ancestry from the Genetics of Type 2 Diabetes (GoT2D) project and a whole-exome-sequencing data set of 13 000 individuals from five ancestries from the GoT2D and T2D-GENES (Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples) projects, we describe guidelines for genome- and exome-wide association P-value thresholds needed to correct for multiple testing, explaining the impact of linkage disequilibrium thresholds for distinguishing independent variants, minor allele frequency and ancestry characteristics. We emphasize the advantage of studying recent genetic isolate populations when performing rare and low-frequency genetic association analyses, as the multiple testing burden is diminished due to higher genetic homogeneity.
Accuracy of genomic breeding values for meat tenderness in Polled Nellore cattle.

PubMed

Magnabosco, C U; Lopes, F B; Fragoso, R C; Eifert, E C; Valente, B D; Rosa, G J M; Sainz, R D

2016-07-01

Zebu () cattle, mostly of the Nellore breed, comprise more than 80% of the beef cattle in Brazil, given their tolerance of the tropical climate and high resistance to ectoparasites. Despite their advantages for production in tropical environments, zebu cattle tend to produce tougher meat than Bos taurus breeds. Traditional genetic selection to improve meat tenderness is constrained by the difficulty and cost of phenotypic evaluation for meat quality. Therefore, genomic selection may be the best strategy to improve meat quality traits. This study was performed to compare the accuracies of different Bayesian regression models in predicting molecular breeding values for meat tenderness in Polled Nellore cattle. The data set was composed of Warner-Bratzler shear force (WBSF) of longissimus muscle from 205, 141, and 81 animals slaughtered in 2005, 2010, and 2012, respectively, which were selected and mated so as to create extreme segregation for WBSF. The animals were genotyped with either the Illumina BovineHD (HD; 777,000 from 90 samples) chip or the GeneSeek Genomic Profiler (GGP Indicus HD; 77,000 from 337 samples). The quality controls of SNP were Hard-Weinberg Proportion -value ≥ 0.1%, minor allele frequency > 1%, and call rate > 90%. The FImpute program was used for imputation from the GGP Indicus HD chip to the HD chip. The effect of each SNP was estimated using ridge regression, least absolute shrinkage and selection operator (LASSO), Bayes A, Bayes B, and Bayes Cπ methods. Different numbers of SNP were used, with 1, 2, 3, 4, 5, 7, 10, 20, 40, 60, 80, or 100% of the markers preselected based on their significance test (-value from genomewide association studies [GWAS]) or randomly sampled. The prediction accuracy was assessed by the correlation between genomic breeding value and the observed WBSF phenotype, using a leave-one-out cross-validation methodology. The prediction accuracies using all markers were all very similar for all models, ranging from 0.22 (Bayes Cπ) to 0.25 (Bayes B). When preselecting SNP based on GWAS results, the highest correlation (0.27) between WBSF and the genomic breeding value was achieved using the Bayesian LASSO model with 15,030 (3%) markers. Although this study used relatively few animals, the design of the segregating population ensured wide genetic variability for meat tenderness, which was important to achieve acceptable accuracy of genomic prediction. Although all models showed similar levels of prediction accuracy, some small advantages were observed with the Bayes B approach when higher numbers of markers were preselected based on their -values resulting from a GWAS analysis.
A high-density SNP genetic linkage map for the silver-lipped pearl oyster, Pinctada maxima: a valuable resource for gene localisation and marker-assisted selection.

PubMed

Jones, David B; Jerry, Dean R; Khatkar, Mehar S; Raadsma, Herman W; Zenger, Kyall R

2013-11-20

The silver-lipped pearl oyster, Pinctada maxima, is an important tropical aquaculture species extensively farmed for the highly sought "South Sea" pearls. Traditional breeding programs have been initiated for this species in order to select for improved pearl quality, but many economic traits under selection are complex, polygenic and confounded with environmental factors, limiting the accuracy of selection. The incorporation of a marker-assisted selection (MAS) breeding approach would greatly benefit pearl breeding programs by allowing the direct selection of genes responsible for pearl quality. However, before MAS can be incorporated, substantial genomic resources such as genetic linkage maps need to be generated. The construction of a high-density genetic linkage map for P. maxima is not only essential for unravelling the genomic architecture of complex pearl quality traits, but also provides indispensable information on the genome structure of pearl oysters. A total of 1,189 informative genome-wide single nucleotide polymorphisms (SNPs) were incorporated into linkage map construction. The final linkage map consisted of 887 SNPs in 14 linkage groups, spans a total genetic distance of 831.7 centimorgans (cM), and covers an estimated 96% of the P. maxima genome. Assessment of sex-specific recombination across all linkage groups revealed limited overall heterochiasmy between the sexes (i.e. 1.15:1 F/M map length ratio). However, there were pronounced localised differences throughout the linkage groups, whereby male recombination was suppressed near the centromeres compared to female recombination, but inflated towards telomeric regions. Mean values of LD for adjacent SNP pairs suggest that a higher density of markers will be required for powerful genome-wide association studies. Finally, numerous nacre biomineralization genes were localised providing novel positional information for these genes. This high-density SNP genetic map is the first comprehensive linkage map for any pearl oyster species. It provides an essential genomic tool facilitating studies investigating the genomic architecture of complex trait variation and identifying quantitative trait loci for economically important traits useful in genetic selection programs within the P. maxima pearling industry. Furthermore, this map provides a foundation for further research aiming to improve our understanding of the dynamic process of biomineralization, and pearl oyster evolution and synteny.
Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies

PubMed Central

Kuo, Chia-Ling; Vsevolozhskaya, Olga A.; Zaykin, Dmitri V.

2015-01-01

Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease. PMID:25955023
Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies.

PubMed

Kuo, Chia-Ling; Vsevolozhskaya, Olga A; Zaykin, Dmitri V

2015-01-01

Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease.
Relative stability of DNA as a generic criterion for promoter prediction: whole genome annotation of microbial genomes with varying nucleotide base composition.

PubMed

Rangannan, Vetriselvi; Bansal, Manju

2009-12-01

The rapid increase in genome sequence information has necessitated the annotation of their functional elements, particularly those occurring in the non-coding regions, in the genomic context. Promoter region is the key regulatory region, which enables the gene to be transcribed or repressed, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. In this analysis, we demonstrate that while the promoter regions are in general less stable than the flanking regions, their average free energy varies depending on the GC composition of the flanking genomic sequence. We have therefore obtained a set of free energy threshold values, for genomic DNA with varying GC content and used them as generic criteria for predicting promoter regions in several microbial genomes, using an in-house developed tool PromPredict. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (50.8% GC) and B. subtilis (43.5% GC) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.
Strand-specific transcriptome profiling with directly labeled RNA on genomic tiling microarrays

PubMed Central

2011-01-01

Background With lower manufacturing cost, high spot density, and flexible probe design, genomic tiling microarrays are ideal for comprehensive transcriptome studies. Typically, transcriptome profiling using microarrays involves reverse transcription, which converts RNA to cDNA. The cDNA is then labeled and hybridized to the probes on the arrays, thus the RNA signals are detected indirectly. Reverse transcription is known to generate artifactual cDNA, in particular the synthesis of second-strand cDNA, leading to false discovery of antisense RNA. To address this issue, we have developed an effective method using RNA that is directly labeled, thus by-passing the cDNA generation. This paper describes this method and its application to the mapping of transcriptome profiles. Results RNA extracted from laboratory cultures of Porphyromonas gingivalis was fluorescently labeled with an alkylation reagent and hybridized directly to probes on genomic tiling microarrays specifically designed for this periodontal pathogen. The generated transcriptome profile was strand-specific and produced signals close to background level in most antisense regions of the genome. In contrast, high levels of signal were detected in the antisense regions when the hybridization was done with cDNA. Five antisense areas were tested with independent strand-specific RT-PCR and none to negligible amplification was detected, indicating that the strong antisense cDNA signals were experimental artifacts. Conclusions An efficient method was developed for mapping transcriptome profiles specific to both coding strands of a bacterial genome. This method chemically labels and uses extracted RNA directly in microarray hybridization. The generated transcriptome profile was free of cDNA artifactual signals. In addition, this method requires fewer processing steps and is potentially more sensitive in detecting small amount of RNA compared to conventional end-labeling methods due to the incorporation of more fluorescent molecules per RNA fragment. PMID:21235785
[The application of CRISPR/Cas9 genome editing technology in cancer research].

PubMed

Wang, Da-yong; Ma, Ning; Hui, Yang; Gao, Xu

2016-01-01

The CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein-9 nuclease) genome editing technology has become more and more popular in gene editing because of its simple design and easy operation. Using the CRISPR/Cas9 system, researchers can perform site-directed genome modification at the base level. Moreover, it has been widely used in genome editing in multiple species and related cancer research. In this review, we summarize the application of the CRISPR/Cas9 system in cancer research based on the latest research progresses as well as our understanding of cancer research and genome editing techniques.
Replicative Intermediates of Human Papillomavirus Type 11 in Laryngeal Papillomas: Site of Replication Initiation and Direction of Replication

NASA Astrophysics Data System (ADS)

Auborn, K. J.; Little, R. D.; Platt, T. H. K.; Vaccariello, M. A.; Schildkraut, C. L.

1994-07-01

We have examined the structures of replication intermediates from the human papillomavirus type 11 genome in DNA extracted from papilloma lesions (laryngeal papillomas). The sites of replication initiation and termination utilized in vivo were mapped by using neutral/neutral and neutral/alkaline two-dimensional agarose gel electrophoresis methods. Initiation of replication was detected in or very close to the upstream regulatory region (URR; the noncoding, regulatory sequences upstream of the open reading frames in the papillomavirus genome). We also show that replication forks proceed bidirectionally from the origin and converge 180circ opposite the URR. These results demonstrate the feasibility of analysis of replication of viral genomes directly from infected tissue.
Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications.

PubMed

Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang

2012-06-15

Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication. This is particularly true for the zebrafish genome. Further analysis of the duplicated gene sets indicated that a significant portion of duplicated genes in the zebrafish genome were of recent, lineage-specific duplication events. Most strikingly, a subset of duplicated genes is enriched among the recently duplicated genes involved in immune or sensory response pathways. Such findings demonstrated the significance of continuous gene duplication as well as that of whole genome duplication in the course of genome evolution.
Ancient bacterial endosymbionts of insects: Genomes as sources of insight and springboards for inquiry.

PubMed

Wernegreen, Jennifer J

2017-09-15

Ancient associations between insects and bacteria provide models to study intimate host-microbe interactions. Currently, a wealth of genome sequence data for long-term, obligately intracellular (primary) endosymbionts of insects reveals profound genomic consequences of this specialized bacterial lifestyle. Those consequences include severe genome reduction and extreme base compositions. This minireview highlights the utility of genome sequence data to understand how, and why, endosymbionts have been pushed to such extremes, and to illuminate the functional consequences of such extensive genome change. While the static snapshots provided by individual endosymbiont genomes are valuable, comparative analyses of multiple genomes have shed light on evolutionary mechanisms. Namely, genome comparisons have told us that selection is important in fine-tuning gene content, but at the same time, mutational pressure and genetic drift contribute to genome degradation. Examples from Blochmannia, the primary endosymbiont of the ant tribe Camponotini, illustrate the value and constraints of genome sequence data, and exemplify how genomes can serve as a springboard for further comparative and experimental inquiry. Copyright © 2017. Published by Elsevier Inc.
Detection of a divergent variant of grapevine virus F by next-generation sequencing.

PubMed

Molenaar, Nicholas; Burger, Johan T; Maree, Hans J

2015-08-01

The complete genome sequence of a South African isolate of grapevine virus F (GVF) is presented. It was first detected by metagenomic next-generation sequencing of field samples and validated through direct Sanger sequencing. The genome sequence of GVF isolate V5 consists of 7539 nucleotides and contains a poly(A) tail. It has a typical vitivirus genome arrangement that comprises five open reading frames (ORFs), which share only 88.96 % nucleotide sequence identity with the existing complete GVF genome sequence (JX105428).

Non-Homologous End Joining and Homology Directed DNA Repair Frequency of Double-Stranded Breaks Introduced by Genome Editing Reagents.

PubMed

Zaboikin, Michail; Zaboikina, Tatiana; Freter, Carl; Srinivasakumar, Narasimhachar

2017-01-01

Genome editing using transcription-activator like effector nucleases or RNA guided nucleases allows one to precisely engineer desired changes within a given target sequence. The genome editing reagents introduce double stranded breaks (DSBs) at the target site which can then undergo DNA repair by non-homologous end joining (NHEJ) or homology directed recombination (HDR) when a template DNA molecule is available. NHEJ repair results in indel mutations at the target site. As PCR amplified products from mutant target regions are likely to exhibit different melting profiles than PCR products amplified from wild type target region, we designed a high resolution melting analysis (HRMA) for rapid identification of efficient genome editing reagents. We also designed TaqMan assays using probes situated across the cut site to discriminate wild type from mutant sequences present after genome editing. The experiments revealed that the sensitivity of the assays to detect NHEJ-mediated DNA repair could be enhanced by selection of transfected cells to reduce the contribution of unmodified genomic DNA from untransfected cells to the DNA melting profile. The presence of donor template DNA lacking the target sequence at the time of genome editing further enhanced the sensitivity of the assays for detection of mutant DNA molecules by excluding the wild-type sequences modified by HDR. A second TaqMan probe that bound to an adjacent site, outside of the primary target cut site, was used to directly determine the contribution of HDR to DNA repair in the presence of the donor template sequence. The TaqMan qPCR assay, designed to measure the contribution of NHEJ and HDR in DNA repair, corroborated the results from HRMA. The data indicated that genome editing reagents can produce DSBs at high efficiency in HEK293T cells but a significant proportion of these are likely masked by reversion to wild type as a result of HDR. Supplying a donor plasmid to provide a template for HDR (that eliminates a PCR amplifiable target) revealed these cryptic DSBs and facilitated the determination of the true efficacy of genome editing reagents. The results indicated that in HEK293T cells, approximately 40% of the DSBs introduced by genome editing, were available for participation in HDR.
Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

PubMed Central

2013-01-01

Background Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species. Results Genotyping by Sequencing (GBS) was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs) linked these results to published maps for cross-validation and map comparison. Conclusions GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation distortion in R. idaeus, which may help to identify deleterious alleles that are the basis of inbreeding depression in the species. PMID:23324311
Human genetics and genomics a decade after the release of the draft sequence of the human genome.

PubMed

Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

2011-10-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.
Human genetics and genomics a decade after the release of the draft sequence of the human genome

PubMed Central

2011-01-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605
Genome research elucidating environmental adaptation: Dark-fly project as a case study.

PubMed

Fuse, Naoyuki

2017-08-01

Organisms have the capacity to adapt to diverse environments, and environmental adaptation is a substantial driving force of evolution. Recent progress of genome science has addressed the genetic mechanisms underlying environmental adaptation. Whole genome sequencing has identified adaptive genes selected under particular environments. Genome editing technology enables us to directly test the role(s) of a gene in environmental adaptation. Genome science has also shed light on a unique organism, Dark-fly, which has been reared long-term in the dark. We determined the whole genome sequence of Dark-fly and reenacted environmental selections of the Dark-fly genome to identify the genes related to dark-adaptation. Here I will give an overview of current progress in genome science and summarize our study using Dark-fly, as a case study for environmental adaptation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Microbial genome mining for accelerated natural products discovery: is a renaissance in the making?

PubMed

Bachmann, Brian O; Van Lanen, Steven G; Baltz, Richard H

2014-02-01

Microbial genome mining is a rapidly developing approach to discover new and novel secondary metabolites for drug discovery. Many advances have been made in the past decade to facilitate genome mining, and these are reviewed in this Special Issue of the Journal of Industrial Microbiology and Biotechnology. In this Introductory Review, we discuss the concept of genome mining and why it is important for the revitalization of natural product discovery; what microbes show the most promise for focused genome mining; how microbial genomes can be mined; how genome mining can be leveraged with other technologies; how progress on genome mining can be accelerated; and who should fund future progress in this promising field. We direct interested readers to more focused reviews on the individual topics in this Special Issue for more detailed summaries on the current state-of-the-art.
Drug target inference through pathway analysis of genomics data

PubMed Central

Ma, Haisu; Zhao, Hongyu

2013-01-01

Statistical modeling coupled with bioinformatics is commonly used for drug discovery. Although there exist many approaches for single target based drug design and target inference, recent years have seen a paradigm shift to system-level pharmacological research. Pathway analysis of genomics data represents one promising direction for computational inference of drug targets. This article aims at providing a comprehensive review on the evolving issues is this field, covering methodological developments, their pros and cons, as well as future research directions. PMID:23369829
Toxicological effects of benzo[a]pyrene on DNA methylation of whole genome in ICR mice.

PubMed

Zhao, L; Zhang, S; An, X; Tan, W; Pang, D; Ouyang, H

2015-10-30

It has been well known that alterations in DNA methylation - an important regulator of gene transcription - lead to cancer. Therefore a change in the level of DNA methylation of whole genome has been considered as a biomarker of carcinogenesis. Previously, a large number of experimental results in genetic toxicology have showed that benzo[a]pyrene could cause DNA mutation and fragmentation. However, there was little to no studies on alterations in DNA methylation of genome directly result from exposure to benzo[a]pyrene. In this paper, possible mechanisms of alterations in whole genomic DNA methylation by benzo[a]pyrene were investigated using ICR mice after benzo[a]pyrene exposure. The blood, liver, pancreas, skin, lung and bladder of ICR mice were removed and checked after a fixed time interval (6 hours) of benzo[a]pyrene exposure, and whole genomic DNA methylation level was determined by high performance liquid chromatography (HPLC). The results exhibited tissue specificity, that is, the level of whole genomic DNA methylation decreases significantly in blood and liver, rather than pancreas, lung, skin and bladder of ICR mice. This study investigated the direct relationship between aberrant DNA methylation level and benzo[a]pyrene exposure, which might be helpful to clarify the toxicological mechanism of benzo[a]pyrene in epigenetic perspectives.
Comparative Genomics of Completely Sequenced Lactobacillus helveticus Genomes Provides Insights into Strain-Specific Genes and Resolves Metagenomics Data Down to the Strain Level.

PubMed

Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E; Irmler, Stefan; Ahrens, Christian H

2018-01-01

Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus -to our knowledge-identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus . Notably, the functional Clusters of Orthologous Groups of proteins categories "cell wall/membrane biogenesis" and "defense mechanisms" were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level.
Comparative Genomics of Completely Sequenced Lactobacillus helveticus Genomes Provides Insights into Strain-Specific Genes and Resolves Metagenomics Data Down to the Strain Level

PubMed Central

Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R.; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E.; Irmler, Stefan; Ahrens, Christian H.

2018-01-01

Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus—to our knowledge—identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus. Notably, the functional Clusters of Orthologous Groups of proteins categories “cell wall/membrane biogenesis” and “defense mechanisms” were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level. PMID:29441050
Academic-industrial partnerships in drug discovery in the age of genomics.

PubMed

Harris, Tim; Papadopoulos, Stelios; Goldstein, David B

2015-06-01

Many US FDA-approved drugs have been developed through productive interactions between the biotechnology industry and academia. Technological breakthroughs in genomics, in particular large-scale sequencing of human genomes, is creating new opportunities to understand the biology of disease and to identify high-value targets relevant to a broad range of disorders. However, the scale of the work required to appropriately analyze large genomic and clinical data sets is challenging industry to develop a broader view of what areas of work constitute precompetitive research. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genomes: At the edge of chaos with maximum information capacity

NASA Astrophysics Data System (ADS)

Kong, Sing-Guan; Chen, Hong-Da; Torda, Andrew; Lee, H. C.

2016-12-01

We propose an order index, ϕ, which quantifies the notion of “life at the edge of chaos” when applied to genome sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length and base composition. The 786 complete genomic sequences in GenBank were found to have ϕ values in a very narrow range, 0.037 ± 0.027. We show this implies that genomes are halfway towards being completely random, namely, at the edge of chaos. We argue that this narrow range represents the neighborhood of a fixed-point in the space of sequences, and genomes are driven there by the dynamics of a robust, predominantly neutral evolution process.
Evidence for contemporary plant mitoviruses

USDA-ARS?s Scientific Manuscript database

Mitoviruses have small RNA(+) genomes, replicate in mitochondria, and have to date been directly shown to infect only fungi. For this report, sequences that appear to represent approximately complete mitovirus genomes were discovered in plant transcriptome data at GenBank. At least 17 of the refined...
Thermodynamic Basis for the Emergence of Genomes during Prebiotic Evolution

DTIC Science & Technology

2012-05-01

Thermodynamic Basis for the Emergence of Genomes during Prebiotic Evolution Hyung-June Woo, Ravi Vijaya Satya, Jaques Reifman* DoD Biotechnology High...polymerases are above, near, and below a critical point, respectively. The prebiotic evolution therefore must have crossed this critical region. Over...among many potential oligomers capable of templated replication, RNAs may have evolved to form prebiotic genomes due to the value of their nonenzymatic
Is junk DNA bunk? A critique of ENCODE.

PubMed

Doolittle, W Ford

2013-04-02

Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE's ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE's definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed.
Is junk DNA bunk? A critique of ENCODE

PubMed Central

Doolittle, W. Ford

2013-01-01

Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE’s ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE’s definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed. PMID:23479647
The role of parasite-driven selection in shaping landscape genomic structure in red grouse (Lagopus lagopus scotica).

PubMed

Wenzel, Marius A; Douglas, Alex; James, Marianne C; Redpath, Steve M; Piertney, Stuart B

2016-01-01

Landscape genomics promises to provide novel insights into how neutral and adaptive processes shape genome-wide variation within and among populations. However, there has been little emphasis on examining whether individual-based phenotype-genotype relationships derived from approaches such as genome-wide association (GWAS) manifest themselves as a population-level signature of selection in a landscape context. The two may prove irreconcilable as individual-level patterns become diluted by high levels of gene flow and complex phenotypic or environmental heterogeneity. We illustrate this issue with a case study that examines the role of the highly prevalent gastrointestinal nematode Trichostrongylus tenuis in shaping genomic signatures of selection in red grouse (Lagopus lagopus scotica). Individual-level GWAS involving 384 SNPs has previously identified five SNPs that explain variation in T. tenuis burden. Here, we examine whether these same SNPs display population-level relationships between T. tenuis burden and genetic structure across a small-scale landscape of 21 sites with heterogeneous parasite pressure. Moreover, we identify adaptive SNPs showing signatures of directional selection using F(ST) outlier analysis and relate population- and individual-level patterns of multilocus neutral and adaptive genetic structure to T. tenuis burden. The five candidate SNPs for parasite-driven selection were neither associated with T. tenuis burden on a population level, nor under directional selection. Similarly, there was no evidence of parasite-driven selection in SNPs identified as candidates for directional selection. We discuss these results in the context of red grouse ecology and highlight the broader consequences for the utility of landscape genomics approaches for identifying signatures of selection. © 2015 John Wiley & Sons Ltd.
Immortalization of normal human mammary epithelial cells in two steps by direct targeting of senescence barriers does not require gross genomic alterations

DOE PAGES

Garbe, James C.; Vrba, Lukas; Sputova, Klara; ...

2014-10-29

Telomerase reactivation and immortalization are critical for human carcinoma progression. However, little is known about the mechanisms controlling this crucial step, due in part to the paucity of experimentally tractable model systems that can examine human epithelial cell immortalization as it might occur in vivo. We achieved efficient non-clonal immortalization of normal human mammary epithelial cells (HMEC) by directly targeting the 2 main senescence barriers encountered by cultured HMEC. The stress-associated stasis barrier was bypassed using shRNA to p16INK4; replicative senescence due to critically shortened telomeres was bypassed in post-stasis HMEC by c-MYC transduction. Thus, 2 pathologically relevant oncogenic agentsmore » are sufficient to immortally transform normal HMEC. The resultant non-clonal immortalized lines exhibited normal karyotypes. Most human carcinomas contain genomically unstable cells, with widespread instability first observed in vivo in pre-malignant stages; in vitro, instability is seen as finite cells with critically shortened telomeres approach replicative senescence. Our results support our hypotheses that: (1) telomere-dysfunction induced genomic instability in pre-malignant finite cells may generate the errors required for telomerase reactivation and immortalization, as well as many additional “passenger” errors carried forward into resulting carcinomas; (2) genomic instability during cancer progression is needed to generate errors that overcome tumor suppressive barriers, but not required per se; bypassing the senescence barriers by direct targeting eliminated a need for genomic errors to generate immortalization. Achieving efficient HMEC immortalization, in the absence of “passenger” genomic errors, should facilitate examination of telomerase regulation during human carcinoma progression, and exploration of agents that could prevent immortalization.« less
Scanning the human genome at kilobase resolution.

PubMed

Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming

2008-05-01

Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
The cost-benefit of genomic testing of heifers and using sexed semen in pasture-based dairy herds.

PubMed

Newton, J E; Hayes, B J; Pryce, J E

2018-07-01

Recent improvements in dairy cow fertility and female reproductive technologies offer an opportunity to apply greater selection pressure to females. This means there may be greater incentive to obtain genomic breeding values for females. We modeled the impact of changes to key parameters on the net benefit from genomic testing of heifer calves with and without usage of sexed semen. This paper builds on earlier cost-benefit studies but uses parameters relevant to pasture-based systems. A deterministic model was used to evaluate the effect on net benefit due to changes in (1) reproduction rate, (2) genomic test costs, (3) availability of parent-derived breeding values (EBV PA ), and (4) replacement rate. When the use of sexed semen was included, we also considered (1) the proportion of heifers and cows mated to sexed semen, (2) decreases in conception rate in inseminations with sexed semen, and (3) the marginal return for surplus heifers. Scenarios with lower replacement rates and no availability of EBV PA had the largest net benefits. Under current Australian parameters, the net benefit of genomic testing realized over the lifetime of genotyped heifers is expected to range from A$204 to A$1,124 per 100 cows for a herd with median reproductive performance. The cost of a genomic test, a perceived barrier to many farmers, had only a small effect on net benefit. Genomic testing alone was always more profitable than using sexed semen and genomic testing together if the only benefit considered was increased genetic gain in heifer replacements. When other benefits (i.e., the higher sale price of a surplus heifer compared with a male calf) were considered, there were combinations of parameters where net benefit from using sexed semen and genomic testing was higher than the equivalent scenario with genomic testing only. Using sexed semen alongside genomic testing is most likely to be profitable when (1) used in heifers, (2) the marginal return for selling surplus heifers (sale price minus rearing costs) is greater than A$400, and (3) conception rates of no more than 10 percentage points lower than those achieved using conventional semen can be realized. Net benefit was highly dependent on the marginal return. Demonstrating that the initial investment in genomic testing can be recouped within the lifetime of the heifers tested may assist in the development of extension messages to explain the value of genomic testing females at the herd level. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).

Recommendations from the EGAPP Working Group: genomic profiling to assess cardiovascular risk to improve cardiovascular health.

PubMed

2010-12-01

The Evaluation of Genomic Applications in Practice and Prevention Working Group (EWG) found insufficient evidence to recommend testing for the 9p21 genetic variant or 57 other variants in 28 genes (listed in ) to assess risk for cardiovascular disease (CVD) in the general population, specifically heart disease and stroke. The EWG found that the magnitude of net health benefit from use of any of these tests alone or in combination is negligible. The EWG discourages clinical use unless further evidence supports improved clinical outcomes. Based on the available evidence, the overall certainty of net health benefit is deemed "Low." It has been suggested that an improvement in CVD risk classification (adjusting intermediate risk of CVD into high- or low-risk categories) might lead to management changes (e.g., earlier initiation or higher rates of medical interventions, or targeted recommendations for behavioral change) that improve CVD outcomes. In the absence of direct evidence to support this possibility, this review sought indirect evidence aimed at documenting the extent to which genomic profiling alters CVD risk estimation, alone and in combination with traditional risk factors, and the extent to which risk reclassification improves health outcomes. Assay-related evidence on available genomic profiling tests was deemed inadequate. However, based on existing technologies that have been or may be used and on data from two of the companies performing such testing, the analytic sensitivity and specificity of tests for individual gene variants might be at least satisfactory. Twenty-nine gene candidates were evaluated, with 58 different gene variant/disease associations. Evidence on clinical validity was rated inadequate for 34 of these associations (59%) and adequate for 23 (40%). Inadequate grades were based on limited evidence, poor replication, existence of possible biases, or combinations of these factors. For heart disease (25 combined associations) and stroke (13 combined associations), profiling provided areas under the receiver operator characteristics curve of 66% and 57%, respectively. Only the association of 9p21 variants with heart disease had convincing evidence of a per-allele odds ratio of between 1.2 and 1.3; this was the highest effect size for any variant/disease combination with at least adequate evidence. Although the 9p21 association seems to be independent of traditional risk factors, there is adequate evidence that the improvement in risk prediction is, at best, small. Clinical utility was not formally evaluated in any of the studies reported to date, including for 9p21. As a result, no evidence was available on the balance of benefits and harms. Also, there was no direct evidence available to assess the health benefits and harms of adding these markers to traditional risk factors (e.g., Framingham Risk Score). However, the estimated additional benefit from adding genomic markers to traditional risk factors was found to be negligible. Prevention of CVD is a public health priority. Improvements in outcomes associated with genomic profiling could have important impacts. Traditional risk factors such as those used in the Framingham Risk Scores have an advantage in clinical screening and risk assessment strategies because they measure the actual targets for therapy (e.g., lipid levels and blood pressure). To add value, genomic testing should lead to better outcomes than those achievable by assessment and treatment of traditional risk factors alone. Some issues important for clinical utility remain unknown, such as the biological mechanism underlying the most convincing marker's (9p21) association with CVD; the level of risk that changes intervention; whether long-term disease outcomes will improve; how individuals ordering direct to consumer tests will understand/respond to test results and interact with the health care system; and whether direct to consumer testing will motivate behavior change or amplify potential harms.
Impact of direct-to-consumer predictive genomic testing on risk perception and worry among patients receiving routine care in a preventive health clinic.

PubMed

James, Katherine M; Cowl, Clayton T; Tilburt, Jon C; Sinicrope, Pamela S; Robinson, Marguerite E; Frimannsdottir, Katrin R; Tiedje, Kristina; Koenig, Barbara A

2011-10-01

To assess the impact of direct-to-consumer (DTC) predictive genomic risk information on perceived risk and worry in the context of routine clinical care. Patients attending a preventive medicine clinic between June 1 and December 18, 2009, were randomly assigned to receive either genomic risk information from a DTC product plus usual care (n=74) or usual care alone (n=76). At intervals of 1 week and 1 year after their clinic visit, participants completed surveys containing validated measures of risk perception and levels of worry associated with the 12 conditions assessed by the DTC product. Of 345 patients approached, 150 (43%) agreed to participate, 64 (19%) refused, and 131 (38%) did not respond. Compared with those receiving usual care, participants who received genomic risk information initially rated their risk as higher for 4 conditions (abdominal aneurysm [P=.001], Graves disease [P=.04], obesity [P=.01], and osteoarthritis [P=.04]) and lower for one (prostate cancer [P=.02]). Although differences were not significant, they also reported higher levels of worry for 7 conditions and lower levels for 5 others. At 1 year, there were no significant differences between groups. Predictive genomic risk information modestly influences risk perception and worry. The extent and direction of this influence may depend on the condition being tested and its baseline prominence in preventive health care and may attenuate with time.
Improving accuracy of genomic prediction in Brangus cattle by adding animals with imputed low-density SNP genotypes.

PubMed

Lopes, F B; Wu, X-L; Li, H; Xu, J; Perkins, T; Genho, J; Ferretti, R; Tait, R G; Bauck, S; Rosa, G J M

2018-02-01

Reliable genomic prediction of breeding values for quantitative traits requires the availability of sufficient number of animals with genotypes and phenotypes in the training set. As of 31 October 2016, there were 3,797 Brangus animals with genotypes and phenotypes. These Brangus animals were genotyped using different commercial SNP chips. Of them, the largest group consisted of 1,535 animals genotyped by the GGP-LDV4 SNP chip. The remaining 2,262 genotypes were imputed to the SNP content of the GGP-LDV4 chip, so that the number of animals available for training the genomic prediction models was more than doubled. The present study showed that the pooling of animals with both original or imputed 40K SNP genotypes substantially increased genomic prediction accuracies on the ten traits. By supplementing imputed genotypes, the relative gains in genomic prediction accuracies on estimated breeding values (EBV) were from 12.60% to 31.27%, and the relative gain in genomic prediction accuracies on de-regressed EBV was slightly small (i.e. 0.87%-18.75%). The present study also compared the performance of five genomic prediction models and two cross-validation methods. The five genomic models predicted EBV and de-regressed EBV of the ten traits similarly well. Of the two cross-validation methods, leave-one-out cross-validation maximized the number of animals at the stage of training for genomic prediction. Genomic prediction accuracy (GPA) on the ten quantitative traits was validated in 1,106 newly genotyped Brangus animals based on the SNP effects estimated in the previous set of 3,797 Brangus animals, and they were slightly lower than GPA in the original data. The present study was the first to leverage currently available genotype and phenotype resources in order to harness genomic prediction in Brangus beef cattle. © 2018 Blackwell Verlag GmbH.
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads

PubMed Central

Li, Pinghao; Jiang, Xiaoqian; Wang, Shuang; Kim, Jihoon; Xiong, Hongkai; Ohno-Machado, Lucila

2014-01-01

Background and objective Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. Methods We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), a novel compression algorithm for aligned reads in the sorted Sequence Alignment/Map (SAM) format. We first aligned short reads against a reference genome and stored exactly mapped reads for compression. For the inexact mapped or unmapped reads, we realigned them against different reference genomes using an adaptive scheme by gradually shortening the read length. Regarding the base quality value, we offer lossy and lossless compression mechanisms. The lossy compression mechanism for the base quality values uses k-means clustering, where a user can adjust the balance between decompression quality and compression rate. The lossless compression can be produced by setting k (the number of clusters) to the number of different quality values. Results The proposed method produced a compression ratio in the range 0.5–0.65, which corresponds to 35–50% storage savings based on experimental datasets. The proposed approach achieved 15% more storage savings over CRAM and comparable compression ratio with Samcomp (CRAM and Samcomp are two of the state-of-the-art genome compression algorithms). The software is freely available at https://sourceforge.net/projects/hierachicaldnac/with a General Public License (GPL) license. Limitation Our method requires having different reference genomes and prolongs the execution time for additional alignments. Conclusions The proposed multi-reference-based compression algorithm for aligned reads outperforms existing single-reference based algorithms. PMID:24368726
Canine hip dysplasia is predictable by genotyping.

PubMed

Guo, G; Zhou, Z; Wang, Y; Zhao, K; Zhu, L; Lust, G; Hunter, L; Friedenberg, S; Li, J; Zhang, Y; Harris, S; Jones, P; Sandler, J; Krotscheck, U; Todhunter, R; Zhang, Z

2011-04-01

To establish a predictive method using whole genome genotyping for early intervention in canine hip dysplasia (CHD) risk management, for the prevention of the progression of secondary osteoarthritis (OA), and for selective breeding. Two sets of dogs (six breeds) were genotyped with dense SNPs covering the entire canine genome. The first set contained 359 dogs upon which a predictive formula for genomic breeding value (GBV) was derived by using their estimated breeding value (EBV) of the Norberg angle (a measure of CHD) and their genotypes. To investigate how well the formula would work for an individual dog with genotype only (without using EBV), a cross validation was performed by masking the EBV of one dog at a time. The genomic data and the EBV of the remaining dogs were used to predict the GBV for the single dog that was left out. The second set of dogs included 38 new Labrador retriever dogs, which had no pedigree relationship to the dogs in the first set. The cross validation showed a strong correlation (R>0.7) between the EBV and the GBV. The independent validation showed a moderate correlation (R=0.5) between GBV for the Norberg angle and the observed Norberg angle (no EBV was available for the new 38 dogs). Sensitivity, specificity, positive and negative predictive values of the genomic data were all above 70%. Prediction of CHD from genomic data is feasible, and can be applied for risk management of CHD and early selection for genetic improvement to reduce the prevalence of CHD in breeding programs. The prediction can be implemented before maturity, at which age current radiographic screening programs are traditionally applied, and as soon as DNA is available. Copyright © 2010 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

PubMed

Palaiokostas, Christos; Cariou, Sophie; Bestin, Anastasia; Bruant, Jean-Sebastien; Haffray, Pierrick; Morin, Thierry; Cabon, Joëlle; Allal, François; Vandeputte, Marc; Houston, Ross D

2018-06-08

European sea bass (Dicentrarchus labrax) is one of the most important species for European aquaculture. Viral nervous necrosis (VNN), commonly caused by the redspotted grouper nervous necrosis virus (RGNNV), can result in high levels of morbidity and mortality, mainly during the larval and juvenile stages of cultured sea bass. In the absence of efficient therapeutic treatments, selective breeding for host resistance offers a promising strategy to control this disease. Our study aimed at investigating genetic resistance to VNN and genomic-based approaches to improve disease resistance by selective breeding. A population of 1538 sea bass juveniles from a factorial cross between 48 sires and 17 dams was challenged with RGNNV with mortalities and survivors being recorded and sampled for genotyping by the RAD sequencing approach. We used genome-wide genotype data from 9195 single nucleotide polymorphisms (SNPs) for downstream analysis. Estimates of heritability of survival on the underlying scale for the pedigree and genomic relationship matrices were 0.27 (HPD interval 95%: 0.14-0.40) and 0.43 (0.29-0.57), respectively. Classical genome-wide association analysis detected genome-wide significant quantitative trait loci (QTL) for resistance to VNN on chromosomes (unassigned scaffolds in the case of 'chromosome' 25) 3, 20 and 25 (P < 1e06). Weighted genomic best linear unbiased predictor provided additional support for the QTL on chromosome 3 and suggested that it explained 4% of the additive genetic variation. Genomic prediction approaches were tested to investigate the potential of using genome-wide SNP data to estimate breeding values for resistance to VNN and showed that genomic prediction resulted in a 13% increase in successful classification of resistant and susceptible animals compared to pedigree-based methods, with Bayes A and Bayes B giving the highest predictive ability. Genome-wide significant QTL were identified but each with relatively small effects on the trait. Tests of genomic prediction suggested that incorporating genome-wide SNP data is likely to result in higher accuracy of estimated breeding values for resistance to VNN. RAD sequencing is an effective method for generating such genome-wide SNPs, and our findings highlight the potential of genomic selection to breed farmed European sea bass with improved resistance to VNN.
Detecting Single-Nucleotide Substitutions Induced by Genome Editing.

PubMed

Miyaoka, Yuichiro; Chan, Amanda H; Conklin, Bruce R

2016-08-01

The detection of genome editing is critical in evaluating genome-editing tools or conditions, but it is not an easy task to detect genome-editing events-especially single-nucleotide substitutions-without a surrogate marker. Here we introduce a procedure that significantly contributes to the advancement of genome-editing technologies. It uses droplet digital polymerase chain reaction (ddPCR) and allele-specific hydrolysis probes to detect single-nucleotide substitutions generated by genome editing (via homology-directed repair, or HDR). HDR events that introduce substitutions using donor DNA are generally infrequent, even with genome-editing tools, and the outcome is only one base pair difference in 3 billion base pairs of the human genome. This task is particularly difficult in induced pluripotent stem (iPS) cells, in which editing events can be very rare. Therefore, the technological advances described here have implications for therapeutic genome editing and experimental approaches to disease modeling with iPS cells. © 2016 Cold Spring Harbor Laboratory Press.
Choosing a genome browser for a Model Organism Database: surveying the Maize community

PubMed Central

Sen, Taner Z.; Harper, Lisa C.; Schaeffer, Mary L.; Andorf, Carson M.; Seigfried, Trent E.; Campbell, Darwin A.; Lawrence, Carolyn J.

2010-01-01

As the B73 maize genome sequencing project neared completion, MaizeGDB began to integrate a graphical genome browser with its existing web interface and database. To ensure that maize researchers would optimally benefit from the potential addition of a genome browser to the existing MaizeGDB resource, personnel at MaizeGDB surveyed researchers’ needs. Collected data indicate that existing genome browsers for maize were inadequate and suggest implementation of a browser with quick interface and intuitive tools would meet most researchers’ needs. Here, we document the survey’s outcomes, review functionalities of available genome browser software platforms and offer our rationale for choosing the GBrowse software suite for MaizeGDB. Because the genome as represented within the MaizeGDB Genome Browser is tied to detailed phenotypic data, molecular marker information, available stocks, etc., the MaizeGDB Genome Browser represents a novel mechanism by which the researchers can leverage maize sequence information toward crop improvement directly. Database URL: http://gbrowse.maizegdb.org/ PMID:20627860
The societal opportunities and challenges of genome editing.

PubMed

Carroll, Dana; Charo, R Alta

2015-11-05

The genome editing platforms currently in use have revolutionized the field of genetics. At an accelerating rate, these tools are entering areas with direct impact on human well being. Here, we discuss applications in agriculture and in medicine, and examine some associated societal issues.
Placental transcriptome co-expression analysis reveals conserved regulatory program across gestation

USDA-ARS?s Scientific Manuscript database

Mammalian development in utero is absolutely dependent on proper placental development, which is ultimately regulated by the placental genome. The regulation of the placental genome can be directly studied by exploring the underlying organization of the placental transcriptome through a systematic a...
The relationship between runs of homozygosity and inbreeding in Jersey cattle under selection

USDA-ARS?s Scientific Manuscript database

Inbreeding is often an inevitable outcome of strong directional artificial selection but it reduces fitness in a population with increased frequency of recessive deleterious alleles. Runs of homozygosity (ROH) representing genomic autozygosity that occur from mating between selected and genomically ...
UCbase 2.0: ultraconserved sequences database (2014 update).

PubMed

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it. © The Author(s) 2014. Published by Oxford University Press.
Genome assembly from synthetic long read clouds

PubMed Central

Kuleshov, Volodymyr; Snyder, Michael P.; Batzoglou, Serafim

2016-01-01

Motivation: Despite rapid progress in sequencing technology, assembling de novo the genomes of new species as well as reconstructing complex metagenomes remains major technological challenges. New synthetic long read (SLR) technologies promise significant advances towards these goals; however, their applicability is limited by high sequencing requirements and the inability of current assembly paradigms to cope with combinations of short and long reads. Results: Here, we introduce Architect, a new de novo scaffolder aimed at SLR technologies. Unlike previous assembly strategies, Architect does not require a costly subassembly step; instead it assembles genomes directly from the SLR’s underlying short reads, which we refer to as read clouds. This enables a 4- to 20-fold reduction in sequencing requirements and a 5-fold increase in assembly contiguity on both genomic and metagenomic datasets relative to state-of-the-art assembly strategies aimed directly at fully subassembled long reads. Availability and Implementation: Our source code is freely available at https://github.com/kuleshov/architect. Contact: kuleshov@stanford.edu PMID:27307620
The Human Genome Initiative of the Department of Energy

DOE R&D Accomplishments Database

1988-01-01

The structural characterization of genes and elucidation of their encoded functions have become a cornerstone of modern health research, biology and biotechnology. A genome program is an organized effort to locate and identify the functions of all the genes of an organism. Beginning with the DOE-sponsored, 1986 human genome workshop at Santa Fe, the value of broadly organized efforts supporting total genome characterization became a subject of intensive study. There is now national recognition that benefits will rapidly accrue from an effective scientific infrastructure for total genome research. In the US genome research is now receiving dedicated funds. Several other nations are implementing genome programs. Supportive infrastructure is being improved through both national and international cooperation. The Human Genome Initiative of the Department of Energy (DOE) is a focused program of Resource and Technology Development, with objectives of speeding and bringing economies to the national human genome effort. This report relates the origins and progress of the Initiative.
Pangenome evidence for extensive interdomain horizontal transfer affecting lineage core and shell genes in uncultured planktonic thaumarchaeota and euryarchaeota.

PubMed

Deschamps, Philippe; Zivanovic, Yvan; Moreira, David; Rodriguez-Valera, Francisco; López-García, Purificación

2014-06-12

Horizontal gene transfer (HGT) is an important force in evolution, which may lead, among other things, to the adaptation to new environments by the import of new metabolic functions. Recent studies based on phylogenetic analyses of a few genome fragments containing archaeal 16S rRNA genes and fosmid-end sequences from deep-sea metagenomic libraries have suggested that marine planktonic archaea could be affected by high HGT frequency. Likewise, a composite genome of an uncultured marine euryarchaeote showed high levels of gene sequence similarity to bacterial genes. In this work, we ask whether HGT is frequent and widespread in genomes of these marine archaea, and whether HGT is an ancient and/or recurrent phenomenon. To answer these questions, we sequenced 997 fosmid archaeal clones from metagenomic libraries of deep-Mediterranean waters (1,000 and 3,000 m depth) and built comprehensive pangenomes for planktonic Thaumarchaeota (Group I archaea) and Euryarchaeota belonging to the uncultured Groups II and III Euryarchaeota (GII/III-Euryarchaeota). Comparison with available reference genomes of Thaumarchaeota and a composite marine surface euryarchaeote genome allowed us to define sets of core, lineage-specific core, and shell gene ortholog clusters for the two archaeal lineages. Molecular phylogenetic analyses of all gene clusters showed that 23.9% of marine Thaumarchaeota genes and 29.7% of GII/III-Euryarchaeota genes had been horizontally acquired from bacteria. HGT is not only extensive and directional but also ongoing, with high HGT levels in lineage-specific core (ancient transfers) and shell (recent transfers) genes. Many of the acquired genes are related to metabolism and membrane biogenesis, suggesting an adaptive value for life in cold, oligotrophic oceans. We hypothesize that the acquisition of an important amount of foreign genes by the ancestors of these archaeal groups significantly contributed to their divergence and ecological success. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease.

PubMed

Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M

2016-05-01

To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
The Genomes of Two Bat Species with Long Constant Frequency Echolocation Calls.

PubMed

Dong, Dong; Lei, Ming; Hua, Panyu; Pan, Yi-Hsuan; Mu, Shuo; Zheng, Guantao; Pang, Erli; Lin, Kui; Zhang, Shuyi

2017-01-01

Bats can perceive the world by using a wide range of sensory systems, and some of the systems have become highly specialized, such as auditory sensory perception. Among bat species, the Old World leaf-nosed bats and horseshoe bats (rhinolophoid bats) possess the most sophisticated echolocation systems. Here, we reported the whole-genome sequencing and de novo assembles of two rhinolophoid bats-the great leaf-nosed bat (Hipposideros armiger) and the Chinese rufous horseshoe bat (Rhinolophus sinicus). Comparative genomic analyses revealed the adaptation of auditory sensory perception in the rhinolophoid bat lineages, probably resulting from the extreme selectivity used in the auditory processing by these bats. Pseudogenization of some vision-related genes in rhinolophoid bats was observed, suggesting that these genes have undergone relaxed natural selection. An extensive contraction of olfactory receptor gene repertoires was observed in the lineage leading to the common ancestor of bats. Further extensive gene contractions can be observed in the branch leading to the rhinolophoid bats. Such concordance suggested that molecular changes at one sensory gene might have direct consequences for genes controlling for other sensory modalities. To characterize the population genetic structure and patterns of evolution, we re-sequenced the genome of 20 great leaf-nosed bats from four different geographical locations of China. The result showed similar sequence diversity values and little differentiation among populations. Moreover, evidence of genetic adaptations to high altitudes in the great leaf-nosed bats was observed. Taken together, our work provided a useful resource for future research on the evolution of bats. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies

PubMed Central

Abnet, Christian C.; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D.; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M.; Liao, Linda M.; Lee, Maxwell P.; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C.; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B.; Giffen, Carol A.; Burdett, Laurie; Fraumeni, Joseph F.; Tucker, Margaret A.; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M.; Yuan, Zhi-Qing; Chanock, Stephen J.; Zhang, Xue-Jun; Taylor, Philip R.; Wang, Li-Dong

2012-01-01

Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10−8, and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19–1.40) and P= 7.63 × 10−10. An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants. PMID:22323360
Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies.

PubMed

Abnet, Christian C; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M; Liao, Linda M; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B; Giffen, Carol A; Burdett, Laurie; Fraumeni, Joseph F; Tucker, Margaret A; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M; Yuan, Zhi-Qing; Chanock, Stephen J; Zhang, Xue-Jun; Taylor, Philip R; Wang, Li-Dong

2012-05-01

Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10(-8), and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19-1.40) and P= 7.63 × 10(-10). An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants.
Domestic estimated breeding values and genomic enhanced breeding values of bulls in comparison with their foreign genomic enhanced breeding values.

PubMed

Přibyl, J; Bauer, J; Čermák, V; Pešek, P; Přibylová, J; Šplíchal, J; Vostrá-Vydrová, H; Vostrý, L; Zavadilová, L

2015-10-01

Estimated breeding values (EBVs) and genomic enhanced breeding values (GEBVs) for milk production of young genotyped Holstein bulls were predicted using a conventional BLUP - Animal Model, a method fitting regression coefficients for loci (RRBLUP), a method utilizing the realized genomic relationship matrix (GBLUP), by a single-step procedure (ssGBLUP) and by a one-step blending procedure. Information sources for prediction were the nation-wide database of domestic Czech production records in the first lactation combined with deregressed proofs (DRP) from Interbull files (August 2013) and domestic test-day (TD) records for the first three lactations. Data from 2627 genotyped bulls were used, of which 2189 were already proven under domestic conditions. Analyses were run that used Interbull values for genotyped bulls only or that used Interbull values for all available sires. Resultant predictions were compared with GEBV of 96 young foreign bulls evaluated abroad and whose proofs were from Interbull method GMACE (August 2013) on the Czech scale. Correlations of predictions with GMACE values of foreign bulls ranged from 0.33 to 0.75. Combining domestic data with Interbull EBVs improved prediction of both EBV and GEBV. Predictions by Animal Model (traditional EBV) using only domestic first lactation records and GMACE values were correlated by only 0.33. Combining the nation-wide domestic database with all available DRP for genotyped and un-genotyped sires from Interbull resulted in an EBV correlation of 0.60, compared with 0.47 when only Interbull data were used. In all cases, GEBVs had higher correlations than traditional EBVs, and the highest correlations were for predictions from the ssGBLUP procedure using combined data (0.75), or with all available DRP from Interbull records only (one-step blending approach, 0.69). The ssGBLUP predictions using the first three domestic lactation records in the TD model were correlated with GMACE predictions by 0.69, 0.64 and 0.61 for milk yield, protein yield and fat yield, respectively.

Genome-scale engineering for systems and synthetic biology

PubMed Central

Esvelt, Kevin M; Wang, Harris H

2013-01-01

Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering. PMID:23340847
Draft genome sequence of Trametes villosa (Sw.) Kreisel CCMB561, a tropical white-rot Basidiomycota from the semiarid region of Brazil.

PubMed

Ferreira, Dalila Souza Santos; Kato, Rodrigo Bentes; Miranda, Fábio Malcher; da Costa Pinheiro, Kenny; Fonseca, Paula Luize Camargos; Tomé, Luiz Marcelo Ribeiro; Vaz, Aline Bruna Martins; Badotti, Fernanda; Ramos, Rommel Thiago Jucá; Brenig, Bertram; Azevedo, Vasco Ariston de Carvalho; Benevides, Raquel Guimarães; Góes-Neto, Aristóteles

2018-06-01

Herein, we present the draft genome of Trametes villosa isolate CCMB561, a wood-decaying Basidiomycota commonly found in tropical semiarid climate. The genome assembly was 57.98 Mb in size with an L50 of 691. A total of 16,711 putative protein-encoding genes was predicted, including 590 genes coding for carbohydrate-active enzymes (CAZy), directly involved in the decomposition of lignocellulosic materials. This is the first genome of this species of high interest in bioenergy research. The draft genome of Trametes villosa isolate CCMB561 will provide an important resource for future investigations in biofuel production, bioremediation and other green technologies.
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples.

PubMed

Quick, Joshua; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah C; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno R; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2017-06-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab.
Programming cells by multiplex genome engineering and accelerated evolution.

PubMed

Wang, Harris H; Isaacs, Farren J; Carr, Peter A; Sun, Zachary Z; Xu, George; Forest, Craig R; Church, George M

2009-08-13

The breadth of genomic diversity found among organisms in nature allows populations to adapt to diverse environments. However, genomic diversity is difficult to generate in the laboratory and new phenotypes do not easily arise on practical timescales. Although in vitro and directed evolution methods have created genetic variants with usefully altered phenotypes, these methods are limited to laborious and serial manipulation of single genes and are not used for parallel and continuous directed evolution of gene networks or genomes. Here, we describe multiplex automated genome engineering (MAGE) for large-scale programming and evolution of cells. MAGE simultaneously targets many locations on the chromosome for modification in a single cell or across a population of cells, thus producing combinatorial genomic diversity. Because the process is cyclical and scalable, we constructed prototype devices that automate the MAGE technology to facilitate rapid and continuous generation of a diverse set of genetic changes (mismatches, insertions, deletions). We applied MAGE to optimize the 1-deoxy-D-xylulose-5-phosphate (DXP) biosynthesis pathway in Escherichia coli to overproduce the industrially important isoprenoid lycopene. Twenty-four genetic components in the DXP pathway were modified simultaneously using a complex pool of synthetic DNA, creating over 4.3 billion combinatorial genomic variants per day. We isolated variants with more than fivefold increase in lycopene production within 3 days, a significant improvement over existing metabolic engineering techniques. Our multiplex approach embraces engineering in the context of evolution by expediting the design and evolution of organisms with new and improved properties.
Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs.

PubMed

Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M

2017-06-01

The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange events and exploring cellular heterogeneity. Strand-seq is a single-cell sequencing technology that resolves the individual homologs within a cell by restricting sequence analysis to the DNA template strands used during DNA replication. This protocol, which takes up to 4 d to complete, relies on the directionality of DNA, in which each single strand of a DNA molecule is distinguished based on its 5'-3' orientation. Culturing cells in a thymidine analog for one round of cell division labels nascent DNA strands, allowing for their selective removal during genomic library construction. To preserve directionality of template strands, genomic preamplification is bypassed and labeled nascent strands are nicked and not amplified during library preparation. Each single-cell library is multiplexed for pooling and sequencing, and the resulting sequence data are aligned, mapping to either the minus or plus strand of the reference genome, to assign template strand states for each chromosome in the cell. The major adaptations to conventional single-cell sequencing protocols include harvesting of daughter cells after a single round of BrdU incorporation, bypassing of whole-genome amplification, and removal of the BrdU + strand during Strand-seq library preparation. By sequencing just template strands, the structure and identity of each homolog are preserved.
Identified OAS3 gene variants associated with coexistence of HBsAg and anti-HBs in chronic HBV infection.

PubMed

Wang, S; Wang, J; Fan, M-J; Li, T-Y; Pan, H; Wang, X; Liu, H-K; Lin, Q-F; Zhang, J-G; Guan, L-P; Zhernakova, D V; O'Brien, S J; Feng, Z-R; Chang, L; Dai, E-H; Lu, J-H; Xi, H-L; Zeng, Z; Yu, Y-Y; Wang, B-B

2018-03-27

The underlying mechanism of coexistence of hepatitis B surface antigen (HBsAg) and hepatitis B surface antigen antibody (anti-HBs) is still controversial. To identify the host genetic factors related to this unusual clinical phenomenon, a two-stage study was conducted in the Chinese Han population. In the first stage, we performed a case-control (1:1) age- and gender-matched study of 101 cases with concurrent HBsAg and anti-HBs and 102 controls with negative HBsAg and positive anti-HBs using whole exome sequencing. In the second validation stage, we directly sequence the 16 exons on the OAS3 gene in two dependent cohorts of 48 cases and 200 controls. Although, in the first stage, a genome-wide association study of 58,563 polymorphism variants in 101 cases and 102 controls found no significant loci (P-value ≤ .05/58563), and neither locus achieved a conservative genome-wide significance threshold (P-value ≤ 5e-08), gene-based burden analysis showed that OAS3 gene rare variants were associated with the coexistence of HBsAg and anti-HBs. (P-value = 4.127e-06 ≤ 0.05/6994). A total of 16 rare variants were screened out from 21 cases and 3 controls. In the second validation stage, one case with a stop-gained rare variant was identified. Fisher's exact test of all 149 cases and 302 controls showed that the rare coding sequence mutations were more frequent in cases vs controls (P-value = 7.299e-09, OR = 17.27, 95% CI [5.01-58.72]). Protein-coding rare variations on the OAS3 gene are associated with the coexistence of HBsAg and anti-HBs in patients with chronic HBV infection in Chinese Han population. © 2018 John Wiley & Sons Ltd.
Genome Sequence of the Necrotrophic Plant Pathogen Alternaria brassicicola Abra43

PubMed Central

Belmas, Elodie; Briand, Martial; Kwasiborski, Anthony; Colou, Justine; N’Guyen, Guillaume; Iacomi, Béatrice; Grappin, Philippe; Campion, Claire; Simoneau, Philippe; Barret, Matthieu

2018-01-01

ABSTRACT Alternaria brassicicola causes dark spot (or black spot) disease, which is one of the most common and destructive fungal diseases of Brassicaceae spp. worldwide. Here, we report the draft genome sequence of strain Abra43. The assembly comprises 29 scaffolds, with an N50 value of 2.1 Mb. The assembled genome was 31,036,461 bp in length, with a G+C content of 50.85%. PMID:29439047
Genome Sequence of an Endophytic Fungus, Fusarium solani JS-169, Which Has Antifungal Activity.

PubMed

Kim, Jung A; Jeon, Jongbum; Park, Sook-Young; Kim, Ki-Tae; Choi, Gobong; Lee, Hyun-Jung; Kim, Yangsun; Yang, Hee-Sun; Yeo, Joo-Hong; Lee, Yong-Hwan; Kim, Soonok

2017-10-19

An endophytic fungus, Fusarium solani strain JS-169, isolated from a mulberry twig, showed considerable antifungal activity. Here, we report the draft genome sequence of this strain. The assembly comprises 17 scaffolds, with an N 50 value of 4.93 Mb. The assembled genome was 45,813,297 bp in length, with a G+C content of 49.91%. Copyright © 2017 Kim et al.
[Direct genetic manipulation and criminal code in Venezuela: absolute criminal law void?].

PubMed

Cermeño Zambrano, Fernando G De J

2002-01-01

The judicial regulation of genetic biotechnology applied to the human genome is of big relevance currently in Venezuela due to the drafting of an innovative bioethical law in the country's parliament. This article will highlight the constitutional normative of Venezuela's 1999 Constitution regarding this subject, as it establishes the framework from which this matter will be legally regulated. The approach this article makes towards the genetic biotechnology applied to the human genome is made taking into account the Venezuelan penal law and by highlighting the violent genetic manipulations that have criminal relevance. The genetic biotechnology applied to the human genome has another important relevance as a consequence of the reformulation of the Venezuelan Penal Code discussed by the country's National Assembly. Therefore, a concise study of the country's penal code will be made in this article to better understand what judicial-penal properties have been protected by the Venezuelan penal legislation. This last step will enable us to identify the penal tools Venezuela counts on to face direct genetic manipulations. We will equally indicate the existing punitive loophole and that should be covered by the penal legislator. In conclusion, this essay concerns criminal policy, referred to the direct genetic manipulations on the human genome that haven't been typified in Venezuelan law, thus discovering a genetic biotechnology paradise.
DNA replication stress: from molecular mechanisms to human disease.

PubMed

Muñoz, Sergio; Méndez, Juan

2017-02-01

The genome of proliferating cells must be precisely duplicated in each cell division cycle. Chromosomal replication entails risks such as the possibility of introducing breaks and/or mutations in the genome. Hence, DNA replication requires the coordinated action of multiple proteins and regulatory factors, whose deregulation causes severe developmental diseases and predisposes to cancer. In recent years, the concept of "replicative stress" (RS) has attracted much attention as it impinges directly on genomic stability and offers a promising new avenue to design anticancer therapies. In this review, we summarize recent progress in three areas: (1) endogenous and exogenous factors that contribute to RS, (2) molecular mechanisms that mediate the cellular responses to RS, and (3) the large list of diseases that are directly or indirectly linked to RS.
Induction of infectious petunia vein clearing (pararetro) virus from endogenous provirus in petunia

PubMed Central

Richert-Pöggeler, Katja R.; Noreen, Faiza; Schwarzacher, Trude; Harper, Glyn; Hohn, Thomas

2003-01-01

Infection by an endogenous pararetrovirus using forms of both episomal and chromosomal origin has been demonstrated and characterized, together with evidence that petunia vein clearing virus (PVCV) is a constituent of the Petunia hybrida genome. Our findings allow comparative and direct analysis of horizontally and vertically transmitted virus forms and demonstrate their infectivity using biolistic transformation of a provirus-free petunia species. Some integrants within the genome of P.hybrida are arranged in tandem, allowing direct release of virus by transcription. In addition to known inducers of endogenous pararetroviruses, such as genome hybridization, tissue culture and abiotic stresses, we observed activation of PVCV after wounding. Our data also support the hypothesis that the host plant uses DNA methylation to control the endogenous pararetrovirus. PMID:12970195
Genomic selection for slaughter age in pigs using the Cox frailty model.

PubMed

Santos, V S; Martins Filho, S; Resende, M D V; Azevedo, C F; Lopes, P S; Guimarães, S E F; Glória, L S; Silva, F F

2015-10-19

The aim of this study was to compare genomic selection methodologies using a linear mixed model and the Cox survival model. We used data from an F2 population of pigs, in which the response variable was the time in days from birth to the culling of the animal and the covariates were 238 markers [237 single nucleotide polymorphism (SNP) plus the halothane gene]. The data were corrected for fixed effects, and the accuracy of the method was determined based on the correlation of the ranks of predicted genomic breeding values (GBVs) in both models with the corrected phenotypic values. The analysis was repeated with a subset of SNP markers with largest absolute effects. The results were in agreement with the GBV prediction and the estimation of marker effects for both models for uncensored data and for normality. However, when considering censored data, the Cox model with a normal random effect (S1) was more appropriate. Since there was no agreement between the linear mixed model and the imputed data (L2) for the prediction of genomic values and the estimation of marker effects, the model S1 was considered superior as it took into account the latent variable and the censored data. Marker selection increased correlations between the ranks of predicted GBVs by the linear and Cox frailty models and the corrected phenotypic values, and 120 markers were required to increase the predictive ability for the characteristic analyzed.
Application of genomic selection in farm animal breeding.

PubMed

Tan, Cheng; Bian, Cheng; Yang, Da; Li, Ning; Wu, Zhen-Fang; Hu, Xiao-Xiang

2017-11-20

Genomic selection (GS) has become a widely accepted method in animal breeding to genetically improve economic traits. With the declining costs of high-density SNP chips and next-generation sequencing, GS has been applied in dairy cattle, swine, poultry and other animals and gained varying degrees of success. Currently, major challenges in GS studies include further reducing the cost of genome-wide SNP genotyping and improving the predictive accuracy of genomic estimated breeding value (GEBV). In this review, we summarize various methods for genome-wide SNP genotyping and GEBV prediction, and give a brief introduction of GS in livestock and poultry breeding. This review will provide a reference for further implementation of GS in farm animal breeding.
Genomic Prediction of Resistance to Pasteurellosis in Gilthead Sea Bream (Sparus aurata) Using 2b-RAD Sequencing

PubMed Central

Palaiokostas, Christos; Ferraresso, Serena; Franch, Rafaella; Houston, Ross D.; Bargelloni, Luca

2016-01-01

Gilthead sea bream (Sparus aurata) is a species of paramount importance to the Mediterranean aquaculture industry, with an annual production exceeding 140,000 metric tons. Pasteurellosis due to the Gram-negative bacterium Photobacterium damselae subsp. piscicida (Phdp) causes significant mortality, especially during larval and juvenile stages, and poses a serious threat to bream production. Selective breeding for improved resistance to pasteurellosis is a promising avenue for disease control, and the use of genetic markers to predict breeding values can improve the accuracy of selection, and allow accurate calculation of estimated breeding values of nonchallenged animals. In the current study, a population of 825 sea bream juveniles, originating from a factorial cross between 67 broodfish (32 sires, 35 dams), were challenged by 30 min immersion with 1 × 105 CFU virulent Phdp. Mortalities and survivors were recorded and sampled for genotyping by sequencing. The restriction-site associated DNA sequencing approach, 2b-RAD, was used to generate genome-wide single nucleotide polymorphism (SNP) genotypes for all samples. A high-density linkage map containing 12,085 SNPs grouped into 24 linkage groups (consistent with the karyotype) was constructed. The heritability of surviving days (censored data) was 0.22 (95% highest density interval: 0.11–0.36) and 0.28 (95% highest density interval: 0.17–0.4) using the pedigree and the genomic relationship matrix respectively. A genome-wide association study did not reveal individual SNPs significantly associated with resistance at a genome-wide significance level. Genomic prediction approaches were tested to investigate the potential of the SNPs obtained by 2b-RAD for estimating breeding values for resistance. The accuracy of the genomic prediction models (r = 0.38–0.46) outperformed the traditional BLUP approach based on pedigree records (r = 0.30). Overall results suggest that major quantitative trait loci affecting resistance to pasteurellosis were not present in this population, but highlight the effectiveness of 2b-RAD genotyping by sequencing for genomic selection in a mass spawning fish species. PMID:27652890
ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.

PubMed

Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D

2017-08-10

Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries.
Complete genome sequence of Planctomyces brasiliensis type strain (DSM 5305 T), phylogenomic analysis and reclassification of Planctomycetes including the descriptions of Gimesia gen. nov., Planctopirus gen. nov. and Rubinisphaera gen. nov. and emended descriptions of the order Planctomycetales and the family Planctomycetaceae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Scheuner, Carmen; Tindall, Brian J.; Lu, Megan

Planctomyces brasiliensis Schlesner 1990 belongs to the order Planctomycetales, which differs from other bacterial taxa by several distinctive features such as internal cell compartmentalization, multiplication by forming buds directly from the spherical, ovoid or pear-shaped mother cell and a cell wall consisting of a proteinaceous layer rather than a peptidoglycan layer. The first strains of P. brasiliensis, including the type strain IFAM 1448 T, were isolated from a water sample of Lagoa Vermelha, a salt pit near Rio de Janeiro, Brasil. This is the second completed genome sequence of a type strain of the genus Planctomyces to be published andmore » the sixth type strain genome sequence from the family Planctomycetaceae. The 6,006,602 bp long genome with its 4,811 protein-coding and 54 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. We study phylogenomic analyses that indicate that the classification within the Planctomycetaceae is partially in conflict with its evolutionary history, as the positioning of Schlesneria renders the genus Planctomyces paraphyletic. A re-analysis of published fatty-acid measurements also does not support the current arrangement of the two genera. A quantitative comparison of phylogenetic and phenotypic aspects indicates that the three Planctomyces species with type strains available in public culture collections should be placed in separate genera. Thus the genera Gimesia, Planctopirus and Rubinisphaera are proposed to accommodate P. maris, P. limnophilus and P. brasiliensis, respectively. Pronounced differences between the reported G + C content of Gemmata obscuriglobus, Singulisphaera acidiphila and Zavarzinella formosa and G + C content calculated from their genome sequences call for emendation of their species descriptions. Lastly, in addition to other features, the range of G + C values reported for the genera within the Planctomycetaceae indicates that the descriptions of the family and the order should be emended.« less
Complete genome sequence of Planctomyces brasiliensis type strain (DSM 5305 T), phylogenomic analysis and reclassification of Planctomycetes including the descriptions of Gimesia gen. nov., Planctopirus gen. nov. and Rubinisphaera gen. nov. and emended descriptions of the order Planctomycetales and the family Planctomycetaceae

DOE PAGES

Scheuner, Carmen; Tindall, Brian J.; Lu, Megan; ...

2014-12-08

Planctomyces brasiliensis Schlesner 1990 belongs to the order Planctomycetales, which differs from other bacterial taxa by several distinctive features such as internal cell compartmentalization, multiplication by forming buds directly from the spherical, ovoid or pear-shaped mother cell and a cell wall consisting of a proteinaceous layer rather than a peptidoglycan layer. The first strains of P. brasiliensis, including the type strain IFAM 1448 T, were isolated from a water sample of Lagoa Vermelha, a salt pit near Rio de Janeiro, Brasil. This is the second completed genome sequence of a type strain of the genus Planctomyces to be published andmore » the sixth type strain genome sequence from the family Planctomycetaceae. The 6,006,602 bp long genome with its 4,811 protein-coding and 54 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. We study phylogenomic analyses that indicate that the classification within the Planctomycetaceae is partially in conflict with its evolutionary history, as the positioning of Schlesneria renders the genus Planctomyces paraphyletic. A re-analysis of published fatty-acid measurements also does not support the current arrangement of the two genera. A quantitative comparison of phylogenetic and phenotypic aspects indicates that the three Planctomyces species with type strains available in public culture collections should be placed in separate genera. Thus the genera Gimesia, Planctopirus and Rubinisphaera are proposed to accommodate P. maris, P. limnophilus and P. brasiliensis, respectively. Pronounced differences between the reported G + C content of Gemmata obscuriglobus, Singulisphaera acidiphila and Zavarzinella formosa and G + C content calculated from their genome sequences call for emendation of their species descriptions. Lastly, in addition to other features, the range of G + C values reported for the genera within the Planctomycetaceae indicates that the descriptions of the family and the order should be emended.« less
Complementary Information Derived from CRISPR Cas9 Mediated Gene Deletion and Suppression. | Office of Cancer Genomics

Cancer.gov

CRISPR-Cas9 provides the means to perform genome editing and facilitates loss-of-function screens. However, we and others demonstrated that expression of the Cas9 endonuclease induces a gene-independent response that correlates with the number of target sequences in the genome. An alternative approach to suppressing gene expression is to block transcription using a catalytically inactive Cas9 (dCas9). Here we directly compare genome editing by CRISPR-Cas9 (cutting, CRISPRc) and gene suppression using KRAB-dCas9 (CRISPRi) in loss-of-function screens to identify cell essential genes.
Omics and Environmental Science Genomic Approaches With Natural Fish Populations From Polluted Environments

PubMed Central

Bozinovic, Goran; Oleksiak, Marjorie F.

2010-01-01

Transcriptomics and population genomics are two complementary genomic approaches that can be used to gain insight into pollutant effects in natural populations. Transcriptomics identify altered gene expression pathways while population genomics approaches more directly target the causative genomic polymorphisms. Neither approach is restricted to a pre-determined set of genes or loci. Instead, both approaches allow a broad overview of genomic processes. Transcriptomics and population genomic approaches have been used to explore genomic responses in populations of fish from polluted environments and have identified sets of candidate genes and loci that appear biologically important in response to pollution. Often differences in gene expression or loci between polluted and reference populations are not conserved among polluted populations suggesting a biological complexity that we do not yet fully understand. As genomic approaches become less expensive with the advent of new sequencing and genotyping technologies, they will be more widely used in complimentary studies. However, while these genomic approaches are immensely powerful for identifying candidate gene and loci, the challenge of determining biological mechanisms that link genotypes and phenotypes remains. PMID:21072843
[Personal genomics: are we debating the right Issues?].

PubMed

Vayena, E; Mauch, F

2012-07-25

The debate about personal genomics and their role in personalized medicine has been, to some extent, hijacked by the controversy about commercially available genomic tests sold directly to consumers. The clinical validity and utility of such tests are currently limited and most medical associations recommend that consumers refrain from testing. Conversely, DTC genomics proponents and particularly the DTC industry argue that there is personal utility in acquiring genomic information. While it is necessary to debate risks and benefits of DTC genomics, we should not lose sight of the increasingly important role that genomics will play in medical practice and public health. Therefore, and in anticipation of this shift we also need to focus on important implications from the use of genomics information such as genetic discrimination, privacy protection and equitable access to health care. Undoubtedly, personal genomics will challenge our social norms maybe more than our medicine. Sticking to the polarization of «to have or not to have DTC genomics» risks to takes us away from the critical issues we need to be debating.

Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction.

PubMed

Zhou, Yao; Vales, M Isabel; Wang, Aoxue; Zhang, Zhiwu

2017-09-01

Accuracy of genomic prediction is commonly calculated as the Pearson correlation coefficient between the predicted and observed phenotypes in the inference population by using cross-validation analysis. More frequently than expected, significant negative accuracies of genomic prediction have been reported in genomic selection studies. These negative values are surprising, given that the minimum value for prediction accuracy should hover around zero when randomly permuted data sets are analyzed. We reviewed the two common approaches for calculating the Pearson correlation and hypothesized that these negative accuracy values reflect potential bias owing to artifacts caused by the mathematical formulas used to calculate prediction accuracy. The first approach, Instant accuracy, calculates correlations for each fold and reports prediction accuracy as the mean of correlations across fold. The other approach, Hold accuracy, predicts all phenotypes in all fold and calculates correlation between the observed and predicted phenotypes at the end of the cross-validation process. Using simulated and real data, we demonstrated that our hypothesis is true. Both approaches are biased downward under certain conditions. The biases become larger when more fold are employed and when the expected accuracy is low. The bias of Instant accuracy can be corrected using a modified formula. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

PubMed

Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

2014-01-01

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

PubMed Central

Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

2014-01-01

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409
Primer in Genetics and Genomics, Article 2-Advancing Nursing Research With Genomic Approaches.

PubMed

Lee, Hyunhwa; Gill, Jessica; Barr, Taura; Yun, Sijung; Kim, Hyungsuk

2017-03-01

Nurses investigate reasons for variable patient symptoms and responses to treatments to inform how best to improve outcomes. Genomics has the potential to guide nursing research exploring contributions to individual variability. This article is meant to serve as an introduction to the novel methods available through genomics for addressing this critical issue and includes a review of methodological considerations for selected genomic approaches. This review presents essential concepts in genetics and genomics that will allow readers to identify upcoming trends in genomics nursing research and improve research practice. It introduces general principles of genomic research and provides an overview of the research process. It also highlights selected nursing studies that serve as clinical examples of the use of genomic technologies. Finally, the authors provide suggestions about how to apply genomic technology in nursing research along with directions for future research. Using genomic approaches in nursing research can advance the understanding of the complex pathophysiology of disease susceptibility and different patient responses to interventions. Nurses should be incorporating genomics into education, clinical practice, and research as the influence of genomics in health-care research and practice continues to grow. Nurses are also well placed to translate genomic discoveries into improved methods for patient assessment and intervention.
Seamless editing of the chloroplast genome in plants.

PubMed

Martin Avila, Elena; Gisby, Martin F; Day, Anil

2016-07-29

Gene editing technologies enable the precise insertion of favourable mutations and performance enhancing trait genes into chromosomes whilst excluding all excess DNA from modified genomes. The technology gives rise to a new class of biotech crops which is likely to have widespread applications in agriculture. Despite progress in the nucleus, the seamless insertions of point mutations and non-selectable foreign genes into the organelle genomes of crops have not been described. The chloroplast genome is an attractive target to improve photosynthesis and crop performance. Current chloroplast genome engineering technologies for introducing point mutations into native chloroplast genes leave DNA scars, such as the target sites for recombination enzymes. Seamless editing methods to modify chloroplast genes need to address reversal of site-directed point mutations by template mediated repair with the vast excess of wild type chloroplast genomes that are present early in the transformation process. Using tobacco, we developed an efficient two-step method to edit a chloroplast gene by replacing the wild type sequence with a transient intermediate. This was resolved to the final edited gene by recombination between imperfect direct repeats. Six out of 11 transplastomic plants isolated contained the desired intermediate and at the second step this was resolved to the edited chloroplast gene in five of six plants tested. Maintenance of a single base deletion mutation in an imperfect direct repeat of the native chloroplast rbcL gene showed the limited influence of biased repair back to the wild type sequence. The deletion caused a frameshift, which replaced the five C-terminal amino acids of the Rubisco large subunit with 16 alternative residues resulting in a ~30-fold reduction in its accumulation. We monitored the process in vivo by engineering an overlapping gusA gene downstream of the edited rbcL gene. Translational coupling between the overlapping rbcL and gusA genes resulted in relatively high GUS accumulation (~0.5 % of leaf protein). Editing chloroplast genomes using transient imperfect direct repeats provides an efficient method for introducing point mutations into chloroplast genes. Moreover, we describe the first synthetic operon allowing expression of a downstream overlapping gene by translational coupling in chloroplasts. Overlapping genes provide a new mechanism for co-ordinating the translation of foreign proteins in chloroplasts.
Single-Cell Genomic Analysis in Plants

PubMed Central

Hu, Haifei; Scheben, Armin; Edwards, David

2018-01-01

Individual cells in an organism are variable, which strongly impacts cellular processes. Advances in sequencing technologies have enabled single-cell genomic analysis to become widespread, addressing shortcomings of analyses conducted on populations of bulk cells. While the field of single-cell plant genomics is in its infancy, there is great potential to gain insights into cell lineage and functional cell types to help understand complex cellular interactions in plants. In this review, we discuss current approaches for single-cell plant genomic analysis, with a focus on single-cell isolation, DNA amplification, next-generation sequencing, and bioinformatics analysis. We outline the technical challenges of analysing material from a single plant cell, and then examine applications of single-cell genomics and the integration of this approach with genome editing. Finally, we indicate future directions we expect in the rapidly developing field of plant single-cell genomic analysis. PMID:29361790
Chromatin Insulators and Topological Domains: Adding New Dimensions to 3D Genome Architecture

PubMed Central

Matharu, Navneet K.; Ahanger, Sajad H.

2015-01-01

The spatial organization of metazoan genomes has a direct influence on fundamental nuclear processes that include transcription, replication, and DNA repair. It is imperative to understand the mechanisms that shape the 3D organization of the eukaryotic genomes. Chromatin insulators have emerged as one of the central components of the genome organization tool-kit across species. Recent advancements in chromatin conformation capture technologies have provided important insights into the architectural role of insulators in genomic structuring. Insulators are involved in 3D genome organization at multiple spatial scales and are important for dynamic reorganization of chromatin structure during reprogramming and differentiation. In this review, we will discuss the classical view and our renewed understanding of insulators as global genome organizers. We will also discuss the plasticity of chromatin structure and its re-organization during pluripotency and differentiation and in situations of cellular stress. PMID:26340639
The genetic overlap between schizophrenia and height.

PubMed

Bacanu, Silviu-Alin; Chen, Xianging; Kendler, Kenneth S

2013-12-01

Epidemiological studies suggest that height and schizophrenia risk are inversely correlated. These findings might arise because i) height and schizophrenia share genetic variants and ii) the effects of these shared variants are in opposite direction for the two traits. We use genome wide association data to empirically evaluate these hypotheses. We find that variants which impact on height and risk for schizophrenia are distributed across several genomic regions and the directions of effect vary, some consistent and others inconsistent with the direction expected from the phenotypic data. Moreover, signals that were in and not in accord with the phenotypic data aggregated in distinct biological pathways. © 2013 Elsevier B.V. All rights reserved.
Confidentiality and data sharing: vulnerabilities of the Mexican Genomics Sovereignty Act.

PubMed

Rojas-Martínez, Augusto

2015-07-01

A law known as "Genomic Sovereignty Act", instituted in 2011, regulates research on the human genome in Mexico. This law establishes Government regulations for the exportation of DNA samples from Mexican nationals for population genetics studies. The Genomic Sovereignty Act protects fundamental human values, as confidentiality and non-discrimination based on personal genetic information. It also supports the development of the genome-based medical biotechnology and the bio-economy. Current laws for the protection of the genomic confidentiality, however, are inexplicit and insufficient, and the legal and technological instruments are primitive and insufficient to safeguard this bioethical principle. In addition, this law may undermine efforts of the national and international scientific communities to cooperate with big-data analysis for the development of the genome-based biomedical sciences. The argument of this article is that deficiencies in the protection of the confidentiality of genomic information and limitations in data sharing severely weaken the objectives and scope of the Genomic Sovereignty Act. In addition, the Act may compromise the national biomedical development and the international cooperation for research and development in the field of human genomics.
The 1000 bull genome project

USDA-ARS?s Scientific Manuscript database

To meet growing global demands for high value protein from milk and meat, rates of genetic gain in domestic cattle must be accelerated. At the same time, animal health and welfare must be considered. The 1000 bull genomes project supports these goals by providing annotated sequence variants and ge...
Comparison of genomic-enhanced EPD systems using an external phenotypic database

USDA-ARS?s Scientific Manuscript database

The American Angus Association (AAA) is currently evaluating two methods to incorporate genomic information into their genetic evaluation program: 1) multi-trait incorporation of an externally produced molecular breeding value as an indicator trait (MT) and 2) single-step evaluation with an unweight...
Validating genomic reliabilities and gains from phenotypic updates

USDA-ARS?s Scientific Manuscript database

Reliability can be validated from the variance of the difference of earlier and later estimated breeding values as a fraction of the genetic variance. This new method avoids using squared correlations that can be biased downward by selection. Published genomic reliabilities of U.S. young bulls agree...
Chloroplast genomes: diversity, evolution, and applications in genetic engineering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daniell, Henry; Lin, Choun -Sea; Yu, Ming

Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.
Chloroplast genomes: diversity, evolution, and applications in genetic engineering

DOE PAGES

Daniell, Henry; Lin, Choun -Sea; Yu, Ming; ...

2016-06-23

Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels.

PubMed

Erbe, M; Hayes, B J; Matukumalli, L K; Goswami, S; Bowman, P J; Reich, C M; Mason, B A; Goddard, M E

2012-07-01

Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models.

PubMed

Karoui, Sofiene; Carabaño, María Jesús; Díaz, Clara; Legarra, Andrés

2012-12-07

Using a multi-breed reference population might be a way of increasing the accuracy of genomic breeding values in small breeds. Models involving mixed-breed data do not take into account the fact that marker effects may differ among breeds. This study was aimed at investigating the impact on accuracy of increasing the number of genotyped candidates in the training set by using a multi-breed reference population, in contrast to single-breed genomic evaluations. Three traits (milk production, fat content and female fertility) were analyzed by genomic mixed linear models and Bayesian methodology. Three breeds of French dairy cattle were used: Holstein, Montbéliarde and Normande with 2976, 950 and 970 bulls in the training population, respectively and 964, 222 and 248 bulls in the validation population, respectively. All animals were genotyped with the Illumina Bovine SNP50 array. Accuracy of genomic breeding values was evaluated under three scenarios for the correlation of genomic breeding values between breeds (r(g)): uncorrelated (1), r(g) = 0; estimated r(g) (2); high, r(g) = 0.95 (3). Accuracy and bias of predictions obtained in the validation population with the multi-breed training set were assessed by the coefficient of determination (R(2)) and by the regression coefficient of daughter yield deviations of validation bulls on their predicted genomic breeding values, respectively. The genetic variation captured by the markers for each trait was similar to that estimated for routine pedigree-based genetic evaluation. Posterior means for rg ranged from -0.01 for fertility between Montbéliarde and Normande to 0.79 for milk yield between Montbéliarde and Holstein. Differences in R(2) between the three scenarios were notable only for fat content in the Montbéliarde breed: from 0.27 in scenario (1) to 0.33 in scenarios (2) and (3). Accuracies for fertility were lower than for other traits. Using a multi-breed reference population resulted in small or no increases in accuracy. Only the breed with a small data set and large genetic correlation with the breed with a large data set showed increased accuracy for the traits with moderate (milk) to high (fat content) heritability. No benefit was observed for fertility, a lowly heritable trait.
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models

PubMed Central

2012-01-01

Background Using a multi-breed reference population might be a way of increasing the accuracy of genomic breeding values in small breeds. Models involving mixed-breed data do not take into account the fact that marker effects may differ among breeds. This study was aimed at investigating the impact on accuracy of increasing the number of genotyped candidates in the training set by using a multi-breed reference population, in contrast to single-breed genomic evaluations. Methods Three traits (milk production, fat content and female fertility) were analyzed by genomic mixed linear models and Bayesian methodology. Three breeds of French dairy cattle were used: Holstein, Montbéliarde and Normande with 2976, 950 and 970 bulls in the training population, respectively and 964, 222 and 248 bulls in the validation population, respectively. All animals were genotyped with the Illumina Bovine SNP50 array. Accuracy of genomic breeding values was evaluated under three scenarios for the correlation of genomic breeding values between breeds (rg): uncorrelated (1), rg = 0; estimated rg (2); high, rg = 0.95 (3). Accuracy and bias of predictions obtained in the validation population with the multi-breed training set were assessed by the coefficient of determination (R2) and by the regression coefficient of daughter yield deviations of validation bulls on their predicted genomic breeding values, respectively. Results The genetic variation captured by the markers for each trait was similar to that estimated for routine pedigree-based genetic evaluation. Posterior means for rg ranged from −0.01 for fertility between Montbéliarde and Normande to 0.79 for milk yield between Montbéliarde and Holstein. Differences in R2 between the three scenarios were notable only for fat content in the Montbéliarde breed: from 0.27 in scenario (1) to 0.33 in scenarios (2) and (3). Accuracies for fertility were lower than for other traits. Conclusions Using a multi-breed reference population resulted in small or no increases in accuracy. Only the breed with a small data set and large genetic correlation with the breed with a large data set showed increased accuracy for the traits with moderate (milk) to high (fat content) heritability. No benefit was observed for fertility, a lowly heritable trait. PMID:23216664
Single-Step BLUP with Varying Genotyping Effort in Open-Pollinated Picea glauca.

PubMed

Ratcliffe, Blaise; El-Dien, Omnia Gamal; Cappa, Eduardo P; Porth, Ilga; Klápště, Jaroslav; Chen, Charles; El-Kassaby, Yousry A

2017-03-10

Maximization of genetic gain in forest tree breeding programs is contingent on the accuracy of the predicted breeding values and precision of the estimated genetic parameters. We investigated the effect of the combined use of contemporary pedigree information and genomic relatedness estimates on the accuracy of predicted breeding values and precision of estimated genetic parameters, as well as rankings of selection candidates, using single-step genomic evaluation (HBLUP). In this study, two traits with diverse heritabilities [tree height (HT) and wood density (WD)] were assessed at various levels of family genotyping efforts (0, 25, 50, 75, and 100%) from a population of white spruce ( Picea glauca ) consisting of 1694 trees from 214 open-pollinated families, representing 43 provenances in Québec, Canada. The results revealed that HBLUP bivariate analysis is effective in reducing the known bias in heritability estimates of open-pollinated populations, as it exposes hidden relatedness, potential pedigree errors, and inbreeding. The addition of genomic information in the analysis considerably improved the accuracy in breeding value estimates by accounting for both Mendelian sampling and historical coancestry that were not captured by the contemporary pedigree alone. Increasing family genotyping efforts were associated with continuous improvement in model fit, precision of genetic parameters, and breeding value accuracy. Yet, improvements were observed even at minimal genotyping effort, indicating that even modest genotyping effort is effective in improving genetic evaluation. The combined utilization of both pedigree and genomic information may be a cost-effective approach to increase the accuracy of breeding values in forest tree breeding programs where shallow pedigrees and large testing populations are the norm. Copyright © 2017 Ratcliffe et al.
Effects of dairy slurry on silage fermentation characteristics and nutritive value of alfalfa.

PubMed

Coblentz, W K; Muck, R E; Borchardt, M A; Spencer, S K; Jokela, W E; Bertram, M G; Coffey, K P

2014-11-01

Dairy producers frequently ask questions about the risks associated with applying dairy slurry to growing alfalfa (Medicago sativa L.). Our objectives were to determine the effects of applying dairy slurry on the subsequent nutritive value and fermentation characteristics of alfalfa balage. Dairy slurry was applied to 0.17-ha plots of alfalfa; applications were made to the second (HARV1) and third (HARV2) cuttings during June and July of 2012, respectively, at mean rates of 42,400 ± 5271 and 41,700 ± 2397 L/ha, respectively. Application strategies included (1) no slurry, (2) slurry applied directly to stubble immediately after the preceding harvest, (3) slurry applied after 1 wk of post-ensiled regrowth, or (4) slurry applied after 2 wk of regrowth. All harvested forage was packaged in large, rectangular bales that were ensiled as wrapped balage. Yields of DM harvested from HARV1 (2,477 kg/ha) and HARV2 (781 kg/ha) were not affected by slurry application treatment. By May 2013, all silages appeared to be well preserved, with no indication of undesirable odors characteristic of clostridial fermentations. Clostridium tyrobutyricum, which is known to negatively affect cheese production, was not detected in any forage on either a pre- or post-ensiled basis. On a pre-ensiled basis, counts for Clostridium cluster 1 were greater for slurry-applied plots than for those receiving no slurry, and this response was consistent for HARV1 (4.44 vs. 3.29 log10 genomic copies/g) and HARV2 (4.99 vs. 3.88 log10 genomic copies/g). Similar responses were observed on a post-ensiled basis; however, post-ensiled counts also were greater for HARV1 (5.51 vs. 5.17 log10 genomic copies/g) and HARV2 (5.84 vs. 5.28 log10 genomic copies/g) when slurry was applied to regrowth compared with stubble. For HARV2, counts also were greater following a 2-wk application delay compared with a 1-wk delay (6.23 vs. 5.45 log10 genomic copies/g). These results suggest that the risk of clostridial fermentations in alfalfa silages is greater following applications of slurry. Based on pre- and post-ensiled clostridial counts, applications of dairy slurry on stubble are preferred (and less risky) compared with delayed applications on growing alfalfa. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Short communication: Improving the accuracy of genomic prediction of body conformation traits in Chinese Holsteins using markers derived from high-density marker panels.

PubMed

Song, H; Li, L; Ma, P; Zhang, S; Su, G; Lund, M S; Zhang, Q; Ding, X

2018-06-01

This study investigated the efficiency of genomic prediction with adding the markers identified by genome-wide association study (GWAS) using a data set of imputed high-density (HD) markers from 54K markers in Chinese Holsteins. Among 3,056 Chinese Holsteins with imputed HD data, 2,401 individuals born before October 1, 2009, were used for GWAS and a reference population for genomic prediction, and the 220 younger cows were used as a validation population. In total, 1,403, 1,536, and 1,383 significant single nucleotide polymorphisms (SNP; false discovery rate at 0.05) associated with conformation final score, mammary system, and feet and legs were identified, respectively. About 2 to 3% genetic variance of 3 traits was explained by these significant SNP. Only a very small proportion of significant SNP identified by GWAS was included in the 54K marker panel. Three new marker sets (54K+) were herein produced by adding significant SNP obtained by linear mixed model for each trait into the 54K marker panel. Genomic breeding values were predicted using a Bayesian variable selection (BVS) model. The accuracies of genomic breeding value by BVS based on the 54K+ data were 2.0 to 5.2% higher than those based on the 54K data. The imputed HD markers yielded 1.4% higher accuracy on average (BVS) than the 54K data. Both the 54K+ and HD data generated lower bias of genomic prediction, and the 54K+ data yielded the lowest bias in all situations. Our results show that the imputed HD data were not very useful for improving the accuracy of genomic prediction and that adding the significant markers derived from the imputed HD marker panel could improve the accuracy of genomic prediction and decrease the bias of genomic prediction. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

Re-Examining the Gene in Personalized Genomics

ERIC Educational Resources Information Center

Bartol, Jordan

2013-01-01

Personalized genomics companies (PG; also called "direct-to-consumer genetics") are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept…
The genome of Diuraphis noxia, a global pest of small grains

USDA-ARS?s Scientific Manuscript database

The Russian wheat aphid (Diuraphis noxia) is the world's most destructive grain aphid, producing unique phytotoxic damage symptoms that result directly from salivary proteins injected into the host plant while feeding. We sequenced and assembled the genome of D. noxia biotype 2, the most widely des...
Construction of high-quality recombination maps with low-coverage genomic sequencing for joint linkage analysis in maize

USDA-ARS?s Scientific Manuscript database

A genome-wide association study (GWAS) is the foremost strategy used for finding genes that control human diseases and agriculturally important traits, but it often reports false positives. In contrast, its complementary method, linkage analysis, provides direct genetic confirmation, but with limite...
Genome-wide association identifies genomic regions associated with entropion in domestic sheep

USDA-ARS?s Scientific Manuscript database

Entropion is an inversion of the eyelid allowing direct contact between the eyelashes and cornea, potentially causing blindness if not treated. In domestic sheep, entropion has a variable frequency (0-80%) worldwide and is heritable (heritability 0.08-0.21). Identification of genes associated with e...
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

PubMed

Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

2016-04-01

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
Phenotypic diversification by enhanced genome restructuring after induction of multiple DNA double-strand breaks.

PubMed

Muramoto, Nobuhiko; Oda, Arisa; Tanaka, Hidenori; Nakamura, Takahiro; Kugou, Kazuto; Suda, Kazuki; Kobayashi, Aki; Yoneda, Shiori; Ikeuchi, Akinori; Sugimoto, Hiroki; Kondo, Satoshi; Ohto, Chikara; Shibata, Takehiko; Mitsukawa, Norihiro; Ohta, Kunihiro

2018-05-18

DNA double-strand break (DSB)-mediated genome rearrangements are assumed to provide diverse raw genetic materials enabling accelerated adaptive evolution; however, it remains unclear about the consequences of massive simultaneous DSB formation in cells and their resulting phenotypic impact. Here, we establish an artificial genome-restructuring technology by conditionally introducing multiple genomic DSBs in vivo using a temperature-dependent endonuclease TaqI. Application in yeast and Arabidopsis thaliana generates strains with phenotypes, including improved ethanol production from xylose at higher temperature and increased plant biomass, that are stably inherited to offspring after multiple passages. High-throughput genome resequencing revealed that these strains harbor diverse rearrangements, including copy number variations, translocations in retrotransposons, and direct end-joinings at TaqI-cleavage sites. Furthermore, large-scale rearrangements occur frequently in diploid yeasts (28.1%) and tetraploid plants (46.3%), whereas haploid yeasts and diploid plants undergo minimal rearrangement. This genome-restructuring system (TAQing system) will enable rapid genome breeding and aid genome-evolution studies.
APPLaUD: access for patients and participants to individual level uninterpreted genomic data.

PubMed

Thorogood, Adrian; Bobe, Jason; Prainsack, Barbara; Middleton, Anna; Scott, Erick; Nelson, Sarah; Corpas, Manuel; Bonhomme, Natasha; Rodriguez, Laura Lyman; Murtagh, Madeleine; Kleiderman, Erika

2018-02-17

There is a growing support for the stance that patients and research participants should have better and easier access to their raw (uninterpreted) genomic sequence data in both clinical and research contexts. We review legal frameworks and literature on the benefits, risks, and practical barriers of providing individuals access to their data. We also survey genomic sequencing initiatives that provide or plan to provide individual access. Many patients and research participants expect to be able to access their health and genomic data. Individuals have a legal right to access their genomic data in some countries and contexts. Moreover, increasing numbers of participatory research projects, direct-to-consumer genetic testing companies, and now major national sequencing initiatives grant individuals access to their genomic sequence data upon request. Drawing on current practice and regulatory analysis, we outline legal, ethical, and practical guidance for genomic sequencing initiatives seeking to offer interested patients and participants access to their raw genomic data.
The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

PubMed Central

Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

2016-01-01

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095
Temperature regulates splicing efficiency of the cold-inducible RNA-binding protein gene Cirbp

PubMed Central

Gotic, Ivana; Omidi, Saeed; Fleury-Olela, Fabienne; Molina, Nacho; Naef, Felix; Schibler, Ueli

2016-01-01

In mammals, body temperature fluctuates diurnally around a mean value of 36°C–37°C. Despite the small differences between minimal and maximal values, body temperature rhythms can drive robust cycles in gene expression in cultured cells and, likely, animals. Here we studied the mechanisms responsible for the temperature-dependent expression of cold-inducible RNA-binding protein (CIRBP). In NIH3T3 fibroblasts exposed to simulated mouse body temperature cycles, Cirbp mRNA oscillates about threefold in abundance, as it does in mouse livers. This daily mRNA accumulation cycle is directly controlled by temperature oscillations and does not depend on the cells’ circadian clocks. Here we show that the temperature-dependent accumulation of Cirbp mRNA is controlled primarily by the regulation of splicing efficiency, defined as the fraction of Cirbp pre-mRNA processed into mature mRNA. As revealed by genome-wide “approach to steady-state” kinetics, this post-transcriptional mechanism is widespread in the temperature-dependent control of gene expression. PMID:27633015
Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025

PubMed Central

Bruford, Michael W.; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J.; Amaral, Andreia J.; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F.; Hall, Stephen J. G.; Hanotte, Olivier; Hassan, Faiz-ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A.; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L.; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

2015-01-01

Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that “…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity.” However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are societal in origin and are predicated on the value (e.g., socio-economic and cultural) of these resources to farmers, rural communities and society as a whole. The overall conclusion is that despite the fact that the livestock sector has been relatively well-organized in the application of genetic methodologies to date, there is still a large gap between the current state-of-the-art in the use of tools to characterize genomic resources and its application to many non-commercial and local breeds, hampering the consistent utilization of genetic and genomic data as indicators of genetic erosion and diversity. The livestock genomic sector therefore needs to make a concerted effort in the coming decade to enable to the democratization of the powerful tools that are now at its disposal, and to ensure that they are applied in the context of breed conservation as well as development. PMID:26539210
Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025.

PubMed

Bruford, Michael W; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J; Amaral, Andreia J; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F; Hall, Stephen J G; Hanotte, Olivier; Hassan, Faiz-Ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

2015-01-01

Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that "…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity." However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are societal in origin and are predicated on the value (e.g., socio-economic and cultural) of these resources to farmers, rural communities and society as a whole. The overall conclusion is that despite the fact that the livestock sector has been relatively well-organized in the application of genetic methodologies to date, there is still a large gap between the current state-of-the-art in the use of tools to characterize genomic resources and its application to many non-commercial and local breeds, hampering the consistent utilization of genetic and genomic data as indicators of genetic erosion and diversity. The livestock genomic sector therefore needs to make a concerted effort in the coming decade to enable to the democratization of the powerful tools that are now at its disposal, and to ensure that they are applied in the context of breed conservation as well as development.
Genomics education for medical professionals - the current UK landscape.

PubMed

Slade, Ingrid; Subramanian, Deepak N; Burton, Hilary

2016-08-01

Genomics education in the UK is at an early stage of development, and its pace of evolution has lagged behind that of the genomics research upon which it is based. As a result, knowledge of genomics and its applications remains limited among non-specialist clinicians. In this review article, we describe the complex landscape for genomics education within the UK, and highlight the large number and variety of organisations that can influence, direct and provide genomics training to medical professionals. Postgraduate genomics education is being shaped by the work of the Health Education England (HEE) Genomics Education Programme, working in conjunction with the Joint Committee on Genomics in Medicine. The success of their work will be greatly enhanced by the full cooperation and engagement of the many groups, societies and organisations involved with medical education and training (such as the royal colleges). Without this cooperation, there is a risk of poor coordination and unnecessary duplication of work. Leadership from an organisation such as the HEE Genomics Education Programme will have a key role in guiding the formulation and delivery of genomics education policy by various stakeholders among the different disciplines in medicine. © 2016 Royal College of Physicians.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

PubMed Central

Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

2016-01-01

Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes. PMID:27437173
Reconstructing rare soil microbial genomes using in situ enrichments and metagenomics

PubMed Central

Delmont, Tom O.; Eren, A. Murat; Maccario, Lorrie; Prestat, Emmanuel; Esen, Özcan C.; Pelletier, Eric; Le Paslier, Denis; Simonet, Pascal; Vogel, Timothy M.

2015-01-01

Despite extensive direct sequencing efforts and advanced analytical tools, reconstructing microbial genomes from soil using metagenomics have been challenging due to the tremendous diversity and relatively uniform distribution of genomes found in this system. Here we used enrichment techniques in an attempt to decrease the complexity of a soil microbiome prior to sequencing by submitting it to a range of physical and chemical stresses in 23 separate microcosms for 4 months. The metagenomic analysis of these microcosms at the end of the treatment yielded 540 Mb of assembly using standard de novo assembly techniques (a total of 559,555 genes and 29,176 functions), from which we could recover novel bacterial genomes, plasmids and phages. The recovered genomes belonged to Leifsonia (n = 2), Rhodanobacter (n = 5), Acidobacteria (n = 2), Sporolactobacillus (n = 2, novel nitrogen fixing taxon), Ktedonobacter (n = 1, second representative of the family Ktedonobacteraceae), Streptomyces (n = 3, novel polyketide synthase modules), and Burkholderia (n = 2, includes mega-plasmids conferring mercury resistance). Assembled genomes averaged to 5.9 Mb, with relative abundances ranging from rare (<0.0001%) to relatively abundant (>0.01%) in the original soil microbiome. Furthermore, we detected them in samples collected from geographically distant locations, particularly more in temperate soils compared to samples originating from high-latitude soils and deserts. To the best of our knowledge, this study is the first successful attempt to assemble multiple bacterial genomes directly from a soil sample. Our findings demonstrate that developing pertinent enrichment conditions can stimulate environmental genomic discoveries that would have been impossible to achieve with canonical approaches that focus solely upon post-sequencing data treatment. PMID:25983722
Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake

USGS Publications Warehouse

Castoe, Todd A.; Poole, Alexander W.; de Koning, A. P. Jason; Jones, Kenneth L.; Tomback, Diana F.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Lance, Stacey L.; Streicher, Jeffrey W.; Smith, Eric N.; Pollock, David D.

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample – a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.
Rapid microsatellite identification from illumina paired-end genomic sequencing in two birds and a snake

USGS Publications Warehouse

Castoe, T.A.; Poole, A.W.; de Koning, A. P. J.; Jones, K.L.; Tomback, D.F.; Oyler-McCance, S.J.; Fike, J.A.; Lance, S.L.; Streicher, J.W.; Smith, E.N.; Pollock, D.D.

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample - a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable. ?? 2012 Castoe et al.
Impact of Direct-to-Consumer Predictive Genomic Testing on Risk Perception and Worry Among Patients Receiving Routine Care in a Preventive Health Clinic

PubMed Central

James, Katherine M.; Cowl, Clayton T.; Tilburt, Jon C.; Sinicrope, Pamela S.; Robinson, Marguerite E.; Frimannsdottir, Katrin R.; Tiedje, Kristina; Koenig, Barbara A.

2011-01-01

OBJECTIVE: To assess the impact of direct-to-consumer (DTC) predictive genomic risk information on perceived risk and worry in the context of routine clinical care. PATIENTS AND METHODS: Patients attending a preventive medicine clinic between June 1 and December 18, 2009, were randomly assigned to receive either genomic risk information from a DTC product plus usual care (n=74) or usual care alone (n=76). At intervals of 1 week and 1 year after their clinic visit, participants completed surveys containing validated measures of risk perception and levels of worry associated with the 12 conditions assessed by the DTC product. RESULTS: Of 345 patients approached, 150 (43%) agreed to participate, 64 (19%) refused, and 131 (38%) did not respond. Compared with those receiving usual care, participants who received genomic risk information initially rated their risk as higher for 4 conditions (abdominal aneurysm [P=.001], Graves disease [P=.04], obesity [P=.01], and osteoarthritis [P=.04]) and lower for one (prostate cancer [P=.02]). Although differences were not significant, they also reported higher levels of worry for 7 conditions and lower levels for 5 others. At 1 year, there were no significant differences between groups. CONCLUSION: Predictive genomic risk information modestly influences risk perception and worry. The extent and direction of this influence may depend on the condition being tested and its baseline prominence in preventive health care and may attenuate with time. Trial Registration: clinicaltrials.gov identifier: NCT00782366 PMID:21964170
Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake.

PubMed

Castoe, Todd A; Poole, Alexander W; de Koning, A P Jason; Jones, Kenneth L; Tomback, Diana F; Oyler-McCance, Sara J; Fike, Jennifer A; Lance, Stacey L; Streicher, Jeffrey W; Smith, Eric N; Pollock, David D

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample--a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.
WEGO 2.0: a web tool for analyzing and plotting GO annotations, 2018 update.

PubMed

Ye, Jia; Zhang, Yong; Cui, Huihai; Liu, Jiawei; Wu, Yuqing; Cheng, Yun; Xu, Huixing; Huang, Xingxin; Li, Shengting; Zhou, An; Zhang, Xiuqing; Bolund, Lars; Chen, Qiang; Wang, Jian; Yang, Huanming; Fang, Lin; Shi, Chunmei

2018-05-18

WEGO (Web Gene Ontology Annotation Plot), created in 2006, is a simple but useful tool for visualizing, comparing and plotting GO (Gene Ontology) annotation results. Owing largely to the rapid development of high-throughput sequencing and the increasing acceptance of GO, WEGO has benefitted from outstanding performance regarding the number of users and citations in recent years, which motivated us to update to version 2.0. WEGO uses the GO annotation results as input. Based on GO's standardized DAG (Directed Acyclic Graph) structured vocabulary system, the number of genes corresponding to each GO ID is calculated and shown in a graphical format. WEGO 2.0 updates have targeted four aspects, aiming to provide a more efficient and up-to-date approach for comparative genomic analyses. First, the number of input files, previously limited to three, is now unlimited, allowing WEGO to analyze multiple datasets. Also added in this version are the reference datasets of nine model species that can be adopted as baselines in genomic comparative analyses. Furthermore, in the analyzing processes each Chi-square test is carried out for multiple datasets instead of every two samples. At last, WEGO 2.0 provides an additional output graph along with the traditional WEGO histogram, displaying the sorted P-values of GO terms and indicating their significant differences. At the same time, WEGO 2.0 features an entirely new user interface. WEGO is available for free at http://wego.genomics.org.cn.

Consumers report lower confidence in their genetics knowledge following direct-to-consumer personal genomic testing.

PubMed

Carere, Deanna Alexis; Kraft, Peter; Kaphingst, Kimberly A; Roberts, J Scott; Green, Robert C

2016-01-01

The aim of this study was to measure changes to genetics knowledge and self-efficacy following personal genomic testing (PGT). New customers of 23andMe and Pathway Genomics completed a series of online surveys. We measured genetics knowledge (nine true/false items) and genetics self-efficacy (five Likert-scale items) before receipt of results and 6 months after results and used paired methods to evaluate change over time. Correlates of change (e.g., decision regret) were identified using linear regression. 998 PGT customers (59.9% female; 85.8% White; mean age 46.9 ± 15.5 years) were included in our analyses. Mean genetics knowledge score was 8.15 ± 0.95 (out of 9) at baseline and 8.25 ± 0.92 at 6 months (P = 0.0024). Mean self-efficacy score was 29.06 ± 5.59 (out of 35) at baseline and 27.7 ± 5.46 at 6 months (P < 0.0001); on each item, 30-45% of participants reported lower self-efficacy following PGT. Change in self-efficacy was positively associated with health-care provider consultation (P = 0.0042), impact of PGT on perceived control over one's health (P < 0.0001), and perceived value of PGT (P < 0.0001) and was negatively associated with decision regret (P < 0.0001). Lowered genetics self-efficacy following PGT may reflect an appropriate reevaluation by consumers in response to receiving complex genetic information.Genet Med 18 1, 65-72.
A universe of dwarfs and giants: genome size and chromosome evolution in the monocot family Melanthiaceae.

PubMed

Pellicer, Jaume; Kelly, Laura J; Leitch, Ilia J; Zomlefer, Wendy B; Fay, Michael F

2014-03-01

• Since the occurrence of giant genomes in angiosperms is restricted to just a few lineages, identifying where shifts towards genome obesity have occurred is essential for understanding the evolutionary mechanisms triggering this process. • Genome sizes were assessed using flow cytometry in 79 species and new chromosome numbers were obtained. Phylogenetically based statistical methods were applied to infer ancestral character reconstructions of chromosome numbers and nuclear DNA contents. • Melanthiaceae are the most diverse family in terms of genome size, with C-values ranging more than 230-fold. Our data confirmed that giant genomes are restricted to tribe Parideae, with most extant species in the family characterized by small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor (MRCA) for the family had a relatively small genome (1C = 5.37 pg). Chromosome losses and polyploidy are recovered as the main evolutionary mechanisms generating chromosome number change. • Genome evolution in Melanthiaceae has been characterized by a trend towards genome size reduction, with just one episode of dramatic DNA accumulation in Parideae. Such extreme contrasting profiles of genome size evolution illustrate the key role of transposable elements and chromosome rearrangements in driving the evolution of plant genomes. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Normalization of Complete Genome Characteristics: Application to Evolution from Primitive Organisms to Homo sapiens.

PubMed

Sorimachi, Kenji; Okayasu, Teiji; Ohhira, Shuji

2015-04-01

Normalized nucleotide and amino acid contents of complete genome sequences can be visualized as radar charts. The shapes of these charts depict the characteristics of an organism's genome. The normalized values calculated from the genome sequence theoretically exclude experimental errors. Further, because normalization is independent of both target size and kind, this procedure is applicable not only to single genes but also to whole genomes, which consist of a huge number of different genes. In this review, we discuss the applications of the normalization of the nucleotide and predicted amino acid contents of complete genomes to the investigation of genome structure and to evolutionary research from primitive organisms to Homo sapiens. Some of the results could never have been obtained from the analysis of individual nucleotide or amino acid sequences but were revealed only after the normalization of nucleotide and amino acid contents was applied to genome research. The discovery that genome structure was homogeneous was obtained only after normalization methods were applied to the nucleotide or predicted amino acid contents of genome sequences. Normalization procedures are also applicable to evolutionary research. Thus, normalization of the contents of whole genomes is a useful procedure that can help to characterize organisms.
Learning about human population history from ancient and modern genomes.

PubMed

Stoneking, Mark; Krause, Johannes

2011-08-18

Genome-wide data, both from SNP arrays and from complete genome sequencing, are becoming increasingly abundant and are now even available from extinct hominins. These data are providing new insights into population history; in particular, when combined with model-based analytical approaches, genome-wide data allow direct testing of hypotheses about population history. For example, genome-wide data from both contemporary populations and extinct hominins strongly support a single dispersal of modern humans from Africa, followed by two archaic admixture events: one with Neanderthals somewhere outside Africa and a second with Denisovans that (so far) has only been detected in New Guinea. These new developments promise to reveal new stories about human population history, without having to resort to storytelling.
Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Gary; Detter, John C; Bruce, David C

We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus 11B, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudo genes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Gary; Detter, Chris; Bruce, David

We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus lIB, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudogenes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
Genomic Variation of Inbreeding and Ancestry in the Remaining Two Isle Royale Wolves.

PubMed

Hedrick, Philip W; Kardos, Marty; Peterson, Rolf O; Vucetich, John A

2017-03-01

Inbreeding, relatedness, and ancestry have traditionally been estimated with pedigree information, however, molecular genomic data can provide more detailed examination of these properties. For example, pedigree information provides estimation of the expected value of these measures but molecular genomic data can estimate the realized values of these measures in individuals. Here, we generate the theoretical distribution of inbreeding, relatedness, and ancestry for the individuals in the pedigree of the Isle Royale wolves, the first examination of such variation in a wild population with a known pedigree. We use the 38 autosomes of the dog genome and their estimated map lengths in our genomic analysis. Although it is known that the remaining wolves are highly inbred, closely related, and descend from only 3 ancestors, our analyses suggest that there is significant variation in the realized inbreeding and relatedness around pedigree expectations. For example, the expected inbreeding in a hypothetical offspring from the 2 remaining wolves is 0.438 but the realized 95% genomic confidence interval is from 0.311 to 0.565. For individual chromosomes, a substantial proportion of the whole chromosomes are completely identical by descent. This examination provides a background to use when analyzing molecular genomic data for individual levels of inbreeding, relatedness, and ancestry. The level of variation in these measures is a function of the time to the common ancestor(s), the number of chromosomes, and the rate of recombination. In the Isle Royale wolf population, the few generations to a common ancestor results in the high variance in genomic inbreeding. © The American Genetic Association 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Stomatal vs. genome size in angiosperms: the somatic tail wagging the genomic dog?

PubMed Central

Hodgson, J. G.; Sharafi, M.; Jalili, A.; Díaz, S.; Montserrat-Martí, G.; Palmer, C.; Cerabolini, B.; Pierce, S.; Hamzehee, B.; Asri, Y.; Jamzad, Z.; Wilson, P.; Raven, J. A.; Band, S. R.; Basconcelo, S.; Bogard, A.; Carter, G.; Charles, M.; Castro-Díez, P.; Cornelissen, J. H. C.; Funes, G.; Jones, G.; Khoshnevis, M.; Pérez-Harguindeguy, N.; Pérez-Rontomé, M. C.; Shirvany, F. A.; Vendramini, F.; Yazdani, S.; Abbas-Azimi, R.; Boustani, S.; Dehghan, M.; Guerrero-Campo, J.; Hynd, A.; Kowsary, E.; Kazemi-Saeed, F.; Siavash, B.; Villar-Salvador, P.; Craigie, R.; Naqinezhad, A.; Romo-Díez, A.; de Torres Espuny, L.; Simmons, E.

2010-01-01

Background and Aims Genome size is a function, and the product, of cell volume. As such it is contingent on ecological circumstance. The nature of ‘this ecological circumstance’ is, however, hotly debated. Here, we investigate for angiosperms whether stomatal size may be this ‘missing link’: the primary determinant of genome size. Stomata are crucial for photosynthesis and their size affects functional efficiency. Methods Stomatal and leaf characteristics were measured for 1442 species from Argentina, Iran, Spain and the UK and, using PCA, some emergent ecological and taxonomic patterns identified. Subsequently, an assessment of the relationship between genome-size values obtained from the Plant DNA C-values database and measurements of stomatal size was carried out. Key Results Stomatal size is an ecologically important attribute. It varies with life-history (woody species < herbaceous species < vernal geophytes) and contributes to ecologically and physiologically important axes of leaf specialization. Moreover, it is positively correlated with genome size across a wide range of major taxa. Conclusions Stomatal size predicts genome size within angiosperms. Correlation is not, however, proof of causality and here our interpretation is hampered by unexpected deficiencies in the scientific literature. Firstly, there are discrepancies between our own observations and established ideas about the ecological significance of stomatal size; very large stomata, theoretically facilitating photosynthesis in deep shade, were, in this study (and in other studies), primarily associated with vernal geophytes of unshaded habitats. Secondly, the lower size limit at which stomata can function efficiently, and the ecological circumstances under which these minute stomata might occur, have not been satisfactorally resolved. Thus, our hypothesis, that the optimization of stomatal size for functional efficiency is a major ecological determinant of genome size, remains unproven. PMID:20375204
Reclassification of Paenibacillus riograndensis as a Genomovar of Paenibacillus sonchi: Genome-Based Metrics Improve Bacterial Taxonomic Classification

PubMed Central

Sant’Anna, Fernando H.; Ambrosini, Adriana; de Souza, Rocheli; de Carvalho Fernandes, Gabriela; Bach, Evelise; Balsanelli, Eduardo; Baura, Valter; Brito, Luciana F.; Wendisch, Volker F.; de Oliveira Pedrosa, Fábio; de Souza, Emanuel M.; Passaglia, Luciane M. P.

2017-01-01

Species from the genus Paenibacillus are widely studied due to their biotechnological relevance. Dozens of novel species descriptions of this genus were published in the last couple of years, but few utilized genomic data as classification criteria. Here, we demonstrate the importance of using genome-based metrics and phylogenetic analyses to identify and classify Paenibacillus strains. For this purpose, Paenibacillus riograndensis SBR5T, Paenibacillus sonchi X19-5T, and their close relatives were compared through phenotypic, genotypic, and genomic approaches. With respect to P. sonchi X19-5T, P. riograndensis SBR5T, Paenibacillus sp. CAR114, and Paenibacillus sp. CAS34 presented ANI (average nucleotide identity) values ranging from 95.61 to 96.32%, gANI (whole-genome average nucleotide identity) values ranging from 96.78 to 97.31%, and dDDH (digital DNA–DNA hybridization) values ranging from 68.2 to 73.2%. Phylogenetic analyses of 16S rRNA, gyrB, recA, recN, and rpoB genes and concatenated proteins supported the monophyletic origin of these Paenibacillus strains. Therefore, we propose to assign Paenibacillus sp. CAR114 and Paenibacillus sp. CAS34 to P. sonchi species, and reclassify P. riograndensis SBR5T as a later heterotypic synonym of P. sonchi (type strain X19-5T), with the creation of three novel genomovars, P. sonchi genomovar Sonchi (type strain X19-5T), P. sonchi genomovar Riograndensis (type strain SBR5T), P. sonchi genomovar Oryzarum (type strain CAS34T = DSM 102041T; = BR10511T). PMID:29046663
Precision genome engineering and agriculture: opportunities and regulatory challenges.

PubMed

Voytas, Daniel F; Gao, Caixia

2014-06-01

Plant agriculture is poised at a technological inflection point. Recent advances in genome engineering make it possible to precisely alter DNA sequences in living cells, providing unprecedented control over a plant's genetic material. Potential future crops derived through genome engineering include those that better withstand pests, that have enhanced nutritional value, and that are able to grow on marginal lands. In many instances, crops with such traits will be created by altering only a few nucleotides among the billions that comprise plant genomes. As such, and with the appropriate regulatory structures in place, crops created through genome engineering might prove to be more acceptable to the public than plants that carry foreign DNA in their genomes. Public perception and the performance of the engineered crop varieties will determine the extent to which this powerful technology contributes towards securing the world's food supply.
Chironomid midges (Diptera, chironomidae) show extremely small genome sizes.

PubMed

Cornette, Richard; Gusev, Oleg; Nakahara, Yuichi; Shimura, Sachiko; Kikawada, Takahiro; Okuda, Takashi

2015-06-01

Chironomid midges (Diptera; Chironomidae) are found in various environments from the high Arctic to the Antarctic, including temperate and tropical regions. In many freshwater habitats, members of this family are among the most abundant invertebrates. In the present study, the genome sizes of 25 chironomid species were determined by flow cytometry and the resulting C-values ranged from 0.07 to 0.20 pg DNA (i.e. from about 68 to 195 Mbp). These genome sizes were uniformly very small and included, to our knowledge, the smallest genome sizes recorded to date among insects. Small proportion of transposable elements and short intron sizes were suggested to contribute to the reduction of genome sizes in chironomids. We discuss about the possible developmental and physiological advantages of having a small genome size and about putative implications for the ecological success of the family Chironomidae.
Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database

PubMed Central

Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M.; Brown, Eric W.; Timme, Ruth

2016-01-01

The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. PMID:27008877
Overview of the creative genome: effects of genome structure and sequence on the generation of variation and evolution.

PubMed

Caporale, Lynn Helena

2012-09-01

This overview of a special issue of Annals of the New York Academy of Sciences discusses uneven distribution of distinct types of variation across the genome, the dependence of specific types of variation upon distinct classes of DNA sequences and/or the induction of specific proteins, the circumstances in which distinct variation-generating systems are activated, and the implications of this work for our understanding of evolution and of cancer. Also discussed is the value of non text-based computational methods for analyzing information carried by DNA, early insights into organizational frameworks that affect genome behavior, and implications of this work for comparative genomics. © 2012 New York Academy of Sciences.
Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

PubMed

de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

2016-08-01

Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected p<0.05), highly ranked gene-sets reaching suggestive significance including the dopamine receptor antagonists metoclopramide and trifluoperazine and the tyrosine kinase inhibitor neratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.
A genome-wide association meta-analysis identifies new childhood obesity loci

PubMed Central

Bradfield, Jonathan P.; Taal, H. Rob; Timpson, Nicholas J.; Scherag, André; Lecoeur, Cecile; Warrington, Nicole M.; Hypponen, Elina; Holst, Claus; Valcarcel, Beatriz; Thiering, Elisabeth; Salem, Rany M.; Schumacher, Fredrick R.; Cousminer, Diana L.; Sleiman, Patrick M.A.; Zhao, Jianhua; Berkowitz, Robert I.; Vimaleswaran, Karani S.; Jarick, Ivonne; Pennell, Craig E.; Evans, David M.; St. Pourcain, Beate; Berry, Diane J.; Mook-Kanamori, Dennis O; Hofman, Albert; Rivadeinera, Fernando; Uitterlinden, André G.; van Duijn, Cornelia M.; van der Valk, Ralf J.P.; de Jongste, Johan C.; Postma, Dirkje S.; Boomsma, Dorret I.; Gauderman, William J.; Hassanein, Mohamed T.; Lindgren, Cecilia M.; Mägi, Reedik; Boreham, Colin A.G.; Neville, Charlotte E.; Moreno, Luis A.; Elliott, Paul; Pouta, Anneli; Hartikainen, Anna-Liisa; Li, Mingyao; Raitakari, Olli; Lehtimäki, Terho; Eriksson, Johan G.; Palotie, Aarno; Dallongeville, Jean; Das, Shikta; Deloukas, Panos; McMahon, George; Ring, Susan M.; Kemp, John P.; Buxton, Jessica L.; Blakemore, Alexandra I.F.; Bustamante, Mariona; Guxens, Mònica; Hirschhorn, Joel N.; Gillman, Matthew W.; Kreiner-Møller, Eskil; Bisgaard, Hans; Gilliland, Frank D.; Heinrich, Joachim; Wheeler, Eleanor; Barroso, Inês; O'Rahilly, Stephen; Meirhaeghe, Aline; Sørensen, Thorkild I.A.; Power, Chris; Palmer, Lyle J.; Hinney, Anke; Widen, Elisabeth; Farooqi, I. Sadaf; McCarthy, Mark I.; Froguel, Philippe; Meyre, David; Hebebrand, Johannes; Jarvelin, Marjo-Riitta; Jaddoe, Vincent W.V.; Smith, George Davey; Hakonarson, Hakon; Grant, Struan F.A.

2012-01-01

Multiple genetic variants have been associated with adult obesity and a few with severe obesity in childhood; however, less progress has been made to establish genetic influences on common early-onset obesity. We performed a North American-Australian-European collaborative meta-analysis of fourteen studies consisting of 5,530 cases (≥95th percentile of body mass index (BMI)) and 8,318 controls (<50th percentile of BMI) of European ancestry. Taking forward the eight novel signals yielding association with P < 5×10−6 in to nine independent datasets (n = 2,818 cases and 4,083 controls) we observed two loci that yielded a genome wide significant combined P-value, namely near OLFM4 on 13q14 (rs9568856; P=1.82×10−9; OR=1.22) and within HOXB5 on 17q21 (rs9299; P=3.54×10−9; OR=1.14). Both loci continued to show association when including two extreme childhood obesity cohorts (n = 2,214 cases and 2,674 controls). Finally, these two loci yielded directionally consistent associations in the GIANT meta-analysis of adult BMI1. PMID:22484627
Engineering and Evolution of Saccharomyces cerevisiae to Produce Biofuels and Chemicals.

PubMed

Turner, Timothy L; Kim, Heejin; Kong, In Iok; Liu, Jing-Jing; Zhang, Guo-Chang; Jin, Yong-Su

To mitigate global climate change caused partly by the use of fossil fuels, the production of fuels and chemicals from renewable biomass has been attempted. The conversion of various sugars from renewable biomass into biofuels by engineered baker's yeast (Saccharomyces cerevisiae) is one major direction which has grown dramatically in recent years. As well as shifting away from fossil fuels, the production of commodity chemicals by engineered S. cerevisiae has also increased significantly. The traditional approaches of biochemical and metabolic engineering to develop economic bioconversion processes in laboratory and industrial settings have been accelerated by rapid advancements in the areas of yeast genomics, synthetic biology, and systems biology. Together, these innovations have resulted in rapid and efficient manipulation of S. cerevisiae to expand fermentable substrates and diversify value-added products. Here, we discuss recent and major advances in rational (relying on prior experimentally-derived knowledge) and combinatorial (relying on high-throughput screening and genomics) approaches to engineer S. cerevisiae for producing ethanol, butanol, 2,3-butanediol, fatty acid ethyl esters, isoprenoids, organic acids, rare sugars, antioxidants, and sugar alcohols from glucose, xylose, cellobiose, galactose, acetate, alginate, mannitol, arabinose, and lactose.
Genome scan of hybridizing sunflowers from Texas (Helianthus annuus and H. debilis) reveals asymmetric patterns of introgression and small islands of genomic differentiation.

PubMed

Scascitelli, M; Whitney, K D; Randell, R A; King, Matthew; Buerkle, C A; Rieseberg, L H

2010-02-01

Although the sexual transfer of genetic material between species (i.e. introgression) has been documented in many groups of plants and animals, genome-wide patterns of introgression are poorly understood. Is most of the genome permeable to interspecific gene flow, or is introgression typically restricted to a handful of genomic regions? Here, we assess the genomic extent and direction of introgression between three sunflowers from the south-central USA: the common sunflower, Helianthus annuus ssp. annuus; a near-endemic to Texas, Helianthus debilis ssp. cucumerifolius; and their putative hybrid derivative, thought to have recently colonized Texas, H. annuus ssp. texanus. Analyses of variation at 88 genetically mapped microsatellite loci revealed that long-term migration rates were high, genome-wide and asymmetric, with higher migration rates from H. annuus texanus into the two parental taxa than vice versa. These results imply a longer history of intermittent contact between H. debilis and H. annuus than previously believed, and that H. annuus texanus may serve as a bridge for the transfer of alleles between its parental taxa. They also contradict recent theory suggesting that introgression should predominantly be in the direction of the colonizing species. As in previous studies of hybridizing sunflower species, regions of genetic differentiation appear small, whether estimated in terms of FST or unidirectional migration rates. Estimates of recent immigration and admixture were inconsistent, depending on the type of analysis. At the individual locus level, one marker showed striking asymmetry in migration rates, a pattern consistent with tight linkage to a Bateson-Dobzhansky-Muller incompatibility.
Newborn Screening in the Era of Precision Medicine.

PubMed

Yang, Lan; Chen, Jiajia; Shen, Bairong

2017-01-01

As newborn screening success stories gained general confirmation during the past 50 years, scientists quickly discovered diagnostic tests for a host of genetic disorders that could be treated at birth. Outstanding progress in sequencing technologies over the last two decades has made it possible to comprehensively profile newborn screening (NBS) and identify clinically relevant genomic alterations. With the rapid developments in whole-genome sequencing (WGS) and whole-exome sequencing (WES) recently, we can detect newborns at the genomic level and be able to direct the appropriate diagnosis to the different individuals at the appropriate time, which is also encompassed in the concept of precision medicine. Besides, we can develop novel interventions directed at the molecular characteristics of genetic diseases in newborns. The implementation of genomics in NBS programs would provide an effective premise for the identification of the majority of genetic aberrations and primarily help in accurate guidance in treatment and better prediction. However, there are some debate correlated with the widespread application of genome sequencing in NBS due to some major concerns such as clinical analysis, result interpretation, storage of sequencing data, and communication of clinically relevant mutations to pediatricians and parents, along with the ethical, legal, and social implications (so-called ELSI). This review is focused on these critical issues and concerns about the expanding role of genomics in NBS for precision medicine. If WGS or WES is to be incorporated into NBS practice, considerations about these challenges should be carefully regarded and tackled properly to adapt the requirement of genome sequencing in the era of precision medicine.
Adaptation to Low Salinity Promotes Genomic Divergence in Atlantic Cod (Gadus morhua L.)

PubMed Central

Berg, Paul R.; Jentoft, Sissel; Star, Bastiaan; Ring, Kristoffer H.; Knutsen, Halvor; Lien, Sigbjørn; Jakobsen, Kjetill S.; André, Carl

2015-01-01

How genomic selection enables species to adapt to divergent environments is a fundamental question in ecology and evolution. We investigated the genomic signatures of local adaptation in Atlantic cod (Gadus morhua L.) along a natural salinity gradient, ranging from 35‰ in the North Sea to 7‰ within the Baltic Sea. By utilizing a 12 K SNPchip, we simultaneously assessed neutral and adaptive genetic divergence across the Atlantic cod genome. Combining outlier analyses with a landscape genomic approach, we identified a set of directionally selected loci that are strongly correlated with habitat differences in salinity, oxygen, and temperature. Our results show that discrete regions within the Atlantic cod genome are subject to directional selection and associated with adaptation to the local environmental conditions in the Baltic- and the North Sea, indicating divergence hitchhiking and the presence of genomic islands of divergence. We report a suite of outlier single nucleotide polymorphisms within or closely located to genes associated with osmoregulation, as well as genes known to play important roles in the hydration and development of oocytes. These genes are likely to have key functions within a general osmoregulatory framework and are important for the survival of eggs and larvae, contributing to the buildup of reproductive isolation between the low-salinity adapted Baltic cod and the adjacent cod populations. Hence, our data suggest that adaptive responses to the environmental conditions in the Baltic Sea may contribute to a strong and effective reproductive barrier, and that Baltic cod can be viewed as an example of ongoing speciation. PMID:25994933
Exact Solution of Mutator Model with Linear Fitness and Finite Genome Length

NASA Astrophysics Data System (ADS)

Saakian, David B.

2017-08-01

We considered the infinite population version of the mutator phenomenon in evolutionary dynamics, looking at the uni-directional mutations in the mutator-specific genes and linear selection. We solved exactly the model for the finite genome length case, looking at the quasispecies version of the phenomenon. We calculated the mutator probability both in the statics and dynamics. The exact solution is important for us because the mutator probability depends on the genome length in a highly non-trivial way.

Integration of Genomic, Biologic, and Chemical Approaches to Target p53 Loss and Gain-of-Function in Triple Negative Breast Cancer

DTIC Science & Technology

2016-09-01

in this progress report: p53 triple-negative breast cancer subtypes gene expression somatic cell genetics CRISPR /Cas 3. ACCOMPLISHMENTS Major...report, we described the creation of an isogenic p53 mutant TNBC cell line panel using CRISPR /Cas-mediated genome editing8 and the resultant...LOF null state. To validate that mutant p53 is directly responsible for this altered transcription, we will use the same CRISPR -mediated genome
Genomic control of patterning

PubMed Central

Peter, Isabelle S.; Davidson, Eric H.

2014-01-01

The development of multicellular organisms involves the partitioning of the organism into territories of cells of specific structure and function. The information for spatial patterning processes is directly encoded in the genome. The genome determines its own usage depending on stage and position, by means of interactions that constitute gene regulatory networks (GRNs). The GRN driving endomesoderm development in sea urchin embryos illustrates different regulatory strategies by which developmental programs are initiated, orchestrated, stabilized or excluded to define the pattern of specified territories in the developing embryo. PMID:19378258
Genome Improvement at JGI-HAGSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less
Multiplexed genome engineering and genotyping methods applications for synthetic biology and metabolic engineering.

PubMed

Wang, Harris H; Church, George M

2011-01-01

Engineering at the scale of whole genomes requires fundamentally new molecular biology tools. Recent advances in recombineering using synthetic oligonucleotides enable the rapid generation of mutants at high efficiency and specificity and can be implemented at the genome scale. With these techniques, libraries of mutants can be generated, from which individuals with functionally useful phenotypes can be isolated. Furthermore, populations of cells can be evolved in situ by directed evolution using complex pools of oligonucleotides. Here, we discuss ways to utilize these multiplexed genome engineering methods, with special emphasis on experimental design and implementation. Copyright © 2011 Elsevier Inc. All rights reserved.
A HIGH COVERAGE GENOME SEQUENCE FROM AN ARCHAIC DENISOVAN INDIVIDUAL

PubMed Central

Meyer, Matthias; Kircher, Martin; Gansauge, Marie-Theres; Li, Heng; Racimo, Fernando; Mallick, Swapan; Schraiber, Joshua G.; Jay, Flora; Prüfer, Kay; de Filippo, Cesare; Sudmant, Peter H.; Alkan, Can; Fu, Qiaomei; Do, Ron; Rohland, Nadin; Tandon, Arti; Siebauer, Michael; Green, Richard E.; Bryc, Katarzyna; Briggs, Adrian W.; Stenzel, Udo; Dabney, Jesse; Shendure, Jay; Kitzman, Jacob; Hammer, Michael F.; Shunkov, Michael V.; Derevianko, Anatoli P.; Patterson, Nick; Andrés, Aida M.; Eichler, Evan E.; Slatkin, Montgomery; Reich, David; Kelso, Janet; Pääbo, Svante

2013-01-01

We present a DNA library preparation method that has allowed us to reconstruct a high coverage (30X) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of “missing evolution” in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans. PMID:22936568
Therapeutic Genome Editing: Prospects and Challenges

PubMed Central

Cox, David Benjamin Turitz; Platt, Randall Jeffrey; Zhang, Feng

2015-01-01

Recent advances in the development of genome editing technologies based on programmable nucleases have significantly improved our ability to make precise changes in the genomes of eukaryotic cells. Genome editing is already broadening our ability to elucidate the contribution of genetics to disease by facilitating the creation of more accurate cellular and animal models of pathological processes. A particularly tantalizing application of programmable nucleases is the potential to directly correct genetic mutations in affected tissues and cells to treat diseases that are refractory to traditional therapies. Here we discuss current progress towards developing programmable nuclease-based therapies as well as future prospects and challenges. PMID:25654603
Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

PubMed

Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

2017-08-01

Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.
On the molecular mechanism of GC content variation among eubacterial genomes.

PubMed

Wu, Hao; Zhang, Zhang; Hu, Songnian; Yu, Jun

2012-01-10

As a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes. Previously, we classified eubacteria into dnaE-based groups (the dimeric combination of DNA polymerase III alpha subunits), according to a hypothesis where GC content variation is essentially governed by genome replication and DNA repair mechanisms. Further investigation led to the discovery that two major mutator genes, polC and dnaE2, may be responsible for genomic GC content variation. Consequently, an in-depth analysis was conducted to evaluate various potential intrinsic and extrinsic factors in association with GC content variation among eubacterial genomes. Mutator genes, especially those with dominant effects on the mutation spectra, are biased towards either GC or AT richness, and they alter genomic GC content in the two opposite directions. Increased bacterial genome size (or gene number) appears to rely on increased genomic GC content; however, it is unclear whether the changes are directly related to certain environmental pressures. Certain environmental and bacteriological features are related to GC content variation, but their trends are more obvious when analyzed under the dnaE-based grouping scheme. Most terrestrial, plant-associated, and nitrogen-fixing bacteria are members of the dnaE1|dnaE2 group, whereas most pathogenic or symbiotic bacteria in insects, and those dwelling in aquatic environments, are largely members of the dnaE1|polV group. Our studies provide several lines of evidence indicating that DNA polymerase III α subunit and its isoforms participating in either replication (such as polC) or SOS mutagenesis/translesion synthesis (such as dnaE2), play dominant roles in determining GC variability. Other environmental or bacteriological factors, such as genome size, temperature, oxygen requirement, and habitat, either play subsidiary roles or rely indirectly on different mutator genes to fine-tune the GC content. These results provide a comprehensive insight into mechanisms of GC content variation and the robustness of eubacterial genomes in adapting their ever-changing environments over billions of years.
A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy

PubMed Central

Brenton, Zachary W.; Cooper, Elizabeth A.; Myers, Mathew T.; Boyles, Richard E.; Shakoor, Nadia; Zielinski, Kelsey J.; Rauh, Bradley L.; Bridges, William C.; Morris, Geoffrey P.; Kresovich, Stephen

2016-01-01

With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production. PMID:27356613
Genomic testing interacts with reproductive surplus in reducing genetic lag and increasing economic net return.

PubMed

Hjortø, L; Ettema, J F; Kargo, M; Sørensen, A C

2015-01-01

Until now, genomic information has mainly been used to improve the accuracy of genomic breeding values for breeding animals at a population level. However, we hypothesize that the use of information from genotyped females also opens up the possibility of reducing genetic lag in a dairy herd, especially if genomic tests are used in combination with sexed semen or a high management level for reproductive performance, because both factors provide the opportunity for generating a reproductive surplus in the herd. In this study, sexed semen is used in combination with beef semen to produce high-value crossbred beef calves. Thus, on average there is no surplus of and selection among replacement heifers whether to go into the herd or to be sold. In this situation, the selection opportunities arise when deciding which cows to inseminate with sexed semen, conventional semen, or beef semen. We tested the hypothesis by combining the results of 2 stochastic simulation programs, SimHerd and ADAM. SimHerd estimates the economic effect of different strategies for use of sexed semen and beef semen at 3 levels of reproductive performance in a dairy herd. Besides simulating the operational return, SimHerd also simulates the parity distribution of the dams of heifer calves. The ADAM program estimates genetic merit per year in a herd under different strategies for use of sexed semen and genomic tests. The annual net return per slot was calculated as the sum of operational return and value of genetic lag minus costs of genomic tests divided by the total number of slots. Our results showed that the use of genomic tests for decision making decreases genetic lag by as much as 0.14 genetic standard deviation units of the breeding goal and that genetic lag decreases even more (up to 0.30 genetic standard deviation units) when genomic tests are used in combination with strategies for increasing and using a reproductive surplus. Thus, our hypothesis was supported. We also observed that genomic tests are used most efficiently to decrease genetic lag when the genomic information is used more than once in the lifetime of an animal and when as many selection decisions as possible are based on genomic information. However, all breakeven prices were lower than or equal to €50, which is the current price of low-density chip genotyping in Denmark, Finland, and Sweden, so in the vast majority of cases, it is not profitable to genotype routinely for management purposes under the present price assumptions. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Genomics of the hop psuedo-autosomal regions

USDA-ARS?s Scientific Manuscript database

Hop is one of the few crop species with female and male plants with sex being determined by either XX or XY chromosomes. Hop cones are only produced in female hops with or without fertilization. This has lead to most genomic research being directed toward female plants. Very little work has been don...
Cracking the Genetic Code | NIH MedlinePlus the Magazine

MedlinePlus

... how do you approach that? Now, with sequencing technologies that allow you to sequence an entire genome for $10,000 in less than a week, you can really begin to see what's there. JEFFREY BROWN: But you've said that the Human Genome Project has not yet directly affected the health care ...
Characterization of the genomic organization of the region bordering the centromere of chromosome V of Podospora anserina by direct sequencing.

PubMed

Silar, Philippe; Barreau, Christian; Debuchy, Robert; Kicka, Sébastien; Turcq, Béatrice; Sainsard-Chanet, Annie; Sellem, Carole H; Billault, Alain; Cattolico, Laurence; Duprat, Simone; Weissenbach, Jean

2003-08-01

A Podospora anserina BAC library of 4800 clones has been constructed in the vector pBHYG allowing direct selection in fungi. Screening of the BAC collection for centromeric sequences of chromosome V allowed the recovery of clones localized on either sides of the centromere, but no BAC clone was found to contain the centromere. Seven BAC clones containing 322,195 and 156,244bp from either sides of the centromeric region were sequenced and annotated. One 5S rRNA gene, 5 tRNA genes, and 163 putative coding sequences (CDS) were identified. Among these, only six CDS seem specific to P. anserina. The gene density in the centromeric region is approximately one gene every 2.8kb. Extrapolation of this gene density to the whole genome of P. anserina suggests that the genome contains about 11,000 genes. Synteny analyses between P. anserina and Neurospora crassa show that co-linearity extends at the most to a few genes, suggesting rapid genome rearrangements between these two species.
Genetic signatures of natural selection in a model invasive ascidian

NASA Astrophysics Data System (ADS)

Lin, Yaping; Chen, Yiyong; Yi, Changho; Fong, Jonathan J.; Kim, Won; Rius, Marc; Zhan, Aibin

2017-03-01

Invasive species represent promising models to study species’ responses to rapidly changing environments. Although local adaptation frequently occurs during contemporary range expansion, the associated genetic signatures at both population and genomic levels remain largely unknown. Here, we use genome-wide gene-associated microsatellites to investigate genetic signatures of natural selection in a model invasive ascidian, Ciona robusta. Population genetic analyses of 150 individuals sampled in Korea, New Zealand, South Africa and Spain showed significant genetic differentiation among populations. Based on outlier tests, we found high incidence of signatures of directional selection at 19 loci. Hitchhiking mapping analyses identified 12 directional selective sweep regions, and all selective sweep windows on chromosomes were narrow (~8.9 kb). Further analyses indentified 132 candidate genes under selection. When we compared our genetic data and six crucial environmental variables, 16 putatively selected loci showed significant correlation with these environmental variables. This suggests that the local environmental conditions have left significant signatures of selection at both population and genomic levels. Finally, we identified “plastic” genomic regions and genes that are promising regions to investigate evolutionary responses to rapid environmental change in C. robusta.
Methods and Applications of CRISPR-Mediated Base Editing in Eukaryotic Genomes.

PubMed

Hess, Gaelen T; Tycko, Josh; Yao, David; Bassik, Michael C

2017-10-05

The past several years have seen an explosion in development of applications for the CRISPR-Cas9 system, from efficient genome editing, to high-throughput screening, to recruitment of a range of DNA and chromatin-modifying enzymes. While homology-directed repair (HDR) coupled with Cas9 nuclease cleavage has been used with great success to repair and re-write genomes, recently developed base-editing systems present a useful orthogonal strategy to engineer nucleotide substitutions. Base editing relies on recruitment of cytidine deaminases to introduce changes (rather than double-stranded breaks and donor templates) and offers potential improvements in efficiency while limiting damage and simplifying the delivery of editing machinery. At the same time, these systems enable novel mutagenesis strategies to introduce sequence diversity for engineering and discovery. Here, we review the different base-editing platforms, including their deaminase recruitment strategies and editing outcomes, and compare them to other CRISPR genome-editing technologies. Additionally, we discuss how these systems have been applied in therapeutic, engineering, and research settings. Lastly, we explore future directions of this emerging technology. Copyright © 2017 Elsevier Inc. All rights reserved.
Data compression and genomes: a two-dimensional life domain map.

PubMed

Menconi, Giulia; Benci, Vieri; Buiatti, Marcello

2008-07-21

We define the complexity of DNA sequences as the information content per nucleotide, calculated by means of some Lempel-Ziv data compression algorithm. It is possible to use the statistics of the complexity values of the functional regions of different complete genomes to distinguish among genomes of different domains of life (Archaea, Bacteria and Eukarya). We shall focus on the distribution function of the complexity of non-coding regions. We show that the three domains may be plotted in separate regions within the two-dimensional space where the axes are the skewness coefficient and the curtosis coefficient of the aforementioned distribution. Preliminary results on 15 genomes are introduced.
The complete chloroplast genome sequence of the medicinal plant Andrographis paniculata.

PubMed

Ding, Ping; Shao, Yanhua; Li, Qian; Gao, Junli; Zhang, Runjing; Lai, Xiaoping; Wang, Deqin; Zhang, Huiye

2016-07-01

The complete chloroplast genome of Andrographis paniculata, an important medicinal plant with great economic value, has been studied in this article. The genome size is 150,249 bp in length, with 38.3% GC content. A pair of inverted repeats (IRs, 25,300 bp) are separated by a large single copy region (LSC, 82,459 bp) and a small single-copy region (SSC, 17,190 bp). The chloroplast genome contains 114 unique genes, 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes. In these genes, 15 genes contained 1 intron and 3 genes comprised of 2 introns.
The complete chloroplast genome sequence of Dendrobium nobile.

PubMed

Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu

2016-11-01

The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.
Multimedia presentations on the human genome: Implementation and assessment of a teaching program for the introduction to genome science using a poster and animations.

PubMed

Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto

2008-11-01

Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from Mendel's peas not only because many students do not understand that plants are organisms, but also because human biology contains important social and health issues. Therefore, we have developed a teaching program for the introduction to genome science, whose subjects are focused on the human genome. This program comprises mixed multimedia presentations: a large poster with illustrations and text on the human genome (a human genome map for every home), and animations on the basics of genome science. We implemented and assessed this program at four high schools. Our results indicate that students felt that they learned about the human genome from the program and some increases in students' understanding were observed with longer exposure to the mixed multimedia presentations. Copyright © 2008 International Union of Biochemistry and Molecular Biology, Inc.
Cell-Penetrating Peptide-Mediated Delivery of Cas9 Protein and Guide RNA for Genome Editing.

PubMed

Suresh, Bharathi; Ramakrishna, Suresh; Kim, Hyongbum

2017-01-01

The clustered, regularly interspaced, short palindromic repeat (CRISPR)-associated (Cas) system represents an efficient tool for genome editing. It consists of two components: the Cas9 protein and a guide RNA. To date, delivery of these two components has been achieved using either plasmid or viral vectors or direct delivery of protein and RNA. Plasmid- and virus-free direct delivery of Cas9 protein and guide RNA has several advantages over the conventional plasmid-mediated approach. Direct delivery results in shorter exposure time at the cellular level, which in turn leads to lower toxicity and fewer off-target mutations with reduced host immune responses, whereas plasmid- or viral vector-mediated delivery can result in uncontrolled integration of the vector sequence into the host genome and unwanted immune responses. Cell-penetrating peptide (CPP), a peptide that has an intrinsic ability to translocate across cell membranes, has been adopted as a means of achieving efficient Cas9 protein and guide RNA delivery. We developed a method for treating human cell lines with CPP-conjugated recombinant Cas9 protein and CPP-complexed guide RNAs that leads to endogenous gene disruption. Here we describe a protocol for preparing an efficient CPP-conjugated recombinant Cas9 protein and CPP-complexed guide RNAs, as well as treatment methods to achieve safe genome editing in human cell lines.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.