Sample records for implementing genomic selection

  1. Goals and hurdles for a successful implementation of genomic selection in breeding programme for selected annual and perennial crops.

    PubMed

    Jonas, Elisabeth; de Koning, Dirk Jan

    Genomic Selection is an important topic in quantitative genetics and breeding. Not only does it allow the full use of current molecular genetic technologies, it stimulates also the development of new methods and models. Genomic selection, if fully implemented in commercial farming, should have a major impact on the productivity of various agricultural systems. But suggested approaches need to be applicable in commercial breeding populations. Many of the published research studies focus on methodologies. We conclude from the reviewed publications, that a stronger focus on strategies for the implementation of genomic selection in advanced breeding lines, introduction of new varieties, hybrids or multi-line crosses is needed. Efforts to find solutions for a better prediction and integration of environmental influences need to continue within applied breeding schemes. Goals of the implementation of genomic selection into crop breeding should be carefully defined and crop breeders in the private sector will play a substantial part in the decision-making process. However, the lack of published results from studies within, or in collaboration with, private companies diminishes the knowledge on the status of genomic selection within applied breeding programmes. Studies on the implementation of genomic selection in plant breeding need to evaluate models and methods with an enhanced emphasis on population-specific requirements and production environments. Adaptation of methods to breeding schemes or changes to breeding programmes for a better integration of genomic selection strategies are needed across species. More openness with a continuous exchange will contribute to successes.

  2. Advances and Challenges in Genomic Selection for Disease Resistance.

    PubMed

    Poland, Jesse; Rutkoski, Jessica

    2016-08-04

    Breeding for disease resistance is a central focus of plant breeding programs, as any successful variety must have the complete package of high yield, disease resistance, agronomic performance, and end-use quality. With the need to accelerate the development of improved varieties, genomics-assisted breeding is becoming an important tool in breeding programs. With marker-assisted selection, there has been success in breeding for disease resistance; however, much of this work and research has focused on identifying, mapping, and selecting for major resistance genes that tend to be highly effective but vulnerable to breakdown with rapid changes in pathogen races. In contrast, breeding for minor-gene quantitative resistance tends to produce more durable varieties but is a more challenging breeding objective. As the genetic architecture of resistance shifts from single major R genes to a diffused architecture of many minor genes, the best approach for molecular breeding will shift from marker-assisted selection to genomic selection. Genomics-assisted breeding for quantitative resistance will therefore necessitate whole-genome prediction models and selection methodology as implemented for classical complex traits such as yield. Here, we examine multiple case studies testing whole-genome prediction models and genomic selection for disease resistance. In general, whole-genome models for disease resistance can produce prediction accuracy suitable for application in breeding. These models also largely outperform multiple linear regression as would be applied in marker-assisted selection. With the implementation of genomic selection for yield and other agronomic traits, whole-genome marker profiles will be available for the entire set of breeding lines, enabling genomic selection for disease at no additional direct cost. In this context, the scope of implementing genomics selection for disease resistance, and specifically for quantitative resistance and quarantined pathogens, becomes a tractable and powerful approach in breeding programs.

  3. Whole-genome regression and prediction methods applied to plant and animal breeding.

    PubMed

    de Los Campos, Gustavo; Hickey, John M; Pong-Wong, Ricardo; Daetwyler, Hans D; Calus, Mario P L

    2013-02-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade.

  4. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding

    PubMed Central

    de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.

    2013-01-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228

  5. Efficient use of historical data for genomic selection: a case study of rust resistance in wheat

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) is a new methodology that can improve wheat breeding efficiency. To implement GS, a training population (TP) with both phenotypic and genotypic data is required to train a statistical model used to predict genotyped selection candidates (SCs). Several factors impact prediction...

  6. Prospects for genomic selection in cassava breeding

    USDA-ARS?s Scientific Manuscript database

    Cassava (Manihot esculenta Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa in order to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on p...

  7. The current state of implementation science in genomic medicine: opportunities for improvement.

    PubMed

    Roberts, Megan C; Kennedy, Amy E; Chambers, David A; Khoury, Muin J

    2017-08-01

    The objective of this study was to identify trends and gaps in the field of implementation science in genomic medicine. We conducted a literature review using the Centers for Disease Control and Prevention's Public Health Genomics Knowledge Base to examine the current literature in the field of implementation science in genomic medicine. We selected original research articles based on specific inclusion criteria and then abstracted information about study design, genomic medicine, and implementation outcomes. Data were aggregated, and trends and gaps in the literature were discussed. Our final review encompassed 283 articles published in 2014, the majority of which described uptake (35.7%, n = 101) and preferences (36.4%, n = 103) regarding genomic technologies, particularly oncology (35%, n = 99). Key study design elements, such as racial/ethnic composition of study populations, were underreported in studies. Few studies incorporated implementation science theoretical frameworks, sustainability measures, or capacity building. Although genomic discovery provides the potential for population health benefit, the current knowledge base around implementation to turn this promise into a reality is severely limited. Current gaps in the literature demonstrate a need to apply implementation science principles to genomic medicine in order to deliver on the promise of precision medicine.Genet Med advance online publication 12 January 2017.

  8. Genome-wide selection components analysis in a fish with male pregnancy.

    PubMed

    Flanagan, Sarah P; Jones, Adam G

    2017-04-01

    A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  9. swga: a primer design toolkit for selective whole genome amplification.

    PubMed

    Clarke, Erik L; Sundararaman, Sesh A; Seifert, Stephanie N; Bushman, Frederic D; Hahn, Beatrice H; Brisson, Dustin

    2017-07-15

    Population genomic analyses are often hindered by difficulties in obtaining sufficient numbers of genomes for analysis by DNA sequencing. Selective whole-genome amplification (SWGA) provides an efficient approach to amplify microbial genomes from complex backgrounds for sequence acquisition. However, the process of designing sets of primers for this method has many degrees of freedom and would benefit from an automated process to evaluate the vast number of potential primer sets. Here, we present swga , a program that identifies primer sets for SWGA and evaluates them for efficiency and selectivity. We used swga to design and test primer sets for the selective amplification of Wolbachia pipientis genomic DNA from infected Drosophila melanogaster and Mycobacterium tuberculosis from human blood. We identify primer sets that successfully amplify each against their backgrounds and describe a general method for using swga for arbitrary targets. In addition, we describe characteristics of primer sets that correlate with successful amplification, and present guidelines for implementation of SWGA to detect new targets. Source code and documentation are freely available on https://www.github.com/eclarke/swga . The program is implemented in Python and C and licensed under the GNU Public License. ecl@mail.med.upenn.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  10. Does genomic selection have a future in plant breeding?

    PubMed

    Jonas, Elisabeth; de Koning, Dirk-Jan

    2013-09-01

    Plant breeding largely depends on phenotypic selection in plots and only for some, often disease-resistance-related traits, uses genetic markers. The more recently developed concept of genomic selection, using a black box approach with no need of prior knowledge about the effect or function of individual markers, has also been proposed as a great opportunity for plant breeding. Several empirical and theoretical studies have focused on the possibility to implement this as a novel molecular method across various species. Although we do not question the potential of genomic selection in general, in this Opinion, we emphasize that genomic selection approaches from dairy cattle breeding cannot be easily applied to complex plant breeding. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. Genome-wide regression and prediction with the BGLR statistical package.

    PubMed

    Pérez, Paulino; de los Campos, Gustavo

    2014-10-01

    Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.

  12. Mitigation of inbreeding while preserving genetic gain in genomic breeding programs for outbred plants.

    PubMed

    Lin, Zibei; Shi, Fan; Hayes, Ben J; Daetwyler, Hans D

    2017-05-01

    Heuristic genomic inbreeding controls reduce inbreeding in genomic breeding schemes without reducing genetic gain. Genomic selection is increasingly being implemented in plant breeding programs to accelerate genetic gain of economically important traits. However, it may cause significant loss of genetic diversity when compared with traditional schemes using phenotypic selection. We propose heuristic strategies to control the rate of inbreeding in outbred plants, which can be categorised into three types: controls during mate allocation, during selection, and simultaneous selection and mate allocation. The proposed mate allocation measure GminF allocates two or more parents for mating in mating groups that minimise coancestry using a genomic relationship matrix. Two types of relationship-adjusted genomic breeding values for parent selection candidates ([Formula: see text]) and potential offspring ([Formula: see text]) are devised to control inbreeding during selection and even enabling simultaneous selection and mate allocation. These strategies were tested in a case study using a simulated perennial ryegrass breeding scheme. As compared to the genomic selection scheme without controls, all proposed strategies could significantly decrease inbreeding while achieving comparable genetic gain. In particular, the scenario using [Formula: see text] in simultaneous selection and mate allocation reduced inbreeding to one-third of the original genomic selection scheme. The proposed strategies are readily applicable in any outbred plant breeding program.

  13. Use of qualitative environmental and phenotypic variables in the context of allele distribution models: detecting signatures of selection in the genome of Lake Victoria cichlids.

    PubMed

    Joost, Stéphane; Kalbermatten, Michael; Bezault, Etienne; Seehausen, Ole

    2012-01-01

    When searching for loci possibly under selection in the genome, an alternative to population genetics theoretical models is to establish allele distribution models (ADM) for each locus to directly correlate allelic frequencies and environmental variables such as precipitation, temperature, or sun radiation. Such an approach implementing multiple logistic regression models in parallel was implemented within a computing program named MATSAM: . Recently, this application was improved in order to support qualitative environmental predictors as well as to permit the identification of associations between genomic variation and individual phenotypes, allowing the detection of loci involved in the genetic architecture of polymorphic characters. Here, we present the corresponding methodological developments and compare the results produced by software implementing population genetics theoretical models (DFDIST: and BAYESCAN: ) and ADM (MATSAM: ) in an empirical context to detect signatures of genomic divergence associated with speciation in Lake Victoria cichlid fishes.

  14. Exploitation of data from breeding programs supports rapid implementation of genomic selection for key agronomic traits in perennial ryegrass.

    PubMed

    Pembleton, Luke W; Inch, Courtney; Baillie, Rebecca C; Drayton, Michelle C; Thakur, Preeti; Ogaji, Yvonne O; Spangenberg, German C; Forster, John W; Daetwyler, Hans D; Cogan, Noel O I

    2018-06-02

    Exploitation of data from a ryegrass breeding program has enabled rapid development and implementation of genomic selection for sward-based biomass yield with a twofold-to-threefold increase in genetic gain. Genomic selection, which uses genome-wide sequence polymorphism data and quantitative genetics techniques to predict plant performance, has large potential for the improvement in pasture plants. Major factors influencing the accuracy of genomic selection include the size of reference populations, trait heritability values and the genetic diversity of breeding populations. Global diversity of the important forage species perennial ryegrass is high and so would require a large reference population in order to achieve moderate accuracies of genomic selection. However, diversity of germplasm within a breeding program is likely to be lower. In addition, de novo construction and characterisation of reference populations are a logistically complex process. Consequently, historical phenotypic records for seasonal biomass yield and heading date over a 18-year period within a commercial perennial ryegrass breeding program have been accessed, and target populations have been characterised with a high-density transcriptome-based genotyping-by-sequencing assay. Ability to predict observed phenotypic performance in each successive year was assessed by using all synthetic populations from previous years as a reference population. Moderate and high accuracies were achieved for the two traits, respectively, consistent with broad-sense heritability values. The present study represents the first demonstration and validation of genomic selection for seasonal biomass yield within a diverse commercial breeding program across multiple years. These results, supported by previous simulation studies, demonstrate the ability to predict sward-based phenotypic performance early in the process of individual plant selection, so shortening the breeding cycle, increasing the rate of genetic gain and allowing rapid adoption in ryegrass improvement programs.

  15. Genomic assisted selection for enhancing line breeding: merging genomic and phenotypic selection in winter wheat breeding programs with preliminary yield trials.

    PubMed

    Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Akgöl, Batuhan; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann

    2017-02-01

    Early generation genomic selection is superior to conventional phenotypic selection in line breeding and can be strongly improved by including additional information from preliminary yield trials. The selection of lines that enter resource-demanding multi-environment trials is a crucial decision in every line breeding program as a large amount of resources are allocated for thoroughly testing these potential varietal candidates. We compared conventional phenotypic selection with various genomic selection approaches across multiple years as well as the merit of integrating phenotypic information from preliminary yield trials into the genomic selection framework. The prediction accuracy using only phenotypic data was rather low (r = 0.21) for grain yield but could be improved by modeling genetic relationships in unreplicated preliminary yield trials (r = 0.33). Genomic selection models were nevertheless found to be superior to conventional phenotypic selection for predicting grain yield performance of lines across years (r = 0.39). We subsequently simplified the problem of predicting untested lines in untested years to predicting tested lines in untested years by combining breeding values from preliminary yield trials and predictions from genomic selection models by a heritability index. This genomic assisted selection led to a 20% increase in prediction accuracy, which could be further enhanced by an appropriate marker selection for both grain yield (r = 0.48) and protein content (r = 0.63). The easy to implement and robust genomic assisted selection gave thus a higher prediction accuracy than either conventional phenotypic or genomic selection alone. The proposed method took the complex inheritance of both low and high heritable traits into account and appears capable to support breeders in their selection decisions to develop enhanced varieties more efficiently.

  16. Non-additive Effects in Genomic Selection

    PubMed Central

    Varona, Luis; Legarra, Andres; Toro, Miguel A.; Vitezica, Zulma G.

    2018-01-01

    In the last decade, genomic selection has become a standard in the genetic evaluation of livestock populations. However, most procedures for the implementation of genomic selection only consider the additive effects associated with SNP (Single Nucleotide Polymorphism) markers used to calculate the prediction of the breeding values of candidates for selection. Nevertheless, the availability of estimates of non-additive effects is of interest because: (i) they contribute to an increase in the accuracy of the prediction of breeding values and the genetic response; (ii) they allow the definition of mate allocation procedures between candidates for selection; and (iii) they can be used to enhance non-additive genetic variation through the definition of appropriate crossbreeding or purebred breeding schemes. This study presents a review of methods for the incorporation of non-additive genetic effects into genomic selection procedures and their potential applications in the prediction of future performance, mate allocation, crossbreeding, and purebred selection. The work concludes with a brief outline of some ideas for future lines of that may help the standard inclusion of non-additive effects in genomic selection. PMID:29559995

  17. Non-additive Effects in Genomic Selection.

    PubMed

    Varona, Luis; Legarra, Andres; Toro, Miguel A; Vitezica, Zulma G

    2018-01-01

    In the last decade, genomic selection has become a standard in the genetic evaluation of livestock populations. However, most procedures for the implementation of genomic selection only consider the additive effects associated with SNP (Single Nucleotide Polymorphism) markers used to calculate the prediction of the breeding values of candidates for selection. Nevertheless, the availability of estimates of non-additive effects is of interest because: (i) they contribute to an increase in the accuracy of the prediction of breeding values and the genetic response; (ii) they allow the definition of mate allocation procedures between candidates for selection; and (iii) they can be used to enhance non-additive genetic variation through the definition of appropriate crossbreeding or purebred breeding schemes. This study presents a review of methods for the incorporation of non-additive genetic effects into genomic selection procedures and their potential applications in the prediction of future performance, mate allocation, crossbreeding, and purebred selection. The work concludes with a brief outline of some ideas for future lines of that may help the standard inclusion of non-additive effects in genomic selection.

  18. Assessing Predictive Properties of Genome-Wide Selection in Soybeans

    PubMed Central

    Xavier, Alencar; Muir, William M.; Rainey, Katy Martin

    2016-01-01

    Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set. PMID:27317786

  19. Genomic selection needs to be carefully assessed to meet specific requirements in livestock breeding programs

    PubMed Central

    Jonas, Elisabeth; de Koning, Dirk-Jan

    2015-01-01

    Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection (GS) in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies). It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating GS into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken, and fish. It outlines tasks to help understanding possible consequences when applying genomic information in breeding scenarios. PMID:25750652

  20. Genomic selection needs to be carefully assessed to meet specific requirements in livestock breeding programs.

    PubMed

    Jonas, Elisabeth; de Koning, Dirk-Jan

    2015-01-01

    Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection (GS) in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies). It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating GS into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken, and fish. It outlines tasks to help understanding possible consequences when applying genomic information in breeding scenarios.

  1. Dairy cattle genomics evaluation program update

    USDA-ARS?s Scientific Manuscript database

    Implementation of genomic evaluation has caused profound changes in dairy cattle breeding. All young bulls bought by major artificial-insemination organizations now are selected based on these evaluation. Evaluation reliability can reach ~75% for yield traits, which is adequate for marketing semen o...

  2. GAPIT: genome association and prediction integrated tool.

    PubMed

    Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu

    2012-09-15

    Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. zhiwu.zhang@cornell.edu Supplementary data are available at Bioinformatics online.

  3. Application of genomic selection in farm animal breeding.

    PubMed

    Tan, Cheng; Bian, Cheng; Yang, Da; Li, Ning; Wu, Zhen-Fang; Hu, Xiao-Xiang

    2017-11-20

    Genomic selection (GS) has become a widely accepted method in animal breeding to genetically improve economic traits. With the declining costs of high-density SNP chips and next-generation sequencing, GS has been applied in dairy cattle, swine, poultry and other animals and gained varying degrees of success. Currently, major challenges in GS studies include further reducing the cost of genome-wide SNP genotyping and improving the predictive accuracy of genomic estimated breeding value (GEBV). In this review, we summarize various methods for genome-wide SNP genotyping and GEBV prediction, and give a brief introduction of GS in livestock and poultry breeding. This review will provide a reference for further implementation of GS in farm animal breeding.

  4. Closing the gap between knowledge and clinical application: challenges for genomic translation.

    PubMed

    Burke, Wylie; Korngiebel, Diane M

    2015-01-01

    Despite early predictions and rapid progress in research, the introduction of personal genomics into clinical practice has been slow. Several factors contribute to this translational gap between knowledge and clinical application. The evidence available to support genetic test use is often limited, and implementation of new testing programs can be challenging. In addition, the heterogeneity of genomic risk information points to the need for strategies to select and deliver the information most appropriate for particular clinical needs. Accomplishing these tasks also requires recognition that some expectations for personal genomics are unrealistic, notably expectations concerning the clinical utility of genomic risk assessment for common complex diseases. Efforts are needed to improve the body of evidence addressing clinical outcomes for genomics, apply implementation science to personal genomics, and develop realistic goals for genomic risk assessment. In addition, translational research should emphasize the broader benefits of genomic knowledge, including applications of genomic research that provide clinical benefit outside the context of personal genomic risk.

  5. Stakeholder engagement: a key component of integrating genomic information into electronic health records

    PubMed Central

    Hartzler, Andrea; McCarty, Catherine A.; Rasmussen, Luke V.; Williams, Marc S.; Brilliant, Murray; Bowton, Erica A.; Clayton, Ellen Wright; Faucett, William A.; Ferryman, Kadija; Field, Julie R.; Fullerton, Stephanie M.; Horowitz, Carol R.; Koenig, Barbara A.; McCormick, Jennifer B.; Ralston, James D.; Sanderson, Saskia C.; Smith, Maureen E.; Trinidad, Susan Brown

    2014-01-01

    Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine. PMID:24030437

  6. Genomic selection in a commercial winter wheat population.

    PubMed

    He, Sang; Schulthess, Albert Wilhelm; Mirdita, Vilson; Zhao, Yusheng; Korzun, Viktor; Bothe, Reiner; Ebmeyer, Erhard; Reif, Jochen C; Jiang, Yong

    2016-03-01

    Genomic selection models can be trained using historical data and filtering genotypes based on phenotyping intensity and reliability criterion are able to increase the prediction ability. We implemented genomic selection based on a large commercial population incorporating 2325 European winter wheat lines. Our objectives were (1) to study whether modeling epistasis besides additive genetic effects results in enhancement on prediction ability of genomic selection, (2) to assess prediction ability when training population comprised historical or less-intensively phenotyped lines, and (3) to explore the prediction ability in subpopulations selected based on the reliability criterion. We found a 5 % increase in prediction ability when shifting from additive to additive plus epistatic effects models. In addition, only a marginal loss from 0.65 to 0.50 in accuracy was observed using the data collected from 1 year to predict genotypes of the following year, revealing that stable genomic selection models can be accurately calibrated to predict subsequent breeding stages. Moreover, prediction ability was maximized when the genotypes evaluated in a single location were excluded from the training set but subsequently decreased again when the phenotyping intensity was increased above two locations, suggesting that the update of the training population should be performed considering all the selected genotypes but excluding those evaluated in a single location. The genomic prediction ability was substantially higher in subpopulations selected based on the reliability criterion, indicating that phenotypic selection for highly reliable individuals could be directly replaced by applying genomic selection to them. We empirically conclude that there is a high potential to assist commercial wheat breeding programs employing genomic selection approaches.

  7. Applications of Genomics to Genetic Improvement of Dairy Cattle

    USDA-ARS?s Scientific Manuscript database

    Implementation of genomic evaluation has caused profound changes in dairy cattle breeding. All young bulls bought by major artificial-insemination (AI) organizations now are selected based on such evaluations. Evaluation reliability can reach about 75% for yield traits, which is adequate for marketi...

  8. SweeD: likelihood-based detection of selective sweeps in thousands of genomes.

    PubMed

    Pavlidis, Pavlos; Živkovic, Daniel; Stamatakis, Alexandros; Alachiotis, Nikolaos

    2013-09-01

    The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single nucleotide polymorphism patterns in intraspecific genome alignments. In the last decade, a plethora of algorithms for identifying selective sweeps have been developed. However, the majority of these algorithms have not been designed for analyzing whole-genome data. We present SweeD (Sweep Detector), an open-source tool for the rapid detection of selective sweeps in whole genomes. It analyzes site frequency spectra and represents a substantial extension of the widely used SweepFinder program. The sequential version of SweeD is up to 22 times faster than SweepFinder and, more importantly, is able to analyze thousands of sequences. We also provide a parallel implementation of SweeD for multi-core processors. Furthermore, we implemented a checkpointing mechanism that allows to deploy SweeD on cluster systems with queue execution time restrictions, as well as to resume long-running analyses after processor failures. In addition, the user can specify various demographic models via the command-line to calculate their theoretically expected site frequency spectra. Therefore, (in contrast to SweepFinder) the neutral site frequencies can optionally be directly calculated from a given demographic model. We show that an increase of sample size results in more precise detection of positive selection. Thus, the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection. We validate SweeD via simulations and by scanning the first chromosome from the 1000 human Genomes project for selective sweeps. We compare SweeD results with results from a linkage-disequilibrium-based approach and identify common outliers.

  9. Genomic selection using beef commercial carcass phenotypes.

    PubMed

    Todd, D L; Roughsedge, T; Woolliams, J A

    2014-03-01

    In this study, an industry terminal breeding goal was used in a deterministic simulation, using selection index methodology, to predict genetic gain in a beef population modelled on the UK pedigree Limousin, when using genomic selection (GS) and incorporating phenotype information from novel commercial carcass traits. The effect of genotype-environment interaction was investigated by including the model variations of the genetic correlation between purebred and commercial cross-bred performance (ρX). Three genomic scenarios were considered: (1) genomic breeding values (GBV)+estimated breeding values (EBV) for existing selection traits; (2) GBV for three novel commercial carcass traits+EBV in existing traits; and (3) GBV for novel and existing traits plus EBV for existing traits. Each of the three scenarios was simulated for a range of training population (TP) sizes and with three values of ρX. Scenarios 2 and 3 predicted substantially higher percentage increases over current selection than Scenario 1. A TP of 2000 sires, each with 20 commercial progeny with carcass phenotypes, and assuming a ρX of 0.7, is predicted to increase gain by 40% over current selection in Scenario 3. The percentage increase in gain over current selection increased with decreasing ρX; however, the effect of varying ρX was reduced at high TP sizes for Scenarios 2 and 3. A further non-genomic scenario (4) was considered simulating a conventional population-wide progeny test using EBV only. With 20 commercial cross-bred progenies per sire, similar gain was predicted to Scenario 3 with TP=5000 and ρX=1.0. The range of increases in genetic gain predicted for terminal traits when using GS are of similar magnitude to those observed after the implementation of BLUP technology in the United Kingdom. It is concluded that implementation of GS in a terminal sire breeding goal, using purebred phenotypes alone, will be sub-optimal compared with the inclusion of novel commercial carcass phenotypes in genomic evaluations.

  10. Performance comparison of two efficient genomic selection methods (gsbay & MixP) applied in aquacultural organisms

    NASA Astrophysics Data System (ADS)

    Su, Hailin; Li, Hengde; Wang, Shi; Wang, Yangfan; Bao, Zhenmin

    2017-02-01

    Genomic selection is more and more popular in animal and plant breeding industries all around the world, as it can be applied early in life without impacting selection candidates. The objective of this study was to bring the advantages of genomic selection to scallop breeding. Two different genomic selection tools MixP and gsbay were applied on genomic evaluation of simulated data and Zhikong scallop ( Chlamys farreri) field data. The data were compared with genomic best linear unbiased prediction (GBLUP) method which has been applied widely. Our results showed that both MixP and gsbay could accurately estimate single-nucleotide polymorphism (SNP) marker effects, and thereby could be applied for the analysis of genomic estimated breeding values (GEBV). In simulated data from different scenarios, the accuracy of GEBV acquired was ranged from 0.20 to 0.78 by MixP; it was ranged from 0.21 to 0.67 by gsbay; and it was ranged from 0.21 to 0.61 by GBLUP. Estimations made by MixP and gsbay were expected to be more reliable than those estimated by GBLUP. Predictions made by gsbay were more robust, while with MixP the computation is much faster, especially in dealing with large-scale data. These results suggested that both algorithms implemented by MixP and gsbay are feasible to carry out genomic selection in scallop breeding, and more genotype data will be necessary to produce genomic estimated breeding values with a higher accuracy for the industry.

  11. Allele frequency changes due to hitch-hiking in genomic selection programs

    PubMed Central

    2014-01-01

    Background Genomic selection makes it possible to reduce pedigree-based inbreeding over best linear unbiased prediction (BLUP) by increasing emphasis on own rather than family information. However, pedigree inbreeding might not accurately reflect loss of genetic variation and the true level of inbreeding due to changes in allele frequencies and hitch-hiking. This study aimed at understanding the impact of using long-term genomic selection on changes in allele frequencies, genetic variation and level of inbreeding. Methods Selection was performed in simulated scenarios with a population of 400 animals for 25 consecutive generations. Six genetic models were considered with different heritabilities and numbers of QTL (quantitative trait loci) affecting the trait. Four selection criteria were used, including selection on own phenotype and on estimated breeding values (EBV) derived using phenotype-BLUP, genomic BLUP and Bayesian Lasso. Changes in allele frequencies at QTL, markers and linked neutral loci were investigated for the different selection criteria and different scenarios, along with the loss of favourable alleles and the rate of inbreeding measured by pedigree and runs of homozygosity. Results For each selection criterion, hitch-hiking in the vicinity of the QTL appeared more extensive when accuracy of selection was higher and the number of QTL was lower. When inbreeding was measured by pedigree information, selection on genomic BLUP EBV resulted in lower levels of inbreeding than selection on phenotype BLUP EBV, but this did not always apply when inbreeding was measured by runs of homozygosity. Compared to genomic BLUP, selection on EBV from Bayesian Lasso led to less genetic drift, reduced loss of favourable alleles and more effectively controlled the rate of both pedigree and genomic inbreeding in all simulated scenarios. In addition, selection on EBV from Bayesian Lasso showed a higher selection differential for mendelian sampling terms than selection on genomic BLUP EBV. Conclusions Neutral variation can be shaped to a great extent by the hitch-hiking effects associated with selection, rather than just by genetic drift. When implementing long-term genomic selection, strategies for genomic control of inbreeding are essential, due to a considerable hitch-hiking effect, regardless of the method that is used for prediction of EBV. PMID:24495634

  12. Genetic diversity and signatures of selection in various goat breeds revealed by genome-wide SNP markers.

    PubMed

    Brito, Luiz F; Kijas, James W; Ventura, Ricardo V; Sargolzaei, Mehdi; Porto-Neto, Laercio R; Cánovas, Angela; Feng, Zeny; Jafarikia, Mohsen; Schenkel, Flávio S

    2017-03-14

    The detection of signatures of selection has the potential to elucidate the identities of genes and mutations associated with phenotypic traits important for livestock species. It is also very relevant to investigate the levels of genetic diversity of a population, as genetic diversity represents the raw material essential for breeding and has practical implications for implementation of genomic selection. A total of 1151 animals from nine goat populations selected for different breeding goals and genotyped with the Illumina Goat 50K single nucleotide polymorphisms (SNP) Beadchip were included in this investigation. The proportion of polymorphic SNPs ranged from 0.902 (Nubian) to 0.995 (Rangeland). The overall mean H O and H E was 0.374 ± 0.021 and 0.369 ± 0.023, respectively. The average pairwise genetic distance (D) ranged from 0.263 (Toggenburg) to 0.323 (Rangeland). The overall average for the inbreeding measures F EH , F VR , F LEUT , F ROH and F PED was 0.129, -0.012, -0.010, 0.038 and 0.030, respectively. Several regions located on 19 chromosomes were potentially under selection in at least one of the goat breeds. The genomic population tree constructed using all SNPs differentiated breeds based on selection purpose, while genomic population tree built using only SNPs in the most significant region showed a great differentiation between LaMancha and the other breeds. We hypothesized that this region is related to ear morphogenesis. Furthermore, we identified genes potentially related to reproduction traits, adult body mass, efficiency of food conversion, abdominal fat deposition, conformation traits, liver fat metabolism, milk fatty acids, somatic cells score, milk protein, thermo-tolerance and ear morphogenesis. In general, moderate to high levels of genetic variability were observed for all the breeds and a characterization of runs of homozygosity gave insights into the breeds' development history. The information reported here will be useful for the implementation of genomic selection and other genomic studies in goats. We also identified various genome regions under positive selection using smoothed F ST and hapFLK statistics and suggested genes, which are potentially under selection. These results can now provide a foundation to formulate biological hypotheses related to selection processes in goats.

  13. Technical Report: Algorithm and Implementation for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McLoughlin, Kevin

    2016-01-11

    This report describes the design and implementation of an algorithm for estimating relative microbial abundances, together with confidence limits, using data from metagenomic DNA sequencing. For the background behind this project and a detailed discussion of our modeling approach for metagenomic data, we refer the reader to our earlier technical report, dated March 4, 2014. Briefly, we described a fully Bayesian generative model for paired-end sequence read data, incorporating the effects of the relative abundances, the distribution of sequence fragment lengths, fragment position bias, sequencing errors and variations between the sampled genomes and the nearest reference genomes. A distinctive featuremore » of our modeling approach is the use of a Chinese restaurant process (CRP) to describe the selection of genomes to be sampled, and thus the relative abundances. The CRP component is desirable for fitting abundances to reads that may map ambiguously to multiple targets, because it naturally leads to sparse solutions that select the best representative from each set of nearly equivalent genomes.« less

  14. An experimental validation of genomic selection in octoploid strawberry

    PubMed Central

    Gezan, Salvador A; Osorio, Luis F; Verma, Sujeet; Whitaker, Vance M

    2017-01-01

    The primary goal of genomic selection is to increase genetic gains for complex traits by predicting performance of individuals for which phenotypic data are not available. The objective of this study was to experimentally evaluate the potential of genomic selection in strawberry breeding and to define a strategy for its implementation. Four clonally replicated field trials, two in each of 2 years comprised of a total of 1628 individuals, were established in 2013–2014 and 2014–2015. Five complex yield and fruit quality traits with moderate to low heritability were assessed in each trial. High-density genotyping was performed with the Affymetrix Axiom IStraw90 single-nucleotide polymorphism array, and 17 479 polymorphic markers were chosen for analysis. Several methods were compared, including Genomic BLUP, Bayes B, Bayes C, Bayesian LASSO Regression, Bayesian Ridge Regression and Reproducing Kernel Hilbert Spaces. Cross-validation within training populations resulted in higher values than for true validations across trials. For true validations, Bayes B gave the highest predictive abilities on average and also the highest selection efficiencies, particularly for yield traits that were the lowest heritability traits. Selection efficiencies using Bayes B for parent selection ranged from 74% for average fruit weight to 34% for early marketable yield. A breeding strategy is proposed in which advanced selection trials are utilized as training populations and in which genomic selection can reduce the breeding cycle from 3 to 2 years for a subset of untested parents based on their predicted genomic breeding values. PMID:28090334

  15. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    NASA Astrophysics Data System (ADS)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  16. A Primer on High-Throughput Computing for Genomic Selection

    PubMed Central

    Wu, Xiao-Lin; Beissinger, Timothy M.; Bauck, Stewart; Woodward, Brent; Rosa, Guilherme J. M.; Weigel, Kent A.; Gatti, Natalia de Leon; Gianola, Daniel

    2011-01-01

    High-throughput computing (HTC) uses computer clusters to solve advanced computational problems, with the goal of accomplishing high-throughput over relatively long periods of time. In genomic selection, for example, a set of markers covering the entire genome is used to train a model based on known data, and the resulting model is used to predict the genetic merit of selection candidates. Sophisticated models are very computationally demanding and, with several traits to be evaluated sequentially, computing time is long, and output is low. In this paper, we present scenarios and basic principles of how HTC can be used in genomic selection, implemented using various techniques from simple batch processing to pipelining in distributed computer clusters. Various scripting languages, such as shell scripting, Perl, and R, are also very useful to devise pipelines. By pipelining, we can reduce total computing time and consequently increase throughput. In comparison to the traditional data processing pipeline residing on the central processors, performing general-purpose computation on a graphics processing unit provide a new-generation approach to massive parallel computing in genomic selection. While the concept of HTC may still be new to many researchers in animal breeding, plant breeding, and genetics, HTC infrastructures have already been built in many institutions, such as the University of Wisconsin–Madison, which can be leveraged for genomic selection, in terms of central processing unit capacity, network connectivity, storage availability, and middleware connectivity. Exploring existing HTC infrastructures as well as general-purpose computing environments will further expand our capability to meet increasing computing demands posed by unprecedented genomic data that we have today. We anticipate that HTC will impact genomic selection via better statistical models, faster solutions, and more competitive products (e.g., from design of marker panels to realized genetic gain). Eventually, HTC may change our view of data analysis as well as decision-making in the post-genomic era of selection programs in animals and plants, or in the study of complex diseases in humans. PMID:22303303

  17. Evaluating the role of public health in implementation of genomics-related recommendations: a case study of hereditary cancers using the CDC Science Impact Framework.

    PubMed

    Green, Ridgely Fisk; Ari, Mary; Kolor, Katherine; Dotson, W David; Bowen, Scott; Habarta, Nancy; Rodriguez, Juan L; Richardson, Lisa C; Khoury, Muin J

    2018-06-15

    Public health plays an important role in ensuring access to interventions that can prevent disease, including the implementation of evidence-based genomic recommendations. We used the Centers for Disease Control and Prevention (CDC) Science Impact Framework to trace the impact of public health activities and partnerships on the implementation of the 2009 Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Lynch Syndrome screening recommendation and the 2005 and 2013 United States Preventive Services Task Force (USPSTF) BRCA1 and BRCA2 testing recommendations.The EGAPP and USPSTF recommendations have each been cited by >300 peer-reviewed publications. CDC funds selected states to build capacity to integrate these recommendations into public health programs, through education, policy, surveillance, and partnerships. Most state cancer control plans include genomics-related goals, objectives, or strategies. Since the EGAPP recommendation, major public and private payers now provide coverage for Lynch Syndrome screening for all newly diagnosed colorectal cancers. National guidelines and initiatives, including Healthy People 2020, included similar recommendations and cited the EGAPP and USPSTF recommendations. However, disparities in implementation based on race, ethnicity, and rural residence remain challenges. Public health achievements in promoting the evidence-based use of genomics for the prevention of hereditary cancers can inform future applications of genomics in public health.

  18. Comparison of Models and Whole-Genome Profiling Approaches for Genomic-Enabled Prediction of Septoria Tritici Blotch, Stagonospora Nodorum Blotch, and Tan Spot Resistance in Wheat.

    PubMed

    Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E

    2017-07-01

    The leaf spotting diseases in wheat that include Septoria tritici blotch (STB) caused by , Stagonospora nodorum blotch (SNB) caused by , and tan spot (TS) caused by pose challenges to breeding programs in selecting for resistance. A promising approach that could enable selection prior to phenotyping is genomic selection that uses genome-wide markers to estimate breeding values (BVs) for quantitative traits. To evaluate this approach for seedling and/or adult plant resistance (APR) to STB, SNB, and TS, we compared the predictive ability of least-squares (LS) approach with genomic-enabled prediction models including genomic best linear unbiased predictor (GBLUP), Bayesian ridge regression (BRR), Bayes A (BA), Bayes B (BB), Bayes Cπ (BC), Bayesian least absolute shrinkage and selection operator (BL), and reproducing kernel Hilbert spaces markers (RKHS-M), a pedigree-based model (RKHS-P) and RKHS markers and pedigree (RKHS-MP). We observed that LS gave the lowest prediction accuracies and RKHS-MP, the highest. The genomic-enabled prediction models and RKHS-P gave similar accuracies. The increase in accuracy using genomic prediction models over LS was 48%. The mean genomic prediction accuracies were 0.45 for STB (APR), 0.55 for SNB (seedling), 0.66 for TS (seedling) and 0.48 for TS (APR). We also compared markers from two whole-genome profiling approaches: genotyping by sequencing (GBS) and diversity arrays technology sequencing (DArTseq) for prediction. While, GBS markers performed slightly better than DArTseq, combining markers from the two approaches did not improve accuracies. We conclude that implementing GS in breeding for these diseases would help to achieve higher accuracies and rapid gains from selection. Copyright © 2017 Crop Science Society of America.

  19. Simulating a base population in honey bee for molecular genetic studies

    PubMed Central

    2012-01-01

    Background Over the past years, reports have indicated that honey bee populations are declining and that infestation by an ecto-parasitic mite (Varroa destructor) is one of the main causes. Selective breeding of resistant bees can help to prevent losses due to the parasite, but it requires that a robust breeding program and genetic evaluation are implemented. Genomic selection has emerged as an important tool in animal breeding programs and simulation studies have shown that it yields more accurate breeding value estimates, higher genetic gain and low rates of inbreeding. Since genomic selection relies on marker data, simulations conducted on a genomic dataset are a pre-requisite before selection can be implemented. Although genomic datasets have been simulated in other species undergoing genetic evaluation, simulation of a genomic dataset specific to the honey bee is required since this species has a distinct genetic and reproductive biology. Our software program was aimed at constructing a base population by simulating a random mating honey bee population. A forward-time population simulation approach was applied since it allows modeling of genetic characteristics and reproductive behavior specific to the honey bee. Results Our software program yielded a genomic dataset for a base population in linkage disequilibrium. In addition, information was obtained on (1) the position of markers on each chromosome, (2) allele frequency, (3) χ2 statistics for Hardy-Weinberg equilibrium, (4) a sorted list of markers with a minor allele frequency less than or equal to the input value, (5) average r2 values of linkage disequilibrium between all simulated marker loci pair for all generations and (6) average r2 value of linkage disequilibrium in the last generation for selected markers with the highest minor allele frequency. Conclusion We developed a software program that takes into account the genetic and reproductive biology specific to the honey bee and that can be used to constitute a genomic dataset compatible with the simulation studies necessary to optimize breeding programs. The source code together with an instruction file is freely accessible at http://msproteomics.org/Research/Misc/honeybeepopulationsimulator.html PMID:22520469

  20. Simulating a base population in honey bee for molecular genetic studies.

    PubMed

    Gupta, Pooja; Conrad, Tim; Spötter, Andreas; Reinsch, Norbert; Bienefeld, Kaspar

    2012-06-27

    Over the past years, reports have indicated that honey bee populations are declining and that infestation by an ecto-parasitic mite (Varroa destructor) is one of the main causes. Selective breeding of resistant bees can help to prevent losses due to the parasite, but it requires that a robust breeding program and genetic evaluation are implemented. Genomic selection has emerged as an important tool in animal breeding programs and simulation studies have shown that it yields more accurate breeding value estimates, higher genetic gain and low rates of inbreeding. Since genomic selection relies on marker data, simulations conducted on a genomic dataset are a pre-requisite before selection can be implemented. Although genomic datasets have been simulated in other species undergoing genetic evaluation, simulation of a genomic dataset specific to the honey bee is required since this species has a distinct genetic and reproductive biology. Our software program was aimed at constructing a base population by simulating a random mating honey bee population. A forward-time population simulation approach was applied since it allows modeling of genetic characteristics and reproductive behavior specific to the honey bee. Our software program yielded a genomic dataset for a base population in linkage disequilibrium. In addition, information was obtained on (1) the position of markers on each chromosome, (2) allele frequency, (3) χ(2) statistics for Hardy-Weinberg equilibrium, (4) a sorted list of markers with a minor allele frequency less than or equal to the input value, (5) average r(2) values of linkage disequilibrium between all simulated marker loci pair for all generations and (6) average r2 value of linkage disequilibrium in the last generation for selected markers with the highest minor allele frequency. We developed a software program that takes into account the genetic and reproductive biology specific to the honey bee and that can be used to constitute a genomic dataset compatible with the simulation studies necessary to optimize breeding programs. The source code together with an instruction file is freely accessible at http://msproteomics.org/Research/Misc/honeybeepopulationsimulator.html.

  1. Genome-wide detection of selection signatures in Chinese indigenous Laiwu pigs revealed candidate genes regulating fat deposition in muscle.

    PubMed

    Chen, Minhui; Wang, Jiying; Wang, Yanping; Wu, Ying; Fu, Jinluan; Liu, Jian-Feng

    2018-05-18

    Currently, genome-wide scans for positive selection signatures in commercial breed have been investigated. However, few studies have focused on selection footprints of indigenous breeds. Laiwu pig is an invaluable Chinese indigenous pig breed with extremely high proportion of intramuscular fat (IMF), and an excellent model to detect footprint as the result of natural and artificial selection for fat deposition in muscle. In this study, based on GeneSeek Genomic profiler Porcine HD data, three complementary methods, F ST , iHS (integrated haplotype homozygosity score) and CLR (composite likelihood ratio), were implemented to detect selection signatures in the whole genome of Laiwu pigs. Totally, 175 candidate selected regions were obtained by at least two of the three methods, which covered 43.75 Mb genomic regions and corresponded to 1.79% of the genome sequence. Gene annotation of the selected regions revealed a list of functionally important genes for feed intake and fat deposition, reproduction, and immune response. Especially, in accordance to the phenotypic features of Laiwu pigs, among the candidate genes, we identified several genes, NPY1R, NPY5R, PIK3R1 and JAKMIP1, involved in the actions of two sets of neurons, which are central regulators in maintaining the balance between food intake and energy expenditure. Our results identified a number of regions showing signatures of selection, as well as a list of functionally candidate genes with potential effect on phenotypic traits, especially fat deposition in muscle. Our findings provide insights into the mechanisms of artificial selection of fat deposition and further facilitate follow-up functional studies.

  2. iPat: intelligent prediction and association tool for genomic research.

    PubMed

    Chen, Chunpeng James; Zhang, Zhiwu

    2018-06-01

    The ultimate goal of genomic research is to effectively predict phenotypes from genotypes so that medical management can improve human health and molecular breeding can increase agricultural production. Genomic prediction or selection (GS) plays a complementary role to genome-wide association studies (GWAS), which is the primary method to identify genes underlying phenotypes. Unfortunately, most computing tools cannot perform data analyses for both GWAS and GS. Furthermore, the majority of these tools are executed through a command-line interface (CLI), which requires programming skills. Non-programmers struggle to use them efficiently because of the steep learning curves and zero tolerance for data formats and mistakes when inputting keywords and parameters. To address these problems, this study developed a software package, named the Intelligent Prediction and Association Tool (iPat), with a user-friendly graphical user interface. With iPat, GWAS or GS can be performed using a pointing device to simply drag and/or click on graphical elements to specify input data files, choose input parameters and select analytical models. Models available to users include those implemented in third party CLI packages such as GAPIT, PLINK, FarmCPU, BLINK, rrBLUP and BGLR. Users can choose any data format and conduct analyses with any of these packages. File conversions are automatically conducted for specified input data and selected packages. A GWAS-assisted genomic prediction method was implemented to perform genomic prediction using any GWAS method such as FarmCPU. iPat was written in Java for adaptation to multiple operating systems including Windows, Mac and Linux. The iPat executable file, user manual, tutorials and example datasets are freely available at http://zzlab.net/iPat. zhiwu.zhang@wsu.edu.

  3. Accuracy of estimation of genomic breeding values in pigs using low-density genotypes and imputation.

    PubMed

    Badke, Yvonne M; Bates, Ronald O; Ernst, Catherine W; Fix, Justin; Steibel, Juan P

    2014-04-16

    Genomic selection has the potential to increase genetic progress. Genotype imputation of high-density single-nucleotide polymorphism (SNP) genotypes can improve the cost efficiency of genomic breeding value (GEBV) prediction for pig breeding. Consequently, the objectives of this work were to: (1) estimate accuracy of genomic evaluation and GEBV for three traits in a Yorkshire population and (2) quantify the loss of accuracy of genomic evaluation and GEBV when genotypes were imputed under two scenarios: a high-cost, high-accuracy scenario in which only selection candidates were imputed from a low-density platform and a low-cost, low-accuracy scenario in which all animals were imputed using a small reference panel of haplotypes. Phenotypes and genotypes obtained with the PorcineSNP60 BeadChip were available for 983 Yorkshire boars. Genotypes of selection candidates were masked and imputed using tagSNP in the GeneSeek Genomic Profiler (10K). Imputation was performed with BEAGLE using 128 or 1800 haplotypes as reference panels. GEBV were obtained through an animal-centric ridge regression model using de-regressed breeding values as response variables. Accuracy of genomic evaluation was estimated as the correlation between estimated breeding values and GEBV in a 10-fold cross validation design. Accuracy of genomic evaluation using observed genotypes was high for all traits (0.65-0.68). Using genotypes imputed from a large reference panel (accuracy: R(2) = 0.95) for genomic evaluation did not significantly decrease accuracy, whereas a scenario with genotypes imputed from a small reference panel (R(2) = 0.88) did show a significant decrease in accuracy. Genomic evaluation based on imputed genotypes in selection candidates can be implemented at a fraction of the cost of a genomic evaluation using observed genotypes and still yield virtually the same accuracy. On the other side, using a very small reference panel of haplotypes to impute training animals and candidates for selection results in lower accuracy of genomic evaluation.

  4. Genomic selection for fruit quality traits in apple (Malus×domestica Borkh.).

    PubMed

    Kumar, Satish; Chagné, David; Bink, Marco C A M; Volz, Richard K; Whitworth, Claire; Carlisle, Charmaine

    2012-01-01

    The genome sequence of apple (Malus×domestica Borkh.) was published more than a year ago, which helped develop an 8K SNP chip to assist in implementing genomic selection (GS). In apple breeding programmes, GS can be used to obtain genomic breeding values (GEBV) for choosing next-generation parents or selections for further testing as potential commercial cultivars at a very early stage. Thus GS has the potential to accelerate breeding efficiency significantly because of decreased generation interval or increased selection intensity. We evaluated the accuracy of GS in a population of 1120 seedlings generated from a factorial mating design of four females and two male parents. All seedlings were genotyped using an Illumina Infinium chip comprising 8,000 single nucleotide polymorphisms (SNPs), and were phenotyped for various fruit quality traits. Random-regression best liner unbiased prediction (RR-BLUP) and the Bayesian LASSO method were used to obtain GEBV, and compared using a cross-validation approach for their accuracy to predict unobserved BLUP-BV. Accuracies were very similar for both methods, varying from 0.70 to 0.90 for various fruit quality traits. The selection response per unit time using GS compared with the traditional BLUP-based selection were very high (>100%) especially for low-heritability traits. Genome-wide average estimated linkage disequilibrium (LD) between adjacent SNPs was 0.32, with a relatively slow decay of LD in the long range (r(2) = 0.33 and 0.19 at 100 kb and 1,000 kb respectively), contributing to the higher accuracy of GS. Distribution of estimated SNP effects revealed involvement of large effect genes with likely pleiotropic effects. These results demonstrated that genomic selection is a credible alternative to conventional selection for fruit quality traits.

  5. Strategies for implementing genomic selection for feed efficiency in dairy cattle breeding schemes.

    PubMed

    Wallén, S E; Lillehammer, M; Meuwissen, T H E

    2017-08-01

    Alternative genomic selection and traditional BLUP breeding schemes were compared for the genetic improvement of feed efficiency in simulated Norwegian Red dairy cattle populations. The change in genetic gain over time and achievable selection accuracy were studied for milk yield and residual feed intake, as a measure of feed efficiency. When including feed efficiency in genomic BLUP schemes, it was possible to achieve high selection accuracies for genomic selection, and all genomic BLUP schemes gave better genetic gain for feed efficiency than BLUP using a pedigree relationship matrix. However, introducing a second trait in the breeding goal caused a reduction in the genetic gain for milk yield. When using contracted test herds with genotyped and feed efficiency recorded cows as a reference population, adding an additional 4,000 new heifers per year to the reference population gave accuracies that were comparable to a male reference population that used progeny testing with 250 daughters per sire. When the test herd consisted of 500 or 1,000 cows, lower genetic gain was found than using progeny test records to update the reference population. It was concluded that to improve difficult to record traits, the use of contracted test herds that had additional recording (e.g., measurements required to calculate feed efficiency) is a viable option, possibly through international collaborations. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  6. Genomics DNA Profiling in Elite Professional Soccer Players: A Pilot Study

    PubMed Central

    Kambouris, M; Del Buono, A; Maffulli, N

    2014-01-01

    Functional variants in exonic regions have been associated with development of cardiovascular disease, diabetes and cancer. Athletic performance can be considered a multi-factorial complex phenotype. Genomic DNA was extracted from buccal swabs of seven soccer players from the Fulham football team. Single nucleotide polymorphism (SNPs) genotyping was undertaken. To achieve optimal athletic performance, predictive genomics DNA profiling for sports performance can be used to aid in sport selection and elaboration of personalized training and nutrition programs. Predictive DNA profiling may be able to detect athletes with potential or frank injuries, or screening and selection of future athletes, and can help them to maximize utilization of their potential and improve performance in sports. The aim of this study is to provide a wide scenario of specific genomic variants that an athlete carries, to implement which measures should be taken to maximize the athlete’s potential. PMID:24809029

  7. TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS

    PubMed Central

    Jones, Matthew R.; Good, Jeffrey M.

    2016-01-01

    The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993

  8. Changes in genetic selection differentials and generation intervals in US Holstein dairy cattle as a result of genomic selection.

    PubMed

    García-Ruiz, Adriana; Cole, John B; VanRaden, Paul M; Wiggans, George R; Ruiz-López, Felipe J; Van Tassell, Curtis P

    2016-07-12

    Seven years after the introduction of genomic selection in the United States, it is now possible to evaluate the impact of this technology on the population. Selection differential(s) (SD) and generation interval(s) (GI) were characterized in a four-path selection model that included sire(s) of bulls (SB), sire(s) of cows (SC), dam(s) of bulls (DB), and dam(s) of cows (DC). Changes in SD over time were estimated for milk, fat, and protein yield; somatic cell score (SCS); productive life (PL); and daughter pregnancy rate (DPR) for the Holstein breed. In the period following implementation of genomic selection, dramatic reductions were seen in GI, especially the SB and SC paths. The SB GI reduced from ∼7 y to less than 2.5 y, and the DB GI fell from about 4 y to nearly 2.5 y. SD were relatively stable for yield traits, although modest gains were noted in recent years. The most dramatic response to genomic selection was observed for the lowly heritable traits DPR, PL, and SCS. Genetic trends changed from close to zero to large and favorable, resulting in rapid genetic improvement in fertility, lifespan, and health in a breed where these traits eroded over time. These results clearly demonstrate the positive impact of genomic selection in US dairy cattle, even though this technology has only been in use for a short time. Based on the four-path selection model, rates of genetic gain per year increased from ∼50-100% for yield traits and from threefold to fourfold for lowly heritable traits.

  9. Changes in genetic selection differentials and generation intervals in US Holstein dairy cattle as a result of genomic selection

    PubMed Central

    García-Ruiz, Adriana; Cole, John B.; VanRaden, Paul M.; Wiggans, George R.; Ruiz-López, Felipe J.; Van Tassell, Curtis P.

    2016-01-01

    Seven years after the introduction of genomic selection in the United States, it is now possible to evaluate the impact of this technology on the population. Selection differential(s) (SD) and generation interval(s) (GI) were characterized in a four-path selection model that included sire(s) of bulls (SB), sire(s) of cows (SC), dam(s) of bulls (DB), and dam(s) of cows (DC). Changes in SD over time were estimated for milk, fat, and protein yield; somatic cell score (SCS); productive life (PL); and daughter pregnancy rate (DPR) for the Holstein breed. In the period following implementation of genomic selection, dramatic reductions were seen in GI, especially the SB and SC paths. The SB GI reduced from ∼7 y to less than 2.5 y, and the DB GI fell from about 4 y to nearly 2.5 y. SD were relatively stable for yield traits, although modest gains were noted in recent years. The most dramatic response to genomic selection was observed for the lowly heritable traits DPR, PL, and SCS. Genetic trends changed from close to zero to large and favorable, resulting in rapid genetic improvement in fertility, lifespan, and health in a breed where these traits eroded over time. These results clearly demonstrate the positive impact of genomic selection in US dairy cattle, even though this technology has only been in use for a short time. Based on the four-path selection model, rates of genetic gain per year increased from ∼50–100% for yield traits and from threefold to fourfold for lowly heritable traits. PMID:27354521

  10. Potential and limits to unravel the genetic architecture and predict the variation of Fusarium head blight resistance in European winter wheat (Triticum aestivum L.).

    PubMed

    Jiang, Y; Zhao, Y; Rodemann, B; Plieske, J; Kollers, S; Korzun, V; Ebmeyer, E; Argillier, O; Hinze, M; Ling, J; Röder, M S; Ganal, M W; Mette, M F; Reif, J C

    2015-03-01

    Genome-wide mapping approaches in diverse populations are powerful tools to unravel the genetic architecture of complex traits. The main goals of our study were to investigate the potential and limits to unravel the genetic architecture and to identify the factors determining the accuracy of prediction of the genotypic variation of Fusarium head blight (FHB) resistance in wheat (Triticum aestivum L.) based on data collected with a diverse panel of 372 European varieties. The wheat lines were phenotyped in multi-location field trials for FHB resistance and genotyped with 782 simple sequence repeat (SSR) markers, and 9k and 90k single-nucleotide polymorphism (SNP) arrays. We applied genome-wide association mapping in combination with fivefold cross-validations and observed surprisingly high accuracies of prediction for marker-assisted selection based on the detected quantitative trait loci (QTLs). Using a random sample of markers not selected for marker-trait associations revealed only a slight decrease in prediction accuracy compared with marker-based selection exploiting the QTL information. The same picture was confirmed in a simulation study, suggesting that relatedness is a main driver of the accuracy of prediction in marker-assisted selection of FHB resistance. When the accuracy of prediction of three genomic selection models was contrasted for the three marker data sets, no significant differences in accuracies among marker platforms and genomic selection models were observed. Marker density impacted the accuracy of prediction only marginally. Consequently, genomic selection of FHB resistance can be implemented most cost-efficiently based on low- to medium-density SNP arrays.

  11. Testing for the Occurrence of Selective Episodes During the Divergence of Otophysan Fishes: Insights from Mitogenomics.

    PubMed

    D'Anatro, Alejandro; Giorello, Facundo; Feijoo, Matías; Lessa, Enrique P

    2017-04-01

    How natural selection shapes biodiversity constitutes a topic of renewed interest during the last few decades. The division Otophysi comprises approximately two-thirds of freshwater fish diversity and probably underwent an extensive adaptive radiation derived from a single invasion of the supercontinent Pangaea, giving place to the evolution of the main five Otophysan lineages during a short period of time. Little is known about the factors involved in the processes that lead to lineage diversification among this group of fishes and identifying directional selection acting over protein-coding genes could offer clues about the processes acting on species diversification. The main objective of this study was to explore the otophysan mitochondrial genome evolution, in order to account for the possible signatures of selective events in this lineage, and to explore for the functional connotations of these molecular substitutions. Mainly, three different approaches were used: the "ω-based" BS-REL and MEME methods, implemented in the DATAMONKEY web server, and analysis of selection on amino acid properties, implemented in the software TreeSAAP. We found evidence of selective episodes along several branches of the evolutionary history of othophysan fishes. Analyses carried out using the BS-REL algorithm suggest episodic diversifying selection at basal branches of the otophysan lineage, which was also supported by analyses implemented in MEME and TreeSAAP. These results suggest that throughout the Siluriformes radiation, an important number of adaptive changes occurred in their mitochondrial genome. The metabolic consequences and ecological correlates of these molecular substitutions should be addressed in future studies.

  12. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo

    2017-01-01

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241

  13. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo

    2017-06-07

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.

  14. Ancient genomic changes associated with domestication of the horse.

    PubMed

    Librado, Pablo; Gamba, Cristina; Gaunitz, Charleen; Der Sarkissian, Clio; Pruvost, Mélanie; Albrechtsen, Anders; Fages, Antoine; Khan, Naveed; Schubert, Mikkel; Jagannathan, Vidhya; Serres-Armero, Aitor; Kuderna, Lukas F K; Povolotskaya, Inna S; Seguin-Orlando, Andaine; Lepetz, Sébastien; Neuditschko, Markus; Thèves, Catherine; Alquraishi, Saleh; Alfarhan, Ahmed H; Al-Rasheid, Khaled; Rieder, Stefan; Samashev, Zainolla; Francfort, Henri-Paul; Benecke, Norbert; Hofreiter, Michael; Ludwig, Arne; Keyser, Christine; Marques-Bonet, Tomas; Ludes, Bertrand; Crubézy, Eric; Leeb, Tosso; Willerslev, Eske; Orlando, Ludovic

    2017-04-28

    The genomic changes underlying both early and late stages of horse domestication remain largely unknown. We examined the genomes of 14 early domestic horses from the Bronze and Iron Ages, dating to between ~4.1 and 2.3 thousand years before present. We find early domestication selection patterns supporting the neural crest hypothesis, which provides a unified developmental origin for common domestic traits. Within the past 2.3 thousand years, horses lost genetic diversity and archaic DNA tracts introgressed from a now-extinct lineage. They accumulated deleterious mutations later than expected under the cost-of-domestication hypothesis, probably because of breeding from limited numbers of stallions. We also reveal that Iron Age Scythian steppe nomads implemented breeding strategies involving no detectable inbreeding and selection for coat-color variation and robust forelimbs. Copyright © 2017, American Association for the Advancement of Science.

  15. Mojo Hand, a TALEN design tool for genome editing applications.

    PubMed

    Neff, Kevin L; Argue, David P; Ma, Alvin C; Lee, Han B; Clark, Karl J; Ekker, Stephen C

    2013-01-16

    Recent studies of transcription activator-like (TAL) effector domains fused to nucleases (TALENs) demonstrate enormous potential for genome editing. Effective design of TALENs requires a combination of selecting appropriate genetic features, finding pairs of binding sites based on a consensus sequence, and, in some cases, identifying endogenous restriction sites for downstream molecular genetic applications. We present the web-based program Mojo Hand for designing TAL and TALEN constructs for genome editing applications (http://www.talendesign.org). We describe the algorithm and its implementation. The features of Mojo Hand include (1) automatic download of genomic data from the National Center for Biotechnology Information, (2) analysis of any DNA sequence to reveal pairs of binding sites based on a user-defined template, (3) selection of restriction-enzyme recognition sites in the spacer between the TAL monomer binding sites including options for the selection of restriction enzyme suppliers, and (4) output files designed for subsequent TALEN construction using the Golden Gate assembly method. Mojo Hand enables the rapid identification of TAL binding sites for use in TALEN design. The assembly of TALEN constructs, is also simplified by using the TAL-site prediction program in conjunction with a spreadsheet management aid of reagent concentrations and TALEN formulation. Mojo Hand enables scientists to more rapidly deploy TALENs for genome editing applications.

  16. Estimating P-coverage of biosynthetic pathways in DNA libraries and screening by genetic selection: biotin biosynthesis in the marine microorganism Chromohalobacter.

    PubMed

    Kim, Eun Jin; Angell, Scott; Janes, Jeff; Watanabe, Coran M H

    2008-06-01

    Traditional approaches to natural product discovery involve cell-based screening of natural product extracts followed by compound isolation and characterization. Their importance notwithstanding, continued mining leads to depletion of natural resources and the reisolation of previously identified metabolites. Metagenomic strategies aimed at localizing the biosynthetic cluster genes and expressing them in surrogate hosts offers one possible alternative. A fundamental question that naturally arises when pursuing such a strategy is, how large must the genomic library be to effectively represent the genome of an organism(s) and the biosynthetic gene clusters they harbor? Such an issue is certainly augmented in the absence of expensive robotics to expedite colony picking and/or screening of clones. We have developed an algorism, named BPC (biosynthetic pathway coverage), supported by molecular simulations to deduce the number of BAC clones required to achieve proper coverage of the genome and their respective biosynthetic pathways. The strategy has been applied to the construction of a large-insert BAC library from a marine microorganism, Hon6 (isolated from Honokohau, Maui) thought to represent a new species. The genomic library is constructed with a BAC yeast shuttle vector pClasper lacZ paving the way for the culturing of libraries in both prokaryotic and eukaryotic hosts. Flow cytometric methods are utilized to estimate the genome size of the organism and BPC implemented to assess P-coverage or percent coverage. A genetic selection strategy is illustrated, applications of which could expedite screening efforts in the identification and localization of biosynthetic pathways from marine microbial consortia, offering a powerful complement to genome sequencing and degenerate probe strategies. Implementing this approach, we report on the biotin biosynthetic pathway from the marine microorganism Hon6.

  17. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

    PubMed Central

    2010-01-01

    Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788

  18. Signatures of Long-Term Balancing Selection in Human Genomes

    PubMed Central

    de Filippo, Cesare; Teixeira, João C; Schmidt, Joshua M; Kleinert, Philip; Meyer, Diogo; Andrés, Aida M

    2018-01-01

    Abstract Balancing selection maintains advantageous diversity in populations through various mechanisms. Although extensively explored from a theoretical perspective, an empirical understanding of its prevalence and targets lags behind our knowledge of positive selection. Here, we describe the Non-central Deviation (NCD), a simple yet powerful statistic to detect long-term balancing selection (LTBS) that quantifies how close frequencies are to expectations under LTBS, and provides the basis for a neutrality test. NCD can be applied to a single locus or genomic data, and can be implemented considering only polymorphisms (NCD1) or also considering fixed differences with respect to an outgroup (NCD2) species. Incorporating fixed differences improves power, and NCD2 has higher power to detect LTBS in humans under different frequencies of the balanced allele(s) than other available methods. Applied to genome-wide data from African and European human populations, in both cases using chimpanzee as an outgroup, NCD2 shows that, albeit not prevalent, LTBS affects a sizable portion of the genome: ∼0.6% of analyzed genomic windows and 0.8% of analyzed positions. Significant windows (P < 0.0001) contain 1.6% of SNPs in the genome, which disproportionally fall within exons and change protein sequence, but are not enriched in putatively regulatory sites. These windows overlap ∼8% of the protein-coding genes, and these have larger number of transcripts than expected by chance even after controlling for gene length. Our catalog includes known targets of LTBS but a majority of them (90%) are novel. As expected, immune-related genes are among those with the strongest signatures, although most candidates are involved in other biological functions, suggesting that LTBS potentially influences diverse human phenotypes. PMID:29608730

  19. Genetic and economic benefits of selection based on performance recording and genotyping in lower tiers of multi-tiered sheep breeding schemes.

    PubMed

    Santos, Bruno F S; van der Werf, Julius H J; Gibson, John P; Byrne, Timothy J; Amer, Peter R

    2017-01-17

    Performance recording and genotyping in the multiplier tier of multi-tiered sheep breeding schemes could potentially reduce the difference in the average genetic merit between nucleus and commercial flocks, and create additional economic benefits for the breeding structure. The genetic change in a multiple-trait breeding objective was predicted for various selection strategies that included performance recording, parentage testing and genomic selection. A deterministic simulation model was used to predict selection differentials and the flow of genetic superiority through the different tiers. Cumulative discounted economic benefits were calculated based on trait gains achieved in each of the tiers and considering the extra revenue and associated costs of applying recording, genotyping and selection practices in the multiplier tier of the breeding scheme. Performance recording combined with genomic or parentage information in the multiplier tier reduced the genetic lag between the nucleus and commercial flock by 2 to 3 years. The overall economic benefits of improved performance in the commercial tier offset the costs of recording the multiplier. However, it took more than 18 years before the cumulative net present value of benefits offset the costs at current test prices. Strategies in which recorded multiplier ewes were selected as replacements for the nucleus flock did modestly increase profitability when compared to a closed nucleus structure. Applying genomic selection is the most beneficial strategy if testing costs can be reduced or by genotyping only a proportion of the selection candidates. When the cost of genotyping was reduced, scenarios that combine performance recording with genomic selection were more profitable and reached breakeven point about 10 years earlier. Economic benefits can be generated in multiplier flocks by implementing performance recording in conjunction with either DNA pedigree recording or genomic technology. These recording practices reduce the long genetic lag between the nucleus and commercial flocks in multi-tiered breeding programs. Under current genotyping costs, the time to breakeven was found to be generally very long, although this varied between strategies. Strategies using either genomic selection or DNA pedigree verification were found to be economically viable provided the price paid for the tests is lower than current prices, in the long-term.

  20. Looking for Trouble: Preventive Genomic Sequencing in the General Population and the Role of Patient Choice

    PubMed Central

    Lázaro-Muñoz, Gabriel; Conley, John M.; Davis, Arlene M.; Van Riper, Marcia; Walker, Rebecca L.; Juengst, Eric T.

    2015-01-01

    Advances in genomics have led to calls for developing population-based preventive genomic sequencing (PGS) programs with the goal of identifying genetic health risks in adults without known risk factors. One critical issue for minimizing the harms and maximizing the benefits of PGS is determining the kind and degree of control individuals should have over the generation, use, and handling of their genomic information. In this article we examine whether PGS programs should offer individuals the opportunity to selectively opt-out of the sequencing or analysis of specific genomic conditions (the menu approach) or whether PGS should be implemented using an all-or-nothing panel approach. We conclude that any responsible scale up of PGS will require a menu approach that may seem impractical to some, but which draws its justification from a rich mix of normative, legal, and practical considerations. PMID:26147254

  1. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

    PubMed

    Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

    2014-02-01

    Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.

  2. Evaluation of methods and marker Systems in Genomic Selection of oil palm (Elaeis guineensis Jacq.).

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Yeoh, Suat Hui; Appleton, David Ross; Harikrishna, Jennifer Ann

    2017-12-11

    Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits. The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods. Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation.

  3. New tool to assemble repetitive regions using next-generation sequencing data

    NASA Astrophysics Data System (ADS)

    Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz

    2017-08-01

    The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.

  4. Recovery and characterization of a Citrus clementina Hort. ex Tan. 'Clemenules' haploid plant selected to establish the reference whole Citrus genome sequence.

    PubMed

    Aleza, Pablo; Juárez, José; Hernández, María; Pina, José A; Ollitrault, Patrick; Navarro, Luis

    2009-08-22

    In recent years, the development of structural genomics has generated a growing interest in obtaining haploid plants. The use of homozygous lines presents a significant advantage for the accomplishment of sequencing projects. Commercial citrus species are characterized by high heterozygosity, making it difficult to assemble large genome sequences. Thus, the International Citrus Genomic Consortium (ICGC) decided to establish a reference whole citrus genome sequence from a homozygous plant. Due to the existence of important molecular resources and previous success in obtaining haploid clementine plants, haploid clementine was selected as the target for the implementation of the reference whole genome citrus sequence. To obtain haploid clementine lines we used the technique of in situ gynogenesis induced by irradiated pollen. Flow cytometry, chromosome counts and SSR marker (Simple Sequence Repeats) analysis facilitated the identification of six different haploid lines (2n = x = 9), one aneuploid line (2n = 2x+4 = 22) and one doubled haploid plant (2n = 2x = 18) of 'Clemenules' clementine. One of the haploids, obtained directly from an original haploid embryo, grew vigorously and produced flowers after four years. This is the first haploid plant of clementine that has bloomed and we have, for the first time, characterized the histology of haploid and diploid flowers of clementine. Additionally a double haploid plant was obtained spontaneously from this haploid line. The first haploid plant of 'Clemenules' clementine produced directly by germination of a haploid embryo, which grew vigorously and produced flowers, has been obtained in this work. This haploid line has been selected and it is being used by the ICGC to establish the reference sequence of the nuclear genome of citrus.

  5. MicroScope: a platform for microbial genome annotation and comparative genomics

    PubMed Central

    Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C.; Médigue, C.

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope’s rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of microbial genome annotation, especially for genomes initially analyzed by automatic procedures alone. Database URLs: http://www.genoscope.cns.fr/agc/mage and http://www.genoscope.cns.fr/agc/microcyc PMID:20157493

  6. MicroScope: a platform for microbial genome annotation and comparative genomics.

    PubMed

    Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of microbial genome annotation, especially for genomes initially analyzed by automatic procedures alone.Database URLs: http://www.genoscope.cns.fr/agc/mage and http://www.genoscope.cns.fr/agc/microcyc.

  7. MaGnET: Malaria Genome Exploration Tool

    PubMed Central

    Sharman, Joanna L.; Gerloff, Dietlind L.

    2013-01-01

    Summary: The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive ‘exploration-style’ visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein–protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Availability and Implementation: Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org Contact: joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23894142

  8. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

    PubMed Central

    2013-01-01

    Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313

  9. Goat domestication and breeding: a jigsaw of historical, biological and molecular data with missing pieces.

    PubMed

    Amills, M; Capote, J; Tosser-Klopp, G

    2017-12-01

    Domestic goats (Capra hircus) are spread across the five continents with a census of 1 billion individuals. The worldwide population of goats descends from a limited number of bezoars (Capra aegagrus) domesticated 10 000 YBP (years before the present) in the Fertile Crescent. The extraordinary adaptability and hardiness of goats favoured their rapid spread over the Old World, reaching the Iberian Peninsula and Southern Africa 7000 YBP and 2000 YBP respectively. Molecular studies have revealed one major mitochondrial haplogroup A and five less frequent haplogroups B, C, D, F and G. Moreover, the analysis of autosomal and Y-chromosome markers has evidenced an appreciable geographic differentiation. The implementation of new molecular technologies, such as whole-genome sequencing and genome-wide genotyping, allows for the exploration of caprine diversity at an unprecedented scale, thus providing new insights into the evolutionary history of goats. In spite of a number of pitfalls, the characterization of the functional elements of the goat genome is expected to play a key role in understanding the genetic determination of economically relevant traits. Genomic selection and genome editing also hold great potential, particularly for improving traits that cannot be modified easily by traditional selection. © 2017 Stichting International Foundation for Animal Genetics.

  10. Toward a Better Compression for DNA Sequences Using Huffman Encoding

    PubMed Central

    Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-01-01

    Abstract Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016). PMID:27960065

  11. Toward a Better Compression for DNA Sequences Using Huffman Encoding.

    PubMed

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-04-01

    Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).

  12. Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.

    PubMed

    Zhang, Xiaoshuai; Xue, Fuzhong; Liu, Hong; Zhu, Dianwen; Peng, Bin; Wiemels, Joseph L; Yang, Xiaowei

    2014-12-10

    Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.

  13. Optimal marker-assisted selection to increase the effective size of small populations.

    PubMed

    Wang, J

    2001-02-01

    An approach to the optimal utilization of marker and pedigree information in minimizing the rates of inbreeding and genetic drift at the average locus of the genome (not just the marked loci) in a small diploid population is proposed, and its efficiency is investigated by stochastic simulations. The approach is based on estimating the expected pedigree of each chromosome by using marker and individual pedigree information and minimizing the average coancestry of selected chromosomes by quadratic integer programming. It is shown that the approach is much more effective and much less computer demanding in implementation than previous ones. For pigs with 10 offspring per mother genotyped for two markers (each with four alleles at equal initial frequency) per chromosome of 100 cM, the approach can increase the average effective size for the whole genome by approximately 40 and 55% if mating ratios (the number of females mated with a male) are 3 and 12, respectively, compared with the corresponding values obtained by optimizing between-family selection using pedigree information only. The efficiency of the marker-assisted selection method increases with increasing amount of marker information (number of markers per chromosome, heterozygosity per marker) and family size, but decreases with increasing genome size. For less prolific species, the approach is still effective if the mating ratio is large so that a high marker-assisted selection pressure on the rarer sex can be maintained.

  14. The current state of funded NIH grants in implementation science in genomic medicine: a portfolio analysis.

    PubMed

    Roberts, Megan C; Clyne, Mindy; Kennedy, Amy E; Chambers, David A; Khoury, Muin J

    2017-10-26

    PurposeImplementation science offers methods to evaluate the translation of genomic medicine research into practice. The extent to which the National Institutes of Health (NIH) human genomics grant portfolio includes implementation science is unknown. This brief report's objective is to describe recently funded implementation science studies in genomic medicine in the NIH grant portfolio, and identify remaining gaps.MethodsWe identified investigator-initiated NIH research grants on implementation science in genomic medicine (funding initiated 2012-2016). A codebook was adapted from the literature, three authors coded grants, and descriptive statistics were calculated for each code.ResultsForty-two grants fit the inclusion criteria (~1.75% of investigator-initiated genomics grants). The majority of included grants proposed qualitative and/or quantitative methods with cross-sectional study designs, and described clinical settings and primarily white, non-Hispanic study populations. Most grants were in oncology and examined genetic testing for risk assessment. Finally, grants lacked the use of implementation science frameworks, and most examined uptake of genomic medicine and/or assessed patient-centeredness.ConclusionWe identified large gaps in implementation science studies in genomic medicine in the funded NIH portfolio over the past 5 years. To move the genomics field forward, investigator-initiated research grants should employ rigorous implementation science methods within diverse settings and populations.Genetics in Medicine advance online publication, 26 October 2017; doi:10.1038/gim.2017.180.

  15. Trends in genome-wide and region-specific genetic diversity in the Dutch-Flemish Holstein-Friesian breeding program from 1986 to 2015.

    PubMed

    Doekes, Harmen P; Veerkamp, Roel F; Bijma, Piter; Hiemstra, Sipke J; Windig, Jack J

    2018-04-11

    In recent decades, Holstein-Friesian (HF) selection schemes have undergone profound changes, including the introduction of optimal contribution selection (OCS; around 2000), a major shift in breeding goal composition (around 2000) and the implementation of genomic selection (GS; around 2010). These changes are expected to have influenced genetic diversity trends. Our aim was to evaluate genome-wide and region-specific diversity in HF artificial insemination (AI) bulls in the Dutch-Flemish breeding program from 1986 to 2015. Pedigree and genotype data (~ 75.5 k) of 6280 AI-bulls were used to estimate rates of genome-wide inbreeding and kinship and corresponding effective population sizes. Region-specific inbreeding trends were evaluated using regions of homozygosity (ROH). Changes in observed allele frequencies were compared to those expected under pure drift to identify putative regions under selection. We also investigated the direction of changes in allele frequency over time. Effective population size estimates for the 1986-2015 period ranged from 69 to 102. Two major breakpoints were observed in genome-wide inbreeding and kinship trends. Around 2000, inbreeding and kinship levels temporarily dropped. From 2010 onwards, they steeply increased, with pedigree-based, ROH-based and marker-based inbreeding rates as high as 1.8, 2.1 and 2.8% per generation, respectively. Accumulation of inbreeding varied substantially across the genome. A considerable fraction of markers showed changes in allele frequency that were greater than expected under pure drift. Putative selected regions harboured many quantitative trait loci (QTL) associated to a wide range of traits. In consecutive 5-year periods, allele frequencies changed more often in the same direction than in opposite directions, except when comparing the 1996-2000 and 2001-2005 periods. Genome-wide and region-specific diversity trends reflect major changes in the Dutch-Flemish HF breeding program. Introduction of OCS and the shift in breeding goal were followed by a drop in inbreeding and kinship and a shift in the direction of changes in allele frequency. After introduction of GS, rates of inbreeding and kinship increased substantially while allele frequencies continued to change in the same direction as before GS. These results provide insight in the effect of breeding practices on genomic diversity and emphasize the need for efficient management of genetic diversity in GS schemes.

  16. Acceleration of genetic gain in cattle by reduction of generation interval.

    PubMed

    Kasinathan, Poothappillai; Wei, Hong; Xiang, Tianhao; Molina, Jose A; Metzger, John; Broek, Diane; Kasinathan, Sivakanthan; Faber, David C; Allan, Mark F

    2015-03-02

    Genomic selection (GS) approaches, in combination with reproductive technologies, are revolutionizing the design and implementation of breeding programs in livestock species, particularly in cattle. GS leverages genomic readouts to provide estimates of breeding value early in the life of animals. However, the capacity of these approaches for improving genetic gain in breeding programs is limited by generation interval, the average age of an animal when replacement progeny are born. Here, we present a cost-effective approach that combines GS with reproductive technologies to reduce generation interval by rapidly producing high genetic merit calves.

  17. Efficient Breeding by Genomic Mating.

    PubMed

    Akdemir, Deniz; Sánchez, Julio I

    2016-01-01

    Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.

  18. Accuracy of Genomic Prediction in a Commercial Perennial Ryegrass Breeding Program.

    PubMed

    Fè, Dario; Ashraf, Bilal H; Pedersen, Morten G; Janss, Luc; Byrne, Stephen; Roulund, Niels; Lenk, Ingo; Didion, Thomas; Asp, Torben; Jensen, Christian S; Jensen, Just

    2016-11-01

    The implementation of genomic selection (GS) in plant breeding, so far, has been mainly evaluated in crops farmed as homogeneous varieties, and the results have been generally positive. Fewer results are available for species, such as forage grasses, that are grown as heterogenous families (developed from multiparent crosses) in which the control of the genetic variation is far more complex. Here we test the potential for implementing GS in the breeding of perennial ryegrass ( L.) using empirical data from a commercial forage breeding program. Biparental F and multiparental synthetic (SYN) families of diploid perennial ryegrass were genotyped using genotyping-by-sequencing, and phenotypes for five different traits were analyzed. Genotypes were expressed as family allele frequencies, and phenotypes were recorded as family means. Different models for genomic prediction were compared by using practically relevant cross-validation strategies. All traits showed a highly significant level of genetic variance, which could be traced using the genotyping assay. While there was significant genotype × environment (G × E) interaction for some traits, accuracies were high among F families and between biparental F and multiparental SYN families. We have demonstrated that the implementation of GS in grass breeding is now possible and presents an opportunity to make significant gains for various traits. Copyright © 2016 Crop Science Society of America.

  19. UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs

    PubMed Central

    Mignone, Flavio; Grillo, Giorgio; Licciulli, Flavio; Iacono, Michele; Liuni, Sabino; Kersey, Paul J.; Duarte, Jorge; Saccone, Cecilia; Pesole, Graziano

    2005-01-01

    The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/. PMID:15608165

  20. Development of plant condition measurement - The Jimah Model

    NASA Astrophysics Data System (ADS)

    Evans, Roy F.; Syuhaimi, Mohd; Mazli, Mohammad; Kamarudin, Nurliyana; Maniza Othman, Faiz

    2012-05-01

    The Jimah Model is an information management model. The model has been designed to facilitate analysis of machine condition by integrating diagnostic data with quantitative and qualitative information. The model treats data as a single strand of information - metaphorically a 'genome' of data. The 'Genome' is structured to be representative of plant function and identifies the condition of selected components (or genes) in each machine. To date in industry, computer aided work processes used with traditional industrial practices, have been unable to consistently deliver a standard of information suitable for holistic evaluation of machine condition and change. Significantly the reengineered site strategies necessary for implementation of this "data genome concept" have resulted in enhanced knowledge and management of plant condition. In large plant with high initial equipment cost and subsequent high maintenance costs, accurate measurement of major component condition becomes central to whole of life management and replacement decisions. A case study following implementation of the model at a major power station site in Malaysia (Jimah) shows that modeling of plant condition and wear (in real time) can be made a practical reality.

  1. Physician Response to Implementation of Genotype-Tailored Antiplatelet Therapy

    PubMed Central

    Peterson, Josh F.; Field, Julie R.; Unertl, Kim; Schildcrout, Jonathan S.; Johnson, Daniel C.; Shi, Yaping; Danciu, Ioana; Cleator, John H.; Pulley, Jill M.; McPherson, John A.; Denny, Josh C.; Laposata, Michael; Roden, Dan M.; Johnson, Kevin B.

    2016-01-01

    Physician responses to genomic information are vital to the success of precision medicine initiatives. We prospectively studied a pharmacogenomics implementation program for the propensity of clinicians to select antiplatelet therapy based on CYP2C19 loss-of-function (LOF) variants in stented patients. Among 2,676 patients, 514 (19.2%) were found to have a CYP2C19 variant affecting clopidogrel metabolism. For the majority (93.6%) of the cohort, cardiologists received active and direct notification of CYP2C19 status. Over 12 months, 57.6% of poor metabolizers and 33.2% of intermediate metabolizers received alternatives to clopidogrel. CYP2C19 variant status was the most influential factor impacting the prescribing decision [HR in poor metabolizers 8.1, 95% CI (5.4,12.2) and HR 5.0, 95% CI (4.0,6.3) in intermediate metabolizers], followed by patient age and type of stent implanted. We conclude that cardiologists tailored antiplatelet therapy for a minority of patients with a CYP2C19 variant and considered both genomic and non-genomic risks in their clinical decision-making. PMID:26693963

  2. The African Genome Variation Project shapes medical genetics in Africa

    PubMed Central

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2014-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterisation of African genetic diversity is needed. The African Genome Variation Project (AGVP) provides a resource to help design, implement and interpret genomic studies in sub-Saharan Africa (SSA) and worldwide. The AGVP represents dense genotypes from 1,481 and whole genome sequences (WGS) from 320 individuals across SSA. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across SSA. We identify new loci under selection, including for malaria and hypertension. We show that modern imputation panels can identify association signals at highly differentiated loci across populations in SSA. Using WGS, we show further improvement in imputation accuracy supporting efforts for large-scale sequencing of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa, showing for the first time that such designs are feasible. PMID:25470054

  3. Structural-Functional Organization of the Eukaryotic Cell Nucleus and Transcription Regulation: Introduction to This Special Issue of Biochemistry (Moscow).

    PubMed

    Razin, S V

    2018-04-01

    This issue of Biochemistry (Moscow) is devoted to the cell nucleus and mechanisms of transcription regulation. Over the years, biochemical processes in the cell nucleus have been studied in isolation, outside the context of their spatial organization. Now it is clear that segregation of functional processes within a compartmentalized cell nucleus is very important for the implementation of basic genetic processes. The functional compartmentalization of the cell nucleus is closely related to the spatial organization of the genome, which in turn plays a key role in the operation of epigenetic mechanisms. In this issue of Biochemistry (Moscow), we present a selection of review articles covering the functional architecture of the eukaryotic cell nucleus, the mechanisms of genome folding, the role of stochastic processes in establishing 3D architecture of the genome, and the impact of genome spatial organization on transcription regulation.

  4. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    PubMed Central

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170

  5. A divergent Artiodactyl MYADM-like repeat is associated with erythrocyte traits and weight of lamb weaned in domestic sheep.

    PubMed

    Gonzalez, Michael V; Mousel, Michelle R; Herndon, David R; Jiang, Yu; Dalrymple, Brian P; Reynolds, James O; Johnson, Wendell C; Herrmann-Hoesing, Lynn M; White, Stephen N

    2013-01-01

    A genome-wide association study (GWAS) was performed to investigate seven red blood cell (RBC) phenotypes in over 500 domestic sheep (Ovis aries) from three breeds (Columbia, Polypay, and Rambouillet). A single nucleotide polymorphism (SNP) showed genome-wide significant association with increased mean corpuscular hemoglobin concentration (MCHC, P = 6.2×10(-14)) and genome-wide suggestive association with decreased mean corpuscular volume (MCV, P = 2.5×10(-6)). The ovine HapMap project found the same genomic region and the same peak SNP has been under extreme historical selective pressure, demonstrating the importance of this region for survival, reproduction, and/or artificially selected traits. We observed a large (>50 kb) variant haplotype sequence containing a full-length divergent artiodactyl MYADM-like repeat in strong linkage disequilibrium with the associated SNP. MYADM gene family members play roles in membrane organization and formation in myeloid cells. However, to our knowledge, no member of the MYADM gene family has been identified in development of morphologically variant RBCs. The specific RBC differences may be indicative of alterations in morphology. Additionally, erythrocytes with altered morphological structure often exhibit increased structural fragility, leading to increased RBC turnover and energy expenditure. The divergent artiodactyl MYADM-like repeat was also associated with increased ewe lifetime kilograms of lamb weaned (P = 2×10(-4)). This suggests selection for normal RBCs might increase lamb weights, although further validation is required before implementation in marker-assisted selection. These results provide clues to explain the strong selection on the artiodactyl MYADM-like repeat locus in sheep, and suggest MYADM family members may be important for RBC morphology in other mammals.

  6. A Divergent Artiodactyl MYADM-like Repeat Is Associated with Erythrocyte Traits and Weight of Lamb Weaned in Domestic Sheep

    PubMed Central

    Gonzalez, Michael V.; Mousel, Michelle R.; Herndon, David R.; Jiang, Yu; Dalrymple, Brian P.; Reynolds, James O.; Johnson, Wendell C.; Herrmann-Hoesing, Lynn M.; White, Stephen N.

    2013-01-01

    A genome-wide association study (GWAS) was performed to investigate seven red blood cell (RBC) phenotypes in over 500 domestic sheep (Ovis aries) from three breeds (Columbia, Polypay, and Rambouillet). A single nucleotide polymorphism (SNP) showed genome-wide significant association with increased mean corpuscular hemoglobin concentration (MCHC, P = 6.2×10−14) and genome-wide suggestive association with decreased mean corpuscular volume (MCV, P = 2.5×10−6). The ovine HapMap project found the same genomic region and the same peak SNP has been under extreme historical selective pressure, demonstrating the importance of this region for survival, reproduction, and/or artificially selected traits. We observed a large (>50 kb) variant haplotype sequence containing a full-length divergent artiodactyl MYADM-like repeat in strong linkage disequilibrium with the associated SNP. MYADM gene family members play roles in membrane organization and formation in myeloid cells. However, to our knowledge, no member of the MYADM gene family has been identified in development of morphologically variant RBCs. The specific RBC differences may be indicative of alterations in morphology. Additionally, erythrocytes with altered morphological structure often exhibit increased structural fragility, leading to increased RBC turnover and energy expenditure. The divergent artiodactyl MYADM-like repeat was also associated with increased ewe lifetime kilograms of lamb weaned (P = 2×10−4). This suggests selection for normal RBCs might increase lamb weights, although further validation is required before implementation in marker-assisted selection. These results provide clues to explain the strong selection on the artiodactyl MYADM-like repeat locus in sheep, and suggest MYADM family members may be important for RBC morphology in other mammals. PMID:24023702

  7. Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus)

    PubMed Central

    Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

    2016-01-01

    Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus. PMID:27336696

  8. Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus).

    PubMed

    Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

    2016-01-01

    Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus.

  9. Developing a Common Framework for Evaluating the Implementation of Genomic Medicine Interventions in Clinical Care: The IGNITE Network’s Common Measures Working Group

    PubMed Central

    Orlando, Lori A.; Sperber, Nina R.; Voils, Corrine; Nichols, Marshall; Myers, Rachel A.; Wu, R. Ryanne; Rakhra-Burris, Tejinder; Levy, Kenneth D.; Levy, Mia; Pollin, Toni I.; Guan, Yue; Horowitz, Carol R.; Ramos, Michelle; Kimmel, Stephen E.; McDonough, Caitrin W.; Madden, Ebony B.; Damschroder, Laura J.

    2017-01-01

    Purpose Implementation research provides a structure for evaluating the clinical integration of genomic medicine interventions. This paper describes the Implementing GeNomics In PracTicE (IGNITE) Network’s efforts to promote: 1) a broader understanding of genomic medicine implementation research; and 2) the sharing of knowledge generated in the network. Methods To facilitate this goal the IGNITE Network Common Measures Working Group (CMG) members adopted the Consolidated Framework for Implementation Research (CFIR) to guide their approach to: identifying constructs and measures relevant to evaluating genomic medicine as a whole, standardizing data collection across projects, and combining data in a centralized resource for cross network analyses. Results CMG identified ten high-priority CFIR constructs as important for genomic medicine. Of those, eight didn’t have standardized measurement instruments. Therefore, we developed four survey tools to address this gap. In addition, we identified seven high-priority constructs related to patients, families, and communities that did not map to CFIR constructs. Both sets of constructs were combined to create a draft genomic medicine implementation model. Conclusion We developed processes to identify constructs deemed valuable for genomic medicine implementation and codified them in a model. These resources are freely available to facilitate knowledge generation and sharing across the field. PMID:28914267

  10. Developing a common framework for evaluating the implementation of genomic medicine interventions in clinical care: the IGNITE Network's Common Measures Working Group.

    PubMed

    Orlando, Lori A; Sperber, Nina R; Voils, Corrine; Nichols, Marshall; Myers, Rachel A; Wu, R Ryanne; Rakhra-Burris, Tejinder; Levy, Kenneth D; Levy, Mia; Pollin, Toni I; Guan, Yue; Horowitz, Carol R; Ramos, Michelle; Kimmel, Stephen E; McDonough, Caitrin W; Madden, Ebony B; Damschroder, Laura J

    2018-06-01

    PurposeImplementation research provides a structure for evaluating the clinical integration of genomic medicine interventions. This paper describes the Implementing Genomics in Practice (IGNITE) Network's efforts to promote (i) a broader understanding of genomic medicine implementation research and (ii) the sharing of knowledge generated in the network.MethodsTo facilitate this goal, the IGNITE Network Common Measures Working Group (CMG) members adopted the Consolidated Framework for Implementation Research (CFIR) to guide its approach to identifying constructs and measures relevant to evaluating genomic medicine as a whole, standardizing data collection across projects, and combining data in a centralized resource for cross-network analyses.ResultsCMG identified 10 high-priority CFIR constructs as important for genomic medicine. Of those, eight did not have standardized measurement instruments. Therefore, we developed four survey tools to address this gap. In addition, we identified seven high-priority constructs related to patients, families, and communities that did not map to CFIR constructs. Both sets of constructs were combined to create a draft genomic medicine implementation model.ConclusionWe developed processes to identify constructs deemed valuable for genomic medicine implementation and codified them in a model. These resources are freely available to facilitate knowledge generation and sharing across the field.

  11. Challenges and strategies for implementing genomic services in diverse settings: experiences from the Implementing GeNomics In pracTicE (IGNITE) network.

    PubMed

    Sperber, Nina R; Carpenter, Janet S; Cavallari, Larisa H; J Damschroder, Laura; Cooper-DeHoff, Rhonda M; Denny, Joshua C; Ginsburg, Geoffrey S; Guan, Yue; Horowitz, Carol R; Levy, Kenneth D; Levy, Mia A; Madden, Ebony B; Matheny, Michael E; Pollin, Toni I; Pratt, Victoria M; Rosenman, Marc; Voils, Corrine I; W Weitzel, Kristen; Wilke, Russell A; Ryanne Wu, R; Orlando, Lori A

    2017-05-22

    To realize potential public health benefits from genetic and genomic innovations, understanding how best to implement the innovations into clinical care is important. The objective of this study was to synthesize data on challenges identified by six diverse projects that are part of a National Human Genome Research Institute (NHGRI)-funded network focused on implementing genomics into practice and strategies to overcome these challenges. We used a multiple-case study approach with each project considered as a case and qualitative methods to elicit and describe themes related to implementation challenges and strategies. We describe challenges and strategies in an implementation framework and typology to enable consistent definitions and cross-case comparisons. Strategies were linked to challenges based on expert review and shared themes. Three challenges were identified by all six projects, and strategies to address these challenges varied across the projects. One common challenge was to increase the relative priority of integrating genomics within the health system electronic health record (EHR). Four projects used data warehousing techniques to accomplish the integration. The second common challenge was to strengthen clinicians' knowledge and beliefs about genomic medicine. To overcome this challenge, all projects developed educational materials and conducted meetings and outreach focused on genomic education for clinicians. The third challenge was engaging patients in the genomic medicine projects. Strategies to overcome this challenge included use of mass media to spread the word, actively involving patients in implementation (e.g., a patient advisory board), and preparing patients to be active participants in their healthcare decisions. This is the first collaborative evaluation focusing on the description of genomic medicine innovations implemented in multiple real-world clinical settings. Findings suggest that strategies to facilitate integration of genomic data within existing EHRs and educate stakeholders about the value of genomic services are considered important for effective implementation. Future work could build on these findings to evaluate which strategies are optimal under what conditions. This information will be useful for guiding translation of discoveries to clinical care, which, in turn, can provide data to inform continual improvement of genomic innovations and their applications.

  12. The African Genome Variation Project shapes medical genetics in Africa

    NASA Astrophysics Data System (ADS)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2015-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  13. The African Genome Variation Project shapes medical genetics in Africa.

    PubMed

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O; Choudhury, Ananyo; Ritchie, Graham R S; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N; Young, Elizabeth H; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S

    2015-01-15

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  14. Genetic counselors’ (GC) knowledge, awareness, and understanding of clinical next-generation sequencing (NGS) genomic testing

    PubMed Central

    Boland, PM; Ruth, K; Matro, JM; Rainey, KL; Fang, CY; Wong, YN; Daly, MB; Hall, MJ

    2014-01-01

    Genomic tests are increasingly complex, less expensive, and more widely available with the advent of next-generation sequencing (NGS). We assessed knowledge and perceptions among genetic counselors pertaining to NGS genomic testing via an online survey. Associations between selected characteristics and perceptions were examined. Recent education on NGS testing was common, but practical experience limited. Perceived understanding of clinical NGS was modest, specifically concerning tumor testing. Greater perceived understanding of clinical NGS testing correlated with more time spent in cancer-related counseling, exposure to NGS testing, and NGS-focused education. Substantial disagreement about the role of counseling for tumor-based testing was seen. Finally, a majority of counselors agreed with the need for more education about clinical NGS testing, supporting this approach to optimizing implementation. PMID:25523111

  15. Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers.

    PubMed

    Da, Yang

    2015-12-18

    The amount of functional genomic information has been growing rapidly but remains largely unused in genomic selection. Genomic prediction and estimation using haplotypes in genome regions with functional elements such as all genes of the genome can be an approach to integrate functional and structural genomic information for genomic selection. Towards this goal, this article develops a new haplotype approach for genomic prediction and estimation. A multi-allelic haplotype model treating each haplotype as an 'allele' was developed for genomic prediction and estimation based on the partition of a multi-allelic genotypic value into additive and dominance values. Each additive value is expressed as a function of h - 1 additive effects, where h = number of alleles or haplotypes, and each dominance value is expressed as a function of h(h - 1)/2 dominance effects. For a sample of q individuals, the limit number of effects is 2q - 1 for additive effects and is the number of heterozygous genotypes for dominance effects. Additive values are factorized as a product between the additive model matrix and the h - 1 additive effects, and dominance values are factorized as a product between the dominance model matrix and the h(h - 1)/2 dominance effects. Genomic additive relationship matrix is defined as a function of the haplotype model matrix for additive effects, and genomic dominance relationship matrix is defined as a function of the haplotype model matrix for dominance effects. Based on these results, a mixed model implementation for genomic prediction and variance component estimation that jointly use haplotypes and single markers is established, including two computing strategies for genomic prediction and variance component estimation with identical results. The multi-allelic genetic partition fills a theoretical gap in genetic partition by providing general formulations for partitioning multi-allelic genotypic values and provides a haplotype method based on the quantitative genetics model towards the utilization of functional and structural genomic information for genomic prediction and estimation.

  16. Accuracies of genomically estimated breeding values from pure-breed and across-breed predictions in Australian beef cattle.

    PubMed

    Boerner, Vinzent; Johnston, David J; Tier, Bruce

    2014-10-24

    The major obstacles for the implementation of genomic selection in Australian beef cattle are the variety of breeds and in general, small numbers of genotyped and phenotyped individuals per breed. The Australian Beef Cooperative Research Center (Beef CRC) investigated these issues by deriving genomic prediction equations (PE) from a training set of animals that covers a range of breeds and crosses including Angus, Murray Grey, Shorthorn, Hereford, Brahman, Belmont Red, Santa Gertrudis and Tropical Composite. This paper presents accuracies of genomically estimated breeding values (GEBV) that were calculated from these PE in the commercial pure-breed beef cattle seed stock sector. PE derived by the Beef CRC from multi-breed and pure-breed training populations were applied to genotyped Angus, Limousin and Brahman sires and young animals, but with no pure-breed Limousin in the training population. The accuracy of the resulting GEBV was assessed by their genetic correlation to their phenotypic target trait in a bi-variate REML approach that models GEBV as trait observations. Accuracies of most GEBV for Angus and Brahman were between 0.1 and 0.4, with accuracies for abattoir carcass traits generally greater than for live animal body composition traits and reproduction traits. Estimated accuracies greater than 0.5 were only observed for Brahman abattoir carcass traits and for Angus carcass rib fat. Averaged across traits within breeds, accuracies of GEBV were highest when PE from the pooled across-breed training population were used. However, for the Angus and Brahman breeds the difference in accuracy from using pure-breed PE was small. For the Limousin breed no reasonable results could be achieved for any trait. Although accuracies were generally low compared to published accuracies estimated within breeds, they are in line with those derived in other multi-breed populations. Thus PE developed by the Beef CRC can contribute to the implementation of genomic selection in Australian beef cattle breeding.

  17. Linkage analysis of systolic blood pressure: a score statistic and computer implementation

    PubMed Central

    Wang, Kai; Peng, Yingwei

    2003-01-01

    A genome-wide linkage analysis was conducted on systolic blood pressure using a score statistic. The randomly selected Replicate 34 of the simulated data was used. The score statistic was applied to the sibships derived from the general pedigrees. An add-on R program to GENEHUNTER was developed for this analysis and is freely available. PMID:14975145

  18. Genome-wide association study for backfat thickness in Canchim beef cattle using Random Forest approach

    PubMed Central

    2013-01-01

    Background Meat quality involves many traits, such as marbling, tenderness, juiciness, and backfat thickness, all of which require attention from livestock producers. Backfat thickness improvement by means of traditional selection techniques in Canchim beef cattle has been challenging due to its low heritability, and it is measured late in an animal’s life. Therefore, the implementation of new methodologies for identification of single nucleotide polymorphisms (SNPs) linked to backfat thickness are an important strategy for genetic improvement of carcass and meat quality. Results The set of SNPs identified by the random forest approach explained as much as 50% of the deregressed estimated breeding value (dEBV) variance associated with backfat thickness, and a small set of 5 SNPs were able to explain 34% of the dEBV for backfat thickness. Several quantitative trait loci (QTL) for fat-related traits were found in the surrounding areas of the SNPs, as well as many genes with roles in lipid metabolism. Conclusions These results provided a better understanding of the backfat deposition and regulation pathways, and can be considered a starting point for future implementation of a genomic selection program for backfat thickness in Canchim beef cattle. PMID:23738659

  19. Genomic and pedigree-based prediction for leaf, stem, and stripe rust resistance in wheat.

    PubMed

    Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Huerta-Espino, Julio; Lan, Caixia; Bhavani, Sridhar; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E

    2017-07-01

    Genomic prediction for seedling and adult plant resistance to wheat rusts was compared to prediction using few markers as fixed effects in a least-squares approach and pedigree-based prediction. The unceasing plant-pathogen arms race and ephemeral nature of some rust resistance genes have been challenging for wheat (Triticum aestivum L.) breeding programs and farmers. Hence, it is important to devise strategies for effective evaluation and exploitation of quantitative rust resistance. One promising approach that could accelerate gain from selection for rust resistance is 'genomic selection' which utilizes dense genome-wide markers to estimate the breeding values (BVs) for quantitative traits. Our objective was to compare three genomic prediction models including genomic best linear unbiased prediction (GBLUP), GBLUP A that was GBLUP with selected loci as fixed effects and reproducing kernel Hilbert spaces-markers (RKHS-M) with least-squares (LS) approach, RKHS-pedigree (RKHS-P), and RKHS markers and pedigree (RKHS-MP) to determine the BVs for seedling and/or adult plant resistance (APR) to leaf rust (LR), stem rust (SR), and stripe rust (YR). The 333 lines in the 45th IBWSN and the 313 lines in the 46th IBWSN were genotyped using genotyping-by-sequencing and phenotyped in replicated trials. The mean prediction accuracies ranged from 0.31-0.74 for LR seedling, 0.12-0.56 for LR APR, 0.31-0.65 for SR APR, 0.70-0.78 for YR seedling, and 0.34-0.71 for YR APR. For most datasets, the RKHS-MP model gave the highest accuracies, while LS gave the lowest. GBLUP, GBLUP A, RKHS-M, and RKHS-P models gave similar accuracies. Using genome-wide marker-based models resulted in an average of 42% increase in accuracy over LS. We conclude that GS is a promising approach for improvement of quantitative rust resistance and can be implemented in the breeding pipeline.

  20. Global Implementation of Genomic Medicine: We Are Not Alone

    PubMed Central

    Manolio, Teri A.; Abramowicz, Marc; Al-Mulla, Fahd; Anderson, Warwick; Balling, Rudi; Berger, Adam C.; Bleyl, Steven; Chakravarti, Aravinda; Chantratita, Wasun; Chisholm, Rex L.; Dissanayake, Vajira H. W.; Dunn, Michael; Dzau, Victor J.; Han, Bok-Ghee; Hubbard, Tim; Kolbe, Anne; Korf, Bruce; Kubo, Michiaki; Lasko, Paul; Leego, Erkki; Mahasirimongkol, Surakameth; Majumdar, Partha P.; Matthijs, Gert; McLeod, Howard L.; Metspalu, Andres; Meulien, Pierre; Miyano, Satoru; Naparstek, Yaakov; O’Rourke, P. Pearl; Patrinos, George P.; Rehm, Heidi L.; Relling, Mary V.; Rennert, Gad; Rodriguez, Laura Lyman; Roden, Dan M.; Shuldiner, Alan R.; Sinha, Sukdev; Tan, Patrick; Ulfendahl, Mats; Ward, Robyn; Williams, Marc S.; Wong, John E.L.; Green, Eric D.; Ginsburg, Geoffrey S.

    2016-01-01

    Advances in high-throughput genomic technologies coupled with a growing number of genomic results potentially useful in clinical care have led to ground-breaking genomic medicine implementation programs in various nations. Many of these innovative programs capitalize on unique local capabilities arising from the structure of their health care systems or their cultural or political milieu, as well as from unusual burdens of disease or risk alleles. Many such programs are being conducted in relative isolation and might benefit from sharing of approaches and lessons learned in other nations. The National Human Genome Research Institute recently brought together 25 of these groups from around the world to describe and compare projects, examine the current state of implementation and desired near-term capabilities, and identify opportunities for collaboration to promote the responsible implementation of genomic medicine. The wide variety of nascent programs in diverse settings demonstrates that implementation of genomic medicine is expanding globally in varied and highly innovative ways. Opportunities for collaboration abound in the areas of evidence generation, health information technology, education, workforce development, pharmacogenomics, and policy and regulatory issues. Several international organizations that are already facilitating effective research collaborations should engage to ensure implementation proceeds collaboratively without potentially wasteful duplication. Efforts to coalesce these groups around concrete but compelling signature projects, such as global eradication of genetically-mediated drug reactions or developing a truly global genomic variant data resource across a wide number of ethnicities, would accelerate appropriate implementation of genomics to improve clinical care world-wide. PMID:26041702

  1. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis.

    PubMed

    Jakupciak, John P; Wells, Jeffrey M; Karalus, Richard J; Pawlowski, David R; Lin, Jeffrey S; Feldman, Andrew B

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

  2. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    PubMed Central

    Jakupciak, John P.; Wells, Jeffrey M.; Karalus, Richard J.; Pawlowski, David R.; Lin, Jeffrey S.; Feldman, Andrew B.

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. PMID:24455204

  3. An integrated clinical and genomic information system for cancer precision medicine.

    PubMed

    Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk

    2018-04-20

    Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.

  4. Selfish drive can trump function when animal mitochondrial genomes compete.

    PubMed

    Ma, Hansong; O'Farrell, Patrick H

    2016-07-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection. In contrast, matchups between distantly related genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome, leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes showed that the noncoding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, in each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection, promoting change in the sequences influencing transmission.

  5. Selfish drive can trump function when animal mitochondrial genomes compete

    PubMed Central

    Ma, Hansong; O’Farrell, Patrick H.

    2016-01-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection1. Contrastingly, matchups between distant genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes revealed that the non-coding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, within each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection promoting change in the sequences influencing transmission. PMID:27270106

  6. The scope and strength of sex-specific selection in genome evolution

    PubMed Central

    Wright, A E; Mank, J E

    2013-01-01

    Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. PMID:23848139

  7. Genomic selection of agronomic traits in hybrid rice using an NCII population.

    PubMed

    Xu, Yang; Wang, Xin; Ding, Xiaowen; Zheng, Xingfei; Yang, Zefeng; Xu, Chenwu; Hu, Zhongli

    2018-05-10

    Hybrid breeding is an effective tool to improve yield in rice, while parental selection remains the key and difficult issue. Genomic selection (GS) provides opportunities to predict the performance of hybrids before phenotypes are measured. However, the application of GS is influenced by several genetic and statistical factors. Here, we used a rice North Carolina II (NC II) population constructed by crossing 115 rice varieties with five male sterile lines as a model to evaluate effects of statistical methods, heritability, marker density and training population size on prediction for hybrid performance. From the comparison of six GS methods, we found that predictabilities for different methods are significantly different, with genomic best linear unbiased prediction (GBLUP) and least absolute shrinkage and selection operation (LASSO) being the best, support vector machine (SVM) and partial least square (PLS) being the worst. The marker density has lower influence on predicting rice hybrid performance compared with the size of training population. Additionally, we used the 575 (115 × 5) hybrid rice as a training population to predict eight agronomic traits of all hybrids derived from 120 (115 + 5) rice varieties each mating with 3023 rice accessions from the 3000 rice genomes project (3 K RGP). Of the 362,760 potential hybrids, selection of the top 100 predicted hybrids would lead to 35.5%, 23.25%, 30.21%, 42.87%, 61.80%, 75.83%, 19.24% and 36.12% increase in grain yield per plant, thousand-grain weight, panicle number per plant, plant height, secondary branch number, grain number per panicle, panicle length and primary branch number, respectively. This study evaluated the factors affecting predictabilities for hybrid prediction and demonstrated the implementation of GS to predict hybrid performance of rice. Our results suggest that GS could enable the rapid selection of superior hybrids, thus increasing the efficiency of rice hybrid breeding.

  8. The Joint Effects of Background Selection and Genetic Recombination on Local Gene Genealogies

    PubMed Central

    Zeng, Kai; Charlesworth, Brian

    2011-01-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data. PMID:21705759

  9. The joint effects of background selection and genetic recombination on local gene genealogies.

    PubMed

    Zeng, Kai; Charlesworth, Brian

    2011-09-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.

  10. Multilevel Research and the Challenges of Implementing Genomic Medicine

    PubMed Central

    Coates, Ralph J.; Fennell, Mary L.; Glasgow, Russell E.; Scheuner, Maren T.; Schully, Sheri D.; Williams, Marc S.; Clauser, Steven B.

    2012-01-01

    Advances in genomics and related fields promise a new era of personalized medicine in the cancer care continuum. Nevertheless, there are fundamental challenges in integrating genomic medicine into cancer practice. We explore how multilevel research can contribute to implementation of genomic medicine. We first review the rapidly developing scientific discoveries in this field and the paucity of current applications that are ready for implementation in clinical and public health programs. We then define a multidisciplinary translational research agenda for successful integration of genomic medicine into policy and practice and consider challenges for successful implementation. We illustrate the agenda using the example of Lynch syndrome testing in newly diagnosed cases of colorectal cancer and cascade testing in relatives. We synthesize existing information in a framework for future multilevel research for integrating genomic medicine into the cancer care continuum. PMID:22623603

  11. Multilevel research and the challenges of implementing genomic medicine.

    PubMed

    Khoury, Muin J; Coates, Ralph J; Fennell, Mary L; Glasgow, Russell E; Scheuner, Maren T; Schully, Sheri D; Williams, Marc S; Clauser, Steven B

    2012-05-01

    Advances in genomics and related fields promise a new era of personalized medicine in the cancer care continuum. Nevertheless, there are fundamental challenges in integrating genomic medicine into cancer practice. We explore how multilevel research can contribute to implementation of genomic medicine. We first review the rapidly developing scientific discoveries in this field and the paucity of current applications that are ready for implementation in clinical and public health programs. We then define a multidisciplinary translational research agenda for successful integration of genomic medicine into policy and practice and consider challenges for successful implementation. We illustrate the agenda using the example of Lynch syndrome testing in newly diagnosed cases of colorectal cancer and cascade testing in relatives. We synthesize existing information in a framework for future multilevel research for integrating genomic medicine into the cancer care continuum.

  12. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute’s genomic medicine portfolio

    PubMed Central

    Manolio, Teri A.

    2016-01-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual’s genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of “Genomic Medicine Meetings,” under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and diffficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI’s genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. PMID:27612677

  13. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    PubMed

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. Published by Elsevier Ireland Ltd.

  14. The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes.

    PubMed

    Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars

    2017-02-10

    The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.

  15. Natural positive selection and north-south genetic diversity in East Asia.

    PubMed

    Suo, Chen; Xu, Haiyan; Khor, Chiea-Chuen; Ong, Rick Th; Sim, Xueling; Chen, Jieming; Tay, Wan-Ting; Sim, Kar-Seng; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun; Tai, E-Shyong; Wong, Tien-Yin; Chia, Kee-Seng; Teo, Yik-Ying

    2012-01-01

    Recent reports have identified a north-south cline in genetic variation in East and South-East Asia, but these studies have not formally explored the basis of these clinical differences. Understanding the origins of these variations may provide valuable insights in tracking down the functional variants in genomic regions identified by genetic association studies. Here we investigate the genetic basis of these differences with genome-wide data from the HapMap, the Human Genome Diversity Project and the Singapore Genome Variation Project. We implemented four bioinformatic measures to discover genomic regions that are considerably differentiated either between two Han Chinese populations in the north and south of China, or across 22 populations in East and South-East Asia. These measures prioritized genomic stretches with: (i) regional differences in the allelic spectrum for SNPs common to the two Han Chinese populations; (ii) differential evidence of positive selection between the two populations as quantified by integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH); (iii) significant correlation between allele frequencies and geographical latitudes of the 22 populations. We also explored the extent of linkage disequilibrium variations in these regions, which is important in combining genetic association studies from North and South Chinese. Two of the regions that emerged are found in HLA class I and II, suggesting that the HLA imputation panel from the HapMap may not be directly applicable to every Chinese sample. This has important implications to autoimmune studies that plan to impute the classical HLA alleles to fine map the SNP association signals.

  16. Natural positive selection and north–south genetic diversity in East Asia

    PubMed Central

    Suo, Chen; Xu, Haiyan; Khor, Chiea-Chuen; Ong, Rick TH; Sim, Xueling; Chen, Jieming; Tay, Wan-Ting; Sim, Kar-Seng; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun; Tai, E-Shyong; Wong, Tien-Yin; Chia, Kee-Seng; Teo, Yik-Ying

    2012-01-01

    Recent reports have identified a north–south cline in genetic variation in East and South-East Asia, but these studies have not formally explored the basis of these clinical differences. Understanding the origins of these variations may provide valuable insights in tracking down the functional variants in genomic regions identified by genetic association studies. Here we investigate the genetic basis of these differences with genome-wide data from the HapMap, the Human Genome Diversity Project and the Singapore Genome Variation Project. We implemented four bioinformatic measures to discover genomic regions that are considerably differentiated either between two Han Chinese populations in the north and south of China, or across 22 populations in East and South-East Asia. These measures prioritized genomic stretches with: (i) regional differences in the allelic spectrum for SNPs common to the two Han Chinese populations; (ii) differential evidence of positive selection between the two populations as quantified by integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH); (iii) significant correlation between allele frequencies and geographical latitudes of the 22 populations. We also explored the extent of linkage disequilibrium variations in these regions, which is important in combining genetic association studies from North and South Chinese. Two of the regions that emerged are found in HLA class I and II, suggesting that the HLA imputation panel from the HapMap may not be directly applicable to every Chinese sample. This has important implications to autoimmune studies that plan to impute the classical HLA alleles to fine map the SNP association signals. PMID:21792231

  17. Genomic selection in plant breeding

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor ...

  18. [Implementation of Italian guidelines on public health genomics in Italy: a challenging policy of the NHS].

    PubMed

    Boccia, Stefania; Federici, Antonio; Colotto, Marco; Villari, Paolo

    2014-01-01

    Genomics and related fields are becoming increasingly relevant in health care practice. Italy is the first European country that has a structured policy of Public Health Genomics. Nevertheless, what should be the role of genomics in a public health perspective and how public health professionals should engage with advances in genomics' knowledge and technology, is still not entirely clear. A description of the regulatory framework made-up by the Italian government in the last years is provided. In order to implement the national guidelines on Public Health Genomics published in 2013, key issues including the ethical, legal and social aspects within an evidence-based framework should be warranted and are herewith discussed. Genomics and predictive medicine are considered one of the main intervention areas by the National Prevention Plan 2010-2012, and dedicated guidelines were published in 2013. In order to implement such guidelines, we envisage a coordinated effort between stakeholders to guide development in genomic medicine, towards an impact on population health. There is also room to implement knowledge on how genomics can be integrated into health systems in an appropriate and sustainable way. Learning programs are needed to spread knowledge and awareness of genomics technology, in particular on genomic testing for complex diseases.

  19. The scope and strength of sex-specific selection in genome evolution.

    PubMed

    Wright, A E; Mank, J E

    2013-09-01

    Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.

  20. Genomic selection accuracies within and between environments and small breeding groups in white spruce.

    PubMed

    Beaulieu, Jean; Doerksen, Trevor K; MacKay, John; Rainville, André; Bousquet, Jean

    2014-12-02

    Genomic selection (GS) may improve selection response over conventional pedigree-based selection if markers capture more detailed information than pedigrees in recently domesticated tree species and/or make it more cost effective. Genomic prediction accuracies using 1748 trees and 6932 SNPs representative of as many distinct gene loci were determined for growth and wood traits in white spruce, within and between environments and breeding groups (BG), each with an effective size of Ne ≈ 20. Marker subsets were also tested. Model fits and/or cross-validation (CV) prediction accuracies for ridge regression (RR) and the least absolute shrinkage and selection operator models approached those of pedigree-based models. With strong relatedness between CV sets, prediction accuracies for RR within environment and BG were high for wood (r = 0.71-0.79) and moderately high for growth (r = 0.52-0.69) traits, in line with trends in heritabilities. For both classes of traits, these accuracies achieved between 83% and 92% of those obtained with phenotypes and pedigree information. Prediction into untested environments remained moderately high for wood (r ≥ 0.61) but dropped significantly for growth (r ≥ 0.24) traits, emphasizing the need to phenotype in all test environments and model genotype-by-environment interactions for growth traits. Removing relatedness between CV sets sharply decreased prediction accuracies for all traits and subpopulations, falling near zero between BGs with no known shared ancestry. For marker subsets, similar patterns were observed but with lower prediction accuracies. Given the need for high relatedness between CV sets to obtain good prediction accuracies, we recommend to build GS models for prediction within the same breeding population only. Breeding groups could be merged to build genomic prediction models as long as the total effective population size does not exceed 50 individuals in order to obtain high prediction accuracy such as that obtained in the present study. A number of markers limited to a few hundred would not negatively impact prediction accuracies, but these could decrease more rapidly over generations. The most promising short-term approach for genomic selection would likely be the selection of superior individuals within large full-sib families vegetatively propagated to implement multiclonal forestry.

  1. Assessing the impact of natural service bulls and genotype by environment interactions on genetic gain and inbreeding in organic dairy cattle genomic breeding programs.

    PubMed

    Yin, T; Wensch-Dorendorf, M; Simianer, H; Swalve, H H; König, S

    2014-06-01

    The objective of the present study was to compare genetic gain and inbreeding coefficients of dairy cattle in organic breeding program designs by applying stochastic simulations. Evaluated breeding strategies were: (i) selecting bulls from conventional breeding programs, and taking into account genotype by environment (G×E) interactions, (ii) selecting genotyped bulls within the organic environment for artificial insemination (AI) programs and (iii) selecting genotyped natural service bulls within organic herds. The simulated conventional population comprised 148 800 cows from 2976 herds with an average herd size of 50 cows per herd, and 1200 cows were assigned to 60 organic herds. In a young bull program, selection criteria of young bulls in both production systems (conventional and organic) were either 'conventional' estimated breeding values (EBV) or genomic estimated breeding values (GEBV) for two traits with low (h 2=0.05) and moderate heritability (h 2=0.30). GEBV were calculated for different accuracies (r mg), and G×E interactions were considered by modifying originally simulated true breeding values in the range from r g=0.5 to 1.0. For both traits (h 2=0.05 and 0.30) and r mg⩾0.8, genomic selection of bulls directly in the organic population and using selected bulls via AI revealed higher genetic gain than selecting young bulls in the larger conventional population based on EBV; also without the existence of G×E interactions. Only for pronounced G×E interactions (r g=0.5), and for highly accurate GEBV for natural service bulls (r mg>0.9), results suggests the use of genotyped organic natural service bulls instead of implementing an AI program. Inbreeding coefficients of selected bulls and their offspring were generally lower when basing selection decisions for young bulls on GEBV compared with selection strategies based on pedigree indices.

  2. Accuracy of Genomic Prediction in Switchgrass (Panicum virgatum L.) Improved by Accounting for Linkage Disequilibrium

    PubMed Central

    Ramstein, Guillaume P.; Evans, Joseph; Kaeppler, Shawn M.; Mitchell, Robert B.; Vogel, Kenneth P.; Buell, C. Robin; Casler, Michael D.

    2016-01-01

    Switchgrass is a relatively high-yielding and environmentally sustainable biomass crop, but further genetic gains in biomass yield must be achieved to make it an economically viable bioenergy feedstock. Genomic selection (GS) is an attractive technology to generate rapid genetic gains in switchgrass, and meet the goals of a substantial displacement of petroleum use with biofuels in the near future. In this study, we empirically assessed prediction procedures for genomic selection in two different populations, consisting of 137 and 110 half-sib families of switchgrass, tested in two locations in the United States for three agronomic traits: dry matter yield, plant height, and heading date. Marker data were produced for the families’ parents by exome capture sequencing, generating up to 141,030 polymorphic markers with available genomic-location and annotation information. We evaluated prediction procedures that varied not only by learning schemes and prediction models, but also by the way the data were preprocessed to account for redundancy in marker information. More complex genomic prediction procedures were generally not significantly more accurate than the simplest procedure, likely due to limited population sizes. Nevertheless, a highly significant gain in prediction accuracy was achieved by transforming the marker data through a marker correlation matrix. Our results suggest that marker-data transformations and, more generally, the account of linkage disequilibrium among markers, offer valuable opportunities for improving prediction procedures in GS. Some of the achieved prediction accuracies should motivate implementation of GS in switchgrass breeding programs. PMID:26869619

  3. Use of genomic recursions and algorithm for proven and young animals for single-step genomic BLUP analyses--a simulation study.

    PubMed

    Fragomeni, B O; Lourenco, D A L; Tsuruta, S; Masuda, Y; Aguilar, I; Misztal, I

    2015-10-01

    The purpose of this study was to examine accuracy of genomic selection via single-step genomic BLUP (ssGBLUP) when the direct inverse of the genomic relationship matrix (G) is replaced by an approximation of G(-1) based on recursions for young genotyped animals conditioned on a subset of proven animals, termed algorithm for proven and young animals (APY). With the efficient implementation, this algorithm has a cubic cost with proven animals and linear with young animals. Ten duplicate data sets mimicking a dairy cattle population were simulated. In a first scenario, genomic information for 20k genotyped bulls, divided in 7k proven and 13k young bulls, was generated for each replicate. In a second scenario, 5k genotyped cows with phenotypes were included in the analysis as young animals. Accuracies (average for the 10 replicates) in regular EBV were 0.72 and 0.34 for proven and young animals, respectively. When genomic information was included, they increased to 0.75 and 0.50. No differences between genomic EBV (GEBV) obtained with the regular G(-1) and the approximated G(-1) via the recursive method were observed. In the second scenario, accuracies in GEBV (0.76, 0.51 and 0.59 for proven bulls, young males and young females, respectively) were also higher than those in EBV (0.72, 0.35 and 0.49). Again, no differences between GEBV with regular G(-1) and with recursions were observed. With the recursive algorithm, the number of iterations to achieve convergence was reduced from 227 to 206 in the first scenario and from 232 to 209 in the second scenario. Cows can be treated as young animals in APY without reducing the accuracy. The proposed algorithm can be implemented to reduce computing costs and to overcome current limitations on the number of genotyped animals in the ssGBLUP method. © 2015 Blackwell Verlag GmbH.

  4. RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

    PubMed

    Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

    2013-11-01

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.

  5. Breeding for resistance to gastrointestinal nematodes - the potential in low-input/output small ruminant production systems.

    PubMed

    Zvinorova, P I; Halimani, T E; Muchadeyi, F C; Matika, O; Riggio, V; Dzama, K

    2016-07-30

    The control of gastrointestinal nematodes (GIN) is mainly based on the use of drugs, grazing management, use of copper oxide wire particles and bioactive forages. Resistance to anthelmintic drugs in small ruminants is documented worldwide. Host genetic resistance to parasites, has been increasingly used as a complementary control strategy, along with the conventional intervention methods mentioned above. Genetic diversity in resistance to GIN has been well studied in experimental and commercial flocks in temperate climates and more developed economies. However, there are very few report outputs from the more extensive low-input/output smallholder systems in developing and emerging countries. Furthermore, results on quantitative trait loci (QTL) associated with nematode resistance from various studies have not always been consistent, mainly due to the different nematodes studied, different host breeds, ages, climates, natural infections versus artificial challenges, infection level at sampling periods, among others. The increasing use of genetic markers (Single Nucleotide Polymorphisms, SNPs) in GWAS or the use of whole genome sequence data and a plethora of analytic methods offer the potential to identify loci or regions associated nematode resistance. Genomic selection as a genome-wide level method overcomes the need to identify candidate genes. Benefits in genomic selection are now being realised in dairy cattle and sheep under commercial settings in the more advanced countries. However, despite the commercial benefits of using these tools, there are practical problems associated with incorporating the use of marker-assisted selection or genomic selection in low-input/output smallholder farming systems breeding schemes. Unlike anthelmintic resistance, there is no empirical evidence suggesting that nematodes will evolve rapidly in response to resistant hosts. The strategy of nematode control has evolved to a more practical manipulation of host-parasite equilibrium in grazing systems by implementation of various strategies, in which improvement of genetic resistance of small ruminant should be included. Therefore, selection for resistant hosts can be considered as one of the sustainable control strategy, although it will be most effective when used to complement other control strategies such as grazing management and improving efficiency of anthelmintics currently. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  6. High-confidence assessment of functional impact of human mitochondrial non-synonymous genome variations by APOGEE.

    PubMed

    Castellana, Stefano; Fusilli, Caterina; Mazzoccoli, Gianluigi; Biagini, Tommaso; Capocefalo, Daniele; Carella, Massimo; Vescovi, Angelo Luigi; Mazza, Tommaso

    2017-06-01

    24,189 are all the possible non-synonymous amino acid changes potentially affecting the human mitochondrial DNA. Only a tiny subset was functionally evaluated with certainty so far, while the pathogenicity of the vast majority was only assessed in-silico by software predictors. Since these tools proved to be rather incongruent, we have designed and implemented APOGEE, a machine-learning algorithm that outperforms all existing prediction methods in estimating the harmfulness of mitochondrial non-synonymous genome variations. We provide a detailed description of the underlying algorithm, of the selected and manually curated training and test sets of variants, as well as of its classification ability.

  7. Preemptive Genotyping for Personalized Medicine: Design of the Right Drug, Right Dose, Right Time – Using Genomic Data to Individualize Treatment Protocol

    PubMed Central

    Bielinski, Suzette J.; Olson, Janet E.; Pathak, Jyotishman; Weinshilboum, Richard M.; Wang, Liewei; Lyke, Kelly J.; Ryu, Euijung; Targonski, Paul V.; Van Norstrand, Michael D.; Hathcock, Matthew A.; Takahashi, Paul Y.; McCormick, Jennifer B.; Johnson, Kiley J.; Maschke, Karen J.; Rohrer Vitek, Carolyn R.; Ellingson, Marissa S.; Wieben, Eric D.; Farrugia, Gianrico; Morrisette, Jody A.; Kruckeberg, Keri J.; Bruflat, Jamie K.; Peterson, Lisa M.; Blommel, Joseph H.; Skierka, Jennifer M.; Ferber, Matthew J.; Black, John L.; Baudhuin, Linnea M.; Klee, Eric W.; Ross, Jason L.; Veldhuizen, Tamra L.; Schultz, Cloann G.; Caraballo, Pedro J.; Freimuth, Robert R.; Chute, Christopher G.; Kullo, Iftikhar J.

    2014-01-01

    Objective To report the design and implementation of the Right Drug, Right Dose, Right Time: Using Genomic Data to Individualize Treatment Protocol that was developed to test the concept that prescribers can deliver genome guided therapy at the point-of-care by using preemptive pharmacogenomics (PGx) data and clinical decision support (CDS) integrated in the electronic medical record (EMR). Patients and Methods We used a multivariable prediction model to identify patients with a high risk of initiating statin therapy within 3 years. The model was used to target a study cohort most likely to benefit from preemptive PGx testing among Mayo Clinic Biobank participants with a recruitment goal of 1000 patients. Cox proportional hazards model was utilized using the variables selected through the Lasso shrinkage method. An operational CDS model was adapted to implement PGx rules within the EMR. Results The prediction model included age, sex, race, and 6 chronic diseases categorized by the Clinical Classifications Software for ICD-9 codes (dyslipidemia, diabetes, peripheral atherosclerosis, disease of the blood-forming organs, coronary atherosclerosis and other heart diseases, and hypertension). Of the 2000 Biobank participants invited, 50% provided blood samples, 13% refused, 28% did not respond, and 9% consented but did not provide a blood sample within the recruitment window (October 4, 2012 – March 20, 2013). Preemptive PGx testing included CYP2D6 genotyping and targeted sequencing of 84 PGx genes. Synchronous real-time CDS is integrated in the EMR and flags potential patient-specific drug-gene interactions and provides therapeutic guidance. Conclusion These interventions will improve understanding and implementation of genomic data in clinical practice. PMID:24388019

  8. AGORA : Organellar genome annotation from the amino acid and nucleotide references.

    PubMed

    Jung, Jaehee; Kim, Jong Im; Jeong, Young-Sik; Yi, Gangman

    2018-03-29

    Next-generation sequencing (NGS) technologies have led to the accumulation of highthroughput sequence data from various organisms in biology. To apply gene annotation of organellar genomes for various organisms, more optimized tools for functional gene annotation are required. Almost all gene annotation tools are mainly focused on the chloroplast genome of land plants or the mitochondrial genome of animals.We have developed a web application AGORA for the fast, user-friendly, and improved annotations of organellar genomes. AGORA annotates genes based on a BLAST-based homology search and clustering with selected reference sequences from the NCBI database or user-defined uploaded data. AGORA can annotate the functional genes in almost all mitochondrion and plastid genomes of eukaryotes. The gene annotation of a genome with an exon-intron structure within a gene or inverted repeat region is also available. It provides information of start and end positions of each gene, BLAST results compared with the reference sequence, and visualization of gene map by OGDRAW. Users can freely use the software, and the accessible URL is https://bigdata.dongguk.edu/gene_project/AGORA/.The main module of the tool is implemented by the python and php, and the web page is built by the HTML and CSS to support all browsers. gangman@dongguk.edu.

  9. Genomic selection in plant breeding.

    PubMed

    Newell, Mark A; Jannink, Jean-Luc

    2014-01-01

    Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor and major marker effects. Thus, the GEBV may capture more of the genetic variation for the particular trait under selection.

  10. Genomic predictions can accelerate selection for resistance against Piscirickettsia salmonis in Atlantic salmon (Salmo salar).

    PubMed

    Bangera, Rama; Correa, Katharina; Lhorente, Jean P; Figueroa, René; Yáñez, José M

    2017-01-31

    Salmon Rickettsial Syndrome (SRS) caused by Piscirickettsia salmonis is a major disease affecting the Chilean salmon industry. Genomic selection (GS) is a method wherein genome-wide markers and phenotype information of full-sibs are used to predict genomic EBV (GEBV) of selection candidates and is expected to have increased accuracy and response to selection over traditional pedigree based Best Linear Unbiased Prediction (PBLUP). Widely used GS methods such as genomic BLUP (GBLUP), SNPBLUP, Bayes C and Bayesian Lasso may perform differently with respect to accuracy of GEBV prediction. Our aim was to compare the accuracy, in terms of reliability of genome-enabled prediction, from different GS methods with PBLUP for resistance to SRS in an Atlantic salmon breeding program. Number of days to death (DAYS), binary survival status (STATUS) phenotypes, and 50 K SNP array genotypes were obtained from 2601 smolts challenged with P. salmonis. The reliability of different GS methods at different SNP densities with and without pedigree were compared to PBLUP using a five-fold cross validation scheme. Heritability estimated from GS methods was significantly higher than PBLUP. Pearson's correlation between predicted GEBV from PBLUP and GS models ranged from 0.79 to 0.91 and 0.79-0.95 for DAYS and STATUS, respectively. The relative increase in reliability from different GS methods for DAYS and STATUS with 50 K SNP ranged from 8 to 25% and 27-30%, respectively. All GS methods outperformed PBLUP at all marker densities. DAYS and STATUS showed superior reliability over PBLUP even at the lowest marker density of 3 K and 500 SNP, respectively. 20 K SNP showed close to maximal reliability for both traits with little improvement using higher densities. These results indicate that genomic predictions can accelerate genetic progress for SRS resistance in Atlantic salmon and implementation of this approach will contribute to the control of SRS in Chile. We recommend GBLUP for routine GS evaluation because this method is computationally faster and the results are very similar with other GS methods. The use of lower density SNP or the combination of low density SNP and an imputation strategy may help to reduce genotyping costs without compromising gain in reliability.

  11. Preemptive genotyping for personalized medicine: design of the right drug, right dose, right time-using genomic data to individualize treatment protocol.

    PubMed

    Bielinski, Suzette J; Olson, Janet E; Pathak, Jyotishman; Weinshilboum, Richard M; Wang, Liewei; Lyke, Kelly J; Ryu, Euijung; Targonski, Paul V; Van Norstrand, Michael D; Hathcock, Matthew A; Takahashi, Paul Y; McCormick, Jennifer B; Johnson, Kiley J; Maschke, Karen J; Rohrer Vitek, Carolyn R; Ellingson, Marissa S; Wieben, Eric D; Farrugia, Gianrico; Morrisette, Jody A; Kruckeberg, Keri J; Bruflat, Jamie K; Peterson, Lisa M; Blommel, Joseph H; Skierka, Jennifer M; Ferber, Matthew J; Black, John L; Baudhuin, Linnea M; Klee, Eric W; Ross, Jason L; Veldhuizen, Tamra L; Schultz, Cloann G; Caraballo, Pedro J; Freimuth, Robert R; Chute, Christopher G; Kullo, Iftikhar J

    2014-01-01

    To report the design and implementation of the Right Drug, Right Dose, Right Time-Using Genomic Data to Individualize Treatment protocol that was developed to test the concept that prescribers can deliver genome-guided therapy at the point of care by using preemptive pharmacogenomics (PGx) data and clinical decision support (CDS) integrated into the electronic medical record (EMR). We used a multivariate prediction model to identify patients with a high risk of initiating statin therapy within 3 years. The model was used to target a study cohort most likely to benefit from preemptive PGx testing among the Mayo Clinic Biobank participants, with a recruitment goal of 1000 patients. We used a Cox proportional hazards model with variables selected through the Lasso shrinkage method. An operational CDS model was adapted to implement PGx rules within the EMR. The prediction model included age, sex, race, and 6 chronic diseases categorized by the Clinical Classifications Software for International Classification of Diseases, Ninth Revision codes (dyslipidemia, diabetes, peripheral atherosclerosis, disease of the blood-forming organs, coronary atherosclerosis and other heart diseases, and hypertension). Of the 2000 Biobank participants invited, 1013 (51%) provided blood samples, 256 (13%) declined participation, 555 (28%) did not respond, and 176 (9%) consented but did not provide a blood sample within the recruitment window (October 4, 2012, through March 20, 2013). Preemptive PGx testing included CYP2D6 genotyping and targeted sequencing of 84 PGx genes. Synchronous real-time CDS was integrated into the EMR and flagged potential patient-specific drug-gene interactions and provided therapeutic guidance. This translational project provides an opportunity to begin to evaluate the impact of preemptive sequencing and EMR-driven genome-guided therapy. These interventions will improve understanding and implementation of genomic data in clinical practice. Copyright © 2014 Mayo Foundation for Medical Education and Research. Published by Elsevier Inc. All rights reserved.

  12. A predictive assessment of genetic correlations between traits in chickens using markers.

    PubMed

    Momen, Mehdi; Mehrgardi, Ahmad Ayatollahi; Sheikhy, Ayoub; Esmailizadeh, Ali; Fozi, Masood Asadi; Kranis, Andreas; Valente, Bruno D; Rosa, Guilherme J M; Gianola, Daniel

    2017-02-01

    Genomic selection has been successfully implemented in plant and animal breeding programs to shorten generation intervals and accelerate genetic progress per unit of time. In practice, genomic selection can be used to improve several correlated traits simultaneously via multiple-trait prediction, which exploits correlations between traits. However, few studies have explored multiple-trait genomic selection. Our aim was to infer genetic correlations between three traits measured in broiler chickens by exploring kinship matrices based on a linear combination of measures of pedigree and marker-based relatedness. A predictive assessment was used to gauge genetic correlations. A multivariate genomic best linear unbiased prediction model was designed to combine information from pedigree and genome-wide markers in order to assess genetic correlations between three complex traits in chickens, i.e. body weight at 35 days of age (BW), ultrasound area of breast meat (BM) and hen-house egg production (HHP). A dataset with 1351 birds that were genotyped with the 600 K Affymetrix platform was used. A kinship kernel (K) was constructed as K = λ G + (1 - λ)A, where A is the numerator relationship matrix, measuring pedigree-based relatedness, and G is a genomic relationship matrix. The weight (λ) assigned to each source of information varied over the grid λ = (0, 0.2, 0.4, 0.6, 0.8, 1). Maximum likelihood estimates of heritability and genetic correlations were obtained at each λ, and the "optimum" λ was determined using cross-validation. Estimates of genetic correlations were affected by the weight placed on the source of information used to build K. For example, the genetic correlation between BW-HHP and BM-HHP changed markedly when λ varied from 0 (only A used for measuring relatedness) to 1 (only genomic information used). As λ increased, predictive correlations (correlation between observed phenotypes and predicted breeding values) increased and mean-squared predictive error decreased. However, the improvement in predictive ability was not monotonic, with an optimum found at some 0 < λ < 1, i.e., when both sources of information were used together. Our findings indicate that multiple-trait prediction may benefit from combining pedigree and marker information. Also, it appeared that expected correlated responses to selection computed from standard theory may differ from realized responses. The predictive assessment provided a metric for performance evaluation as well as a means for expressing uncertainty of outcomes of multiple-trait selection.

  13. Divergence hitchhiking and the spread of genomic isolation during ecological speciation-with-gene-flow

    PubMed Central

    Via, Sara

    2012-01-01

    In allopatric populations, geographical separation simultaneously isolates the entire genome, allowing genetic divergence to accumulate virtually anywhere in the genome. In sympatric populations, however, the strong divergent selection required to overcome migration produces a genetic mosaic of divergent and non-divergent genomic regions. In some recent genome scans, each divergent genomic region has been interpreted as an independent incidence of migration/selection balance, such that the reduction of gene exchange is restricted to a few kilobases around each divergently selected gene. I propose an alternative mechanism, ‘divergence hitchhiking’ (DH), in which divergent selection can reduce gene exchange for several megabases around a gene under strong divergent selection. Not all genes/markers within a DH region are divergently selected, yet the entire region is protected to some degree from gene exchange, permitting genetic divergence from mechanisms other than divergent selection to accumulate secondarily. After contrasting DH and multilocus migration/selection balance (MM/SB), I outline a model in which genomic isolation at a given genomic location is jointly determined by DH and genome-wide effects of the progressive reduction in realized migration, then illustrate DH using data from several pairs of incipient species in the wild. PMID:22201174

  14. Genetic counselors' (GC) knowledge, awareness, understanding of clinical next-generation sequencing (NGS) genomic testing.

    PubMed

    Boland, P M; Ruth, K; Matro, J M; Rainey, K L; Fang, C Y; Wong, Y N; Daly, M B; Hall, M J

    2015-12-01

    Genomic tests are increasingly complex, less expensive, and more widely available with the advent of next-generation sequencing (NGS). We assessed knowledge and perceptions among genetic counselors pertaining to NGS genomic testing via an online survey. Associations between selected characteristics and perceptions were examined. Recent education on NGS testing was common, but practical experience limited. Perceived understanding of clinical NGS was modest, specifically concerning tumor testing. Greater perceived understanding of clinical NGS testing correlated with more time spent in cancer-related counseling, exposure to NGS testing, and NGS-focused education. Substantial disagreement about the role of counseling for tumor-based testing was seen. Finally, a majority of counselors agreed with the need for more education about clinical NGS testing, supporting this approach to optimizing implementation. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. Towards big data science in the decade ahead from ten years of InCoB and the 1st ISCB-Asia Joint Conference

    PubMed Central

    2011-01-01

    The 2011 International Conference on Bioinformatics (InCoB) conference, which is the annual scientific conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted by Kuala Lumpur, Malaysia, is co-organized with the first ISCB-Asia conference of the International Society for Computational Biology (ISCB). InCoB and the sequencing of the human genome are both celebrating their tenth anniversaries and InCoB’s goalposts for the next decade, implementing standards in bioinformatics and globally distributed computational networks, will be discussed and adopted at this conference. Of the 49 manuscripts (selected from 104 submissions) accepted to BMC Genomics and BMC Bioinformatics conference supplements, 24 are featured in this issue, covering software tools, genome/proteome analysis, systems biology (networks, pathways, bioimaging) and drug discovery and design. PMID:22372736

  16. Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition

    PubMed Central

    Lefébure, Tristan; Stanhope, Michael J

    2007-01-01

    Background The genus Streptococcus is one of the most diverse and important human and agricultural pathogens. This study employs comparative evolutionary analyses of 26 Streptococcus genomes to yield an improved understanding of the relative roles of recombination and positive selection in pathogen adaptation to their hosts. Results Streptococcus genomes exhibit extreme levels of evolutionary plasticity, with high levels of gene gain and loss during species and strain evolution. S. agalactiae has a large pan-genome, with little recombination in its core-genome, while S. pyogenes has a smaller pan-genome and much more recombination of its core-genome, perhaps reflecting the greater habitat, and gene pool, diversity for S. agalactiae compared to S. pyogenes. Core-genome recombination was evident in all lineages (18% to 37% of the core-genome judged to be recombinant), while positive selection was mainly observed during species differentiation (from 11% to 34% of the core-genome). Positive selection pressure was unevenly distributed across lineages and biochemical main role categories. S. suis was the lineage with the greatest level of positive selection pressure, the largest number of unique loci selected, and the largest amount of gene gain and loss. Conclusion Recombination is an important evolutionary force in shaping Streptococcus genomes, not only in the acquisition of significant portions of the genome as lineage specific loci, but also in facilitating rapid evolution of the core-genome. Positive selection, although undoubtedly a slower process, has nonetheless played an important role in adaptation of the core-genome of different Streptococcus species to different hosts. PMID:17475002

  17. Implementation of genomics research in Africa: challenges and recommendations

    PubMed Central

    Adebamowo, Sally N.; Francis, Veronica; Tambo, Ernest; Diallo, Seybou H.; Landouré, Guida; Nembaware, Victoria; Dareng, Eileen; Muhamed, Babu; Odutola, Michael; Akeredolu, Teniola; Nerima, Barbara; Ozumba, Petronilla J.; Mbhele, Slee; Ghanash, Anita; Wachinou, Ablo P.; Ngomi, Nicholas

    2018-01-01

    ABSTRACT Background: There is exponential growth in the interest and implementation of genomics research in Africa. This growth has been facilitated by the Human Hereditary and Health in Africa (H3Africa) initiative, which aims to promote a contemporary research approach to the study of genomics and environmental determinants of common diseases in African populations. Objective: The purpose of this article is to describe important challenges affecting genomics research implementation in Africa. Methods: The observations, challenges and recommendations presented in this article were obtained through discussions by African scientists at teleconferences and face-to-face meetings, seminars at consortium conferences and in-depth individual discussions. Results: Challenges affecting genomics research implementation in Africa, which are related to limited resources include ill-equipped facilities, poor accessibility to research centers, lack of expertise and an enabling environment for research activities in local hospitals. Challenges related to the research study include delayed funding, extensive procedures and interventions requiring multiple visits, delays setting up research teams and insufficient staff training, language barriers and an underappreciation of cultural norms. While many African countries are struggling to initiate genomics projects, others have set up genomics research facilities that meet international standards. Conclusions: The lessons learned in implementing successful genomics projects in Africa are recommended as strategies to overcome these challenges. These recommendations may guide the development and application of new research programs in low-resource settings. PMID:29336236

  18. Implementation of genomics research in Africa: challenges and recommendations.

    PubMed

    Adebamowo, Sally N; Francis, Veronica; Tambo, Ernest; Diallo, Seybou H; Landouré, Guida; Nembaware, Victoria; Dareng, Eileen; Muhamed, Babu; Odutola, Michael; Akeredolu, Teniola; Nerima, Barbara; Ozumba, Petronilla J; Mbhele, Slee; Ghanash, Anita; Wachinou, Ablo P; Ngomi, Nicholas

    2018-01-01

    There is exponential growth in the interest and implementation of genomics research in Africa. This growth has been facilitated by the Human Hereditary and Health in Africa (H3Africa) initiative, which aims to promote a contemporary research approach to the study of genomics and environmental determinants of common diseases in African populations. The purpose of this article is to describe important challenges affecting genomics research implementation in Africa. The observations, challenges and recommendations presented in this article were obtained through discussions by African scientists at teleconferences and face-to-face meetings, seminars at consortium conferences and in-depth individual discussions. Challenges affecting genomics research implementation in Africa, which are related to limited resources include ill-equipped facilities, poor accessibility to research centers, lack of expertise and an enabling environment for research activities in local hospitals. Challenges related to the research study include delayed funding, extensive procedures and interventions requiring multiple visits, delays setting up research teams and insufficient staff training, language barriers and an underappreciation of cultural norms. While many African countries are struggling to initiate genomics projects, others have set up genomics research facilities that meet international standards. The lessons learned in implementing successful genomics projects in Africa are recommended as strategies to overcome these challenges. These recommendations may guide the development and application of new research programs in low-resource settings.

  19. Genome-wide association analysis of bacterial cold water disease resistance in rainbow trout reveals the potential of a hybrid approach between genomic selection and marker assisted selection

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) simultaneously incorporates dense SNP marker genotypes with phenotypic data from related animals to predict animal-specific genomic breeding value (GEBV), which circumvents the need to measure the disease phenotype in potential breeders. Marker assisted selection (MAS) involv...

  20. Canine hip dysplasia is predictable by genotyping.

    PubMed

    Guo, G; Zhou, Z; Wang, Y; Zhao, K; Zhu, L; Lust, G; Hunter, L; Friedenberg, S; Li, J; Zhang, Y; Harris, S; Jones, P; Sandler, J; Krotscheck, U; Todhunter, R; Zhang, Z

    2011-04-01

    To establish a predictive method using whole genome genotyping for early intervention in canine hip dysplasia (CHD) risk management, for the prevention of the progression of secondary osteoarthritis (OA), and for selective breeding. Two sets of dogs (six breeds) were genotyped with dense SNPs covering the entire canine genome. The first set contained 359 dogs upon which a predictive formula for genomic breeding value (GBV) was derived by using their estimated breeding value (EBV) of the Norberg angle (a measure of CHD) and their genotypes. To investigate how well the formula would work for an individual dog with genotype only (without using EBV), a cross validation was performed by masking the EBV of one dog at a time. The genomic data and the EBV of the remaining dogs were used to predict the GBV for the single dog that was left out. The second set of dogs included 38 new Labrador retriever dogs, which had no pedigree relationship to the dogs in the first set. The cross validation showed a strong correlation (R>0.7) between the EBV and the GBV. The independent validation showed a moderate correlation (R=0.5) between GBV for the Norberg angle and the observed Norberg angle (no EBV was available for the new 38 dogs). Sensitivity, specificity, positive and negative predictive values of the genomic data were all above 70%. Prediction of CHD from genomic data is feasible, and can be applied for risk management of CHD and early selection for genetic improvement to reduce the prevalence of CHD in breeding programs. The prediction can be implemented before maturity, at which age current radiographic screening programs are traditionally applied, and as soon as DNA is available. Copyright © 2010 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.

  1. Genome analysis of Legionella pneumophila strains using a mixed-genome microarray.

    PubMed

    Euser, Sjoerd M; Nagelkerke, Nico J; Schuren, Frank; Jansen, Ruud; Den Boer, Jeroen W

    2012-01-01

    Legionella, the causative agent for Legionnaires' disease, is ubiquitous in both natural and man-made aquatic environments. The distribution of Legionella genotypes within clinical strains is significantly different from that found in environmental strains. Developing novel genotypic methods that offer the ability to distinguish clinical from environmental strains could help to focus on more relevant (virulent) Legionella species in control efforts. Mixed-genome microarray data can be used to perform a comparative-genome analysis of strain collections, and advanced statistical approaches, such as the Random Forest algorithm are available to process these data. Microarray analysis was performed on a collection of 222 Legionella pneumophila strains, which included patient-derived strains from notified cases in The Netherlands in the period 2002-2006 and the environmental strains that were collected during the source investigation for those patients within the Dutch National Legionella Outbreak Detection Programme. The Random Forest algorithm combined with a logistic regression model was used to select predictive markers and to construct a predictive model that could discriminate between strains from different origin: clinical or environmental. Four genetic markers were selected that correctly predicted 96% of the clinical strains and 66% of the environmental strains collected within the Dutch National Legionella Outbreak Detection Programme. The Random Forest algorithm is well suited for the development of prediction models that use mixed-genome microarray data to discriminate between Legionella strains from different origin. The identification of these predictive genetic markers could offer the possibility to identify virulence factors within the Legionella genome, which in the future may be implemented in the daily practice of controlling Legionella in the public health environment.

  2. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

  3. Clinical Actionability of Comprehensive Genomic Profiling for Management of Rare or Refractory Cancers

    PubMed Central

    Hirshfield, Kim M.; Tolkunov, Denis; Zhong, Hua; Ali, Siraj M.; Stein, Mark N.; Murphy, Susan; Vig, Hetal; Vazquez, Alexei; Glod, John; Moss, Rebecca A.; Belyi, Vladimir; Chan, Chang S.; Chen, Suzie; Goodell, Lauri; Foran, David; Yelensky, Roman; Palma, Norma A.; Sun, James X.; Miller, Vincent A.; Stephens, Philip J.; Ross, Jeffrey S.; Kaufman, Howard; Poplin, Elizabeth; Mehnert, Janice; Tan, Antoinette R.; Bertino, Joseph R.; Aisner, Joseph; DiPaola, Robert S.

    2016-01-01

    Background. The frequency with which targeted tumor sequencing results will lead to implemented change in care is unclear. Prospective assessment of the feasibility and limitations of using genomic sequencing is critically important. Methods. A prospective clinical study was conducted on 100 patients with diverse-histology, rare, or poor-prognosis cancers to evaluate the clinical actionability of a Clinical Laboratory Improvement Amendments (CLIA)-certified, comprehensive genomic profiling assay (FoundationOne), using formalin-fixed, paraffin-embedded tumors. The primary objectives were to assess utility, feasibility, and limitations of genomic sequencing for genomically guided therapy or other clinical purpose in the setting of a multidisciplinary molecular tumor board. Results. Of the tumors from the 92 patients with sufficient tissue, 88 (96%) had at least one genomic alteration (average 3.6, range 0–10). Commonly altered pathways included p53 (46%), RAS/RAF/MAPK (rat sarcoma; rapidly accelerated fibrosarcoma; mitogen-activated protein kinase) (45%), receptor tyrosine kinases/ligand (44%), PI3K/AKT/mTOR (phosphatidylinositol-4,5-bisphosphate 3-kinase; protein kinase B; mammalian target of rapamycin) (35%), transcription factors/regulators (31%), and cell cycle regulators (30%). Many low frequency but potentially actionable alterations were identified in diverse histologies. Use of comprehensive profiling led to implementable clinical action in 35% of tumors with genomic alterations, including genomically guided therapy, diagnostic modification, and trigger for germline genetic testing. Conclusion. Use of targeted next-generation sequencing in the setting of an institutional molecular tumor board led to implementable clinical action in more than one third of patients with rare and poor-prognosis cancers. Major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access. Early and serial sequencing in the clinical course and expanded access to genomically guided early-phase clinical trials and targeted agents may increase actionability. Implications for Practice: Identification of key factors that facilitate use of genomic tumor testing results and implementation of genomically guided therapy may lead to enhanced benefit for patients with rare or difficult to treat cancers. Clinical use of a targeted next-generation sequencing assay in the setting of an institutional molecular tumor board led to implementable clinical action in over one third of patients with rare and poor prognosis cancers. The major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access both on trial and off label. Approaches to increase actionability include early and serial sequencing in the clinical course and expanded access to genomically guided early phase clinical trials and targeted agents. PMID:27566247

  4. Precision medicine in oncology: New practice models and roles for oncology pharmacists.

    PubMed

    Walko, Christine; Kiel, Patrick J; Kolesar, Jill

    2016-12-01

    Three different precision medicine practice models developed by oncology pharmacists are described, including strategies for implementation and recommendations for educating the next generation of oncology pharmacy practitioners. Oncology is unique in that somatic mutations can both drive the development of a tumor and serve as a therapeutic target for treating the cancer. Precision medicine practice models are a forum through which interprofessional teams, including pharmacists, discuss tumor somatic mutations to guide patient-specific treatment. The University of Wisconsin, Indiana University, and Moffit Cancer Center have implemented precision medicine practice models developed and led by oncology pharmacists. Different practice models, including a clinic, a clinical consultation service, and a molecular tumor board (MTB), were adopted to enhance integration into health systems and payment structures. Although the practice models vary, commonalities of three models include leadership by the clinical pharmacist, specific therapeutic recommendations, procurement of medications for off-label use, and a research component. These three practice models function as interprofessional training sites for pharmacy and medical students and residents, providing an important training resource at these institutions. Key implementation strategies include interprofessional involvement, institutional support, integration into clinical workflow, and selection of model by payer mix. MTBs are a pathway for clinical implementation of genomic medicine in oncology and are an emerging practice model for oncology pharmacists. Because pharmacists must be prepared to participate fully in contemporary practice, oncology pharmacy residents must be trained in genomic oncology, schools of pharmacy should expand precision medicine and genomics education, and opportunities for continuing education in precision medicine should be made available to practicing pharmacists. Copyright © 2016 by the American Society of Health-System Pharmacists, Inc. All rights reserved.

  5. Implementing Genome-Driven Oncology

    PubMed Central

    Hyman, David M.; Taylor, Barry S.; Baselga, José

    2017-01-01

    Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282

  6. Genotype Imputation with Thousands of Genomes

    PubMed Central

    Howie, Bryan; Marchini, Jonathan; Stephens, Matthew

    2011-01-01

    Genotype imputation is a statistical technique that is often used to increase the power and resolution of genetic association studies. Imputation methods work by using haplotype patterns in a reference panel to predict unobserved genotypes in a study dataset, and a number of approaches have been proposed for choosing subsets of reference haplotypes that will maximize accuracy in a given study population. These panel selection strategies become harder to apply and interpret as sequencing efforts like the 1000 Genomes Project produce larger and more diverse reference sets, which led us to develop an alternative framework. Our approach is built around a new approximation that uses local sequence similarity to choose a custom reference panel for each study haplotype in each region of the genome. This approximation makes it computationally efficient to use all available reference haplotypes, which allows us to bypass the panel selection step and to improve accuracy at low-frequency variants by capturing unexpected allele sharing among populations. Using data from HapMap 3, we show that our framework produces accurate results in a wide range of human populations. We also use data from the Malaria Genetic Epidemiology Network (MalariaGEN) to provide recommendations for imputation-based studies in Africa. We demonstrate that our approximation improves efficiency in large, sequence-based reference panels, and we discuss general computational strategies for modern reference datasets. Genome-wide association studies will soon be able to harness the power of thousands of reference genomes, and our work provides a practical way for investigators to use this rich information. New methodology from this study is implemented in the IMPUTE2 software package. PMID:22384356

  7. Accuracy of genomic prediction in switchgrass ( Panicum virgatum L.) improved by accounting for linkage disequilibrium

    DOE PAGES

    Ramstein, Guillaume P.; Evans, Joseph; Kaeppler, Shawn M.; ...

    2016-02-11

    Switchgrass is a relatively high-yielding and environmentally sustainable biomass crop, but further genetic gains in biomass yield must be achieved to make it an economically viable bioenergy feedstock. Genomic selection (GS) is an attractive technology to generate rapid genetic gains in switchgrass, and meet the goals of a substantial displacement of petroleum use with biofuels in the near future. In this study, we empirically assessed prediction procedures for genomic selection in two different populations, consisting of 137 and 110 half-sib families of switchgrass, tested in two locations in the United States for three agronomic traits: dry matter yield, plant height,more » and heading date. Marker data were produced for the families’ parents by exome capture sequencing, generating up to 141,030 polymorphic markers with available genomic-location and annotation information. We evaluated prediction procedures that varied not only by learning schemes and prediction models, but also by the way the data were preprocessed to account for redundancy in marker information. More complex genomic prediction procedures were generally not significantly more accurate than the simplest procedure, likely due to limited population sizes. Nevertheless, a highly significant gain in prediction accuracy was achieved by transforming the marker data through a marker correlation matrix. Our results suggest that marker-data transformations and, more generally, the account of linkage disequilibrium among markers, offer valuable opportunities for improving prediction procedures in GS. Furthermore, some of the achieved prediction accuracies should motivate implementation of GS in switchgrass breeding programs.« less

  8. Accuracy of genomic prediction in switchgrass ( Panicum virgatum L.) improved by accounting for linkage disequilibrium

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramstein, Guillaume P.; Evans, Joseph; Kaeppler, Shawn M.

    Switchgrass is a relatively high-yielding and environmentally sustainable biomass crop, but further genetic gains in biomass yield must be achieved to make it an economically viable bioenergy feedstock. Genomic selection (GS) is an attractive technology to generate rapid genetic gains in switchgrass, and meet the goals of a substantial displacement of petroleum use with biofuels in the near future. In this study, we empirically assessed prediction procedures for genomic selection in two different populations, consisting of 137 and 110 half-sib families of switchgrass, tested in two locations in the United States for three agronomic traits: dry matter yield, plant height,more » and heading date. Marker data were produced for the families’ parents by exome capture sequencing, generating up to 141,030 polymorphic markers with available genomic-location and annotation information. We evaluated prediction procedures that varied not only by learning schemes and prediction models, but also by the way the data were preprocessed to account for redundancy in marker information. More complex genomic prediction procedures were generally not significantly more accurate than the simplest procedure, likely due to limited population sizes. Nevertheless, a highly significant gain in prediction accuracy was achieved by transforming the marker data through a marker correlation matrix. Our results suggest that marker-data transformations and, more generally, the account of linkage disequilibrium among markers, offer valuable opportunities for improving prediction procedures in GS. Furthermore, some of the achieved prediction accuracies should motivate implementation of GS in switchgrass breeding programs.« less

  9. Differential positive selection of malaria resistance genes in three indigenous populations of Peninsular Malaysia.

    PubMed

    Liu, Xuanyao; Yunus, Yushimah; Lu, Dongsheng; Aghakhanian, Farhang; Saw, Woei-Yuh; Deng, Lian; Ali, Mohammad; Wang, Xu; Nor, Fadzilah Mohd; Ghazali, Fadzilah; Rahman, Thuhairah Abdul; Shaari, Shahrul Azlin; Salleh, Mohd Zaki; Phipps, Maude E; Ong, Rick Twee-Hee; Xu, Shuhua; Teo, Yik-Ying; Hoh, Boon-Peng

    2015-04-01

    The indigenous populations from Peninsular Malaysia, locally known as Orang Asli, continue to adopt an agro-subsistence nomadic lifestyle, residing primarily within natural jungle habitats. Leading a hunter-gatherer lifestyle in a tropical jungle environment, the Orang Asli are routinely exposed to malaria. Here we surveyed the genetic architecture of individuals from four Orang Asli tribes with high-density genotyping across more than 2.5 million polymorphisms. These tribes reside in different geographical locations in Peninsular Malaysia and belong to three main ethno-linguistic groups, where there is minimal interaction between the tribes. We first dissect the genetic diversity and admixture between the tribes and with neighboring urban populations. Later, by implementing five metrics, we investigated the genome-wide signatures for positive natural selection of these Orang Asli, respectively. Finally, we searched for evidence of genomic adaptation to the pressure of malaria infection. We observed that different evolutionary responses might have emerged in the different Orang Asli communities to mitigate malaria infection.

  10. Signatures of selection in tilapia revealed by whole genome resequencing.

    PubMed

    Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua

    2015-09-16

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.

  11. MendeLIMS: a web-based laboratory information management system for clinical genome sequencing.

    PubMed

    Grimes, Susan M; Ji, Hanlee P

    2014-08-27

    Large clinical genomics studies using next generation DNA sequencing require the ability to select and track samples from a large population of patients through many experimental steps. With the number of clinical genome sequencing studies increasing, it is critical to maintain adequate laboratory information management systems to manage the thousands of patient samples that are subject to this type of genetic analysis. To meet the needs of clinical population studies using genome sequencing, we developed a web-based laboratory information management system (LIMS) with a flexible configuration that is adaptable to continuously evolving experimental protocols of next generation DNA sequencing technologies. Our system is referred to as MendeLIMS, is easily implemented with open source tools and is also highly configurable and extensible. MendeLIMS has been invaluable in the management of our clinical genome sequencing studies. We maintain a publicly available demonstration version of the application for evaluation purposes at http://mendelims.stanford.edu. MendeLIMS is programmed in Ruby on Rails (RoR) and accesses data stored in SQL-compliant relational databases. Software is freely available for non-commercial use at http://dna-discovery.stanford.edu/software/mendelims/.

  12. designGG: an R-package and web tool for the optimal design of genetical genomics experiments.

    PubMed

    Li, Yang; Swertz, Morris A; Vera, Gonzalo; Fu, Jingyuan; Breitling, Rainer; Jansen, Ritsert C

    2009-06-18

    High-dimensional biomolecular profiling of genetically different individuals in one or more environmental conditions is an increasingly popular strategy for exploring the functioning of complex biological systems. The optimal design of such genetical genomics experiments in a cost-efficient and effective way is not trivial. This paper presents designGG, an R package for designing optimal genetical genomics experiments. A web implementation for designGG is available at http://gbic.biol.rug.nl/designGG. All software, including source code and documentation, is freely available. DesignGG allows users to intelligently select and allocate individuals to experimental units and conditions such as drug treatment. The user can maximize the power and resolution of detecting genetic, environmental and interaction effects in a genome-wide or local mode by giving more weight to genome regions of special interest, such as previously detected phenotypic quantitative trait loci. This will help to achieve high power and more accurate estimates of the effects of interesting factors, and thus yield a more reliable biological interpretation of data. DesignGG is applicable to linkage analysis of experimental crosses, e.g. recombinant inbred lines, as well as to association analysis of natural populations.

  13. Self-guided management of exome and whole-genome sequencing results: changing the results return model.

    PubMed

    Yu, Joon-Ho; Jamal, Seema M; Tabor, Holly K; Bamshad, Michael J

    2013-09-01

    Researchers and clinicians face the practical and ethical challenge of if and how to offer for return the wide and varied scope of results available from individual exome sequencing and whole-genome sequencing. We argue that rather than viewing individual exome sequencing and whole-genome sequencing as a test for which results need to be "returned," that the technology should instead be framed as a dynamic resource of information from which results should be "managed" over the lifetime of an individual. We further suggest that individual exome sequencing and whole-genome sequencing results management is optimized using a self-guided approach that enables individuals to self-select among results offered for return in a convenient, confidential, personalized context that is responsive to their value system. This approach respects autonomy, allows individuals to maximize potential benefits of genomic information (beneficence) and minimize potential harms (nonmaleficence), and also preserves their right to an open future to the extent they desire or think is appropriate. We describe key challenges and advantages of such a self-guided management system and offer guidance on implementation using an information systems approach.

  14. Genomic selection for the improvement of meat quality in beef.

    PubMed

    Pimentel, E C G; König, S

    2012-10-01

    Selection index theory was used to compare different selection strategies aiming at the improvement of meat quality in beef cattle. Alternative strategies were compared with a reference scenario with three basic traits in the selection index: BW at 200 d (W200) and 400 d (W400) and muscling score (MUSC). These traits resemble the combination currently used in the German national beef genetic evaluation system. Traits in the breeding goal were defined as the 3 basic traits plus marbling score (MARB), to depict a situation where an established breeding program currently selecting for growth and carcass yield intends to incorporate meat quality in its selection program. Economic weights were either the same for all 4 traits, or doubled or tripled for MARB. Two additional selection criteria for improving MARB were considered: Live animal intramuscular fat content measured by ultrasound (UIMF) as an indicator trait and a genomic breeding value (GEBV) for the target trait directly (gMARB). Results were used to estimate the required number of genotyped animals in an own calibration set for implementing genomic selection focusing on meat quality. Adding UIMF to the basic index increased the overall genetic gain per generation by 15% when the economic weight on MARB was doubled and by 44% when it was tripled. When a genomic breeding value for marbling could be estimated with an accuracy of 0.5, adding gMARB to the index provided larger genetic gain than adding UIMF. Greatest genetic gain per generation was obtained with the scenario containing GEBV for 4 traits (gW200, gW400, gMUSC, and gMARB) when the accuracies of these GEBV were ≥0.7. Adding UIMF to the index substantially improved response to selection for MARB, which switched from negative to positive when the economic weight on MARB was doubled or tripled. For all scenarios that contained gMARB in the selection index, the response to selection in MARB was positive for all relative economic weights on MARB, when the accuracy of GEBV was >0.7. Results indicated that setting up a calibration set of ∼500 genotyped animals with carcass phenotypes for MARB could suffice to obtain a larger response to selection than measuring UIMF. If the size of the calibration set is ∼2,500, adding the ultrasound trait to an index containing already the GEBV would bring little benefit, unless the relative economic weight for marbling is much larger than for the other traits.

  15. SNP marker discovery, linkage map construction and identification of QTLs for enhanced salinity tolerance in field pea (Pisum sativum L.)

    PubMed Central

    2013-01-01

    Background Field pea (Pisum sativum L.) is a self-pollinating, diploid, cool-season food legume. Crop production is constrained by multiple biotic and abiotic stress factors, including salinity, that cause reduced growth and yield. Recent advances in genomics have permitted the development of low-cost high-throughput genotyping systems, allowing the construction of saturated genetic linkage maps for identification of quantitative trait loci (QTLs) associated with traits of interest. Genetic markers in close linkage with the relevant genomic regions may then be implemented in varietal improvement programs. Results In this study, single nucleotide polymorphism (SNP) markers associated with expressed sequence tags (ESTs) were developed and used to generate comprehensive linkage maps for field pea. From a set of 36,188 variant nucleotide positions detected through in silico analysis, 768 were selected for genotyping of a recombinant inbred line (RIL) population. A total of 705 SNPs (91.7%) successfully detected segregating polymorphisms. In addition to SNPs, genomic and EST-derived simple sequence repeats (SSRs) were assigned to the genetic map in order to obtain an evenly distributed genome-wide coverage. Sequences associated with the mapped molecular markers were used for comparative genomic analysis with other legume species. Higher levels of conserved synteny were observed with the genomes of Medicago truncatula Gaertn. and chickpea (Cicer arietinum L.) than with soybean (Glycine max [L.] Merr.), Lotus japonicus L. and pigeon pea (Cajanus cajan [L.] Millsp.). Parents and RIL progeny were screened at the seedling growth stage for responses to salinity stress, imposed by addition of NaCl in the watering solution at a concentration of 18 dS m-1. Salinity-induced symptoms showed normal distribution, and the severity of the symptoms increased over time. QTLs for salinity tolerance were identified on linkage groups Ps III and VII, with flanking SNP markers suitable for selection of resistant cultivars. Comparison of sequences underpinning these SNP markers to the M. truncatula genome defined genomic regions containing candidate genes associated with saline stress tolerance. Conclusion The SNP assays and associated genetic linkage maps developed in this study permitted identification of salinity tolerance QTLs and candidate genes. This constitutes an important set of tools for marker-assisted selection (MAS) programs aimed at performance enhancement of field pea cultivars. PMID:24134188

  16. SNP marker discovery, linkage map construction and identification of QTLs for enhanced salinity tolerance in field pea (Pisum sativum L.).

    PubMed

    Leonforte, Antonio; Sudheesh, Shimna; Cogan, Noel O I; Salisbury, Philip A; Nicolas, Marc E; Materne, Michael; Forster, John W; Kaur, Sukhjiwan

    2013-10-17

    Field pea (Pisum sativum L.) is a self-pollinating, diploid, cool-season food legume. Crop production is constrained by multiple biotic and abiotic stress factors, including salinity, that cause reduced growth and yield. Recent advances in genomics have permitted the development of low-cost high-throughput genotyping systems, allowing the construction of saturated genetic linkage maps for identification of quantitative trait loci (QTLs) associated with traits of interest. Genetic markers in close linkage with the relevant genomic regions may then be implemented in varietal improvement programs. In this study, single nucleotide polymorphism (SNP) markers associated with expressed sequence tags (ESTs) were developed and used to generate comprehensive linkage maps for field pea. From a set of 36,188 variant nucleotide positions detected through in silico analysis, 768 were selected for genotyping of a recombinant inbred line (RIL) population. A total of 705 SNPs (91.7%) successfully detected segregating polymorphisms. In addition to SNPs, genomic and EST-derived simple sequence repeats (SSRs) were assigned to the genetic map in order to obtain an evenly distributed genome-wide coverage. Sequences associated with the mapped molecular markers were used for comparative genomic analysis with other legume species. Higher levels of conserved synteny were observed with the genomes of Medicago truncatula Gaertn. and chickpea (Cicer arietinum L.) than with soybean (Glycine max [L.] Merr.), Lotus japonicus L. and pigeon pea (Cajanus cajan [L.] Millsp.). Parents and RIL progeny were screened at the seedling growth stage for responses to salinity stress, imposed by addition of NaCl in the watering solution at a concentration of 18 dS m-1. Salinity-induced symptoms showed normal distribution, and the severity of the symptoms increased over time. QTLs for salinity tolerance were identified on linkage groups Ps III and VII, with flanking SNP markers suitable for selection of resistant cultivars. Comparison of sequences underpinning these SNP markers to the M. truncatula genome defined genomic regions containing candidate genes associated with saline stress tolerance. The SNP assays and associated genetic linkage maps developed in this study permitted identification of salinity tolerance QTLs and candidate genes. This constitutes an important set of tools for marker-assisted selection (MAS) programs aimed at performance enhancement of field pea cultivars.

  17. Conservatism and novelty in the genetic architecture of adaptation in Heliconius butterflies.

    PubMed

    Huber, B; Whibley, A; Poul, Y L; Navarro, N; Martin, A; Baxter, S; Shah, A; Gilles, B; Wirth, T; McMillan, W O; Joron, M

    2015-05-01

    Understanding the genetic architecture of adaptive traits has been at the centre of modern evolutionary biology since Fisher; however, evaluating how the genetic architecture of ecologically important traits influences their diversification has been hampered by the scarcity of empirical data. Now, high-throughput genomics facilitates the detailed exploration of variation in the genome-to-phenotype map among closely related taxa. Here, we investigate the evolution of wing pattern diversity in Heliconius, a clade of neotropical butterflies that have undergone an adaptive radiation for wing-pattern mimicry and are influenced by distinct selection regimes. Using crosses between natural wing-pattern variants, we used genome-wide restriction site-associated DNA (RAD) genotyping, traditional linkage mapping and multivariate image analysis to study the evolution of the architecture of adaptive variation in two closely related species: Heliconius hecale and H. ismenius. We implemented a new morphometric procedure for the analysis of whole-wing pattern variation, which allows visualising spatial heatmaps of genotype-to-phenotype association for each quantitative trait locus separately. We used the H. melpomene reference genome to fine-map variation for each major wing-patterning region uncovered, evaluated the role of candidate genes and compared genetic architectures across the genus. Our results show that, although the loci responding to mimicry selection are highly conserved between species, their effect size and phenotypic action vary throughout the clade. Multilocus architecture is ancestral and maintained across species under directional selection, whereas the single-locus (supergene) inheritance controlling polymorphism in H. numata appears to have evolved only once. Nevertheless, the conservatism in the wing-patterning toolkit found throughout the genus does not appear to constrain phenotypic evolution towards local adaptive optima.

  18. Localization of canine brachycephaly using an across breed mapping approach.

    PubMed

    Bannasch, Danika; Young, Amy; Myers, Jeffrey; Truvé, Katarina; Dickinson, Peter; Gregg, Jeffrey; Davis, Ryan; Bongcam-Rudloff, Eric; Webster, Matthew T; Lindblad-Toh, Kerstin; Pedersen, Niels

    2010-03-10

    The domestic dog, Canis familiaris, exhibits profound phenotypic diversity and is an ideal model organism for the genetic dissection of simple and complex traits. However, some of the most interesting phenotypes are fixed in particular breeds and are therefore less tractable to genetic analysis using classical segregation-based mapping approaches. We implemented an across breed mapping approach using a moderately dense SNP array, a low number of animals and breeds carefully selected for the phenotypes of interest to identify genetic variants responsible for breed-defining characteristics. Using a modest number of affected (10-30) and control (20-60) samples from multiple breeds, the correct chromosomal assignment was identified in a proof of concept experiment using three previously defined loci; hyperuricosuria, white spotting and chondrodysplasia. Genome-wide association was performed in a similar manner for one of the most striking morphological traits in dogs: brachycephalic head type. Although candidate gene approaches based on comparable phenotypes in mice and humans have been utilized for this trait, the causative gene has remained elusive using this method. Samples from nine affected breeds and thirteen control breeds identified strong genome-wide associations for brachycephalic head type on Cfa 1. Two independent datasets identified the same genomic region. Levels of relative heterozygosity in the associated region indicate that it has been subjected to a selective sweep, consistent with it being a breed defining morphological characteristic. Genotyping additional dogs in the region confirmed the association. To date, the genetic structure of dog breeds has primarily been exploited for genome wide association for segregating traits. These results demonstrate that non-segregating traits under strong selection are equally tractable to genetic analysis using small sample numbers.

  19. Integrating genomic selection into dairy cattle breeding programmes: a review.

    PubMed

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However, recent simulation studies have shown that putting constraints on genomic inbreeding rates for defining optimal contributions of breeding animals could significantly reduce achievable genetic gain. Finally, the article summarizes the potential of genomic selection to include new traits in the breeding goal to meet societal demands regarding animal health and environmental efficiency in animal production.

  20. Vive la résistance: genome-wide selection against introduced alleles in invasive hybrid zones

    USGS Publications Warehouse

    Kovach, Ryan P.; Hand, Brian K.; Hohenlohe, Paul A.; Cosart, Ted F.; Boyer, Matthew C.; Neville, Helen H.; Muhlfeld, Clint C.; Amish, Stephen J.; Carim, Kellie; Narum, Shawn R.; Lowe, Winsor H.; Allendorf, Fred W.; Luikart, Gordon

    2016-01-01

    Evolutionary and ecological consequences of hybridization between native and invasive species are notoriously complicated because patterns of selection acting on non-native alleles can vary throughout the genome and across environments. Rapid advances in genomics now make it feasible to assess locus-specific and genome-wide patterns of natural selection acting on invasive introgression within and among natural populations occupying diverse environments. We quantified genome-wide patterns of admixture across multiple independent hybrid zones of native westslope cutthroat trout and invasive rainbow trout, the world's most widely introduced fish, by genotyping 339 individuals from 21 populations using 9380 species-diagnostic loci. A significantly greater proportion of the genome appeared to be under selection favouring native cutthroat trout (rather than rainbow trout), and this pattern was pervasive across the genome (detected on most chromosomes). Furthermore, selection against invasive alleles was consistent across populations and environments, even in those where rainbow trout were predicted to have a selective advantage (warm environments). These data corroborate field studies showing that hybrids between these species have lower fitness than the native taxa, and show that these fitness differences are due to selection favouring many native genes distributed widely throughout the genome.

  1. Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.

    PubMed

    Jiménez-Montero, J A; González-Recio, O; Alenda, R

    2013-01-01

    The aim of this study was to evaluate methods for genomic evaluation of the Spanish Holstein population as an initial step toward the implementation of routine genomic evaluations. This study provides a description of the population structure of progeny tested bulls in Spain at the genomic level and compares different genomic evaluation methods with regard to accuracy and bias. Two bayesian linear regression models, Bayes-A and Bayesian-LASSO (B-LASSO), as well as a machine learning algorithm, Random-Boosting (R-Boost), and BLUP using a realized genomic relationship matrix (G-BLUP), were compared. Five traits that are currently under selection in the Spanish Holstein population were used: milk yield, fat yield, protein yield, fat percentage, and udder depth. In total, genotypes from 1859 progeny tested bulls were used. The training sets were composed of bulls born before 2005; including 1601 bulls for production and 1574 bulls for type, whereas the testing sets contained 258 and 235 bulls born in 2005 or later for production and type, respectively. Deregressed proofs (DRP) from January 2009 Interbull (Uppsala, Sweden) evaluation were used as the dependent variables for bulls in the training sets, whereas DRP from the December 2011 DRPs Interbull evaluation were used to compare genomic predictions with progeny test results for bulls in the testing set. Genomic predictions were more accurate than traditional pedigree indices for predicting future progeny test results of young bulls. The gain in accuracy, due to inclusion of genomic data varied by trait and ranged from 0.04 to 0.42 Pearson correlation units. Results averaged across traits showed that B-LASSO had the highest accuracy with an advantage of 0.01, 0.03 and 0.03 points in Pearson correlation compared with R-Boost, Bayes-A, and G-BLUP, respectively. The B-LASSO predictions also showed the least bias (0.02, 0.03 and 0.10 SD units less than Bayes-A, R-Boost and G-BLUP, respectively) as measured by mean difference between genomic predictions and progeny test results. The R-Boosting algorithm provided genomic predictions with regression coefficients closer to unity, which is an alternative measure of bias, for 4 out of 5 traits and also resulted in mean squared errors estimates that were 2%, 10%, and 12% smaller than B-LASSO, Bayes-A, and G-BLUP, respectively. The observed prediction accuracy obtained with these methods was within the range of values expected for a population of similar size, suggesting that the prediction method and reference population described herein are appropriate for implementation of routine genome-assisted evaluations in Spanish dairy cattle. R-Boost is a competitive marker regression methodology in terms of predictive ability that can accommodate large data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  2. Simultaneous gene finding in multiple genomes.

    PubMed

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Artificial selection increased body weight but induced increase of runs of homozygosity in Hanwoo cattle

    PubMed Central

    Kim, Kwondo; Jung, Jaehoon; Caetano-Anollés, Kelsey; Sung, Samsun; Yoo, DongAhn; Choi, Bong-Hwan; Kim, Hyung-Chul; Jeong, Jin-Young; Cho, Yong-Min; Park, Eung-Woo; Choi, Tae-Jeong; Park, Byoungho; Lim, Dajeong

    2018-01-01

    Artificial selection has been demonstrated to have a rapid and significant effect on the phenotype and genome of an organism. However, most previous studies on artificial selection have focused solely on genomic sequences modified by artificial selection or genomic sequences associated with a specific trait. In this study, we generated whole genome sequencing data of 126 cattle under artificial selection, and 24,973,862 single nucleotide variants to investigate the relationship among artificial selection, genomic sequences and trait. Using runs of homozygosity detected by the variants, we showed increase of inbreeding for decades, and at the same time demonstrated a little influence of recent inbreeding on body weight. Also, we could identify ~0.2 Mb runs of homozygosity segment which may be created by recent artificial selection. This approach may aid in development of genetic markers directly influenced by artificial selection, and provide insight into the process of artificial selection. PMID:29561881

  4. A combined strategy involving Sanger and 454 pyrosequencing increases genomic resources to aid in the management of reproduction, disease control and genetic selection in the turbot (Scophthalmus maximus).

    PubMed

    Ribas, Laia; Pardo, Belén G; Fernández, Carlos; Alvarez-Diós, José Antonio; Gómez-Tato, Antonio; Quiroga, María Isabel; Planas, Josep V; Sitjà-Bobadilla, Ariadna; Martínez, Paulino; Piferrer, Francesc

    2013-03-15

    Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database ("Turbot 2 database") was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences ("Turbot 3 database"), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50-90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs.

  5. A combined strategy involving Sanger and 454 pyrosequencing increases genomic resources to aid in the management of reproduction, disease control and genetic selection in the turbot (Scophthalmus maximus)

    PubMed Central

    2013-01-01

    Background Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Results Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database (“Turbot 2 database”) was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences (“Turbot 3 database”), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50–90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. Conclusions The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs. PMID:23497389

  6. Short communication: Implementation of a breeding value for heat tolerance in Australian dairy cattle.

    PubMed

    Nguyen, Thuy T T; Bowman, Phil J; Haile-Mariam, Mekonnen; Nieuwhof, Gert J; Hayes, Benjamin J; Pryce, Jennie E

    2017-09-01

    Excessive ambient temperature and humidity can impair milk production and fertility of dairy cows. Selection for heat-tolerant animals is one possible option to mitigate the effects of heat stress. To enable selection for this trait, we describe the development of a heat tolerance breeding value for Australian dairy cattle. We estimated the direct genomic values of decline in milk, fat, and protein yield per unit increase of temperature-humidity index (THI) using 46,726 single nucleotide polymorphisms and a reference population of 2,236 sires and 11,853 cows for Holsteins and 506 sires and 4,268 cows for Jerseys. This new direct genomic value is the Australian genomic breeding value for heat tolerance (HT ABVg). The components of the HT ABVg are the decline in milk, fat, and protein per unit increase in THI when THI increases above the threshold of 60. These components are weighted by their respective economic values, assumed to be equivalent to the weights applied to milk, fat, and protein yield in the Australian selection indices. Within each breed, the HT ABVg is then standardized to have a mean of 100 and standard deviation (SD) of 5, which is consistent with the presentation of breeding values for many other traits in Australia. The HT ABVg ranged from -4 to +3 SD in Holsteins and -3 to +4 SD in Jerseys. The mean reliabilities of HT ABVg among validation sires, calculated from the prediction error variance and additive genetic variance, were 38% in both breeds. The range in ABVg and their reliability suggests that HT can be improved using genomic selection. There has been a deterioration in the genetic trend of HT, and to moderate the decline it is suggested that the HT ABVg should be included in a multitrait economic index with other traits that contribute to farm profit. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Genetic counselling in the era of genomic medicine

    PubMed Central

    Middleton, Anna

    2018-01-01

    Abstract Background Genomic technology can now deliver cost effective, targeted diagnosis and treatment for patients. Genetic counselling is a communication process empowering patients and families to make autonomous decisions and effectively use new genetic information. The skills of genetic counselling and expertise of genetic counsellors are integral to the effective implementation of genomic medicine. Sources of data Original papers, reviews, guidelines, policy papers and web-resources. Areas of agreement An international consensus on the definition of genetic counselling. Genetic counselling is necessary for implementation of genomic medicine. Areas of controversy Models of genetic counselling. Growing points Genomic medicine is a growing and strategic priority for many health care systems. Genetic counselling is part of this. Areas timely for developing research An evidence base is necessary, incorporating implementation and outcome research, to enable health care systems, practitioners, patients and families to maximize the utility (medically and psychologically) of the new genomic possibilities. PMID:29617718

  8. GWASinlps: Nonlocal prior based iterative SNP selection tool for genome-wide association studies.

    PubMed

    Sanyal, Nilotpal; Lo, Min-Tzu; Kauppi, Karolina; Djurovic, Srdjan; Andreassen, Ole A; Johnson, Valen E; Chen, Chi-Hua

    2018-06-19

    Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Motivated to ameliorate these issues simultaneously, we consider the novel approach of using nonlocal priors in an iterative variable selection framework. We develop a variable selection method, named, iterative nonlocal prior based selection for GWAS, or GWASinlps, that combines, in an iterative variable selection framework, the computational efficiency of the screen-and-select approach based on some association learning and the parsimonious uncertainty quantification provided by the use of nonlocal priors. The hallmark of our method is the introduction of 'structured screen-and-select' strategy, that considers hierarchical screening, which is not only based on response-predictor associations, but also based on response-response associations, and concatenates variable selection within that hierarchy. Extensive simulation studies with SNPs having realistic linkage disequilibrium structures demonstrate the advantages of our computationally efficient method compared to several frequentist and Bayesian variable selection methods, in terms of true positive rate, false discovery rate, mean squared error, and effect size estimation error. Further, we provide empirical power analysis useful for study design. Finally, a real GWAS data application was considered with human height as phenotype. An R-package for implementing the GWASinlps method is available at https://cran.r-project.org/web/packages/GWASinlps/index.html. Supplementary data are available at Bioinformatics online.

  9. Signatures of selection in tilapia revealed by whole genome resequencing

    PubMed Central

    Hong Xia, Jun; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Yi Wan, Zi; Li, Jiale; Lin, Haoran; Hua Yue, Gen

    2015-01-01

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10–100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia. PMID:26373374

  10. High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

    PubMed Central

    2011-01-01

    Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434

  11. Genome-Wide Analysis of Grain Yield Stability and Environmental Interactions in a Multiparental Soybean Population.

    PubMed

    Xavier, Alencar; Jarquin, Diego; Howard, Reka; Ramasubramanian, Vishnu; Specht, James E; Graef, George L; Beavis, William D; Diers, Brian W; Song, Qijian; Cregan, Perry B; Nelson, Randall; Mian, Rouf; Shannon, J Grover; McHale, Leah; Wang, Dechun; Schapaugh, William; Lorenz, Aaron J; Xu, Shizhong; Muir, William M; Rainey, Katy M

    2018-02-02

    Genetic improvement toward optimized and stable agronomic performance of soybean genotypes is desirable for food security. Understanding how genotypes perform in different environmental conditions helps breeders develop sustainable cultivars adapted to target regions. Complex traits of importance are known to be controlled by a large number of genomic regions with small effects whose magnitude and direction are modulated by environmental factors. Knowledge of the constraints and undesirable effects resulting from genotype by environmental interactions is a key objective in improving selection procedures in soybean breeding programs. In this study, the genetic basis of soybean grain yield responsiveness to environmental factors was examined in a large soybean nested association population. For this, a genome-wide association to performance stability estimates generated from a Finlay-Wilkinson analysis and the inclusion of the interaction between marker genotypes and environmental factors was implemented. Genomic footprints were investigated by analysis and meta-analysis using a recently published multiparent model. Results indicated that specific soybean genomic regions were associated with stability, and that multiplicative interactions were present between environments and genetic background. Seven genomic regions in six chromosomes were identified as being associated with genotype-by-environment interactions. This study provides insight into genomic assisted breeding aimed at achieving a more stable agronomic performance of soybean, and documented opportunities to exploit genomic regions that were specifically associated with interactions involving environments and subpopulations. Copyright © 2018 Xavier et al.

  12. Bayesian variable selection for post-analytic interrogation of susceptibility loci.

    PubMed

    Chen, Siying; Nunez, Sara; Reilly, Muredach P; Foulkes, Andrea S

    2017-06-01

    Understanding the complex interplay among protein coding genes and regulatory elements requires rigorous interrogation with analytic tools designed for discerning the relative contributions of overlapping genomic regions. To this aim, we offer a novel application of Bayesian variable selection (BVS) for classifying genomic class level associations using existing large meta-analysis summary level resources. This approach is applied using the expectation maximization variable selection (EMVS) algorithm to typed and imputed SNPs across 502 protein coding genes (PCGs) and 220 long intergenic non-coding RNAs (lncRNAs) that overlap 45 known loci for coronary artery disease (CAD) using publicly available Global Lipids Gentics Consortium (GLGC) (Teslovich et al., 2010; Willer et al., 2013) meta-analysis summary statistics for low-density lipoprotein cholesterol (LDL-C). The analysis reveals 33 PCGs and three lncRNAs across 11 loci with >50% posterior probabilities for inclusion in an additive model of association. The findings are consistent with previous reports, while providing some new insight into the architecture of LDL-cholesterol to be investigated further. As genomic taxonomies continue to evolve, additional classes such as enhancer elements and splicing regions, can easily be layered into the proposed analysis framework. Moreover, application of this approach to alternative publicly available meta-analysis resources, or more generally as a post-analytic strategy to further interrogate regions that are identified through single point analysis, is straightforward. All coding examples are implemented in R version 3.2.1 and provided as supplemental material. © 2016, The International Biometric Society.

  13. Signatures of selection in five Italian cattle breeds detected by a 54K SNP panel.

    PubMed

    Mancini, Giordano; Gargani, Maria; Chillemi, Giovanni; Nicolazzi, Ezequiel Luis; Marsan, Paolo Ajmone; Valentini, Alessio; Pariset, Lorraine

    2014-02-01

    In this study we used a medium density panel of SNP markers to perform population genetic analysis in five Italian cattle breeds. The BovineSNP50 BeadChip was used to genotype a total of 2,935 bulls of Piedmontese, Marchigiana, Italian Holstein, Italian Brown and Italian Pezzata Rossa breeds. To determine a genome-wide pattern of positive selection we mapped the F st values against genome location. The highest F st peaks were obtained on BTA6 and BTA13 where some candidate genes are located. We identified selection signatures peculiar of each breed which suggest selection for genes involved in milk or meat traits. The genetic structure was investigated by using a multidimensional scaling of the genetic distance matrix and a Bayesian approach implemented in the STRUCTURE software. The genotyping data showed a clear partitioning of the cattle genetic diversity into distinct breeds if a number of clusters equal to the number of populations were given. Assuming a lower number of clusters beef breeds group together. Both methods showed all five breeds separated in well defined clusters and the Bayesian approach assigned individuals to the breed of origin. The work is of interest not only because it enriches the knowledge on the process of evolution but also because the results generated could have implications for selective breeding programs.

  14. Genome-wide evidence for divergent selection between populations of a major agricultural pathogen.

    PubMed

    Hartmann, Fanny E; McDonald, Bruce A; Croll, Daniel

    2018-06-01

    The genetic and environmental homogeneity in agricultural ecosystems is thought to impose strong and uniform selection pressures. However, the impact of this selection on plant pathogen genomes remains largely unknown. We aimed to identify the proportion of the genome and the specific gene functions under positive selection in populations of the fungal wheat pathogen Zymoseptoria tritici. First, we performed genome scans in four field populations that were sampled from different continents and on distinct wheat cultivars to test which genomic regions are under recent selection. Based on extended haplotype homozygosity and composite likelihood ratio tests, we identified 384 and 81 selective sweeps affecting 4% and 0.5% of the 35 Mb core genome, respectively. We found differences both in the number and the position of selective sweeps across the genome between populations. Using a XtX-based outlier detection approach, we identified 51 extremely divergent genomic regions between the allopatric populations, suggesting that divergent selection led to locally adapted pathogen populations. We performed an outlier detection analysis between two sympatric populations infecting two different wheat cultivars to identify evidence for host-driven selection. Selective sweep regions harboured genes that are likely to play a role in successfully establishing host infections. We also identified secondary metabolite gene clusters and an enrichment in genes encoding transporter and protein localization functions. The latter gene functions mediate responses to environmental stress, including interactions with the host. The distinct gene functions under selection indicate that both local host genotypes and abiotic factors contributed to local adaptation. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  15. Genomic signatures of positive selection in humans and the limits of outlier approaches.

    PubMed

    Kelley, Joanna L; Madeoy, Jennifer; Calhoun, John C; Swanson, Willie; Akey, Joshua M

    2006-08-01

    Identifying regions of the human genome that have been targets of positive selection will provide important insights into recent human evolutionary history and may facilitate the search for complex disease genes. However, the confounding effects of population demographic history and selection on patterns of genetic variation complicate inferences of selection when a small number of loci are studied. To this end, identifying outlier loci from empirical genome-wide distributions of genetic variation is a promising strategy to detect targets of selection. Here, we evaluate the power and efficiency of a simple outlier approach and describe a genome-wide scan for positive selection using a dense catalog of 1.58 million SNPs that were genotyped in three human populations. In total, we analyzed 14,589 genes, 385 of which possess patterns of genetic variation consistent with the hypothesis of positive selection. Furthermore, several extended genomic regions were found, spanning >500 kb, that contained multiple contiguous candidate selection genes. More generally, these data provide important practical insights into the limits of outlier approaches in genome-wide scans for selection, provide strong candidate selection genes to study in greater detail, and may have important implications for disease related research.

  16. Selections that isolate recombinant mitochondrial genomes in animals

    PubMed Central

    Ma, Hansong; O'Farrell, Patrick H

    2015-01-01

    Homologous recombination is widespread and catalyzes evolution. Nonetheless, its existence in animal mitochondrial DNA is questioned. We designed selections for recombination between co-resident mitochondrial genomes in various heteroplasmic Drosophila lines. In four experimental settings, recombinant genomes became the sole or dominant genome in the progeny. Thus, selection uncovers occurrence of homologous recombination in Drosophila mtDNA and documents its functional benefit. Double-strand breaks enhanced recombination in the germline and revealed somatic recombination. When the recombination partner was a diverged Drosophila melanogaster genome or a genome from a different species such as Drosophila yakuba, sequencing revealed long continuous stretches of exchange. In addition, the distribution of sequence polymorphisms in recombinants allowed us to map a selected trait to a particular region in the Drosophila mitochondrial genome. Thus, recombination can be harnessed to dissect function and evolution of mitochondrial genome. DOI: http://dx.doi.org/10.7554/eLife.07247.001 PMID:26237110

  17. Spiked GBS: A unified, open platform for single marker genotyping and whole-genome profiling

    USDA-ARS?s Scientific Manuscript database

    In plant breeding, there are two primary applications for DNA markers in selection: 1) selection of known genes using a single marker assay (marker-assisted selection; MAS); and 2) whole-genome profiling and prediction (genomic selection; GS). Typically, marker platforms have addressed only one of t...

  18. Effect of Artificial Selection on Runs of Homozygosity in U.S. Holstein Cattle

    PubMed Central

    Kim, Eui-Soo; Cole, John B.; Huson, Heather; Wiggans, George R.; Van Tassell, Curtis P.; Crooker, Brian A.; Liu, George; Da, Yang; Sonstegard, Tad S.

    2013-01-01

    The intensive selection programs for milk made possible by mass artificial insemination increased the similarity among the genomes of North American (NA) Holsteins tremendously since the 1960s. This migration of elite alleles has caused certain regions of the genome to have runs of homozygosity (ROH) occasionally spanning millions of continuous base pairs at a specific locus. In this study, genome signatures of artificial selection in NA Holsteins born between 1953 and 2008 were identified by comparing changes in ROH between three distinct groups under different selective pressure for milk production. The ROH regions were also used to estimate the inbreeding coefficients. The comparisons of genomic autozygosity between groups selected or unselected since 1964 for milk production revealed significant differences with respect to overall ROH frequency and distribution. These results indicate selection has increased overall autozygosity across the genome, whereas the autozygosity in an unselected line has not changed significantly across most of the chromosomes. In addition, ROH distribution was more variable across the genomes of selected animals in comparison to a more even ROH distribution for unselected animals. Further analysis of genome-wide autozygosity changes and the association between traits and haplotypes identified more than 40 genomic regions under selection on several chromosomes (Chr) including Chr 2, 7, 16 and 20. Many of these selection signatures corresponded to quantitative trait loci for milk, fat, and protein yield previously found in contemporary Holsteins. PMID:24348915

  19. Initiative for Molecular Profiling and Advanced Cancer Therapy and challenges in the implementation of precision medicine.

    PubMed

    Tsimberidou, Apostolia-Maria

    In the last decade, breakthroughs in technology have improved our understanding of genomic, transcriptional, proteomic, epigenetic aberrations and immune mechanisms in carcinogenesis. Genomics and model systems have enabled the validation of novel therapeutic strategies. Based on these developments, in 2007, we initiated the IMPACT (Initiative for Molecular Profiling and Advanced Cancer Therapy) study, the first personalized medicine program for patients with advanced cancer at The University of Texas MD Anderson Cancer Center. We demonstrated that in patients referred for Phase I clinical trials, the use of tumor molecular profiling and treatment with matched targeted therapy was associated with encouraging rates of response, progression-free survival and overall survival compared to non-matched therapy. We are currently conducting IMPACT2, a randomized study evaluating molecular profiling and targeted agents in patients with metastatic cancer. Optimization of innovative biomarker-driven clinical trials that include targeted therapy and/or immunotherapeutic approaches for carefully selected patients will accelerate the development of novel drugs and the implementation of precision medicine. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Parallel altitudinal clines reveal trends in adaptive evolution of genome size in Zea mays

    PubMed Central

    Berg, Jeremy J.; Birchler, James A.; Grote, Mark N.; Lorant, Anne; Quezada, Juvenal

    2018-01-01

    While the vast majority of genome size variation in plants is due to differences in repetitive sequence, we know little about how selection acts on repeat content in natural populations. Here we investigate parallel changes in intraspecific genome size and repeat content of domesticated maize (Zea mays) landraces and their wild relative teosinte across altitudinal gradients in Mesoamerica and South America. We combine genotyping, low coverage whole-genome sequence data, and flow cytometry to test for evidence of selection on genome size and individual repeat abundance. We find that population structure alone cannot explain the observed variation, implying that clinal patterns of genome size are maintained by natural selection. Our modeling additionally provides evidence of selection on individual heterochromatic knob repeats, likely due to their large individual contribution to genome size. To better understand the phenotypes driving selection on genome size, we conducted a growth chamber experiment using a population of highland teosinte exhibiting extensive variation in genome size. We find weak support for a positive correlation between genome size and cell size, but stronger support for a negative correlation between genome size and the rate of cell production. Reanalyzing published data of cell counts in maize shoot apical meristems, we then identify a negative correlation between cell production rate and flowering time. Together, our data suggest a model in which variation in genome size is driven by natural selection on flowering time across altitudinal clines, connecting intraspecific variation in repetitive sequence to important differences in adaptive phenotypes. PMID:29746459

  1. PREMIX: PRivacy-preserving EstiMation of Individual admiXture.

    PubMed

    Chen, Feng; Dow, Michelle; Ding, Sijie; Lu, Yao; Jiang, Xiaoqian; Tang, Hua; Wang, Shuang

    2016-01-01

    In this paper we proposed a framework: PRivacy-preserving EstiMation of Individual admiXture (PREMIX) using Intel software guard extensions (SGX). SGX is a suite of software and hardware architectures to enable efficient and secure computation over confidential data. PREMIX enables multiple sites to securely collaborate on estimating individual admixture within a secure enclave inside Intel SGX. We implemented a feature selection module to identify most discriminative Single Nucleotide Polymorphism (SNP) based on informativeness and an Expectation Maximization (EM)-based Maximum Likelihood estimator to identify the individual admixture. Experimental results based on both simulation and 1000 genome data demonstrated the efficiency and accuracy of the proposed framework. PREMIX ensures a high level of security as all operations on sensitive genomic data are conducted within a secure enclave using SGX.

  2. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    USDA-ARS?s Scientific Manuscript database

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  3. Genomes as geography: using GIS technology to build interactive genome feature maps

    PubMed Central

    Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J

    2006-01-01

    Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be exploited to display, explore and interact with genome data. The approach we describe here is organism independent and is equally useful for linear and circular chromosomes. One of the unique capabilities of GenoSIS compared to existing genome browsers is the capacity to generate genome feature maps dynamically in response to complex attribute and spatial queries. PMID:16984652

  4. Assessing genomic selection prediction accuracy in a dynamic barley breeding

    USDA-ARS?s Scientific Manuscript database

    Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...

  5. Genome-enabled prediction models for yield related traits in chickpea

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) unlike marker-assisted backcrossing (MABC) predicts breeding values of lines using genome-wide marker profiling and allows selection of lines prior to field-phenotyping, thereby shortening the breeding cycle. A collection of 320 elite breeding lines was selected and phenotyped...

  6. Measuring genomic pre-selection in theory and in practice

    USDA-ARS?s Scientific Manuscript database

    Potential biases from genomic pre-selection were estimated from actual selection and mating patterns of US Holsteins. Traditional models using only phenotypes and pedigrees do not adjust for average genomic merit of an animal’s parents, progeny, mates, or contemporaries. Positive assortative mating ...

  7. Developing genome-wide microsatellite markers of bamboo and their applications on molecular marker assisted taxonomy for accessions in the genus Phyllostachys.

    PubMed

    Zhao, Hansheng; Yang, Li; Peng, Zhenhua; Sun, Huayu; Yue, Xianghua; Lou, Yongfeng; Dong, Lili; Wang, Lili; Gao, Zhimin

    2015-01-26

    Morphology-based taxonomy via exiguously reproductive organ has severely limitation on bamboo taxonomy, mainly owing to infrequent and unpredictable flowering events of bamboo. Here, we present the first genome-wide analysis and application of microsatellites based on the genome of moso bamboo (Phyllostachys edulis) to assist bamboo taxonomy. Of identified 127,593 microsatellite repeat-motifs, the primers of 1,451 microsatellites were designed and 1,098 markers were physically mapped on the genome of moso bamboo. A total of 917 markers were successfully validated in 9 accessions with ~39.8% polymorphic potential. Retrieved from validated microsatellite markers, 23 markers were selected for polymorphic analysis among 78 accessions and 64 alleles were detected with an average of 2.78 alleles per primers. The cluster result indicated the majority of the accessions were consistent with their current taxonomic classification, confirming the suitability and effectiveness of the developed microsatellite markers. The variations of microsatellite marker in different species were confirmed by sequencing and in silico comparative genome mapping were investigated. Lastly, a bamboo microsatellites database (http://www.bamboogdb.org/ssr) was implemented to browse and search large information of bamboo microsatellites. Consequently, our results of microsatellite marker development are valuable for assisting bamboo taxonomy and investigating genomic studies in bamboo and related grass species.

  8. Economic evaluation of genomic selection in small ruminants: a sheep meat breeding program.

    PubMed

    Shumbusho, F; Raoul, J; Astruc, J M; Palhiere, I; Lemarié, S; Fugeray-Scarbel, A; Elsen, J M

    2016-06-01

    Recent genomic evaluation studies using real data and predicting genetic gain by modeling breeding programs have reported moderate expected benefits from the replacement of classic selection schemes by genomic selection (GS) in small ruminants. The objectives of this study were to compare the cost, monetary genetic gain and economic efficiency of classic selection and GS schemes in the meat sheep industry. Deterministic methods were used to model selection based on multi-trait indices from a sheep meat breeding program. Decisional variables related to male selection candidates and progeny testing were optimized to maximize the annual monetary genetic gain (AMGG), that is, a weighted sum of meat and maternal traits annual genetic gains. For GS, a reference population of 2000 individuals was assumed and genomic information was available for evaluation of male candidates only. In the classic selection scheme, males breeding values were estimated from own and offspring phenotypes. In GS, different scenarios were considered, differing by the information used to select males (genomic only, genomic+own performance, genomic+offspring phenotypes). The results showed that all GS scenarios were associated with higher total variable costs than classic selection (if the cost of genotyping was 123 euros/animal). In terms of AMGG and economic returns, GS scenarios were found to be superior to classic selection only if genomic information was combined with their own meat phenotypes (GS-Pheno) or with their progeny test information. The predicted economic efficiency, defined as returns (proportional to number of expressions of AMGG in the nucleus and commercial flocks) minus total variable costs, showed that the best GS scenario (GS-Pheno) was up to 15% more efficient than classic selection. For all selection scenarios, optimization increased the overall AMGG, returns and economic efficiency. As a conclusion, our study shows that some forms of GS strategies are more advantageous than classic selection, provided that GS is already initiated (i.e. the initial reference population is available). Optimizing decisional variables of the classic selection scheme could be of greater benefit than including genomic information in optimized designs.

  9. The i5k Workspace@NAL—enabling genomic data access, visualization and curation of arthropod genomes

    PubMed Central

    Poelchau, Monica; Childers, Christopher; Moore, Gary; Tsavatapalli, Vijaya; Evans, Jay; Lee, Chien-Yueh; Lin, Han; Lin, Jun-Wei; Hackett, Kevin

    2015-01-01

    The 5000 arthropod genomes initiative (i5k) has tasked itself with coordinating the sequencing of 5000 insect or related arthropod genomes. The resulting influx of data, mostly from small research groups or communities with little bioinformatics experience, will require visualization, dissemination and curation, preferably from a centralized platform. The National Agricultural Library (NAL) has implemented the i5k Workspace@NAL (http://i5k.nal.usda.gov/) to help meet the i5k initiative's genome hosting needs. Any i5k member is encouraged to contact the i5k Workspace with their genome project details. Once submitted, new content will be accessible via organism pages, genome browsers and BLAST search engines, which are implemented via the open-source Tripal framework, a web interface for the underlying Chado database schema. We also implement the Web Apollo software for groups that choose to curate gene models. New content will add to the existing body of 35 arthropod species, which include species relevant for many aspects of arthropod genomic research, including agriculture, invasion biology, systematics, ecology and evolution, and developmental research. PMID:25332403

  10. PSP: rapid identification of orthologous coding genes under positive selection across multiple closely related prokaryotic genomes.

    PubMed

    Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2013-12-27

    With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.

  11. The Undergraduate Training in Genomics (UTRIG) Initiative: early & active training for physicians in the genomic medicine era.

    PubMed

    Wilcox, Rebecca L; Adem, Patricia V; Afshinnekoo, Ebrahim; Atkinson, James B; Burke, Leah W; Cheung, Hoiwan; Dasgupta, Shoumita; DeLaGarza, Julia; Joseph, Loren; LeGallo, Robin; Lew, Madelyn; Lockwood, Christina M; Meiss, Alice; Norman, Jennifer; Markwood, Priscilla; Rizvi, Hasan; Shane-Carson, Kate P; Sobel, Mark E; Suarez, Eric; Tafe, Laura J; Wang, Jason; Haspel, Richard L

    2018-05-01

    Genomic medicine is transforming patient care. However, the speed of development has left a knowledge gap between discovery and effective implementation into clinical practice. Since 2010, the Training Residents in Genomics (TRIG) Working Group has found success in building a rigorous genomics curriculum with implementation tools aimed at pathology residents in postgraduate training years 1-4. Based on the TRIG model, the interprofessional Undergraduate Training in Genomics (UTRIG) Working Group was formed. Under the aegis of the Undergraduate Medical Educators Section of the Association of Pathology Chairs and representation from nine additional professional societies, UTRIG's collaborative goal is building medical student genomic literacy through development of a ready-to-use genomics curriculum. Key elements to the UTRIG curriculum are expert consensus-driven objectives, active learning methods, rigorous assessment and integration.

  12. Practical implementation of cost-effective genomic selection in commercial pig breeding using imputation.

    PubMed

    Cleveland, M A; Hickey, J M

    2013-08-01

    Genomic selection can be implemented in pig breeding at a reduced cost using genotype imputation. Accuracy of imputation and the impact on resulting genomic breeding values (gEBV) was investigated. High-density genotype data was available for 4,763 animals from a single pig line. Three low-density genotype panels were constructed with SNP densities of 450 (L450), 3,071 (L3k) and 5,963 (L6k). Accuracy of imputation was determined using 184 test individuals with no genotyped descendants in the data but with parents and grandparents genotyped using the Illumina PorcineSNP60 Beadchip. Alternative genotyping scenarios were created in which parents, grandparents, and individuals that were not direct ancestors of test animals (Other) were genotyped at high density (S1), grandparents were not genotyped (S2), dams and granddams were not genotyped (S3), and dams and granddams were genotyped at low density (S4). Four additional scenarios were created by excluding Other animal genotypes. Test individuals were always genotyped at low density. Imputation was performed with AlphaImpute. Genomic breeding values were calculated using the single-step genomic evaluation. Test animals were evaluated for the information retained in the gEBV, calculated as the correlation between gEBV using imputed genotypes and gEBV using true genotypes. Accuracy of imputation was high for all scenarios but decreased with fewer SNP on the low-density panel (0.995 to 0.965 for S1) and with reduced genotyping of ancestors, where the largest changes were for L450 (0.965 in S1 to 0.914 in S3). Exclusion of genotypes for Other animals resulted in only small accuracy decreases. Imputation accuracy was not consistent across the genome. Information retained in the gEBV was related to genotyping scenario and thus to imputation accuracy. Reducing the number of SNP on the low-density panel reduced the information retained in the gEBV, with the largest decrease observed from L3k to L450. Excluding Other animal genotypes had little impact on imputation accuracy but caused large decreases in the information retained in the gEBV. These results indicate that accuracy of gEBV from imputed genotypes depends on the level of genotyping in close relatives and the size of the genotyped dataset. Fewer high-density genotyped individuals are needed to obtain accurate imputation than are needed to obtain accurate gEBV. Strategies to optimize development of low-density panels can improve both imputation and gEBV accuracy.

  13. Patient-Centered Precision Health In A Learning Health Care System: Geisinger's Genomic Medicine Experience.

    PubMed

    Williams, Marc S; Buchanan, Adam H; Davis, F Daniel; Faucett, W Andrew; Hallquist, Miranda L G; Leader, Joseph B; Martin, Christa L; McCormick, Cara Z; Meyer, Michelle N; Murray, Michael F; Rahm, Alanna K; Schwartz, Marci L B; Sturm, Amy C; Wagner, Jennifer K; Williams, Janet L; Willard, Huntington F; Ledbetter, David H

    2018-05-01

    Health care delivery is increasingly influenced by the emerging concepts of precision health and the learning health care system. Although not synonymous with precision health, genomics is a key enabler of individualized care. Delivering patient-centered, genomics-informed care based on individual-level data in the current national landscape of health care delivery is a daunting challenge. Problems to overcome include data generation, analysis, storage, and transfer; knowledge management and representation for patients and providers at the point of care; process management; and outcomes definition, collection, and analysis. Development, testing, and implementation of a genomics-informed program requires multidisciplinary collaboration and building the concepts of precision health into a multilevel implementation framework. Using the principles of a learning health care system provides a promising solution. This article describes the implementation of population-based genomic medicine in an integrated learning health care system-a working example of a precision health program.

  14. Genomic selection for quantitative adult plant stem rust resistance in wheat

    USDA-ARS?s Scientific Manuscript database

    Quantitative adult plant resistance (APR) to stem rust (Puccinia graminis f. sp. tritici) is an important breeding target in wheat (Triticum aestivum L.) and a potential target for genomic selection (GS). To evaluate the relative importance of known APR loci in applying genomic selection, we charact...

  15. Increased prediction accuracy in wheat breeding trials using a marker x environment interaction genomic selection model

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates for selection. Originally these models were developed without considering genotype ' environment interaction (GE). Several authors have proposed extensions of the cannonical GS model that accomm...

  16. Array-Based Comparative Genomic Hybridization for the Genomewide Detection of Submicroscopic Chromosomal Abnormalities

    PubMed Central

    Vissers, Lisenka E. L. M. ; de Vries, Bert B. A. ; Osoegawa, Kazutoyo ; Janssen, Irene M. ; Feuth, Ton ; Choy, Chik On ; Straatman, Huub ; van der Vliet, Walter ; Huys, Erik H. L. P. G. ; van Rijk, Anke ; Smeets, Dominique ; van Ravenswaaij-Arts, Conny M. A. ; Knoers, Nine V. ; van der Burgt, Ineke ; de Jong, Pieter J. ; Brunner, Han G. ; van Kessel, Ad Geurts ; Schoenmakers, Eric F. P. M. ; Veltman, Joris A. 

    2003-01-01

    Microdeletions and microduplications, not visible by routine chromosome analysis, are a major cause of human malformation and mental retardation. Novel high-resolution, whole-genome technologies can improve the diagnostic detection rate of these small chromosomal abnormalities. Array-based comparative genomic hybridization allows such a high-resolution screening by hybridizing differentially labeled test and reference DNAs to arrays consisting of thousands of genomic clones. In this study, we tested the diagnostic capacity of this technology using ∼3,500 flourescent in situ hybridization–verified clones selected to cover the genome with an average of 1 clone per megabase (Mb). The sensitivity and specificity of the technology were tested in normal-versus-normal control experiments and through the screening of patients with known microdeletion syndromes. Subsequently, a series of 20 cytogenetically normal patients with mental retardation and dysmorphisms suggestive of a chromosomal abnormality were analyzed. In this series, three microdeletions and two microduplications were identified and validated. Two of these genomic changes were identified also in one of the parents, indicating that these are large-scale genomic polymorphisms. Deletions and duplications as small as 1 Mb could be reliably detected by our approach. The percentage of false-positive results was reduced to a minimum by use of a dye-swap-replicate analysis, all but eliminating the need for laborious validation experiments and facilitating implementation in a routine diagnostic setting. This high-resolution assay will facilitate the identification of novel genes involved in human mental retardation and/or malformation syndromes and will provide insight into the flexibility and plasticity of the human genome. PMID:14628292

  17. Genome-wide identification of allele-specific expression (ASE) in response to Marek's disease virus infection using next generation sequencing.

    PubMed

    Maceachern, Sean; Muir, William M; Crosby, Seth; Cheng, Hans H

    2011-06-03

    Marek's disease (MD), a T cell lymphoma induced by the highly oncogenic α-herpesvirus Marek's disease virus (MDV), is the main chronic infectious disease concern threatening the poultry industry. Enhancing genetic resistance to MD in commercial poultry is an attractive method to augment MD vaccines, which is currently the control method of choice. In order to optimally implement this control strategy through marker-assisted selection (MAS) and to gain biological information, it is necessary to identify specific genes that influence MD incidence. A genome-wide screen for allele-specific expression (ASE) in response to MDV infection was conducted. The highly inbred ADOL chicken lines 6 (MD resistant) and 7 (MD susceptible) were inter-mated in reciprocal crosses and half of the progeny challenged with MDV. Splenic RNA pools at a single time after infection for each treatment group point were generated, sequenced using a next generation sequencer, then analyzed for allele-specific expression (ASE). To validate and extend the results, Illumina GoldenGate assays for selected cSNPs were developed and used on all RNA samples from all 6 time points following MDV challenge. RNA sequencing resulted in 11-13+ million mappable reads per treatment group, 1.7+ Gb total sequence, and 22,655 high-confidence cSNPs. Analysis of these cSNPs revealed that 5360 cSNPs in 3773 genes exhibited statistically significant allelic imbalance. Of the 1536 GoldenGate assays, 1465 were successfully scored with all but 19 exhibiting evidence for allelic imbalance. ASE is an efficient method to identify potentially all or most of the genes influencing this complex trait. The identified cSNPs can be further evaluated in resource populations to determine their allelic direction and size of effect on genetic resistance to MD as well as being directly implemented in genomic selection programs. The described method, although demonstrated in inbred chicken lines, is applicable to all traits in any diploid species, and should prove to be a simple method to identify the majority of genes controlling any complex trait.

  18. Repeated divergent selection on pigmentation genes in a rapid finch radiation

    PubMed Central

    Campagna, Leonardo; Repenning, Márcio; Silveira, Luís Fábio; Fontana, Carla Suertegaray; Tubaro, Pablo L.; Lovette, Irby J.

    2017-01-01

    Instances of recent and rapid speciation are suitable for associating phenotypes with their causal genotypes, especially if gene flow homogenizes areas of the genome that are not under divergent selection. We study a rapid radiation of nine sympatric bird species known as capuchino seedeaters, which are differentiated in sexually selected characters of male plumage and song. We sequenced the genomes of a phenotypically diverse set of species to search for differentiated genomic regions. Capuchinos show differences in a small proportion of their genomes, yet selection has acted independently on the same targets in different members of this radiation. Many divergent regions contain genes involved in the melanogenesis pathway, with the strongest signal originating from putative regulatory regions. Selection has acted on these same genomic regions in different lineages, likely shaping the evolution of cis-regulatory elements, which control how more conserved genes are expressed and thereby generate diversity in classically sexually selected traits. PMID:28560331

  19. Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax

    PubMed Central

    Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

    2018-01-01

    Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes. PMID:29670636

  20. Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax.

    PubMed

    Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

    2018-01-01

    Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes.

  1. Genomic signatures of selection at linked sites: unifying the disparity among species

    PubMed Central

    Cutter, Asher D.; Payseur, Bret A.

    2014-01-01

    Population genetics theory supplies powerful predictions about how natural selection interacts with genetic linkage to sculpt the genomic landscape of nucleotide polymorphism. Both the spread of beneficial mutations and removal of deleterious mutations act to depress polymorphism levels, especially in low-recombination regions. However, empiricists have documented extreme disparities among species. Here we characterize the dominant features that could drive variation in linked selection among species, including roles for selective sweeps being ‘hard’ or ‘soft’, and concealing by demography and genomic confounds. We advocate targeted studies of close relatives to unify our understanding of how selection and linkage interact to shape genome evolution. PMID:23478346

  2. Genomic selection & association mapping in rice: effect of trait genetic architecture, training population composition, marker number & statistical model on accuracy of rice genomic selection in elite, tropical rice breeding

    USDA-ARS?s Scientific Manuscript database

    Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its ef...

  3. Imputation of unordered markers and the impact on genomic selection accuracy

    USDA-ARS?s Scientific Manuscript database

    Genomic selection, a breeding method that promises to accelerate rates of genetic gain, requires dense, genome-wide marker data. Genotyping-by-sequencing can generate a large number of de novo markers. However, without a reference genome, these markers are unordered and typically have a large propo...

  4. The development of genomics applied to dairy breeding

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) has profoundly changed dairy cattle breeding in the last decade and can be defined as the use of genomic breeding values (GEBV) in selection programs. The GEBV is the sum of the effects of dense DNA markers across the whole genome, capturing all the quantitative trait loci (QT...

  5. Estimation of genomic breeding values for milk yield in UK dairy goats.

    PubMed

    Mucha, S; Mrode, R; MacLaren-Lee, I; Coffey, M; Conington, J

    2015-11-01

    The objective of this study was to estimate genomic breeding values for milk yield in crossbred dairy goats. The research was based on data provided by 2 commercial goat farms in the UK comprising 590,409 milk yield records on 14,453 dairy goats kidding between 1987 and 2013. The population was created by crossing 3 breeds: Alpine, Saanen, and Toggenburg. In each generation the best performing animals were selected for breeding, and as a result, a synthetic breed was created. The pedigree file contained 30,139 individuals, of which 2,799 were founders. The data set contained test-day records of milk yield, lactation number, farm, age at kidding, and year and season of kidding. Data on milk composition was unavailable. In total 1,960 animals were genotyped with the Illumina 50K caprine chip. Two methods for estimation of genomic breeding value were compared-BLUP at the single nucleotide polymorphism level (BLUP-SNP) and single-step BLUP. The highest accuracy of 0.61 was obtained with single-step BLUP, and the lowest (0.36) with BLUP-SNP. Linkage disequilibrium (r(2), the squared correlation of the alleles at 2 loci) at 50 kb (distance between 2 SNP) was 0.18. This is the first attempt to implement genomic selection in UK dairy goats. Results indicate that the single-step method provides the highest accuracy for populations with a small number of genotyped individuals, where the number of genotyped males is low and females are predominant in the reference population. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  6. Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

    PubMed

    Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

    2014-12-01

    Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  7. Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.

    PubMed

    Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L

    2016-12-01

    Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.

  8. Breeding and Genetics Symposium: networks and pathways to guide genomic selection.

    PubMed

    Snelling, W M; Cushman, R A; Keele, J W; Maltecca, C; Thomas, M G; Fortes, M R S; Reverter, A

    2013-02-01

    Many traits affecting profitability and sustainability of meat, milk, and fiber production are polygenic, with no single gene having an overwhelming influence on observed variation. No knowledge of the specific genes controlling these traits has been needed to make substantial improvement through selection. Significant gains have been made through phenotypic selection enhanced by pedigree relationships and continually improving statistical methodology. Genomic selection, recently enabled by assays for dense SNP located throughout the genome, promises to increase selection accuracy and accelerate genetic improvement by emphasizing the SNP most strongly correlated to phenotype although the genes and sequence variants affecting phenotype remain largely unknown. These genomic predictions theoretically rely on linkage disequilibrium (LD) between genotyped SNP and unknown functional variants, but familial linkage may increase effectiveness when predicting individuals related to those in the training data. Genomic selection with functional SNP genotypes should be less reliant on LD patterns shared by training and target populations, possibly allowing robust prediction across unrelated populations. Although the specific variants causing polygenic variation may never be known with certainty, a number of tools and resources can be used to identify those most likely to affect phenotype. Associations of dense SNP genotypes with phenotype provide a 1-dimensional approach for identifying genes affecting specific traits; in contrast, associations with multiple traits allow defining networks of genes interacting to affect correlated traits. Such networks are especially compelling when corroborated by existing functional annotation and established molecular pathways. The SNP occurring within network genes, obtained from public databases or derived from genome and transcriptome sequences, may be classified according to expected effects on gene products. As illustrated by functionally informed genomic predictions being more accurate than naive whole-genome predictions of beef tenderness, coupling evidence from livestock genotypes, phenotypes, gene expression, and genomic variants with existing knowledge of gene functions and interactions may provide greater insight into the genes and genomic mechanisms affecting polygenic traits and facilitate functional genomic selection for economically important traits.

  9. Signatures of natural selection and ecological differentiation in microbial genomes.

    PubMed

    Shapiro, B Jesse

    2014-01-01

    We live in a microbial world. Most of the genetic and metabolic diversity that exists on earth - and has existed for billions of years - is microbial. Making sense of this vast diversity is a daunting task, but one that can be approached systematically by analyzing microbial genome sequences. This chapter explores how the evolutionary forces of recombination and selection act to shape microbial genome sequences, leaving signatures that can be detected using comparative genomics and population-genetic tests for selection. I describe the major classes of tests, paying special attention to their relative strengths and weaknesses when applied to microbes. Specifically, I apply a suite of tests for selection to a set of closely-related bacterial genomes with different microhabitat preferences within the marine water column, shedding light on the genomic mechanisms of ecological differentiation in the wild. I will focus on the joint problem of simultaneously inferring the boundaries between microbial populations, and the selective forces operating within and between populations.

  10. Linkage disequilibrium, persistence of phase, and effective population size in Spanish local beef cattle breeds assessed through a high-density single nucleotide polymorphism chip.

    PubMed

    Cañas-Álvarez, J J; Mouresan, E F; Varona, L; Díaz, C; Molina, A; Baro, J A; Altarriba, J; Carabaño, M J; Casellas, J; Piedrafita, J

    2016-07-01

    Linkage disequilibrium (LD) and persistence of phase are fundamental approaches for exploring the genetic basis of economically important traits in cattle, including the identification of QTL for genomic selection and the estimation of effective population size () to determine the size of the training populations. In this study, we have used the Illumina BovineHD chip in 168 trios of 7 Spanish beef cattle breeds to obtain an overview of the magnitude of LD and the persistence of LD phase through the physical distance between markers. Also, we estimated the time of divergence based on the persistence of the LD phase and calculated past from LD estimates using different alternatives to define the recombination rate. Estimates of average (as a measure of LD) for adjacent markers were close to 0.52 in the 7 breeds and decreased with the distance between markers, although in long distances, some LD still remained (0.07 and 0.05 for markers 200 kb and 1 Mb apart, respectively). A panel with a lower boundary of 38,000 SNP would be necessary to launch a successful within-breed genomic selection program. Persistence of phase, measured as the pairwise correlations between estimates of in 2 breeds at short distances (10 kb), was in the 0.89 to 0.94 range and decreased from 0.33 to 0.52 to a range of 0.01 to 0.08 when marker distance increased from 200 kb to 1 Mb, respectively. The magnitude of the persistence of phase between the Spanish beef breeds was similar to those found in dairy breeds. For across-breed genomic selection, the size of the SNP panels must be in the range of 50,000 to 83,000 SNP. Estimates of past showed values ranging from 26 to 31 for 1 generation ago in all breeds. The divergence among breeds occurred between 129 and 207 generations ago. The results of this study are relevant for the future implementation of within- and across-breed genomic selection programs in the Spanish beef cattle populations. Our results suggest that a reduced subset of the SNP panel would be enough to achieve an adequate precision of the genomic predictions.

  11. Integration of genome and phenotypic scanning gives evidence of genetic structure in Mesoamerican common bean (Phaseolus vulgaris L.) landraces from the southwest of Europe.

    PubMed

    Santalla, M; De Ron, A M; De La Fuente, M

    2010-05-01

    Southwestern Europe has been considered as a secondary centre of genetic diversity for the common bean. The dispersal of domesticated materials from their centres of origin provides an experimental system that reveals how human selection during cultivation and adaptation to novel environments affects the genetic composition. In this paper, our goal was to elucidate how distinct events could modify the structure and level of genetic diversity in the common bean. The genome-wide genetic composition was analysed at 42 microsatellite loci in individuals of 22 landraces of domesticated common bean from the Mesoamerican gene pool. The accessions were also characterised for phaseolin seed protein and for nine allozyme polymorphisms and phenotypic traits. One of this study's important findings was the complementary information obtained from all the polymorphisms examined. Most of the markers found to be potentially under the influence of selection were located in the proximity of previously mapped genes and quantitative trait loci (QTLs) related to important agronomic traits, which indicates that population genomics approaches are very efficient in detecting QTLs. As it was revealed by outlier simple sequence repeats, loci analysis with STRUCTURE software and multivariate analysis of phenotypic data, the landraces were grouped into three clusters according to seed size and shape, vegetative growth habit and genetic resistance. A total of 151 alleles were detected with an average of 4 alleles per locus and an average polymorphism information content of 0.31. Using a model-based approach, on the basis of neutral markers implemented in the software STRUCTURE, three clusters were inferred, which were in good agreement with multivariate analysis. Geographic and genetic distances were congruent with the exception of a few putative hybrids identified in this study, suggesting a predominant effect of isolation by distance. Genomic scans using both markers linked to genes affected by selection (outlier) and neutral markers showed advantages relative to other approaches, since they help to create a more complete picture of how adaptation to environmental conditions has sculpted the common bean genomes in southern Europe. The use of outlier loci also gives a clue about what selective forces gave rise to the actual phenotypes of the analysed landraces.

  12. Genome-wide association mapping and agronomic impact of cowpea root architecture.

    PubMed

    Burridge, James D; Schneider, Hannah M; Huynh, Bao-Lam; Roberts, Philip A; Bucksch, Alexander; Lynch, Jonathan P

    2017-02-01

    Genetic analysis of data produced by novel root phenotyping tools was used to establish relationships between cowpea root traits and performance indicators as well between root traits and Striga tolerance. Selection and breeding for better root phenotypes can improve acquisition of soil resources and hence crop production in marginal environments. We hypothesized that biologically relevant variation is measurable in cowpea root architecture. This study implemented manual phenotyping (shovelomics) and automated image phenotyping (DIRT) on a 189-entry diversity panel of cowpea to reveal biologically important variation and genome regions affecting root architecture phenes. Significant variation in root phenes was found and relatively high heritabilities were detected for root traits assessed manually (0.4 for nodulation and 0.8 for number of larger laterals) as well as repeatability traits phenotyped via DIRT (0.5 for a measure of root width and 0.3 for a measure of root tips). Genome-wide association study identified 11 significant quantitative trait loci (QTL) from manually scored root architecture traits and 21 QTL from root architecture traits phenotyped by DIRT image analysis. Subsequent comparisons of results from this root study with other field studies revealed QTL co-localizations between root traits and performance indicators including seed weight per plant, pod number, and Striga (Striga gesnerioides) tolerance. The data suggest selection for root phenotypes could be employed by breeding programs to improve production in multiple constraint environments.

  13. Development of biosensors and their application in metabolic engineering.

    PubMed

    Zhang, Jie; Jensen, Michael K; Keasling, Jay D

    2015-10-01

    In a sustainable bioeconomy, many commodities and high value chemicals, including pharmaceuticals, will be manufactured using microbial cell factories from renewable feedstocks. These cell factories can be efficiently generated by constructing libraries of diversified genomes followed by screening for the desired phenotypes. However, methods available for microbial genome diversification far exceed our ability to screen and select for those variants with optimal performance. Genetically encoded biosensors have shown the potential to address this gap, given their ability to respond to small molecule binding and ease of implementation with high-throughput analysis. Here we describe recent progress in biosensor development and their applications in a metabolic engineering context. We also highlight examples of how biosensors can be integrated with synthetic circuits to exert feedback regulation on the metabolism for improved performance of cell factories. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. High throughput selection of novel plant growth regulators: Assessing the translatability of small bioactive molecules from Arabidopsis to crops.

    PubMed

    Rodriguez-Furlán, Cecilia; Miranda, Giovanna; Reggiardo, Martín; Hicks, Glenn R; Norambuena, Lorena

    2016-04-01

    Plant growth regulators (PGRs) have become an integral part of agricultural and horticultural practices. Accordingly, there is an increased demand for new and cost-effective products. Nevertheless, the market is limited by insufficient innovation. In this context chemical genomics has gained increasing attention as a powerful approach addressing specific traits. Here is described the successful implementation of a highly specific, sensitive and efficient high throughput screening approach using Arabidopsis as a model. Using a combination of techniques, 10,000 diverse compounds were screened and evaluated for several important plant growth traits including root and leaf growth. The phenotype-based selection allowed the compilation of a collection of putative Arabidopsis growth regulators with a broad range of activities and specificities. A subset was selected for evaluating their bioactivity in agronomically valuable plants. Their validation as growth regulators in commercial species such as tomato, lettuce, carrot, maize and turfgrasses reinforced the success of the screening in Arabidopsis and indicated that small molecules activity can be efficiently translated to commercial species. Therefore, the chemical genomics approach in Arabidopsis is a promising field that can be incorporated in PGR discovery programs and has a great potential to develop new products that can be efficiently used in crops. Copyright © 2016. Published by Elsevier Ireland Ltd.

  15. Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

    PubMed Central

    Lin, Wei; Feng, Rui; Li, Hongzhe

    2014-01-01

    In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be much larger than the sample size. Motivated by such modern applications, we consider the problem of variable selection and estimation in high-dimensional sparse instrumental variables models. To overcome the difficulty of high dimensionality and unknown optimal instruments, we propose a two-stage regularization framework for identifying and estimating important covariate effects while selecting and estimating optimal instruments. The methodology extends the classical two-stage least squares estimator to high dimensions by exploiting sparsity using sparsity-inducing penalty functions in both stages. The resulting procedure is efficiently implemented by coordinate descent optimization. For the representative L1 regularization and a class of concave regularization methods, we establish estimation, prediction, and model selection properties of the two-stage regularized estimators in the high-dimensional setting where the dimensionality of co-variates and instruments are both allowed to grow exponentially with the sample size. The practical performance of the proposed method is evaluated by simulation studies and its usefulness is illustrated by an analysis of mouse obesity data. Supplementary materials for this article are available online. PMID:26392642

  16. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    PubMed Central

    2013-01-01

    Background Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re-sequencing accessions, which represent wild, domesticated landrace, and Chinese elite soybean populations were analyzed. Results A total of 5,102,244 single nucleotide polymorphisms (SNPs) and 707,969 insertion/deletions were identified. Among the SNPs detected, 25.5% were not described previously. We found that artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits. The selection regions were not distributed randomly or uniformly throughout the genome. Instead, clusters of selection hotspots in certain genomic regions were observed. Moreover, a set of candidate genes (4.38% of the total annotated genes) significantly affected by selection underlying soybean domestication and genetic improvement were identified. Conclusions Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes/loci underlying agronomically important traits. PMID:23984715

  17. In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB

    PubMed Central

    2013-01-01

    Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. Description By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Conclusion Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity. PMID:23336431

  18. In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB.

    PubMed

    Sarika; Arora, Vasu; Iquebal, Mir Asif; Rai, Anil; Kumar, Dinesh

    2013-01-19

    Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and "finishing" expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity.

  19. Selective Gene Delivery for Integrating Exogenous DNA into Plastid and Mitochondrial Genomes Using Peptide-DNA Complexes.

    PubMed

    Yoshizumi, Takeshi; Oikawa, Kazusato; Chuah, Jo-Ann; Kodama, Yutaka; Numata, Keiji

    2018-05-14

    Selective gene delivery into organellar genomes (mitochondrial and plastid genomes) has been limited because of a lack of appropriate platform technology, even though these organelles are essential for metabolite and energy production. Techniques for selective organellar modification are needed to functionally improve organelles and produce transplastomic/transmitochondrial plants. However, no method for mitochondrial genome modification has yet been established for multicellular organisms including plants. Likewise, modification of plastid genomes has been limited to a few plant species and algae. In the present study, we developed ionic complexes of fusion peptides containing organellar targeting signal and plasmid DNA for selective delivery of exogenous DNA into the plastid and mitochondrial genomes of intact plants. This is the first report of exogenous DNA being integrated into the mitochondrial genomes of not only plants, but also multicellular organisms in general. This fusion peptide-mediated gene delivery system is a breakthrough platform for both plant organellar biotechnology and gene therapy for mitochondrial diseases in animals.

  20. Prospects for Genomic Selection in Cassava Breeding.

    PubMed

    Wolfe, Marnin D; Del Carpio, Dunia Pino; Alabi, Olumide; Ezenwaka, Lydia C; Ikeogu, Ugochukwu N; Kayondo, Ismail S; Lozano, Roberto; Okeke, Uche G; Ozimati, Alfred A; Williams, Esuma; Egesi, Chiedozie; Kawuki, Robert S; Kulakow, Peter; Rabbi, Ismail Y; Jannink, Jean-Luc

    2017-11-01

    Cassava ( Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on previous analyses by assessing the accuracy of seven prediction models for seven traits in three prediction scenarios: cross-validation within populations, cross-population prediction and cross-generation prediction. We also evaluated the impact of increasing the training population (TP) size by phenotyping progenies selected either at random or with a genetic algorithm. Cross-validation results were mostly consistent across programs, with nonadditive models predicting of 10% better on average. Cross-population accuracy was generally low (mean = 0.18) but prediction of cassava mosaic disease increased up to 57% in one Nigerian population when data from another related population were combined. Accuracy across generations was poorer than within-generation accuracy, as expected, but accuracy for dry matter content and mosaic disease severity should be sufficient for rapid-cycling GS. Selection of a prediction model made some difference across generations, but increasing TP size was more important. With a genetic algorithm, selection of one-third of progeny could achieve an accuracy equivalent to phenotyping all progeny. We are in the early stages of GS for this crop but the results are promising for some traits. General guidelines that are emerging are that TPs need to continue to grow but phenotyping can be done on a cleverly selected subset of individuals, reducing the overall phenotyping burden. Copyright © 2017 Crop Science Society of America.

  1. Evolution and the complexity of bacteriophages.

    PubMed

    Serwer, Philip

    2007-03-13

    The genomes of both long-genome (> 200 Kb) bacteriophages and long-genome eukaryotic viruses have cellular gene homologs whose selective advantage is not explained. These homologs add genomic and possibly biochemical complexity. Understanding their significance requires a definition of complexity that is more biochemically oriented than past empirically based definitions. Initially, I propose two biochemistry-oriented definitions of complexity: either decreased randomness or increased encoded information that does not serve immediate needs. Then, I make the assumption that these two definitions are equivalent. This assumption and recent data lead to the following four-part hypothesis that explains the presence of cellular gene homologs in long bacteriophage genomes and also provides a pathway for complexity increases in prokaryotic cells: (1) Prokaryotes underwent evolutionary increases in biochemical complexity after the eukaryote/prokaryote splits. (2) Some of the complexity increases occurred via multi-step, weak selection that was both protected from strong selection and accelerated by embedding evolving cellular genes in the genomes of bacteriophages and, presumably, also archaeal viruses (first tier selection). (3) The mechanisms for retaining cellular genes in viral genomes evolved under additional, longer-term selection that was stronger (second tier selection). (4) The second tier selection was based on increased access by prokaryotic cells to improved biochemical systems. This access was achieved when DNA transfer moved to prokaryotic cells both the more evolved genes and their more competitive and complex biochemical systems. I propose testing this hypothesis by controlled evolution in microbial communities to (1) determine the effects of deleting individual cellular gene homologs on the growth and evolution of long genome bacteriophages and hosts, (2) find the environmental conditions that select for the presence of cellular gene homologs, (3) determine which, if any, bacteriophage genes were selected for maintaining the homologs and (4) determine the dynamics of homolog evolution. This hypothesis is an explanation of evolutionary leaps in general. If accurate, it will assist both understanding and influencing the evolution of microbes and their communities. Analysis of evolutionary complexity increase for at least prokaryotes should include analysis of genomes of long-genome bacteriophages.

  2. Genomic Signature of Kin Selection in an Ant with Obligately Sterile Workers

    PubMed Central

    Warner, Michael R.; Mikheyev, Alexander S.

    2017-01-01

    Abstract Kin selection is thought to drive the evolution of cooperation and conflict, but the specific genes and genome-wide patterns shaped by kin selection are unknown. We identified thousands of genes associated with the sterile ant worker caste, the archetype of an altruistic phenotype shaped by kin selection, and then used population and comparative genomic approaches to study patterns of molecular evolution at these genes. Consistent with population genetic theoretical predictions, worker-upregulated genes experienced reduced selection compared with genes upregulated in reproductive castes. Worker-upregulated genes included more taxonomically restricted genes, indicating that the worker caste has recruited more novel genes, yet these genes also experienced reduced selection. Our study identifies a putative genomic signature of kin selection and helps to integrate emerging sociogenomic data with longstanding social evolution theory. PMID:28419349

  3. A decade of pig genome sequencing: a window on pig domestication and evolution.

    PubMed

    Groenen, Martien A M

    2016-03-29

    Insight into how genomes change and adapt due to selection addresses key questions in evolutionary biology and in domestication of animals and plants by humans. In that regard, the pig and its close relatives found in Africa and Eurasia represent an excellent group of species that enables studies of the effect of both natural and human-mediated selection on the genome. The recent completion of the draft genome sequence of a domestic pig and the development of next-generation sequencing technology during the past decade have created unprecedented possibilities to address these questions in great detail. In this paper, I review recent whole-genome sequencing studies in the pig and closely-related species that provide insight into the demography, admixture and selection of these species and, in particular, how domestication and subsequent selection of Sus scrofa have shaped the genomes of these animals.

  4. Accounting for discovery bias in genomic EPD

    USDA-ARS?s Scientific Manuscript database

    Genomics has contributed substantially to genetic improvement of beef cattle. The implementation is through computation of genomically enhanced expected progeny differences (GE-EPD), which are predictions of genetic merit of individual animals based on genomic information, pedigree, and data on the ...

  5. Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data

    PubMed Central

    Xu, Shuhua

    2015-01-01

    Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution. PMID:26053627

  6. Improving the baking quality of bread wheat by genomic selection in early generations.

    PubMed

    Michel, Sebastian; Kummer, Christian; Gallee, Martin; Hellinger, Jakob; Ametz, Christian; Akgöl, Batuhan; Epure, Doru; Güngör, Huseyin; Löschenberger, Franziska; Buerstmayr, Hermann

    2018-02-01

    Genomic selection shows great promise for pre-selecting lines with superior bread baking quality in early generations, 3 years ahead of labour-intensive, time-consuming, and costly quality analysis. The genetic improvement of baking quality is one of the grand challenges in wheat breeding as the assessment of the associated traits often involves time-consuming, labour-intensive, and costly testing forcing breeders to postpone sophisticated quality tests to the very last phases of variety development. The prospect of genomic selection for complex traits like grain yield has been shown in numerous studies, and might thus be also an interesting method to select for baking quality traits. Hence, we focused in this study on the accuracy of genomic selection for laborious and expensive to phenotype quality traits as well as its selection response in comparison with phenotypic selection. More than 400 genotyped wheat lines were, therefore, phenotyped for protein content, dough viscoelastic and mixing properties related to baking quality in multi-environment trials 2009-2016. The average prediction accuracy across three independent validation populations was r = 0.39 and could be increased to r = 0.47 by modelling major QTL as fixed effects as well as employing multi-trait prediction models, which resulted in an acceptable prediction accuracy for all dough rheological traits (r = 0.38-0.63). Genomic selection can furthermore be applied 2-3 years earlier than direct phenotypic selection, and the estimated selection response was nearly twice as high in comparison with indirect selection by protein content for baking quality related traits. This considerable advantage of genomic selection could accordingly support breeders in their selection decisions and aid in efficiently combining superior baking quality with grain yield in newly developed wheat varieties.

  7. A genomic overview of the population structure of Salmonella.

    PubMed

    Alikhan, Nabil-Fareed; Zhou, Zhemin; Sergeant, Martin J; Achtman, Mark

    2018-04-01

    For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

  8. A tool for selecting SNPs for association studies based on observed linkage disequilibrium patterns.

    PubMed

    De La Vega, Francisco M; Isaac, Hadar I; Scafe, Charles R

    2006-01-01

    The design of genetic association studies using single-nucleotide polymorphisms (SNPs) requires the selection of subsets of the variants providing high statistical power at a reasonable cost. SNPs must be selected to maximize the probability that a causative mutation is in linkage disequilibrium (LD) with at least one marker genotyped in the study. The HapMap project performed a genome-wide survey of genetic variation with about a million SNPs typed in four populations, providing a rich resource to inform the design of association studies. A number of strategies have been proposed for the selection of SNPs based on observed LD, including construction of metric LD maps and the selection of haplotype tagging SNPs. Power calculations are important at the study design stage to ensure successful results. Integrating these methods and annotations can be challenging: the algorithms required to implement these methods are complex to deploy, and all the necessary data and annotations are deposited in disparate databases. Here, we present the SNPbrowser Software, a freely available tool to assist in the LD-based selection of markers for association studies. This stand-alone application provides fast query capabilities and swift visualization of SNPs, gene annotations, power, haplotype blocks, and LD map coordinates. Wizards implement several common SNP selection workflows including the selection of optimal subsets of SNPs (e.g. tagging SNPs). Selected SNPs are screened for their conversion potential to either TaqMan SNP Genotyping Assays or the SNPlex Genotyping System, two commercially available genotyping platforms, expediting the set-up of genetic studies with an increased probability of success.

  9. Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

    USDA-ARS?s Scientific Manuscript database

    A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...

  10. A strategy for implementing genomics into nursing practice informed by three behaviour change theories.

    PubMed

    Leach, Verity; Tonkin, Emma; Lancastle, Deborah; Kirk, Maggie

    2016-06-01

    Genomics is an ever increasing aspect of nursing practice, with focus being directed towards improving health. The authors present an implementation strategy for the incorporation of genomics into nursing practice within the UK, based on three behaviour change theories and the identification of individuals who are likely to provide support for change. Individuals identified as Opinion Leaders and Adopters of genomics illustrate how changes in behaviour might occur among the nursing profession. The core philosophy of the strategy is that genomic nurse Adopters and Opinion Leaders who have direct interaction with their peers in practice will be best placed to highlight the importance of genomics within the nursing role. The strategy discussed in this paper provides scope for continued nursing education and development of genomics within nursing practice on a larger scale. The recommendations might be of particular relevance for senior staff and management. © 2016 John Wiley & Sons Australia, Ltd.

  11. Fast Evolution from Precast Bricks: Genomics of Young Freshwater Populations of Threespine Stickleback Gasterosteus aculeatus

    PubMed Central

    Terekhanova, Nadezhda V.; Logacheva, Maria D.; Penin, Aleksey A.; Neretina, Tatiana V.; Barmintseva, Anna E.; Bazykin, Georgii A.; Kondrashov, Alexey S.; Mugue, Nikolai S.

    2014-01-01

    Adaptation is driven by natural selection; however, many adaptations are caused by weak selection acting over large timescales, complicating its study. Therefore, it is rarely possible to study selection comprehensively in natural environments. The threespine stickleback (Gasterosteus aculeatus) is a well-studied model organism with a short generation time, small genome size, and many genetic and genomic tools available. Within this originally marine species, populations have recurrently adapted to freshwater all over its range. This evolution involved extensive parallelism: pre-existing alleles that adapt sticklebacks to freshwater habitats, but are also present at low frequencies in marine populations, have been recruited repeatedly. While a number of genomic regions responsible for this adaptation have been identified, the details of selection remain poorly understood. Using whole-genome resequencing, we compare pooled genomic samples from marine and freshwater populations of the White Sea basin, and identify 19 short genomic regions that are highly divergent between them, including three known inversions. 17 of these regions overlap protein-coding genes, including a number of genes with predicted functions that are relevant for adaptation to the freshwater environment. We then analyze four additional independently derived young freshwater populations of known ages, two natural and two artificially established, and use the observed shifts of allelic frequencies to estimate the strength of positive selection. Adaptation turns out to be quite rapid, indicating strong selection acting simultaneously at multiple regions of the genome, with selection coefficients of up to 0.27. High divergence between marine and freshwater genotypes, lack of reduction in polymorphism in regions responsible for adaptation, and high frequencies of freshwater alleles observed even in young freshwater populations are all consistent with rapid assembly of G. aculeatus freshwater genotypes from pre-existing genomic regions of adaptive variation, with strong selection that favors this assembly acting simultaneously at multiple loci. PMID:25299485

  12. Fast evolution from precast bricks: genomics of young freshwater populations of threespine stickleback Gasterosteus aculeatus.

    PubMed

    Terekhanova, Nadezhda V; Logacheva, Maria D; Penin, Aleksey A; Neretina, Tatiana V; Barmintseva, Anna E; Bazykin, Georgii A; Kondrashov, Alexey S; Mugue, Nikolai S

    2014-10-01

    Adaptation is driven by natural selection; however, many adaptations are caused by weak selection acting over large timescales, complicating its study. Therefore, it is rarely possible to study selection comprehensively in natural environments. The threespine stickleback (Gasterosteus aculeatus) is a well-studied model organism with a short generation time, small genome size, and many genetic and genomic tools available. Within this originally marine species, populations have recurrently adapted to freshwater all over its range. This evolution involved extensive parallelism: pre-existing alleles that adapt sticklebacks to freshwater habitats, but are also present at low frequencies in marine populations, have been recruited repeatedly. While a number of genomic regions responsible for this adaptation have been identified, the details of selection remain poorly understood. Using whole-genome resequencing, we compare pooled genomic samples from marine and freshwater populations of the White Sea basin, and identify 19 short genomic regions that are highly divergent between them, including three known inversions. 17 of these regions overlap protein-coding genes, including a number of genes with predicted functions that are relevant for adaptation to the freshwater environment. We then analyze four additional independently derived young freshwater populations of known ages, two natural and two artificially established, and use the observed shifts of allelic frequencies to estimate the strength of positive selection. Adaptation turns out to be quite rapid, indicating strong selection acting simultaneously at multiple regions of the genome, with selection coefficients of up to 0.27. High divergence between marine and freshwater genotypes, lack of reduction in polymorphism in regions responsible for adaptation, and high frequencies of freshwater alleles observed even in young freshwater populations are all consistent with rapid assembly of G. aculeatus freshwater genotypes from pre-existing genomic regions of adaptive variation, with strong selection that favors this assembly acting simultaneously at multiple loci.

  13. Genomic Signatures Reveal New Evidences for Selection of Important Traits in Domestic Cattle

    PubMed Central

    Xu, Lingyang; Bickhart, Derek M.; Cole, John B.; Schroeder, Steven G.; Song, Jiuzhou; Tassell, Curtis P. Van; Sonstegard, Tad S.; Liu, George E.

    2015-01-01

    We investigated diverse genomic selections using high-density single nucleotide polymorphism data of five distinct cattle breeds. Based on allele frequency differences, we detected hundreds of candidate regions under positive selection across Holstein, Angus, Charolais, Brahman, and N'Dama. In addition to well-known genes such as KIT, MC1R, ASIP, GHR, LCORL, NCAPG, WIF1, and ABCA12, we found evidence for a variety of novel and less-known genes under selection in cattle, such as LAP3, SAR1B, LRIG3, FGF5, and NUDCD3. Selective sweeps near LAP3 were then validated by next-generation sequencing. Genome-wide association analysis involving 26,362 Holsteins confirmed that LAP3 and SAR1B were related to milk production traits, suggesting that our candidate regions were likely functional. In addition, haplotype network analyses further revealed distinct selective pressures and evolution patterns across these five cattle breeds. Our results provided a glimpse into diverse genomic selection during cattle domestication, breed formation, and recent genetic improvement. These findings will facilitate genome-assisted breeding to improve animal production and health. PMID:25431480

  14. The Genome-based Knowledge Management in Cycles model: a complex adaptive systems framework for implementation of genomic applications.

    PubMed

    Arar, Nedal; Knight, Sara J; Modell, Stephen M; Issa, Amalia M

    2011-03-01

    The main mission of the Genomic Applications in Practice and Prevention Network™ is to advance collaborative efforts involving partners from across the public health sector to realize the promise of genomics in healthcare and disease prevention. We introduce a new framework that supports the Genomic Applications in Practice and Prevention Network mission and leverages the characteristics of the complex adaptive systems approach. We call this framework the Genome-based Knowledge Management in Cycles model (G-KNOMIC). G-KNOMIC proposes that the collaborative work of multidisciplinary teams utilizing genome-based applications will enhance translating evidence-based genomic findings by creating ongoing knowledge management cycles. Each cycle consists of knowledge synthesis, knowledge evaluation, knowledge implementation and knowledge utilization. Our framework acknowledges that all the elements in the knowledge translation process are interconnected and continuously changing. It also recognizes the importance of feedback loops, and the ability of teams to self-organize within a dynamic system. We demonstrate how this framework can be used to improve the adoption of genomic technologies into practice using two case studies of genomic uptake.

  15. Aligning the unalignable: bacteriophage whole genome alignments.

    PubMed

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  16. SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes.

    PubMed

    Jaron, Kamil S; Moravec, Jiří C; Martínková, Natália

    2014-04-15

    Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    PubMed

    Angly, Florent E; Willner, Dana; Prieto-Davó, Alejandra; Edwards, Robert A; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A; Barott, Katie; Cottrell, Matthew T; Desnues, Christelle; Dinsdale, Elizabeth A; Furlan, Mike; Haynes, Matthew; Henn, Matthew R; Hu, Yongfei; Kirchman, David L; McDole, Tracey; McPherson, John D; Meyer, Folker; Miller, R Michael; Mundt, Egbert; Naviaux, Robert K; Rodriguez-Mueller, Beltran; Stevens, Rick; Wegley, Linda; Zhang, Lixin; Zhu, Baoli; Rohwer, Forest

    2009-12-01

    Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.

  18. Genome-wide scans for loci under selection in humans

    PubMed Central

    2005-01-01

    Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection. PMID:16004726

  19. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat.

    PubMed

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-06-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.

  20. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat

    PubMed Central

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-01-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889

  1. Genomic newborn screening: public health policy considerations and recommendations.

    PubMed

    Friedman, Jan M; Cornel, Martina C; Goldenberg, Aaron J; Lister, Karla J; Sénécal, Karine; Vears, Danya F

    2017-02-21

    The use of genome-wide (whole genome or exome) sequencing for population-based newborn screening presents an opportunity to detect and treat or prevent many more serious early-onset health conditions than is possible today. The Paediatric Task Team of the Global Alliance for Genomics and Health's Regulatory and Ethics Working Group reviewed current understanding and concerns regarding the use of genomic technologies for population-based newborn screening and developed, by consensus, eight recommendations for clinicians, clinical laboratory scientists, and policy makers. Before genome-wide sequencing can be implemented in newborn screening programs, its clinical utility and cost-effectiveness must be demonstrated, and the ability to distinguish disease-causing and benign variants of all genes screened must be established. In addition, each jurisdiction needs to resolve ethical and policy issues regarding the disclosure of incidental or secondary findings to families and ownership, appropriate storage and sharing of genomic data. The best interests of children should be the basis for all decisions regarding the implementation of genomic newborn screening.

  2. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow.

    PubMed

    Ravinet, M; Faria, R; Butlin, R K; Galindo, J; Bierne, N; Rafajlović, M; Noor, M A F; Mehlig, B; Westram, A M

    2017-08-01

    Speciation, the evolution of reproductive isolation among populations, is continuous, complex, and involves multiple, interacting barriers. Until it is complete, the effects of this process vary along the genome and can lead to a heterogeneous genomic landscape with peaks and troughs of differentiation and divergence. When gene flow occurs during speciation, barriers restricting gene flow locally in the genome lead to patterns of heterogeneity. However, genomic heterogeneity can also be produced or modified by variation in factors such as background selection and selective sweeps, recombination and mutation rate variation, and heterogeneous gene density. Extracting the effects of gene flow, divergent selection and reproductive isolation from such modifying factors presents a major challenge to speciation genomics. We argue one of the principal aims of the field is to identify the barrier loci involved in limiting gene flow. We first summarize the expected signatures of selection at barrier loci, at the genomic regions linked to them and across the entire genome. We then discuss the modifying factors that complicate the interpretation of the observed genomic landscape. Finally, we end with a road map for future speciation research: a proposal for how to account for these modifying factors and to progress towards understanding the nature of barrier loci. Despite the difficulties of interpreting empirical data, we argue that the availability of promising technical and analytical methods will shed further light on the important roles that gene flow and divergent selection have in shaping the genomic landscape of speciation. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.

  3. Implementing meta-analysis from genome-wide association studies for pork quality traits

    USDA-ARS?s Scientific Manuscript database

    Pork quality plays an important role in the meat processing industry, thus different methodologies have been implemented to elucidate the genetic architecture of traits affecting meat quality. One of the most common and widely used approaches is to perform genome-wide association (GWA) studies. Howe...

  4. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    USDA-ARS?s Scientific Manuscript database

    Background: BAC-based physical maps provide for sequencing across an entire genome or selected sub-genome regions of biological interest. Using the minimum tiling path as a guide, it is possible to select specific BAC clones from prioritized genome sections such as a genetically defined QTL interv...

  5. Accuracy of genomic prediction for BCWD resistance in rainbow trout using different genotyping platforms and genomic selection models

    USDA-ARS?s Scientific Manuscript database

    In this study, we aimed to (1) predict genomic estimated breeding value (GEBV) for bacterial cold water disease (BCWD) resistance by genotyping training (n=583) and validation samples (n=53) with two genotyping platforms (24K RAD-SNP and 49K SNP) and using different genomic selection (GS) models (Ba...

  6. Evaluation of genome-enabled selection for bacterial cold water disease resistance using progeny performance data in Rainbow Trout: Insights on genotyping methods and genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic br...

  7. Evaluation of a decision aid for incidental genomic results, the Genomics ADvISER: protocol for a mixed methods randomised controlled trial.

    PubMed

    Shickh, Salma; Clausen, Marc; Mighton, Chloe; Casalino, Selina; Joshi, Esha; Glogowski, Emily; Schrader, Kasmintan A; Scheer, Adena; Elser, Christine; Panchal, Seema; Eisen, Andrea; Graham, Tracy; Aronson, Melyssa; Semotiuk, Kara M; Winter-Paquette, Laura; Evans, Michael; Lerner-Ellis, Jordan; Carroll, June C; Hamilton, Jada G; Offit, Kenneth; Robson, Mark; Thorpe, Kevin E; Laupacis, Andreas; Bombard, Yvonne

    2018-04-26

    Genome sequencing, a novel genetic diagnostic technology that analyses the billions of base pairs of DNA, promises to optimise healthcare through personalised diagnosis and treatment. However, implementation of genome sequencing faces challenges including the lack of consensus on disclosure of incidental results, gene changes unrelated to the disease under investigation, but of potential clinical significance to the patient and their provider. Current recommendations encourage clinicians to return medically actionable incidental results and stress the importance of education and informed consent. Given the shortage of genetics professionals and genomics expertise among healthcare providers, decision aids (DAs) can help fill a critical gap in the clinical delivery of genome sequencing. We aim to assess the effectiveness of an interactive DA developed for selection of incidental results. We will compare the DA in combination with a brief Q&A session with a genetic counsellor to genetic counselling alone in a mixed-methods randomised controlled trial. Patients who received negative standard cancer genetic results for their personal and family history of cancer and are thus eligible for sequencing will be recruited from cancer genetics clinics in Toronto. Our primary outcome is decisional conflict. Secondary outcomes are knowledge, satisfaction, preparation for decision-making, anxiety and length of session with the genetic counsellor. A subset of participants will complete a qualitative interview about preferences for incidental results. This study has been approved by research ethics boards of St. Michael's Hospital, Mount Sinai Hospital and Sunnybrook Health Sciences Centre. This research poses no significant risk to participants. This study evaluates the effectiveness of a novel patient-centred tool to support clinical delivery of incidental results. Results will be shared through national and international conferences, and at a stakeholder workshop to develop a consensus statement to optimise implementation of the DA in practice. NCT03244202; Pre-results. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. Current Priorities for Public Health Practice in Addressing the Role of Human Genomics in Improving Population Health

    PubMed Central

    Khoury, Muin J.; Bowen, Michael S.; Burke, Wylie; Coates, Ralph J.; Dowling, Nicole F.; Evans, James P.; Reyes, Michele; St. Pierre, Jeannette

    2017-01-01

    In spite of accelerating human genome discoveries in a wide variety of diseases of public health significance, the promise of personalized health care and disease prevention based on genomics has lagged behind. In a time of limited resources, public health agencies must continue to focus on implementing programs that can improve health and prevent disease now. Nevertheless, public health has an important and assertive leadership role in addressing the promise and pitfalls of human genomics for population health. Such efforts are needed not only to implement what is known in genomics to improve health but also to reduce potential harm and create the infrastructure needed to derive health benefits in the future. PMID:21406285

  9. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    PubMed

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  10. Design and Implementation of a Randomized Controlled Trial of Genomic Counseling for Patients with Chronic Disease

    PubMed Central

    Sweet, Kevin; Gordon, Erynn S.; Sturm, Amy C.; Schmidlen, Tara J.; Manickam, Kandamurugu; Toland, Amanda Ewart; Keller, Margaret A.; Stack, Catharine B.; García-España, J. Felipe; Bellafante, Mark; Tayal, Neeraj; Embi, Peter; Binkley, Philip; Hershberger, Ray E.; Sadee, Wolfgang; Christman, Michael; Marsh, Clay

    2014-01-01

    We describe the development and implementation of a randomized controlled trial to investigate the impact of genomic counseling on a cohort of patients with heart failure (HF) or hypertension (HTN), managed at a large academic medical center, the Ohio State University Wexner Medical Center (OSUWMC). Our study is built upon the existing Coriell Personalized Medicine Collaborative (CPMC®). OSUWMC patient participants with chronic disease (CD) receive eight actionable complex disease and one pharmacogenomic test report through the CPMC® web portal. Participants are randomized to either the in-person post-test genomic counseling—active arm, versus web-based only return of results—control arm. Study-specific surveys measure: (1) change in risk perception; (2) knowledge retention; (3) perceived personal control; (4) health behavior change; and, for the active arm (5), overall satisfaction with genomic counseling. This ongoing partnership has spurred creation of both infrastructure and procedures necessary for the implementation of genomics and genomic counseling in clinical care and clinical research. This included creation of a comprehensive informed consent document and processes for prospective return of actionable results for multiple complex diseases and pharmacogenomics (PGx) through a web portal, and integration of genomic data files and clinical decision support into an EPIC-based electronic medical record. We present this partnership, the infrastructure, genomic counseling approach, and the challenges that arose in the design and conduct of this ongoing trial to inform subsequent collaborative efforts and best genomic counseling practices. PMID:24926413

  11. solGS: a web-based tool for genomic selection

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, ana...

  12. The business of genomic testing: a survey of early adopters.

    PubMed

    Crawford, James M; Bry, Lynn; Pfeifer, John; Caughron, Samuel K; Black-Schaffer, Stephen; Kant, Jeffrey A; Kaufman, Jill H

    2014-12-01

    The practice of "genomic" (or "personalized") medicine requires the availability of appropriate diagnostic testing. Our study objective was to identify the reasons for health systems to bring next-generation sequencing into their clinical laboratories and to understand the process by which such decisions were made. Such information may be of value to other health systems seeking to provide next-generation sequencing testing to their patient populations. A standardized open-ended interview was conducted with the laboratory medical directors and/or department of pathology chairs of 13 different academic institutions in 10 different states. Genomic testing for cancer dominated the institutional decision making, with three primary reasons: more effective delivery of cancer care, the perceived need for institutional leadership in the field of genomics, and the premise that genomics will eventually be cost-effective. Barriers to implementation included implementation cost; the time and effort needed to maintain this newer testing; challenges in interpreting genetic variants; establishing the bioinformatics infrastructure; and curating data from medical, ethical, and legal standpoints. Ultimate success depended on alignment with institutional strengths and priorities and working closely with institutional clinical programs. These early adopters uniformly viewed genomic analysis as an imperative for developing their expertise in the implementation and practice of genomic medicine.

  13. Theory of prokaryotic genome evolution.

    PubMed

    Sela, Itamar; Wolf, Yuri I; Koonin, Eugene V

    2016-10-11

    Bacteria and archaea typically possess small genomes that are tightly packed with protein-coding genes. The compactness of prokaryotic genomes is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. Here, by fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. These results suggest that the number of genes in prokaryotic genomes reflects the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias (i.e., the rate of deletion of genetic material being slightly greater than the rate of acquisition). Thus, new genes acquired by microbial genomes, on average, appear to be adaptive. The tight spacing of protein-coding genes likely results from a combination of the deletion bias and purifying selection that efficiently eliminates nonfunctional, noncoding sequences.

  14. Selective significance of genome size in a plant community with heavy metal pollution.

    PubMed

    Vidic, T; Greilhuber, J; Vilhar, B; Dermastia, M

    2009-09-01

    In eukaryotes, nuclear genome sizes vary by more than five orders of magnitude. This variation is not related to organismal complexity, and its origin and biological significance are still disputed. One of the open questions is whether genome size has an adaptive role. We tested the hypothesis that genome size has selective significance, using five grassland communities occurring on a gradient of metal pollution of the soil as a model. We detected a negative correlation between the concentration of contaminating metals in the soil and the number of vascular plant species. Analysis of genome sizes of 70 herbaceous dicot perennial species occurring on the investigated plots revealed a negative correlation between the concentration of contaminating metals in the soil and the proportion of species with large genomes in plant communities. Consistent with the hypothesis, these results show that species with large genomes are at selective disadvantage in extreme environmental conditions.

  15. The genome landscape of indigenous African cattle.

    PubMed

    Kim, Jaemin; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Bashir, Salim; Diallo, Boubacar; Agaba, Morris; Kim, Kwondo; Kwak, Woori; Sung, Samsun; Seo, Minseok; Jeong, Hyeonsoo; Kwon, Taehyung; Taye, Mengistie; Song, Ki-Duk; Lim, Dajeong; Cho, Seoae; Lee, Hyun-Jeong; Yoon, Duhak; Oh, Sung Jong; Kemp, Stephen; Lee, Hak-Kyo; Kim, Heebal

    2017-02-20

    The history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems. We analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N'Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds. Our findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.

  16. Multiplexed genome engineering and genotyping methods applications for synthetic biology and metabolic engineering.

    PubMed

    Wang, Harris H; Church, George M

    2011-01-01

    Engineering at the scale of whole genomes requires fundamentally new molecular biology tools. Recent advances in recombineering using synthetic oligonucleotides enable the rapid generation of mutants at high efficiency and specificity and can be implemented at the genome scale. With these techniques, libraries of mutants can be generated, from which individuals with functionally useful phenotypes can be isolated. Furthermore, populations of cells can be evolved in situ by directed evolution using complex pools of oligonucleotides. Here, we discuss ways to utilize these multiplexed genome engineering methods, with special emphasis on experimental design and implementation. Copyright © 2011 Elsevier Inc. All rights reserved.

  17. Performance of genomic prediction within and across generations in maritime pine.

    PubMed

    Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent

    2016-08-11

    Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.

  18. Signatures of selection in the three-spined stickleback along a small-scale brackish water - freshwater transition zone.

    PubMed

    Konijnendijk, Nellie; Shikano, Takahito; Daneels, Dorien; Volckaert, Filip A M; Raeymaekers, Joost A M

    2015-09-01

    Local adaptation is often obvious when gene flow is impeded, such as observed at large spatial scales and across strong ecological contrasts. However, it becomes less certain at small scales such as between adjacent populations or across weak ecological contrasts, when gene flow is strong. While studies on genomic adaptation tend to focus on the former, less is known about the genomic targets of natural selection in the latter situation. In this study, we investigate genomic adaptation in populations of the three-spined stickleback Gasterosteus aculeatus L. across a small-scale ecological transition with salinities ranging from brackish to fresh. Adaptation to salinity has been repeatedly demonstrated in this species. A genome scan based on 87 microsatellite markers revealed only few signatures of selection, likely owing to the constraints that homogenizing gene flow puts on adaptive divergence. However, the detected loci appear repeatedly as targets of selection in similar studies of genomic adaptation in the three-spined stickleback. We conclude that the signature of genomic selection in the face of strong gene flow is weak, yet detectable. We argue that the range of studies of genomic divergence should be extended to include more systems characterized by limited geographical and ecological isolation, which is often a realistic setting in nature.

  19. Signatures of positive selection in East African Shorthorn Zebu: A genome-wide single nucleotide polymorphism analysis

    PubMed Central

    Bahbahani, Hussain; Clifford, Harry; Wragg, David; Mbole-Kariuki, Mary N; Van Tassell, Curtis; Sonstegard, Tad; Woolhouse, Mark; Hanotte, Olivier

    2015-01-01

    The small East African Shorthorn Zebu (EASZ) is the main indigenous cattle across East Africa. A recent genome wide SNP analysis revealed an ancient stable African taurine x Asian zebu admixture. Here, we assess the presence of candidate signatures of positive selection in their genome, with the aim to provide qualitative insights about the corresponding selective pressures. Four hundred and twenty-five EASZ and four reference populations (Holstein-Friesian, Jersey, N’Dama and Nellore) were analysed using 46,171 SNPs covering all autosomes and the X chromosome. Following FST and two extended haplotype homozygosity-based (iHS and Rsb) analyses 24 candidate genome regions within 14 autosomes and the X chromosome were revealed, in which 18 and 4 were previously identified in tropical-adapted and commercial breeds, respectively. These regions overlap with 340 bovine QTL. They include 409 annotated genes, in which 37 were considered as candidates. These genes are involved in various biological pathways (e.g. immunity, reproduction, development and heat tolerance). Our results support that different selection pressures (e.g. environmental constraints, human selection, genome admixture constrains) have shaped the genome of EASZ. We argue that these candidate regions represent genome landmarks to be maintained in breeding programs aiming to improve sustainable livestock productivity in the tropics. PMID:26130263

  20. Efficient engineering of chromosomal ribosome binding site libraries in mismatch repair proficient Escherichia coli.

    PubMed

    Oesterle, Sabine; Gerngross, Daniel; Schmitt, Steven; Roberts, Tania Michelle; Panke, Sven

    2017-09-26

    Multiplexed gene expression optimization via modulation of gene translation efficiency through ribosome binding site (RBS) engineering is a valuable approach for optimizing artificial properties in bacteria, ranging from genetic circuits to production pathways. Established algorithms design smart RBS-libraries based on a single partially-degenerate sequence that efficiently samples the entire space of translation initiation rates. However, the sequence space that is accessible when integrating the library by CRISPR/Cas9-based genome editing is severely restricted by DNA mismatch repair (MMR) systems. MMR efficiency depends on the type and length of the mismatch and thus effectively removes potential library members from the pool. Rather than working in MMR-deficient strains, which accumulate off-target mutations, or depending on temporary MMR inactivation, which requires additional steps, we eliminate this limitation by developing a pre-selection rule of genome-library-optimized-sequences (GLOS) that enables introducing large functional diversity into MMR-proficient strains with sequences that are no longer subject to MMR-processing. We implement several GLOS-libraries in Escherichia coli and show that GLOS-libraries indeed retain diversity during genome editing and that such libraries can be used in complex genome editing operations such as concomitant deletions. We argue that this approach allows for stable and efficient fine tuning of chromosomal functions with minimal effort.

  1. Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles.

    PubMed

    Orozco-terWengel, Pablo; Kapun, Martin; Nolte, Viola; Kofler, Robert; Flatt, Thomas; Schlötterer, Christian

    2012-10-01

    The genomic basis of adaptation to novel environments is a fundamental problem in evolutionary biology that has gained additional importance in the light of the recent global change discussion. Here, we combined laboratory natural selection (experimental evolution) in Drosophila melanogaster with genome-wide next generation sequencing of DNA pools (Pool-Seq) to identify alleles that are favourable in a novel laboratory environment and traced their trajectories during the adaptive process. Already after 15 generations, we identified a pronounced genomic response to selection, with almost 5000 single nucleotide polymorphisms (SNP; genome-wide false discovery rates < 0.005%) deviating from neutral expectation. Importantly, the evolutionary trajectories of the selected alleles were heterogeneous, with the alleles falling into two distinct classes: (i) alleles that continuously rise in frequency; and (ii) alleles that at first increase rapidly but whose frequencies then reach a plateau. Our data thus suggest that the genomic response to selection can involve a large number of selected SNPs that show unexpectedly complex evolutionary trajectories, possibly due to nonadditive effects. © 2012 Blackwell Publishing Ltd.

  2. Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics.

    PubMed

    Fisher, Charles K; Mehta, Pankaj

    2015-06-01

    Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Discovering functional DNA elements using population genomic information: a proof of concept using human mtDNA.

    PubMed

    Schrider, Daniel R; Kern, Andrew D

    2014-06-09

    Identifying the complete set of functional elements within the human genome would be a windfall for multiple areas of biological research including medicine, molecular biology, and evolution. Complete knowledge of function would aid in the prioritization of loci when searching for the genetic bases of disease or adaptive phenotypes. Because mutations that disrupt function are disfavored by natural selection, purifying selection leaves a detectable signature within functional elements; accordingly, this signal has been exploited for over a decade through the use of genomic comparisons of distantly related species. While this is so, the functional complement of the genome changes extensively across time and between lineages; therefore, evidence of the current action of purifying selection in humans is essential. Because the removal of deleterious mutations by natural selection also reduces within-species genetic diversity within functional loci, dense population genetic data have the potential to reveal genomic elements that are currently functional. Here, we assess the potential of this approach by examining an ultradeep sample of human mitochondrial genomes (n = 16,411). We show that the high density of polymorphism in this data set precisely delineates regions experiencing purifying selection. Furthermore, we show that the number of segregating alleles at a site is strongly correlated with its divergence across species after accounting for known mutational biases in human mitochondrial DNA (ρ = 0.51; P < 2.2 × 10(-16)). These two measures track one another at a remarkably fine scale across many loci-a correlation that is purely the result of natural selection. Our results demonstrate that genetic variation has the potential to reveal with surprising precision which regions in the genome are currently performing important functions and likely to have deleterious fitness effects when mutated. As more complete human genomes are sequenced, similar power to reveal purifying selection may be achievable in the human nuclear genome. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. A systems approach defining constraints of the genome architecture on lineage selection and evolvability during somatic cancer evolution

    PubMed Central

    Rübben, Albert; Nordhoff, Ole

    2013-01-01

    Summary Most clinically distinguishable malignant tumors are characterized by specific mutations, specific patterns of chromosomal rearrangements and a predominant mechanism of genetic instability but it remains unsolved whether modifications of cancer genomes can be explained solely by mutations and selection through the cancer microenvironment. It has been suggested that internal dynamics of genomic modifications as opposed to the external evolutionary forces have a significant and complex impact on Darwinian species evolution. A similar situation can be expected for somatic cancer evolution as molecular key mechanisms encountered in species evolution also constitute prevalent mutation mechanisms in human cancers. This assumption is developed into a systems approach of carcinogenesis which focuses on possible inner constraints of the genome architecture on lineage selection during somatic cancer evolution. The proposed systems approach can be considered an analogy to the concept of evolvability in species evolution. The principal hypothesis is that permissive or restrictive effects of the genome architecture on lineage selection during somatic cancer evolution exist and have a measurable impact. The systems approach postulates three classes of lineage selection effects of the genome architecture on somatic cancer evolution: i) effects mediated by changes of fitness of cells of cancer lineage, ii) effects mediated by changes of mutation probabilities and iii) effects mediated by changes of gene designation and physical and functional genome redundancy. Physical genome redundancy is the copy number of identical genetic sequences. Functional genome redundancy of a gene or a regulatory element is defined as the number of different genetic elements, regardless of copy number, coding for the same specific biological function within a cancer cell. Complex interactions of the genome architecture on lineage selection may be expected when modifications of the genome architecture have multiple and possibly opposed effects which manifest themselves at disparate times and progression stages. Dissection of putative mechanisms mediating constraints exerted by the genome architecture on somatic cancer evolution may provide an algorithm for understanding and predicting as well as modifying somatic cancer evolution in individual patients. PMID:23336076

  5. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    PubMed

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular.

  6. Accuracy of genomic selection in European maize elite breeding populations.

    PubMed

    Zhao, Yusheng; Gowda, Manje; Liu, Wenxin; Würschum, Tobias; Maurer, Hans P; Longin, Friedrich H; Ranc, Nicolas; Reif, Jochen C

    2012-03-01

    Genomic selection is a promising breeding strategy for rapid improvement of complex traits. The objective of our study was to investigate the prediction accuracy of genomic breeding values through cross validation. The study was based on experimental data of six segregating populations from a half-diallel mating design with 788 testcross progenies from an elite maize breeding program. The plants were intensively phenotyped in multi-location field trials and fingerprinted with 960 SNP markers. We used random regression best linear unbiased prediction in combination with fivefold cross validation. The prediction accuracy across populations was higher for grain moisture (0.90) than for grain yield (0.58). The accuracy of genomic selection realized for grain yield corresponds to the precision of phenotyping at unreplicated field trials in 3-4 locations. As for maize up to three generations are feasible per year, selection gain per unit time is high and, consequently, genomic selection holds great promise for maize breeding programs.

  7. The iSelect 9 K SNP analysis revealed polyploidization induced revolutionary changes and intense human selection causing strong haplotype blocks in wheat.

    PubMed

    Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong

    2017-01-30

    A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P < 0.05) smaller than those in the A genome. Intense selection (domestication and breeding) had a stronger effect on the A than on the B genome chromosomes. Based on the genetic pedigrees, many blocks can be traced back to a well-known Strampelli cross, which was made one century ago. Furthermore, polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.

  8. The population genomic signature of environmental selection in the widespread insect-pollinated tree species Frangula alnus at different geographical scales

    PubMed Central

    De Kort, H; Vandepitte, K; Mergeay, J; Mijnsbrugge, K V; Honnay, O

    2015-01-01

    The evaluation of the molecular signatures of selection in species lacking an available closely related reference genome remains challenging, yet it may provide valuable fundamental insights into the capacity of populations to respond to environmental cues. We screened 25 native populations of the tree species Frangula alnus subsp. alnus (Rhamnaceae), covering three different geographical scales, for 183 annotated single-nucleotide polymorphisms (SNPs). Standard population genomic outlier screens were combined with individual-based and multivariate landscape genomic approaches to examine the strength of selection relative to neutral processes in shaping genomic variation, and to identify the main environmental agents driving selection. Our results demonstrate a more distinct signature of selection with increasing geographical distance, as indicated by the proportion of SNPs (i) showing exceptional patterns of genetic diversity and differentiation (outliers) and (ii) associated with climate. Both temperature and precipitation have an important role as selective agents in shaping adaptive genomic differentiation in F. alnus subsp. alnus, although their relative importance differed among spatial scales. At the ‘intermediate' and ‘regional' scales, where limited genetic clustering and high population diversity were observed, some indications of natural selection may suggest a major role for gene flow in safeguarding adaptability. High genetic diversity at loci under selection in particular, indicated considerable adaptive potential, which may nevertheless be compromised by the combined effects of climate change and habitat fragmentation. PMID:25944466

  9. The Causal Meaning of Genomic Predictors and How It Affects Construction and Comparison of Genome-Enabled Selection Models

    PubMed Central

    Valente, Bruno D.; Morota, Gota; Peñagaricano, Francisco; Gianola, Daniel; Weigel, Kent; Rosa, Guilherme J. M.

    2015-01-01

    The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability. PMID:25908318

  10. Sunflower Hybrid Breeding: From Markers to Genomic Selection

    PubMed Central

    Dimitrijevic, Aleksandra; Horn, Renate

    2018-01-01

    In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi, or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare. Integrative approaches combining omic technologies (genomics, transcriptomics, proteomics, metabolomics and phenomics) using bioinformatic tools will facilitate the identification of target genes and markers for complex traits and will give a better insight into the mechanisms behind the traits. PMID:29387071

  11. Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

    PubMed

    Muley, Vijaykumar Yogesh; Ranjan, Akash

    2012-01-01

    Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.

  12. Reproductive technologies combine well with genomic selection in dairy breeding programs.

    PubMed

    Thomasen, J R; Willam, A; Egger-Danner, C; Sørensen, A C

    2016-02-01

    The objective of the present study was to examine whether genomic selection of females interacts with the use of reproductive technologies (RT) to increase annual monetary genetic gain (AMGG). This was tested using a factorial design with 3 factors: genomic selection of females (0 or 2,000 genotyped heifers per year), RT (0 or 50 donors selected at 14 mo of age for producing 10 offspring), and 2 reliabilities of genomic prediction. In addition, different strategies for use of RT and how strategies interact with the reliability of genomic prediction were investigated using stochastic simulation by varying (1) number of donors (25, 50, 100, 200), (2) number of calves born per donor (10 or 20), (3) age of donor (2 or 14 mo), and (4) number of sires (25, 50, 100, 200). In total, 72 different breeding schemes were investigated. The profitability of the different breeding strategies was evaluated by deterministic simulation by varying the costs of a born calf with reproductive technologies at levels of €500, €1,000, and €1,500. The results confirm our hypothesis that combining genomic selection of females with use of RT increases AMGG more than in a reference scheme without genomic selection in females. When the reliability of genomic prediction is high, the effect on rate of inbreeding (ΔF) is small. The study also demonstrates favorable interaction effects between the components of the breeder's equation (selection intensity, selection accuracy, generation interval) for the bull dam donor path, leading to higher AMGG. Increasing the donor program and number of born calves to achieve higher AMGG is associated with the undesirable effect of increased ΔF. This can be alleviated, however, by increasing the numbers of sires without compromising AMGG remarkably. For the major part of the investigated donor schemes, the investment in RT is profitable in dairy cattle populations, even at high levels of costs for RT. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  13. regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests.

    PubMed

    Gel, Bernat; Díez-Villanueva, Anna; Serra, Eduard; Buschbeck, Marcus; Peinado, Miguel A; Malinverni, Roberto

    2016-01-15

    Statistically assessing the relation between a set of genomic regions and other genomic features is a common challenging task in genomic and epigenomic analyses. Randomization based approaches implicitly take into account the complexity of the genome without the need of assuming an underlying statistical model. regioneR is an R package that implements a permutation test framework specifically designed to work with genomic regions. In addition to the predefined randomization and evaluation strategies, regioneR is fully customizable allowing the use of custom strategies to adapt it to specific questions. Finally, it also implements a novel function to evaluate the local specificity of the detected association. regioneR is an R package released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/regioneR). rmalinverni@carrerasresearch.org. © The Author 2015. Published by Oxford University Press.

  14. Optimization of Swine Breeding Programs Using Genomic Selection with ZPLAN+

    PubMed Central

    Lopez, B. M.; Kang, H. S.; Kim, T. H.; Viterbo, V. S.; Kim, H. S.; Na, C. S.; Seo, K. S.

    2016-01-01

    The objective of this study was to evaluate the present conventional selection program of a swine nucleus farm and compare it with a new selection strategy employing genomic enhanced breeding value (GEBV) as the selection criteria. The ZPLAN+ software was employed to calculate and compare the genetic gain, total cost, return and profit of each selection strategy. The first strategy reflected the current conventional breeding program, which was a progeny test system (CS). The second strategy was a selection scheme based strictly on genomic information (GS1). The third scenario was the same as GS1, but the selection by GEBV was further supplemented by the performance test (GS2). The last scenario was a mixture of genomic information and progeny tests (GS3). The results showed that the accuracy of the selection index of young boars of GS1 was 26% higher than that of CS. On the other hand, both GS2 and GS3 gave 31% higher accuracy than CS for young boars. The annual monetary genetic gain of GS1, GS2 and GS3 was 10%, 12%, and 11% higher, respectively, than that of CS. As expected, the discounted costs of genomic selection strategies were higher than those of CS. The costs of GS1, GS2 and GS3 were 35%, 73%, and 89% higher than those of CS, respectively, assuming a genotyping cost of $120. As a result, the discounted profit per animal of GS1 and GS2 was 8% and 2% higher, respectively, than that of CS while GS3 was 6% lower. Comparison among genomic breeding scenarios revealed that GS1 was more profitable than GS2 and GS3. The genomic selection schemes, especially GS1 and GS2, were clearly superior to the conventional scheme in terms of monetary genetic gain and profit. PMID:26954222

  15. Background selection as null hypothesis in population genomics: insights and challenges from Drosophila studies

    PubMed Central

    2017-01-01

    The consequences of selection at linked sites are multiple and widespread across the genomes of most species. Here, I first review the main concepts behind models of selection and linkage in recombining genomes, present the difficulty in parametrizing these models simply as a reduction in effective population size (Ne) and discuss the predicted impact of recombination rates on levels of diversity across genomes. Arguments are then put forward in favour of using a model of selection and linkage with neutral and deleterious mutations (i.e. the background selection model, BGS) as a sensible null hypothesis for investigating the presence of other forms of selection, such as balancing or positive. I also describe and compare two studies that have generated high-resolution landscapes of the predicted consequences of selection at linked sites in Drosophila melanogaster. Both studies show that BGS can explain a very large fraction of the observed variation in diversity across the whole genome, thus supporting its use as null model. Finally, I identify and discuss a number of caveats and challenges in studies of genetic hitchhiking that have been often overlooked, with several of them sharing a potential bias towards overestimating the evidence supporting recent selective sweeps to the detriment of a BGS explanation. One potential source of bias is the analysis of non-equilibrium populations: it is precisely because models of selection and linkage predict variation in Ne across chromosomes that demographic dynamics are not expected to be equivalent chromosome- or genome-wide. Other challenges include the use of incomplete genome annotations, the assumption of temporally stable recombination landscapes, the presence of genes under balancing selection and the consequences of ignoring non-crossover (gene conversion) recombination events. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’. PMID:29109230

  16. Multimedia presentations on the human genome: Implementation and assessment of a teaching program for the introduction to genome science using a poster and animations.

    PubMed

    Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto

    2008-11-01

    Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from Mendel's peas not only because many students do not understand that plants are organisms, but also because human biology contains important social and health issues. Therefore, we have developed a teaching program for the introduction to genome science, whose subjects are focused on the human genome. This program comprises mixed multimedia presentations: a large poster with illustrations and text on the human genome (a human genome map for every home), and animations on the basics of genome science. We implemented and assessed this program at four high schools. Our results indicate that students felt that they learned about the human genome from the program and some increases in students' understanding were observed with longer exposure to the mixed multimedia presentations. Copyright © 2008 International Union of Biochemistry and Molecular Biology, Inc.

  17. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    PubMed

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the preliminary results showed differences in virulence genes found in Yersinia pestis and Yersinia pseudotuberculosis compared to other Yersinia species, and differences between Yersinia enterocolitica subsp. enterocolitica and Yersinia enterocolitica subsp. palearctica. YersiniaBase offers free access to wide range of genomic data and analysis tools for the analysis of Yersinia. YersiniaBase can be accessed at http://yersinia.um.edu.my .

  18. A Meta-Assembly of Selection Signatures in Cattle

    PubMed Central

    Randhawa, Imtiaz A. S.; Khatkar, Mehar S.; Thomson, Peter C.; Raadsma, Herman W.

    2016-01-01

    Since domestication, significant genetic improvement has been achieved for many traits of commercial importance in cattle, including adaptation, appearance and production. In response to such intense selection pressures, the bovine genome has undergone changes at the underlying regions of functional genetic variants, which are termed “selection signatures”. This article reviews 64 recent (2009–2015) investigations testing genomic diversity for departure from neutrality in worldwide cattle populations. In particular, we constructed a meta-assembly of 16,158 selection signatures for individual breeds and their archetype groups (European, African, Zebu and composite) from 56 genome-wide scans representing 70,743 animals of 90 pure and crossbred cattle breeds. Meta-selection-scores (MSS) were computed by combining published results at every given locus, within a sliding window span. MSS were adjusted for common samples across studies and were weighted for significance thresholds across and within studies. Published selection signatures show extensive coverage across the bovine genome, however, the meta-assembly provides a consensus profile of 263 genomic regions of which 141 were unique (113 were breed-specific) and 122 were shared across cattle archetypes. The most prominent peaks of MSS represent regions under selection across multiple populations and harboured genes of known major effects (coat color, polledness and muscle hypertrophy) and genes known to influence polygenic traits (stature, adaptation, feed efficiency, immunity, behaviour, reproduction, beef and dairy production). As the first meta-assembly of selection signatures, it offers novel insights about the hotspots of selective sweeps in the bovine genome, and this method could equally be applied to other species. PMID:27045296

  19. Seamless Genome Editing in Rice via Gene Targeting and Precise Marker Elimination.

    PubMed

    Nishizawa-Yokoi, Ayako; Saika, Hiroaki; Toki, Seiichi

    2016-01-01

    Positive-negative selection using hygromycin phosphotransferase (hpt) and diphtheria toxin A-fragment (DT-A) as positive and negative selection markers, respectively, allows enrichment of cells harboring target genes modified via gene targeting (GT). We have developed a successful GT system employing positive-negative selection and subsequent precise marker excision via the piggyBac transposon derived from the cabbage looper moth to introduce desired modifications into target genes in the rice genome. This approach could be applied to the precision genome editing of almost all endogenous genes throughout the genome, at least in rice.

  20. Genomic selection in sugar beet breeding populations.

    PubMed

    Würschum, Tobias; Reif, Jochen C; Kraft, Thomas; Janssen, Geert; Zhao, Yusheng

    2013-09-18

    Genomic selection exploits dense genome-wide marker data to predict breeding values. In this study we used a large sugar beet population of 924 lines representing different germplasm types present in breeding populations: unselected segregating families and diverse lines from more advanced stages of selection. All lines have been intensively phenotyped in multi-location field trials for six agronomically important traits and genotyped with 677 SNP markers. We used ridge regression best linear unbiased prediction in combination with fivefold cross-validation and obtained high prediction accuracies for all except one trait. In addition, we investigated whether a calibration developed based on a training population composed of diverse lines is suited to predict the phenotypic performance within families. Our results show that the prediction accuracy is lower than that obtained within the diverse set of lines, but comparable to that obtained by cross-validation within the respective families. The results presented in this study suggest that a training population derived from intensively phenotyped and genotyped diverse lines from a breeding program does hold potential to build up robust calibration models for genomic selection. Taken together, our results indicate that genomic selection is a valuable tool and can thus complement the genomics toolbox in sugar beet breeding.

  1. Combining genomic selection and gene identification for crop improvement

    USDA-ARS?s Scientific Manuscript database

    The use of genetic information to predict the value of individuals in plant breeding populations began about 40 years ago. The original paradigm was to identify genomic regions with outsize influence on a trait of economic value, then to use markers in that genomic region to select individuals carry...

  2. Using genomics to enhance selection of novel traits in North American dairy cattle

    USDA-ARS?s Scientific Manuscript database

    Genomics offers new opportunities for the effective selection of novel traits. For traits such as mastitis resistance, hoof health, or the prediction of milk composition from mid-infrared (MIR) data, for example, enough records are usually available to carry out genomic evaluations using sire genoty...

  3. Perspectives for genomic selection applications and research in plants

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) has created a lot of excitement and expectations in the animal and plant breeding research communities. In this review, we briefly describe how genomic prediction can be integrated into breeding efforts and point out achievements and areas where more research is needed. GS pro...

  4. Footprints of Directional Selection in Wild Atlantic Salmon Populations: Evidence for Parasite-Driven Evolution?

    PubMed Central

    Zueva, Ksenia J.; Lumme, Jaakko; Veselov, Alexey E.; Kent, Matthew P.; Lien, Sigbjørn; Primmer, Craig R.

    2014-01-01

    Mechanisms of host-parasite co-adaptation have long been of interest in evolutionary biology; however, determining the genetic basis of parasite resistance has been challenging. Current advances in genome technologies provide new opportunities for obtaining a genome-scale view of the action of parasite-driven natural selection in wild populations and thus facilitate the search for specific genomic regions underlying inter-population differences in pathogen response. European populations of Atlantic salmon (Salmo salar L.) exhibit natural variance in susceptibility levels to the ectoparasite Gyrodactylus salaris Malmberg 1957, ranging from resistance to extreme susceptibility, and are therefore a good model for studying the evolution of virulence and resistance. However, distinguishing the molecular signatures of genetic drift and environment-associated selection in small populations such as land-locked Atlantic salmon populations presents a challenge, specifically in the search for pathogen-driven selection. We used a novel genome-scan analysis approach that enabled us to i) identify signals of selection in salmon populations affected by varying levels of genetic drift and ii) separate potentially selected loci into the categories of pathogen (G. salaris)-driven selection and selection acting upon other environmental characteristics. A total of 4631 single nucleotide polymorphisms (SNPs) were screened in Atlantic salmon from 12 different northern European populations. We identified three genomic regions potentially affected by parasite-driven selection, as well as three regions presumably affected by salinity-driven directional selection. Functional annotation of candidate SNPs is consistent with the role of the detected genomic regions in immune defence and, implicitly, in osmoregulation. These results provide new insights into the genetic basis of pathogen susceptibility in Atlantic salmon and will enable future searches for the specific genes involved. PMID:24670947

  5. Footprints of directional selection in wild Atlantic salmon populations: evidence for parasite-driven evolution?

    PubMed

    Zueva, Ksenia J; Lumme, Jaakko; Veselov, Alexey E; Kent, Matthew P; Lien, Sigbjørn; Primmer, Craig R

    2014-01-01

    Mechanisms of host-parasite co-adaptation have long been of interest in evolutionary biology; however, determining the genetic basis of parasite resistance has been challenging. Current advances in genome technologies provide new opportunities for obtaining a genome-scale view of the action of parasite-driven natural selection in wild populations and thus facilitate the search for specific genomic regions underlying inter-population differences in pathogen response. European populations of Atlantic salmon (Salmo salar L.) exhibit natural variance in susceptibility levels to the ectoparasite Gyrodactylus salaris Malmberg 1957, ranging from resistance to extreme susceptibility, and are therefore a good model for studying the evolution of virulence and resistance. However, distinguishing the molecular signatures of genetic drift and environment-associated selection in small populations such as land-locked Atlantic salmon populations presents a challenge, specifically in the search for pathogen-driven selection. We used a novel genome-scan analysis approach that enabled us to i) identify signals of selection in salmon populations affected by varying levels of genetic drift and ii) separate potentially selected loci into the categories of pathogen (G. salaris)-driven selection and selection acting upon other environmental characteristics. A total of 4631 single nucleotide polymorphisms (SNPs) were screened in Atlantic salmon from 12 different northern European populations. We identified three genomic regions potentially affected by parasite-driven selection, as well as three regions presumably affected by salinity-driven directional selection. Functional annotation of candidate SNPs is consistent with the role of the detected genomic regions in immune defence and, implicitly, in osmoregulation. These results provide new insights into the genetic basis of pathogen susceptibility in Atlantic salmon and will enable future searches for the specific genes involved.

  6. Efficiency of multi-breed genomic selection for dairy cattle breeds with different sizes of reference population.

    PubMed

    Hozé, C; Fritz, S; Phocas, F; Boichard, D; Ducrocq, V; Croiseau, P

    2014-01-01

    Single-breed genomic selection (GS) based on medium single nucleotide polymorphism (SNP) density (~50,000; 50K) is now routinely implemented in several large cattle breeds. However, building large enough reference populations remains a challenge for many medium or small breeds. The high-density BovineHD BeadChip (HD chip; Illumina Inc., San Diego, CA) containing 777,609 SNP developed in 2010 is characterized by short-distance linkage disequilibrium expected to be maintained across breeds. Therefore, combining reference populations can be envisioned. A population of 1,869 influential ancestors from 3 dairy breeds (Holstein, Montbéliarde, and Normande) was genotyped with the HD chip. Using this sample, 50K genotypes were imputed within breed to high-density genotypes, leading to a large HD reference population. This population was used to develop a multi-breed genomic evaluation. The goal of this paper was to investigate the gain of multi-breed genomic evaluation for a small breed. The advantage of using a large breed (Normande in the present study) to mimic a small breed is the large potential validation population to compare alternative genomic selection approaches more reliably. In the Normande breed, 3 training sets were defined with 1,597, 404, and 198 bulls, and a unique validation set included the 394 youngest bulls. For each training set, estimated breeding values (EBV) were computed using pedigree-based BLUP, single-breed BayesC, or multi-breed BayesC for which the reference population was formed by any of the Normande training data sets and 4,989 Holstein and 1,788 Montbéliarde bulls. Phenotypes were standardized by within-breed genetic standard deviation, the proportion of polygenic variance was set to 30%, and the estimated number of SNP with a nonzero effect was about 7,000. The 2 genomic selection (GS) approaches were performed using either the 50K or HD genotypes. The correlations between EBV and observed daughter yield deviations (DYD) were computed for 6 traits and using the different prediction approaches. Compared with pedigree-based BLUP, the average gain in accuracy with GS in small populations was 0.057 for the single-breed and 0.086 for multi-breed approach. This gain was up to 0.193 and 0.209, respectively, with the large reference population. Improvement of EBV prediction due to the multi-breed evaluation was higher for animals not closely related to the reference population. In the case of a breed with a small reference population size, the increase in correlation due to multi-breed GS was 0.141 for bulls without their sire in reference population compared with 0.016 for bulls with their sire in reference population. These results demonstrate that multi-breed GS can contribute to increase genomic evaluation accuracy in small breeds. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Community engagement strategies for genomic studies in Africa: a review of the literature.

    PubMed

    Tindana, Paulina; de Vries, Jantina; Campbell, Megan; Littler, Katherine; Seeley, Janet; Marshall, Patricia; Troyer, Jennifer; Ogundipe, Morisola; Alibu, Vincent Pius; Yakubu, Aminu; Parker, Michael

    2015-04-12

    Community engagement has been recognised as an important aspect of the ethical conduct of biomedical research, especially when research is focused on ethnically or culturally distinct populations. While this is a generally accepted tenet of biomedical research, it is unclear what components are necessary for effective community engagement, particularly in the context of genomic research in Africa. We conducted a review of the published literature to identify the community engagement strategies that can support the successful implementation of genomic studies in Africa. Our search strategy involved using online databases, Pubmed (National Library of Medicine), Medline and Google scholar. Search terms included a combination of the following: community engagement, community advisory boards, community consultation, community participation, effectiveness, genetic and genomic research, Africa, developing countries. A total of 44 articles and 1 thesis were retrieved of which 38 met the selection criteria. Of these, 21 were primary studies on community engagement, while the rest were secondary reports on community engagement efforts in biomedical research studies. 34 related to biomedical research generally, while 4 were specific to genetic and genomic research in Africa. We concluded that there were several community engagement strategies that could support genomic studies in Africa. While many of the strategies could support the early stages of a research project such as the recruitment of research participants, further research is needed to identify effective strategies to engage research participants and their communities beyond the participant recruitment stage. Research is also needed to address how the views of local communities should be incorporated into future uses of human biological samples. Finally, studies evaluating the impact of CE on genetic research are lacking. Systematic evaluation of CE strategies is essential to determine the most effective models of CE for genetic and genomic research conducted in African settings.

  8. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    PubMed

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.

  9. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    PubMed Central

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096

  10. SVGenes: a library for rendering genomic features in scalable vector graphic format.

    PubMed

    Etherington, Graham J; MacLean, Daniel

    2013-08-01

    Drawing genomic features in attractive and informative ways is a key task in visualization of genomics data. Scalable Vector Graphics (SVG) format is a modern and flexible open standard that provides advanced features including modular graphic design, advanced web interactivity and animation within a suitable client. SVGs do not suffer from loss of image quality on re-scaling and provide the ability to edit individual elements of a graphic on the whole object level independent of the whole image. These features make SVG a potentially useful format for the preparation of publication quality figures including genomic objects such as genes or sequencing coverage and for web applications that require rich user-interaction with the graphical elements. SVGenes is a Ruby-language library that uses SVG primitives to render typical genomic glyphs through a simple and flexible Ruby interface. The library implements a simple Page object that spaces and contains horizontal Track objects that in turn style, colour and positions features within them. Tracks are the level at which visual information is supplied providing the full styling capability of the SVG standard. Genomic entities like genes, transcripts and histograms are modelled in Glyph objects that are attached to a track and take advantage of SVG primitives to render the genomic features in a track as any of a selection of defined glyphs. The feature model within SVGenes is simple but flexible and not dependent on particular existing gene feature formats meaning graphics for any existing datasets can easily be created without need for conversion. The library is provided as a Ruby Gem from https://rubygems.org/gems/bio-svgenes under the MIT license, and open source code is available at https://github.com/danmaclean/bioruby-svgenes also under the MIT License. dan.maclean@tsl.ac.uk.

  11. SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data

    PubMed Central

    2014-01-01

    Background Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. Methods In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. Results The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. Conclusions SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. Availability and implementation SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/. PMID:24607237

  12. Nuclease Target Site Selection for Maximizing On-target Activity and Minimizing Off-target Effects in Genome Editing

    PubMed Central

    Lee, Ciaran M; Cradick, Thomas J; Fine, Eli J; Bao, Gang

    2016-01-01

    The rapid advancement in targeted genome editing using engineered nucleases such as ZFNs, TALENs, and CRISPR/Cas9 systems has resulted in a suite of powerful methods that allows researchers to target any genomic locus of interest. A complementary set of design tools has been developed to aid researchers with nuclease design, target site selection, and experimental validation. Here, we review the various tools available for target selection in designing engineered nucleases, and for quantifying nuclease activity and specificity, including web-based search tools and experimental methods. We also elucidate challenges in target selection, especially in predicting off-target effects, and discuss future directions in precision genome editing and its applications. PMID:26750397

  13. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species

    PubMed Central

    Wang, Jing; Street, Nathaniel R.; Scofield, Douglas G.; Ingvarsson, Pär K.

    2016-01-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. PMID:26721855

  14. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    PubMed

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  15. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.

    PubMed

    Fuentes-Pardo, Angela P; Ruzzante, Daniel E

    2017-10-01

    Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.

  16. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    PubMed

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome sequence data alongside the 54k SNP set. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  17. Stable plastid transformation in Scoparia dulcis L.

    PubMed

    Muralikrishna, Narra; Srinivas, Kota; Kumar, Kalva Bharath; Sadanandam, Abbagani

    2016-10-01

    In the present investigation we report stable plastid transformation in Scoparia dulcis L., a versatile medicinal herb via particle gun method. The vector KNTc, harbouring aadA as a selectable marker and egfp as a reporter gene which were under the control of synthetic promoter pNG1014a, targets inverted repeats, trnR / t rnN of the plastid genome. By use of this heterologous vector, recovery of transplastomic lines with suitable selection protocol have been successfully established with overall efficiency of two transgenic lines for 25 bombarded leaf explants. PCR and Southern blot analysis demonstrated stable integration of foreign gene into the target sequences. The results represent a significant advancement of the plastid transformation technology in medicinal plants, which relevantly implements a change over in enhancing and regulating of certain metabolic pathways.

  18. Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations.

    PubMed

    Lin, Yao-Cheng; Boone, Morgane; Meuris, Leander; Lemmens, Irma; Van Roy, Nadine; Soete, Arne; Reumers, Joke; Moisse, Matthieu; Plaisance, Stéphane; Drmanac, Radoje; Chen, Jason; Speleman, Frank; Lambrechts, Diether; Van de Peer, Yves; Tavernier, Jan; Callewaert, Nico

    2014-09-03

    The HEK293 human cell lineage is widely used in cell biology and biotechnology. Here we use whole-genome resequencing of six 293 cell lines to study the dynamics of this aneuploid genome in response to the manipulations used to generate common 293 cell derivatives, such as transformation and stable clone generation (293T); suspension growth adaptation (293S); and cytotoxic lectin selection (293SG). Remarkably, we observe that copy number alteration detection could identify the genomic region that enabled cell survival under selective conditions (i.c. ricin selection). Furthermore, we present methods to detect human/vector genome breakpoints and a user-friendly visualization tool for the 293 genome data. We also establish that the genome structure composition is in steady state for most of these cell lines when standard cell culturing conditions are used. This resource enables novel and more informed studies with 293 cells, and we will distribute the sequenced cell lines to this effect.

  19. Methods of Genomic Competency Integration in Practice

    PubMed Central

    Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

    2015-01-01

    Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through varied strategies but require substantial training in order to design and implement interventions. Clinical Relevance Genomics is critical to the practice of all nurses. There is a great opportunity and interest to address genomic knowledge deficits in the practicing nurse workforce as a strategy to improve patient outcomes. Exemplars of champion dyad interventions designed to increase nursing capacity focus on improving education, policy, and healthcare services. PMID:25808828

  20. Economic evaluation of progeny-testing and genomic selection schemes for small-sized nucleus dairy cattle breeding programs in developing countries.

    PubMed

    Kariuki, C M; Brascamp, E W; Komen, H; Kahi, A K; van Arendonk, J A M

    2017-03-01

    In developing countries minimal and erratic performance and pedigree recording impede implementation of large-sized breeding programs. Small-sized nucleus programs offer an alternative but rely on their economic performance for their viability. We investigated the economic performance of 2 alternative small-sized dairy nucleus programs [i.e., progeny testing (PT) and genomic selection (GS)] over a 20-yr investment period. The nucleus was made up of 453 male and 360 female animals distributed in 8 non-overlapping age classes. Each year 10 active sires and 100 elite dams were selected. Populations of commercial recorded cows (CRC) of sizes 12,592 and 25,184 were used to produce test daughters in PT or to create a reference population in GS, respectively. Economic performance was defined as gross margins, calculated as discounted revenues minus discounted costs following a single generation of selection. Revenues were calculated as cumulative discounted expressions (CDE, kg) × 0.32 (€/kg of milk) × 100,000 (size commercial population). Genetic superiorities, deterministically simulated using pseudo-BLUP index and CDE, were determined using gene flow. Costs were for one generation of selection. Results show that GS schemes had higher cumulated genetic gain in the commercial cow population and higher gross margins compared with PT schemes. Gross margins were between 3.2- and 5.2-fold higher for GS, depending on size of the CRC population. The increase in gross margin was mostly due to a decreased generation interval and lower running costs in GS schemes. In PT schemes many bulls are culled before selection. We therefore also compared 2 schemes in which semen was stored instead of keeping live bulls. As expected, semen storage resulted in an increase in gross margins in PT schemes, but gross margins remained lower than those of GS schemes. We conclude that implementation of small-sized GS breeding schemes can be economically viable for developing countries. The Authors. Published by the Federation of Animal Science Societies and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).

  1. Genetic signatures of natural selection in a model invasive ascidian

    NASA Astrophysics Data System (ADS)

    Lin, Yaping; Chen, Yiyong; Yi, Changho; Fong, Jonathan J.; Kim, Won; Rius, Marc; Zhan, Aibin

    2017-03-01

    Invasive species represent promising models to study species’ responses to rapidly changing environments. Although local adaptation frequently occurs during contemporary range expansion, the associated genetic signatures at both population and genomic levels remain largely unknown. Here, we use genome-wide gene-associated microsatellites to investigate genetic signatures of natural selection in a model invasive ascidian, Ciona robusta. Population genetic analyses of 150 individuals sampled in Korea, New Zealand, South Africa and Spain showed significant genetic differentiation among populations. Based on outlier tests, we found high incidence of signatures of directional selection at 19 loci. Hitchhiking mapping analyses identified 12 directional selective sweep regions, and all selective sweep windows on chromosomes were narrow (~8.9 kb). Further analyses indentified 132 candidate genes under selection. When we compared our genetic data and six crucial environmental variables, 16 putatively selected loci showed significant correlation with these environmental variables. This suggests that the local environmental conditions have left significant signatures of selection at both population and genomic levels. Finally, we identified “plastic” genomic regions and genes that are promising regions to investigate evolutionary responses to rapid environmental change in C. robusta.

  2. The locus of sexual selection: moving sexual selection studies into the post-genomics era.

    PubMed

    Wilkinson, G S; Breden, F; Mank, J E; Ritchie, M G; Higginson, A D; Radwan, J; Jaquiery, J; Salzburger, W; Arriero, E; Barribeau, S M; Phillips, P C; Renn, S C P; Rowe, L

    2015-04-01

    Sexual selection drives fundamental evolutionary processes such as trait elaboration and speciation. Despite this importance, there are surprisingly few examples of genes unequivocally responsible for variation in sexually selected phenotypes. This lack of information inhibits our ability to predict phenotypic change due to universal behaviours, such as fighting over mates and mate choice. Here, we discuss reasons for this apparent gap and provide recommendations for how it can be overcome by adopting contemporary genomic methods, exploiting underutilized taxa that may be ideal for detecting the effects of sexual selection and adopting appropriate experimental paradigms. Identifying genes that determine variation in sexually selected traits has the potential to improve theoretical models and reveal whether the genetic changes underlying phenotypic novelty utilize common or unique molecular mechanisms. Such a genomic approach to sexual selection will help answer questions in the evolution of sexually selected phenotypes that were first asked by Darwin and can furthermore serve as a model for the application of genomics in all areas of evolutionary biology. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.

  3. Genomic Selection in Dairy Cattle: The USDA Experience.

    PubMed

    Wiggans, George R; Cole, John B; Hubbard, Suzanne M; Sonstegard, Tad S

    2017-02-08

    Genomic selection has revolutionized dairy cattle breeding. Since 2000, assays have been developed to genotype large numbers of single-nucleotide polymorphisms (SNPs) at relatively low cost. The first commercial SNP genotyping chip was released with a set of 54,001 SNPs in December 2007. Over 15,000 genotypes were used to determine which SNPs should be used in genomic evaluation of US dairy cattle. Official USDA genomic evaluations were first released in January 2009 for Holsteins and Jerseys, in August 2009 for Brown Swiss, in April 2013 for Ayrshires, and in April 2016 for Guernseys. Producers have accepted genomic evaluations as accurate indications of a bull's eventual daughter-based evaluation. The integration of DNA marker technology and genomics into the traditional evaluation system has doubled the rate of genetic progress for traits of economic importance, decreased generation interval, increased selection accuracy, reduced previous costs of progeny testing, and allowed identification of recessive lethals.

  4. Calculation of genomic predicted transmitting abilities for bovine respiratory disease complex in Holsteins

    USDA-ARS?s Scientific Manuscript database

    Bovine Respiratory Disease Complex is a disease that is very costly to the dairy industry. Genomic selection may be an effective tool to improve host resistance to the pathogens that cause this disease. Use of genomic predicted transmitting abilities (GPTA) for selection has had a dramatic effect on...

  5. Accuracy of genomic selection for BCWD resistance in rainbow trout

    USDA-ARS?s Scientific Manuscript database

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonids. In this study, we aimed to (1) predict genomic breeding values (GEBV) by genotyping training (n=583) and validation samples (n=53) with a SNP50K chip; and (2) assess the accuracy of genomic selection (GS) for BCWD r...

  6. Signatures of positive selection in East African Shorthorn Zebu: a genome-wide SNP analysis

    USDA-ARS?s Scientific Manuscript database

    The small East African Shorthorn Zebu is the main indigenous cattle across East Africa. A recent genome wide SNPs analysis has revealed their ancient stable African taurine x Asian zebu admixture. Here, we assess the presence of candidate signature of positive selection in their genome, with the aim...

  7. GenomicTools: a computational platform for developing high-throughput analytics in genomics.

    PubMed

    Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

    2012-01-15

    Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.

  8. Developing knowledge resources to support precision medicine: principles from the Clinical Pharmacogenetics Implementation Consortium (CPIC).

    PubMed

    Hoffman, James M; Dunnenberger, Henry M; Kevin Hicks, J; Caudle, Kelly E; Whirl Carrillo, Michelle; Freimuth, Robert R; Williams, Marc S; Klein, Teri E; Peterson, Josh F

    2016-07-01

    To move beyond a select few genes/drugs, the successful adoption of pharmacogenomics into routine clinical care requires a curated and machine-readable database of pharmacogenomic knowledge suitable for use in an electronic health record (EHR) with clinical decision support (CDS). Recognizing that EHR vendors do not yet provide a standard set of CDS functions for pharmacogenetics, the Clinical Pharmacogenetics Implementation Consortium (CPIC) Informatics Working Group is developing and systematically incorporating a set of EHR-agnostic implementation resources into all CPIC guidelines. These resources illustrate how to integrate pharmacogenomic test results in clinical information systems with CDS to facilitate the use of patient genomic data at the point of care. Based on our collective experience creating existing CPIC resources and implementing pharmacogenomics at our practice sites, we outline principles to define the key features of future knowledge bases and discuss the importance of these knowledge resources for pharmacogenomics and ultimately precision medicine. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Going where traditional markers have not gone before: utility of and promise for RAD sequencing in marine invertebrate phylogeography and population genomics.

    PubMed

    Reitzel, A M; Herrera, S; Layden, M J; Martindale, M Q; Shank, T M

    2013-06-01

    Characterization of large numbers of single-nucleotide polymorphisms (SNPs) throughout a genome has the power to refine the understanding of population demographic history and to identify genomic regions under selection in natural populations. To this end, population genomic approaches that harness the power of next-generation sequencing to understand the ecology and evolution of marine invertebrates represent a boon to test long-standing questions in marine biology and conservation. We employed restriction-site-associated DNA sequencing (RAD-seq) to identify SNPs in natural populations of the sea anemone Nematostella vectensis, an emerging cnidarian model with a broad geographic range in estuarine habitats in North and South America, and portions of England. We identified hundreds of SNP-containing tags in thousands of RAD loci from 30 barcoded individuals inhabiting four locations from Nova Scotia to South Carolina. Population genomic analyses using high-confidence SNPs resulted in a highly-resolved phylogeography, a result not achieved in previous studies using traditional markers. Plots of locus-specific FST against heterozygosity suggest that a majority of polymorphic sites are neutral, with a smaller proportion suggesting evidence for balancing selection. Loci inferred to be under balancing selection were mapped to the genome, where 90% were located in gene bodies, indicating potential targets of selection. The results from analyses with and without a reference genome supported similar conclusions, further highlighting RAD-seq as a method that can be efficiently applied to species lacking existing genomic resources. We discuss the utility of RAD-seq approaches in burgeoning Nematostella research as well as in other cnidarian species, particularly corals and jellyfishes, to determine phylogeographic relationships of populations and identify regions of the genome undergoing selection. © 2013 John Wiley & Sons Ltd.

  10. Atlantic salmon populations reveal adaptive divergence of immune related genes - a duplicated genome under selection.

    PubMed

    Kjærner-Semb, Erik; Ayllon, Fernando; Furmanek, Tomasz; Wennevik, Vidar; Dahle, Geir; Niemelä, Eero; Ozerov, Mikhail; Vähä, Juha-Pekka; Glover, Kevin A; Rubin, Carl J; Wargelius, Anna; Edvardsen, Rolf B

    2016-08-11

    Populations of Atlantic salmon display highly significant genetic differences with unresolved molecular basis. These differences may result from separate postglacial colonization patterns, diversifying natural selection and adaptation, or a combination. Adaptation could be influenced or even facilitated by the recent whole genome duplication in the salmonid lineage which resulted in a partly tetraploid species with duplicated genes and regions. In order to elucidate the genes and genomic regions underlying the genetic differences, we conducted a genome wide association study using whole genome resequencing data from eight populations from Northern and Southern Norway. From a total of ~4.5 million sequencing-derived SNPs, more than 10 % showed significant differentiation between populations from these two regions and ten selective sweeps on chromosomes 5, 10, 11, 13-15, 21, 24 and 25 were identified. These comprised 59 genes, of which 15 had one or more differentiated missense mutation. Our analysis showed that most sweeps have paralogous regions in the partially tetraploid genome, each lacking the high number of significant SNPs found in the sweeps. The most significant sweep was found on Chr 25 and carried several missense mutations in the antiviral mx genes, suggesting that these populations have experienced differing viral pressures. Interestingly the second most significant sweep, found on Chr 5, contains two genes involved in the NF-KB pathway (nkap and nkrf), which is also a known pathogen target that controls a large number of processes in animals. Our results show that natural selection acting on immune related genes has contributed to genetic divergence between salmon populations in Norway. The differences between populations may have been facilitated by the plasticity of the salmon genome. The observed signatures of selection in duplicated genomic regions suggest that the recently duplicated genome has provided raw material for evolutionary adaptation.

  11. Going where traditional markers have not gone before: utility of and promise for RAD sequencing in marine invertebrate phylogeography and population genomics

    PubMed Central

    Reitzel, A.M.; Herrera, S.; Layden, M.J.; Martindale, M.Q.; Shank, T.M.

    2013-01-01

    Characterization of large numbers of single nucleotide polymorphisms (SNPs) throughout a genome has the power to refine the understanding of population demographic history and to identify genomic regions under selection in natural populations. To this end, population genomic approaches that harness the power of next-generation sequencing to understand the ecology and evolution of marine invertebrates represent a boon to test long-standing questions in marine biology and conservation. We employed restriction-site-associated DNA sequencing (RAD-seq) to identify SNPs in natural populations of the sea anemone Nematostella vectensis, an emerging cnidarian model with a broad geographic range in estuarine habitats in North and South America, and portions of England. We identified hundreds of SNP-containing tags in thousands of RAD loci from 30 barcoded individuals inhabiting four locations from Nova Scotia to South Carolina. Population genomic analyses using high-confidence SNPs resulted in a highly-resolved phylogeography, a result not achieved in previous studies using traditional markers. Plots of locus-specific FST against heterozygosity suggest that a majority of polymorphic sites are neutral, with a smaller proportion suggesting evidence for balancing selection. Loci inferred to be under balancing selection were mapped to the genome, where 90% were located in gene bodies, indicating potential targets of selection. Results from analyses with and without a reference genome supported similar conclusions, further supporting RAD-seq as a method that can be efficiently applied to species lacking existing genomic resources. We discuss the utility of RAD-seq approaches in burgeoning Nematostella research as well as in other cnidarian species, particularly corals, to determine phylogeographic relationships of populations and identify regions of the genome undergoing selection. PMID:23473066

  12. The role of parasite-driven selection in shaping landscape genomic structure in red grouse (Lagopus lagopus scotica).

    PubMed

    Wenzel, Marius A; Douglas, Alex; James, Marianne C; Redpath, Steve M; Piertney, Stuart B

    2016-01-01

    Landscape genomics promises to provide novel insights into how neutral and adaptive processes shape genome-wide variation within and among populations. However, there has been little emphasis on examining whether individual-based phenotype-genotype relationships derived from approaches such as genome-wide association (GWAS) manifest themselves as a population-level signature of selection in a landscape context. The two may prove irreconcilable as individual-level patterns become diluted by high levels of gene flow and complex phenotypic or environmental heterogeneity. We illustrate this issue with a case study that examines the role of the highly prevalent gastrointestinal nematode Trichostrongylus tenuis in shaping genomic signatures of selection in red grouse (Lagopus lagopus scotica). Individual-level GWAS involving 384 SNPs has previously identified five SNPs that explain variation in T. tenuis burden. Here, we examine whether these same SNPs display population-level relationships between T. tenuis burden and genetic structure across a small-scale landscape of 21 sites with heterogeneous parasite pressure. Moreover, we identify adaptive SNPs showing signatures of directional selection using F(ST) outlier analysis and relate population- and individual-level patterns of multilocus neutral and adaptive genetic structure to T. tenuis burden. The five candidate SNPs for parasite-driven selection were neither associated with T. tenuis burden on a population level, nor under directional selection. Similarly, there was no evidence of parasite-driven selection in SNPs identified as candidates for directional selection. We discuss these results in the context of red grouse ecology and highlight the broader consequences for the utility of landscape genomics approaches for identifying signatures of selection. © 2015 John Wiley & Sons Ltd.

  13. Recent advances in understanding the role of nutrition in human genome evolution.

    PubMed

    Ye, Kaixiong; Gu, Zhenglong

    2011-11-01

    Dietary transitions in human history have been suggested to play important roles in the evolution of mankind. Genetic variations caused by adaptation to diet during human evolution could have important health consequences in current society. The advance of sequencing technologies and the rapid accumulation of genome information provide an unprecedented opportunity to comprehensively characterize genetic variations in human populations and unravel the genetic basis of human evolution. Series of selection detection methods, based on various theoretical models and exploiting different aspects of selection signatures, have been developed. Their applications at the species and population levels have respectively led to the identification of human specific selection events that distinguish human from nonhuman primates and local adaptation events that contribute to human diversity. Scrutiny of candidate genes has revealed paradigms of adaptations to specific nutritional components and genome-wide selection scans have verified the prevalence of diet-related selection events and provided many more candidates awaiting further investigation. Understanding the role of diet in human evolution is fundamental for the development of evidence-based, genome-informed nutritional practices in the era of personal genomics.

  14. Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data

    PubMed Central

    Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong

    2018-01-01

    Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.

  15. Experimental evidence supports a sex-specific selective sieve in mitochondrial genome evolution.

    PubMed

    Innocenti, Paolo; Morrow, Edward H; Dowling, Damian K

    2011-05-13

    Mitochondria are maternally transmitted; hence, their genome can only make a direct and adaptive response to selection through females, whereas males represent an evolutionary dead end. In theory, this creates a sex-specific selective sieve, enabling deleterious mutations to accumulate in mitochondrial genomes if they exert male-specific effects. We tested this hypothesis, expressing five mitochondrial variants alongside a standard nuclear genome in Drosophila melanogaster, and found striking sexual asymmetry in patterns of nuclear gene expression. Mitochondrial polymorphism had few effects on nuclear gene expression in females but major effects in males, modifying nearly 10% of transcripts. These were mostly male-biased in expression, with enrichment hotspots in the testes and accessory glands. Our results suggest an evolutionary mechanism that results in mitochondrial genomes harboring male-specific mutation loads.

  16. The relationship between runs of homozygosity and inbreeding in Jersey cattle under selection

    USDA-ARS?s Scientific Manuscript database

    Inbreeding is often an inevitable outcome of strong directional artificial selection but it reduces fitness in a population with increased frequency of recessive deleterious alleles. Runs of homozygosity (ROH) representing genomic autozygosity that occur from mating between selected and genomically ...

  17. A privacy-preserving solution for compressed storage and selective retrieval of genomic data.

    PubMed

    Huang, Zhicong; Ayday, Erman; Lin, Huang; Aiyar, Raeka S; Molyneaux, Adam; Xu, Zhenyu; Fellay, Jacques; Steinmetz, Lars M; Hubaux, Jean-Pierre

    2016-12-01

    In clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients' complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared with BAM, the de facto standard for storing aligned genomic data, SECRAM uses 18% less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based instead of read-based, and (2) it allows random querying of a subregion from a BAM-like file in an encrypted form. Our method thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data. © 2016 Huang et al.; Published by Cold Spring Harbor Laboratory Press.

  18. A privacy-preserving solution for compressed storage and selective retrieval of genomic data

    PubMed Central

    Huang, Zhicong; Ayday, Erman; Lin, Huang; Aiyar, Raeka S.; Molyneaux, Adam; Xu, Zhenyu; Hubaux, Jean-Pierre

    2016-01-01

    In clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients’ complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared with BAM, the de facto standard for storing aligned genomic data, SECRAM uses 18% less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based instead of read-based, and (2) it allows random querying of a subregion from a BAM-like file in an encrypted form. Our method thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data. PMID:27789525

  19. Three chromosomal rearrangements promote genomic divergence between migratory and stationary ecotypes of Atlantic cod.

    PubMed

    Berg, Paul R; Star, Bastiaan; Pampoulie, Christophe; Sodeland, Marte; Barth, Julia M I; Knutsen, Halvor; Jakobsen, Kjetill S; Jentoft, Sissel

    2016-03-17

    Identification of genome-wide patterns of divergence provides insight on how genomes are influenced by selection and can reveal the potential for local adaptation in spatially structured populations. In Atlantic cod - historically a major marine resource - Northeast-Arctic- and Norwegian coastal cod are recognized by fundamental differences in migratory and non-migratory behavior, respectively. However, the genomic architecture underlying such behavioral ecotypes is unclear. Here, we have analyzed more than 8.000 polymorphic SNPs distributed throughout all 23 linkage groups and show that loci putatively under selection are localized within three distinct genomic regions, each of several megabases long, covering approximately 4% of the Atlantic cod genome. These regions likely represent genomic inversions. The frequency of these distinct regions differ markedly between the ecotypes, spawning in the vicinity of each other, which contrasts with the low level of divergence in the rest of the genome. The observed patterns strongly suggest that these chromosomal rearrangements are instrumental in local adaptation and separation of Atlantic cod populations, leaving footprints of large genomic regions under selection. Our findings demonstrate the power of using genomic information in further understanding the population dynamics and defining management units in one of the world's most economically important marine resources.

  20. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain

    PubMed Central

    Schrider, Daniel R.; Kern, Andrew D.

    2015-01-01

    The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods. PMID:26590212

  1. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer)

    PubMed Central

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili

    2017-01-01

    Abstract Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. PMID:28922794

  2. Genomic selection in sugar beet breeding populations

    PubMed Central

    2013-01-01

    Background Genomic selection exploits dense genome-wide marker data to predict breeding values. In this study we used a large sugar beet population of 924 lines representing different germplasm types present in breeding populations: unselected segregating families and diverse lines from more advanced stages of selection. All lines have been intensively phenotyped in multi-location field trials for six agronomically important traits and genotyped with 677 SNP markers. Results We used ridge regression best linear unbiased prediction in combination with fivefold cross-validation and obtained high prediction accuracies for all except one trait. In addition, we investigated whether a calibration developed based on a training population composed of diverse lines is suited to predict the phenotypic performance within families. Our results show that the prediction accuracy is lower than that obtained within the diverse set of lines, but comparable to that obtained by cross-validation within the respective families. Conclusions The results presented in this study suggest that a training population derived from intensively phenotyped and genotyped diverse lines from a breeding program does hold potential to build up robust calibration models for genomic selection. Taken together, our results indicate that genomic selection is a valuable tool and can thus complement the genomics toolbox in sugar beet breeding. PMID:24047500

  3. Genomic data for 78 chickens from 14 populations

    PubMed Central

    Li, Diyan; Che, Tiandong; Chen, Binlong; Tian, Shilin; Zhou, Xuming; Zhang, Guolong; Li, Miao; Gaur, Uma; Li, Yan; Luo, Majing; Zhang, Long; Xu, Zhongxian; Zhao, Xiaoling; Yin, Huadong; Wang, Yan; Jin, Long; Tang, Qianzi; Xu, Huailiang; Yang, Mingyao; Zhou, Rongjia; Li, Ruiqiang

    2017-01-01

    Abstract Background: Since the domestication of the red jungle fowls (Gallus gallus; dating back to ∼10 000 B.P.) in Asia, domestic chickens (Gallus gallus domesticus) have been subjected to the combined effects of natural selection and human-driven artificial selection; this has resulted in marked phenotypic diversity in a number of traits, including behavior, body composition, egg production, and skin color. Population genomic variations through diversifying selection have not been fully investigated. Findings: The whole genomes of 78 domestic chickens were sequenced to an average of 18-fold coverage for each bird. By combining this data with publicly available genomes of five wild red jungle fowls and eight Xishuangbanna game fowls, we conducted a comprehensive comparative genomics analysis of 91 chickens from 17 populations. After aligning ∼21.30 gigabases (Gb) of high-quality data from each individual to the reference chicken genome, we identified ∼6.44 million (M) single nucleotide polymorphisms (SNPs) for each population. These SNPs included 1.10 M novel SNPs in 17 populations that were absent in the current chicken dbSNP (Build 145) entries. Conclusions: The current data is important for population genetics and further studies in chickens and will serve as a valuable resource for investigating diversifying selection and candidate genes for selective breeding in chickens. PMID:28431039

  4. Healthcare provider education to support integration of pharmacogenomics in practice: the eMERGE Network experience

    PubMed Central

    Rohrer Vitek, Carolyn R; Abul-Husn, Noura S; Connolly, John J; Hartzler, Andrea L; Kitchner, Terrie; Peterson, Josh F; Rasmussen, Luke V; Smith, Maureen E; Stallings, Sarah; Williams, Marc S; Wolf, Wendy A; Prows, Cynthia A

    2017-01-01

    Ten organizations within the Electronic Medical Records and Genomics Network developed programs to implement pharmacogenomic sequencing and clinical decision support into clinical settings. Recognizing the importance of informed prescribers, a variety of strategies were used to incorporate provider education to support implementation. Education experiences with pharmacogenomics are described within the context of each organization's prior involvement, including the scope and scale of implementation specific to their Electronic Medical Records and Genomics projects. We describe common and distinct education strategies, provide exemplars and share challenges. Lessons learned inform future perspectives. Future pharmacogenomics clinical implementation initiatives need to include funding toward implementing provider education and evaluating outcomes. PMID:28639489

  5. Practical Approaches for Detecting Selection in Microbial Genomes.

    PubMed

    Hedge, Jessica; Wilson, Daniel J

    2016-02-01

    Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

  6. A Genomic Selection Index Applied to Simulated and Real Data

    PubMed Central

    Ceron-Rojas, J. Jesus; Crossa, José; Arief, Vivi N.; Basford, Kaye; Rutkoski, Jessica; Jarquín, Diego; Alvarado, Gregorio; Beyene, Yoseph; Semagn, Kassa; DeLacy, Ian

    2015-01-01

    A genomic selection index (GSI) is a linear combination of genomic estimated breeding values that uses genomic markers to predict the net genetic merit and select parents from a nonphenotyped testing population. Some authors have proposed a GSI; however, they have not used simulated or real data to validate the GSI theory and have not explained how to estimate the GSI selection response and the GSI expected genetic gain per selection cycle for the unobserved traits after the first selection cycle to obtain information about the genetic gains in each subsequent selection cycle. In this paper, we develop the theory of a GSI and apply it to two simulated and four real data sets with four traits. Also, we numerically compare its efficiency with that of the phenotypic selection index (PSI) by using the ratio of the GSI response over the PSI response, and the PSI and GSI expected genetic gain per selection cycle for observed and unobserved traits, respectively. In addition, we used the Technow inequality to compare GSI vs. PSI efficiency. Results from the simulated data were confirmed by the real data, indicating that GSI was more efficient than PSI per unit of time. PMID:26290571

  7. A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus).

    PubMed

    Chapman, Mark A; Pashley, Catherine H; Wenzler, Jessica; Hvala, John; Tang, Shunxue; Knapp, Steven J; Burke, John M

    2008-11-01

    Genomic scans for selection are a useful tool for identifying genes underlying phenotypic transitions. In this article, we describe the results of a genome scan designed to identify candidates for genes targeted by selection during the evolution of cultivated sunflower. This work involved screening 492 loci derived from ESTs on a large panel of wild, primitive (i.e., landrace), and improved sunflower (Helianthus annuus) lines. This sampling strategy allowed us to identify candidates for selectively important genes and investigate the likely timing of selection. Thirty-six genes showed evidence of selection during either domestication or improvement based on multiple criteria, and a sequence-based test of selection on a subset of these loci confirmed this result. In view of what is known about the structure of linkage disequilibrium across the sunflower genome, these genes are themselves likely to have been targeted by selection, rather than being merely linked to the actual targets. While the selection candidates showed a broad range of putative functions, they were enriched for genes involved in amino acid synthesis and protein catabolism. Given that a similar pattern has been detected in maize (Zea mays), this finding suggests that selection on amino acid composition may be a general feature of the evolution of crop plants. In terms of genomic locations, the selection candidates were significantly clustered near quantitative trait loci (QTL) that contribute to phenotypic differences between wild and cultivated sunflower, and specific instances of QTL colocalization provide some clues as to the roles that these genes may have played during sunflower evolution.

  8. Cancer Genomic Resources and Present Needs in the Latin American Region.

    PubMed

    Torres, Ángela; Oliver, Javier; Frecha, Cecilia; Montealegre, Ana Lorena; Quezada-Urbán, Rosalía; Díaz-Velásquez, Clara Estela; Vaca-Paniagua, Felipe; Perdomo, Sandra

    2017-01-01

    In Latin America (LA), cancer is the second leading cause of death, and little is known about the capacities and needs for the development of research in the field of cancer genomics. In order to evaluate the current capacity for and development of cancer genomics in LA, we collected the available information on genomics, including the number of next-generation sequencing (NGS) platforms, the number of cancer research institutions and research groups, publications in the last 10 years, educational programs, and related national cancer control policies. Currently, there are 221 NGS platforms and 118 research groups in LA developing cancer genomics projects. A total of 272 articles in the field of cancer genetics/genomics were published by authors affiliated to Latin American institutions. Educational programs in genomics are scarce, almost exclusive of graduate programs, and only few are concerning cancer. Only 14 countries have national cancer control plans, but all of them consider secondary prevention strategies for early diagnosis, opportune treatment, and decreasing mortality, where genomic analyses could be implemented. Despite recent advances in introducing knowledge about cancer genomics and its application to LA, the region lacks development of integrated genomic research projects, improved use of NGS platforms, implementation of associated educational programs, and health policies that could have an impact on cancer care. © 2017 S. Karger AG, Basel.

  9. Prostate cancer molecular profiling: the Achilles heel for the implementation of precision medicine.

    PubMed

    Oliveira-Barros, Eliane Gouvêa; Nicolau-Neto, Pedro; Da Costa, Nathalia Meireles; Pinto, Luís Felipe Ribeiro; Palumbo, Antonio; Nasciutti, Luiz Eurico

    2017-11-01

    Cancer has been mainly treated by traditional therapeutic approaches which do not consider the human genetic diversity and present limitations, probably as a consequence of a poor knowledge of both patient's genetic background and tumor biology. Due to genome project conclusion and large-scale gene analyses emergence, the therapeutic management of several prevalent and aggressive tumors has dramatically improved and represents the closest examples of a precision medicine intervention in this field. Nonetheless, prostate cancer (PCa) remains as a challenge to personalized medicine implementation, probably due to its notorious heterogeneous molecular profile. Cancer treatment personalized approaches rely on the premise that a well-defined panorama of tumor molecular alterations can help selecting new and specific therapeutic targets for its treatment and potentially discriminate tumors which behave differentially. Lately, molecular and genetic studies have been investigating PCa basis, revealing multiple recurrent genomic alterations that include mutations, DNA copy-number variations, rearrangements, and gene fusions, among others. In addition to the increment on PCa molecular biology knowledge, mapping the molecular alterations pattern of this neoplasia, especially the differences existent between tumors displaying distinct behaviors, could represent a great improvement concerning the identification of new targets, personalized medicine, and patients' management and prognosis. © 2017 International Federation for Cell Biology.

  10. Potential contribution of genomics and biotechnology in animal production

    USDA-ARS?s Scientific Manuscript database

    The overall objective of the book chapter is to define the potential contribution of genomics in livestock production in Latin American countries. A brief description on what is genomics, genome-wide association studies (GWAS), and genomic selection (GS) is provided. Genomics has been rapidly adopte...

  11. Comparing genomic selection and marker-assisted selection for Fusarium head blight resistance in wheat (Triticum aestivum L.)

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) and marker-assisted selection (MAS) rely on marker-trait associations and are both routinely used for breeding purposes. Although similar, these two approaches differ in their applications and how markers are used to estimate breeding values. In this study, GS and MAS were com...

  12. Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient.

    PubMed

    Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I

    2017-01-01

    This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.

  13. Accuracy and training population design for genomic selection in elite north american oats

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) is a method to estimate the breeding values of individuals by using markers throughout the genome. We evaluated the accuracies of GS using data from five traits on 446 oat lines genotyped with 1005 Diversity Array Technology (DArT) markers and two GS methods (RR-BLUP and Bayes...

  14. Navigating the Interface Between Landscape Genetics and Landscape Genomics.

    PubMed

    Storfer, Andrew; Patton, Austin; Fraik, Alexandra K

    2018-01-01

    As next-generation sequencing data become increasingly available for non-model organisms, a shift has occurred in the focus of studies of the geographic distribution of genetic variation. Whereas landscape genetics studies primarily focus on testing the effects of landscape variables on gene flow and genetic population structure, landscape genomics studies focus on detecting candidate genes under selection that indicate possible local adaptation. Navigating the transition between landscape genomics and landscape genetics can be challenging. The number of molecular markers analyzed has shifted from what used to be a few dozen loci to thousands of loci and even full genomes. Although genome scale data can be separated into sets of neutral loci for analyses of gene flow and population structure and putative loci under selection for inference of local adaptation, there are inherent differences in the questions that are addressed in the two study frameworks. We discuss these differences and their implications for study design, marker choice and downstream analysis methods. Similar to the rapid proliferation of analysis methods in the early development of landscape genetics, new analytical methods for detection of selection in landscape genomics studies are burgeoning. We focus on genome scan methods for detection of selection, and in particular, outlier differentiation methods and genetic-environment association tests because they are the most widely used. Use of genome scan methods requires an understanding of the potential mismatches between the biology of a species and assumptions inherent in analytical methods used, which can lead to high false positive rates of detected loci under selection. Key to choosing appropriate genome scan methods is an understanding of the underlying demographic structure of study populations, and such data can be obtained using neutral loci from the generated genome-wide data or prior knowledge of a species' phylogeographic history. To this end, we summarize recent simulation studies that test the power and accuracy of genome scan methods under a variety of demographic scenarios and sampling designs. We conclude with a discussion of additional considerations for future method development, and a summary of methods that show promise for landscape genomics studies but are not yet widely used.

  15. Navigating the Interface Between Landscape Genetics and Landscape Genomics

    PubMed Central

    Storfer, Andrew; Patton, Austin; Fraik, Alexandra K.

    2018-01-01

    As next-generation sequencing data become increasingly available for non-model organisms, a shift has occurred in the focus of studies of the geographic distribution of genetic variation. Whereas landscape genetics studies primarily focus on testing the effects of landscape variables on gene flow and genetic population structure, landscape genomics studies focus on detecting candidate genes under selection that indicate possible local adaptation. Navigating the transition between landscape genomics and landscape genetics can be challenging. The number of molecular markers analyzed has shifted from what used to be a few dozen loci to thousands of loci and even full genomes. Although genome scale data can be separated into sets of neutral loci for analyses of gene flow and population structure and putative loci under selection for inference of local adaptation, there are inherent differences in the questions that are addressed in the two study frameworks. We discuss these differences and their implications for study design, marker choice and downstream analysis methods. Similar to the rapid proliferation of analysis methods in the early development of landscape genetics, new analytical methods for detection of selection in landscape genomics studies are burgeoning. We focus on genome scan methods for detection of selection, and in particular, outlier differentiation methods and genetic-environment association tests because they are the most widely used. Use of genome scan methods requires an understanding of the potential mismatches between the biology of a species and assumptions inherent in analytical methods used, which can lead to high false positive rates of detected loci under selection. Key to choosing appropriate genome scan methods is an understanding of the underlying demographic structure of study populations, and such data can be obtained using neutral loci from the generated genome-wide data or prior knowledge of a species' phylogeographic history. To this end, we summarize recent simulation studies that test the power and accuracy of genome scan methods under a variety of demographic scenarios and sampling designs. We conclude with a discussion of additional considerations for future method development, and a summary of methods that show promise for landscape genomics studies but are not yet widely used. PMID:29593776

  16. The IGNITE network: a model for genomic medicine implementation and research.

    PubMed

    Weitzel, Kristin Wiisanen; Alexander, Madeline; Bernhardt, Barbara A; Calman, Neil; Carey, David J; Cavallari, Larisa H; Field, Julie R; Hauser, Diane; Junkins, Heather A; Levin, Phillip A; Levy, Kenneth; Madden, Ebony B; Manolio, Teri A; Odgis, Jacqueline; Orlando, Lori A; Pyeritz, Reed; Wu, R Ryanne; Shuldiner, Alan R; Bottinger, Erwin P; Denny, Joshua C; Dexter, Paul R; Flockhart, David A; Horowitz, Carol R; Johnson, Julie A; Kimmel, Stephen E; Levy, Mia A; Pollin, Toni I; Ginsburg, Geoffrey S

    2016-01-05

    Patients, clinicians, researchers and payers are seeking to understand the value of using genomic information (as reflected by genotyping, sequencing, family history or other data) to inform clinical decision-making. However, challenges exist to widespread clinical implementation of genomic medicine, a prerequisite for developing evidence of its real-world utility. To address these challenges, the National Institutes of Health-funded IGNITE (Implementing GeNomics In pracTicE; www.ignite-genomics.org ) Network, comprised of six projects and a coordinating center, was established in 2013 to support the development, investigation and dissemination of genomic medicine practice models that seamlessly integrate genomic data into the electronic health record and that deploy tools for point of care decision making. IGNITE site projects are aligned in their purpose of testing these models, but individual projects vary in scope and design, including exploring genetic markers for disease risk prediction and prevention, developing tools for using family history data, incorporating pharmacogenomic data into clinical care, refining disease diagnosis using sequence-based mutation discovery, and creating novel educational approaches. This paper describes the IGNITE Network and member projects, including network structure, collaborative initiatives, clinical decision support strategies, methods for return of genomic test results, and educational initiatives for patients and providers. Clinical and outcomes data from individual sites and network-wide projects are anticipated to begin being published over the next few years. The IGNITE Network is an innovative series of projects and pilot demonstrations aiming to enhance translation of validated actionable genomic information into clinical settings and develop and use measures of outcome in response to genome-based clinical interventions using a pragmatic framework to provide early data and proofs of concept on the utility of these interventions. Through these efforts and collaboration with other stakeholders, IGNITE is poised to have a significant impact on the acceleration of genomic information into medical practice.

  17. Genomic selection in forage breeding: designing an estimation population

    USDA-ARS?s Scientific Manuscript database

    The benefits of genomic selection to livestock, crops and forest tree breeding can be extended to forage grasses and legumes. The main benefits expected are increased selection accuracy and reduced costs per unit of genotype evaluated and breeding cycle length. Aiming at designing a training populat...

  18. Substitution rate and natural selection in parvovirus B19

    PubMed Central

    Stamenković, Gorana G.; Ćirković, Valentina S.; Šiljić, Marina M.; Blagojević, Jelena V.; Knežević, Aleksandra M.; Joksić, Ivana D.; Stanojević, Maja P.

    2016-01-01

    The aim of this study was to estimate substitution rate and imprints of natural selection on parvovirus B19 genotype 1. Studied datasets included 137 near complete coding B19 genomes (positions 665 to 4851) for phylogenetic and substitution rate analysis and 146 and 214 partial genomes for selection analyses in open reading frames ORF1 and ORF2, respectively, collected 1973–2012 and including 9 newly sequenced isolates from Serbia. Phylogenetic clustering assigned majority of studied isolates to G1A. Nucleotide substitution rate for total coding DNA was 1.03 (0.6–1.27) x 10−4 substitutions/site/year, with higher values for analyzed genome partitions. In spite of the highest evolutionary rate, VP2 codons were found to be under purifying selection with rare episodic positive selection, whereas codons under diversifying selection were found in the unique part of VP1, known to contain B19 immune epitopes important in persistent infection. Analyses of overlapping gene regions identified nucleotide positions under opposite selective pressure in different ORFs, suggesting complex evolutionary mechanisms of nucleotide changes in B19 viral genomes. PMID:27775080

  19. Selection of core animals in the Algorithm for Proven and Young using a simulation model.

    PubMed

    Bradford, H L; Pocrnić, I; Fragomeni, B O; Lourenco, D A L; Misztal, I

    2017-12-01

    The Algorithm for Proven and Young (APY) enables the implementation of single-step genomic BLUP (ssGBLUP) in large, genotyped populations by separating genotyped animals into core and non-core subsets and creating a computationally efficient inverse for the genomic relationship matrix (G). As APY became the choice for large-scale genomic evaluations in BLUP-based methods, a common question is how to choose the animals in the core subset. We compared several core definitions to answer this question. Simulations comprised a moderately heritable trait for 95,010 animals and 50,000 genotypes for animals across five generations. Genotypes consisted of 25,500 SNP distributed across 15 chromosomes. Genotyping errors and missing pedigree were also mimicked. Core animals were defined based on individual generations, equal representation across generations, and at random. For a sufficiently large core size, core definitions had the same accuracies and biases, even if the core animals had imperfect genotypes. When genotyped animals had unknown parents, accuracy and bias were significantly better (p ≤ .05) for random and across generation core definitions. © 2017 The Authors. Journal of Animal Breeding and Genetics Published by Blackwell Verlag GmbH.

  20. Technical note: an R package for fitting sparse neural networks with application in animal breeding.

    PubMed

    Wang, Yangfan; Mi, Xue; Rosa, Guilherme J M; Chen, Zhihui; Lin, Ping; Wang, Shi; Bao, Zhenmin

    2018-05-04

    Neural networks (NNs) have emerged as a new tool for genomic selection (GS) in animal breeding. However, the properties of NN used in GS for the prediction of phenotypic outcomes are not well characterized due to the problem of over-parameterization of NN and difficulties in using whole-genome marker sets as high-dimensional NN input. In this note, we have developed an R package called snnR that finds an optimal sparse structure of a NN by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases), therefore solving the problem of over-parameterization in NN. We have also tested some models fitted in the snnR package to demonstrate their feasibility and effectiveness to be used in several cases as examples. In comparison of snnR to the R package brnn (the Bayesian regularized single layer NNs), with both using the entries of a genotype matrix or a genomic relationship matrix as inputs, snnR has greatly improved the computational efficiency and the prediction ability for the GS in animal breeding because snnR implements a sparse NN with many hidden layers.

  1. A methodological overview on molecular preimplantation genetic diagnosis and screening: a genomic future?

    PubMed

    Vendrell, Xavier; Bautista-Llácer, Rosa

    2012-12-01

    The genetic diagnosis and screening of preimplantation embryos generated by assisted reproduction technology has been consolidated in the prenatal care framework. The rapid evolution of DNA technologies is tending to molecular approaches. Our intention is to present a detailed methodological view, showing different diagnostic strategies based on molecular techniques that are currently applied in preimplantation genetic diagnosis. The amount of DNA from one single, or a few cells, obtained by embryo biopsy is a limiting factor for the molecular analysis. In this sense, genetic laboratories have developed molecular protocols considering this restrictive condition. Nevertheless, the development of whole-genome amplification methods has allowed preimplantation genetic diagnosis for two or more indications simultaneously, like the selection of histocompatible embryos plus detection of monogenic diseases or aneuploidies. Moreover, molecular techniques have permitted preimplantation genetic screening to progress, by implementing microarray-based comparative genome hybridization. Finally, a future view of the embryo-genetics field based on molecular advances is proposed. The normalization, cost-effectiveness analysis, and new technological tools are the next topics for preimplantation genetic diagnosis and screening. Concomitantly, these additions to assisted reproduction technologies could have a positive effect on the schedules of preimplantation studies.

  2. Beliefs about genetic influences on eating behaviors: Characteristics and associations with weight management confidence.

    PubMed

    Persky, Susan; Bouhlal, Sofia; Goldring, Megan R; McBride, Colleen M

    2017-08-01

    The development of precision approaches for customized health interventions is a promising application of genomic discovery. To optimize such weight management interventions, target audiences will need to be engaged in research and implementation efforts. Investigation into approaches that engage these audiences will be required to ensure that genomic information, particularly with respect to genomic influences on endophenotypes like eating behavior, is understood and accepted, and not associated with unintended adverse outcomes. We took steps to characterize healthy individuals' beliefs about genetic influences on eating behavior. Data were collected via online survey from 261 participants selected at random from a database. Respondents infrequently spontaneously identified eating behavior-related factors as running in families. However, those who perceived themselves as overweight and perceived a family history of overweight were more likely to attribute eating behavior to genetics on closed-ended assessments, β=0.252, p=0.039. Genetic attributions for eating behaviors were associated with lower confidence in ability to control eating and weight, β=-0.119, p=0.035. These exploratory findings shed light on beliefs about genetic influences on eating, a behavioral trait (rather than a disease). This investigation can inform future health intervention efforts. Published by Elsevier Ltd.

  3. Genomic and metagenomic technologies to explore the antibiotic resistance mobilome.

    PubMed

    Martínez, José L; Coque, Teresa M; Lanza, Val F; de la Cruz, Fernando; Baquero, Fernando

    2017-01-01

    Antibiotic resistance is a relevant problem for human health that requires global approaches to establish a deep understanding of the processes of acquisition, stabilization, and spread of resistance among human bacterial pathogens. Since natural (nonclinical) ecosystems are reservoirs of resistance genes, a health-integrated study of the epidemiology of antibiotic resistance requires the exploration of such ecosystems with the aim of determining the role they may play in the selection, evolution, and spread of antibiotic resistance genes, involving the so-called resistance mobilome. High-throughput sequencing techniques allow an unprecedented opportunity to describe the genetic composition of a given microbiome without the need to subculture the organisms present inside. However, bioinformatic methods for analyzing this bulk of data, mainly with respect to binning each resistance gene with the organism hosting it, are still in their infancy. Here, we discuss how current genomic methodologies can serve to analyze the resistance mobilome and its linkage with different bacterial genomes and metagenomes. In addition, we describe the drawbacks of current methodologies for analyzing the resistance mobilome, mainly in cases of complex microbiotas, and discuss the possibility of implementing novel tools to improve our current metagenomic toolbox. © 2016 New York Academy of Sciences.

  4. Selective whole genome amplification for resequencing target microbial species from complex natural samples.

    PubMed

    Leichty, Aaron R; Brisson, Dustin

    2014-10-01

    Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.

  5. Sex-linked genomic variation and its relationship to avian plumage dichromatism and sexual selection.

    PubMed

    Huang, Huateng; Rabosky, Daniel L

    2015-09-16

    Sexual dichromatism is the tendency for sexes to differ in color pattern and represents a striking form of within-species morphological variation. Conspicuous intersexual differences in avian plumage are generally thought to result from Darwinian sexual selection, to the extent that dichromatism is often treated as a surrogate for the intensity of sexual selection in phylogenetic comparative studies. Intense sexual selection is predicted to leave a footprint on genetic evolution by reducing the relative genetic diversity on sex chromosome to that on the autosomes. In this study, we test the association between plumage dichromatism and sex-linked genetic diversity using eight species pairs with contrasting levels of dichromatism. We estimated Z-linked and autosomal genetic diversity for these non-model avian species using restriction-site associated (RAD) loci that covered ~3 % of the genome. We find that monochromatic birds consistently have reduced sex-linked genomic variation relative to phylogenetically-paired dichromatic species and this pattern is robust to mutational biases. Our results are consistent with several interpretations. If present-day sexual selection is stronger in dichromatic birds, our results suggest that its impact on sex-linked genomic variation is offset by other processes that lead to proportionately lower Z-linked variation in monochromatic species. We discuss possible factors that may contribute to this discrepancy between phenotypes and genomic variation. Conversely, it is possible that present-day sexual selection -- as measured by the variance in male reproductive success -- is stronger in the set of monochromatic taxa we have examined, potentially reflecting the importance of song, behavior and other non-plumage associated traits as targets of sexual selection. This counterintuitive finding suggests that the relationship between genomic variation and sexual selection is complex and highlights the need for a more comprehensive survey of genomic variation in avian taxa that vary markedly in social and genetic mating systems.

  6. Aquaculture genomics, genetics and breeding in the United States: current status, challenges, and priorities for future research.

    PubMed

    Abdelrahman, Hisham; ElHady, Mohamed; Alcivar-Warren, Acacia; Allen, Standish; Al-Tobasei, Rafet; Bao, Lisui; Beck, Ben; Blackburn, Harvey; Bosworth, Brian; Buchanan, John; Chappell, Jesse; Daniels, William; Dong, Sheng; Dunham, Rex; Durland, Evan; Elaswad, Ahmed; Gomez-Chiarri, Marta; Gosh, Kamal; Guo, Ximing; Hackett, Perry; Hanson, Terry; Hedgecock, Dennis; Howard, Tiffany; Holland, Leigh; Jackson, Molly; Jin, Yulin; Khalil, Karim; Kocher, Thomas; Leeds, Tim; Li, Ning; Lindsey, Lauren; Liu, Shikai; Liu, Zhanjiang; Martin, Kyle; Novriadi, Romi; Odin, Ramjie; Palti, Yniv; Peatman, Eric; Proestou, Dina; Qin, Guyu; Reading, Benjamin; Rexroad, Caird; Roberts, Steven; Salem, Mohamed; Severin, Andrew; Shi, Huitong; Shoemaker, Craig; Stiles, Sheila; Tan, Suxu; Tang, Kathy F J; Thongda, Wilawan; Tiersch, Terrence; Tomasso, Joseph; Prabowo, Wendy Tri; Vallejo, Roger; van der Steen, Hein; Vo, Khoi; Waldbieser, Geoff; Wang, Hanping; Wang, Xiaozhu; Xiang, Jianhai; Yang, Yujia; Yant, Roger; Yuan, Zihao; Zeng, Qifan; Zhou, Tao

    2017-02-20

    Advancing the production efficiency and profitability of aquaculture is dependent upon the ability to utilize a diverse array of genetic resources. The ultimate goals of aquaculture genomics, genetics and breeding research are to enhance aquaculture production efficiency, sustainability, product quality, and profitability in support of the commercial sector and for the benefit of consumers. In order to achieve these goals, it is important to understand the genomic structure and organization of aquaculture species, and their genomic and phenomic variations, as well as the genetic basis of traits and their interrelationships. In addition, it is also important to understand the mechanisms of regulation and evolutionary conservation at the levels of genome, transcriptome, proteome, epigenome, and systems biology. With genomic information and information between the genomes and phenomes, technologies for marker/causal mutation-assisted selection, genome selection, and genome editing can be developed for applications in aquaculture. A set of genomic tools and resources must be made available including reference genome sequences and their annotations (including coding and non-coding regulatory elements), genome-wide polymorphic markers, efficient genotyping platforms, high-density and high-resolution linkage maps, and transcriptome resources including non-coding transcripts. Genomic and genetic control of important performance and production traits, such as disease resistance, feed conversion efficiency, growth rate, processing yield, behaviour, reproductive characteristics, and tolerance to environmental stressors like low dissolved oxygen, high or low water temperature and salinity, must be understood. QTL need to be identified, validated across strains, lines and populations, and their mechanisms of control understood. Causal gene(s) need to be identified. Genetic and epigenetic regulation of important aquaculture traits need to be determined, and technologies for marker-assisted selection, causal gene/mutation-assisted selection, genome selection, and genome editing using CRISPR and other technologies must be developed, demonstrated with applicability, and application to aquaculture industries.Major progress has been made in aquaculture genomics for dozens of fish and shellfish species including the development of genetic linkage maps, physical maps, microarrays, single nucleotide polymorphism (SNP) arrays, transcriptome databases and various stages of genome reference sequences. This paper provides a general review of the current status, challenges and future research needs of aquaculture genomics, genetics, and breeding, with a focus on major aquaculture species in the United States: catfish, rainbow trout, Atlantic salmon, tilapia, striped bass, oysters, and shrimp. While the overall research priorities and the practical goals are similar across various aquaculture species, the current status in each species should dictate the next priority areas within the species. This paper is an output of the USDA Workshop for Aquaculture Genomics, Genetics, and Breeding held in late March 2016 in Auburn, Alabama, with participants from all parts of the United States.

  7. Cow genotyping strategies for genomic selection in a small dairy cattle population.

    PubMed

    Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A

    2017-01-01

    This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  8. Design and implementation of the cacao genome database

    USDA-ARS?s Scientific Manuscript database

    The Cacao Genome Database (CGD, www.cacaogenomedb.org) is being developed to provide a comprehensive data mining resource of genomic, genetic and breeding data for Theobroma cacao. Designed using Chado and a collection of Drupal modules, known as Tripal, CGD currently contains the genetically anchor...

  9. GEAR: genomic enrichment analysis of regional DNA copy number changes.

    PubMed

    Kim, Tae-Min; Jung, Yu-Chae; Rhyu, Mun-Gan; Jung, Myeong Ho; Chung, Yeun-Jun

    2008-02-01

    We developed an algorithm named GEAR (genomic enrichment analysis of regional DNA copy number changes) for functional interpretation of genome-wide DNA copy number changes identified by array-based comparative genomic hybridization. GEAR selects two types of chromosomal alterations with potential biological relevance, i.e. recurrent and phenotype-specific alterations. Then it performs functional enrichment analysis using a priori selected functional gene sets to identify primary and clinical genomic signatures. The genomic signatures identified by GEAR represent functionally coordinated genomic changes, which can provide clues on the underlying molecular mechanisms related to the phenotypes of interest. GEAR can help the identification of key molecular functions that are activated or repressed in the tumor genomes leading to the improved understanding on the tumor biology. GEAR software is available with online manual in the website, http://www.systemsbiology.co.kr/GEAR/.

  10. Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landraces and cultivars.

    PubMed

    Cavanagh, Colin R; Chao, Shiaoman; Wang, Shichen; Huang, Bevan Emma; Stephen, Stuart; Kiani, Seifollah; Forrest, Kerrie; Saintenac, Cyrille; Brown-Guedira, Gina L; Akhunova, Alina; See, Deven; Bai, Guihua; Pumphrey, Michael; Tomar, Luxmi; Wong, Debbie; Kong, Stephan; Reynolds, Matthew; da Silva, Marta Lopez; Bockelman, Harold; Talbert, Luther; Anderson, James A; Dreisigacker, Susanne; Baenziger, Stephen; Carter, Arron; Korzun, Viktor; Morrell, Peter Laurent; Dubcovsky, Jorge; Morell, Matthew K; Sorrells, Mark E; Hayden, Matthew J; Akhunov, Eduard

    2013-05-14

    Domesticated crops experience strong human-mediated selection aimed at developing high-yielding varieties adapted to local conditions. To detect regions of the wheat genome subject to selection during improvement, we developed a high-throughput array to interrogate 9,000 gene-associated single-nucleotide polymorphisms (SNP) in a worldwide sample of 2,994 accessions of hexaploid wheat including landraces and modern cultivars. Using a SNP-based diversity map we characterized the impact of crop improvement on genomic and geographic patterns of genetic diversity. We found evidence of a small population bottleneck and extensive use of ancestral variation often traceable to founders of cultivars from diverse geographic regions. Analyzing genetic differentiation among populations and the extent of haplotype sharing, we identified allelic variants subjected to selection during improvement. Selective sweeps were found around genes involved in the regulation of flowering time and phenology. An introgression of a wild relative-derived gene conferring resistance to a fungal pathogen was detected by haplotype-based analysis. Comparing selective sweeps identified in different populations, we show that selection likely acts on distinct targets or multiple functionally equivalent alleles in different portions of the geographic range of wheat. The majority of the selected alleles were present at low frequency in local populations, suggesting either weak selection pressure or temporal variation in the targets of directional selection during breeding probably associated with changing agricultural practices or environmental conditions. The developed SNP chip and map of genetic variation provide a resource for advancing wheat breeding and supporting future population genomic and genome-wide association studies in wheat.

  11. Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landraces and cultivars

    PubMed Central

    Cavanagh, Colin R.; Chao, Shiaoman; Wang, Shichen; Huang, Bevan Emma; Stephen, Stuart; Kiani, Seifollah; Forrest, Kerrie; Saintenac, Cyrille; Brown-Guedira, Gina L.; Akhunova, Alina; See, Deven; Bai, Guihua; Pumphrey, Michael; Tomar, Luxmi; Wong, Debbie; Kong, Stephan; Reynolds, Matthew; da Silva, Marta Lopez; Bockelman, Harold; Talbert, Luther; Anderson, James A.; Dreisigacker, Susanne; Baenziger, Stephen; Carter, Arron; Korzun, Viktor; Morrell, Peter Laurent; Dubcovsky, Jorge; Morell, Matthew K.; Sorrells, Mark E.; Hayden, Matthew J.; Akhunov, Eduard

    2013-01-01

    Domesticated crops experience strong human-mediated selection aimed at developing high-yielding varieties adapted to local conditions. To detect regions of the wheat genome subject to selection during improvement, we developed a high-throughput array to interrogate 9,000 gene-associated single-nucleotide polymorphisms (SNP) in a worldwide sample of 2,994 accessions of hexaploid wheat including landraces and modern cultivars. Using a SNP-based diversity map we characterized the impact of crop improvement on genomic and geographic patterns of genetic diversity. We found evidence of a small population bottleneck and extensive use of ancestral variation often traceable to founders of cultivars from diverse geographic regions. Analyzing genetic differentiation among populations and the extent of haplotype sharing, we identified allelic variants subjected to selection during improvement. Selective sweeps were found around genes involved in the regulation of flowering time and phenology. An introgression of a wild relative-derived gene conferring resistance to a fungal pathogen was detected by haplotype-based analysis. Comparing selective sweeps identified in different populations, we show that selection likely acts on distinct targets or multiple functionally equivalent alleles in different portions of the geographic range of wheat. The majority of the selected alleles were present at low frequency in local populations, suggesting either weak selection pressure or temporal variation in the targets of directional selection during breeding probably associated with changing agricultural practices or environmental conditions. The developed SNP chip and map of genetic variation provide a resource for advancing wheat breeding and supporting future population genomic and genome-wide association studies in wheat. PMID:23630259

  12. Rapid cycling genomic selection in a multiparental tropical maize population

    USDA-ARS?s Scientific Manuscript database

    Genomic selection (GS) increases genetic gain by reducing the length of the selection cycle, as has been exemplified in maize using rapid cycling recombination of biparental populations. However, no results of GS applied to maize multi-parental populations have been reported so far. This study is th...

  13. Genome-Wide Prediction of the Performance of Three-Way Hybrids in Barley.

    PubMed

    Li, Zuo; Philipp, Norman; Spiller, Monika; Stiewe, Gunther; Reif, Jochen C; Zhao, Yusheng

    2017-03-01

    Predicting the grain yield performance of three-way hybrids is challenging. Three-way crosses are relevant for hybrid breeding in barley ( L.) and maize ( L.) adapted to East Africa. The main goal of our study was to implement and evaluate genome-wide prediction approaches of the performance of three-way hybrids using data of single-cross hybrids for a scenario in which parental lines of the three-way hybrids originate from three genetically distinct subpopulations. We extended the ridge regression best linear unbiased prediction (RRBLUP) and devised a genomic selection model allowing for subpopulation-specific marker effects (GSA-RRBLUP: general and subpopulation-specific additive RRBLUP). Using an empirical barley data set, we showed that applying GSA-RRBLUP tripled the prediction ability of three-way hybrids from 0.095 to 0.308 compared with RRBLUP, modeling one additive effect for all three subpopulations. The experimental findings were further substantiated with computer simulations. Our results emphasize the potential of GSA-RRBLUP to improve genome-wide hybrid prediction of three-way hybrids for scenarios of genetically diverse parental populations. Because of the advantages of the GSA-RRBLUP model in dealing with hybrids from different parental populations, it may also be a promising approach to boost the prediction ability for hybrid breeding programs based on genetically diverse heterotic groups. Copyright © 2017 Crop Science Society of America.

  14. Genome-enabled selection doubles the accuracy of predicted breeding values for bacterial cold water disease resistance compared to traditional family-based selection in rainbow trout aquaculture

    USDA-ARS?s Scientific Manuscript database

    We have shown previously that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  15. Genetic signatures of natural selection in a model invasive ascidian

    PubMed Central

    Lin, Yaping; Chen, Yiyong; Yi, Changho; Fong, Jonathan J.; Kim, Won; Rius, Marc; Zhan, Aibin

    2017-01-01

    Invasive species represent promising models to study species’ responses to rapidly changing environments. Although local adaptation frequently occurs during contemporary range expansion, the associated genetic signatures at both population and genomic levels remain largely unknown. Here, we use genome-wide gene-associated microsatellites to investigate genetic signatures of natural selection in a model invasive ascidian, Ciona robusta. Population genetic analyses of 150 individuals sampled in Korea, New Zealand, South Africa and Spain showed significant genetic differentiation among populations. Based on outlier tests, we found high incidence of signatures of directional selection at 19 loci. Hitchhiking mapping analyses identified 12 directional selective sweep regions, and all selective sweep windows on chromosomes were narrow (~8.9 kb). Further analyses indentified 132 candidate genes under selection. When we compared our genetic data and six crucial environmental variables, 16 putatively selected loci showed significant correlation with these environmental variables. This suggests that the local environmental conditions have left significant signatures of selection at both population and genomic levels. Finally, we identified “plastic” genomic regions and genes that are promising regions to investigate evolutionary responses to rapid environmental change in C. robusta. PMID:28266616

  16. Genome-wide comparative analysis of codon usage bias and codon context patterns among cyanobacterial genomes.

    PubMed

    Prabha, Ratna; Singh, Dhananjaya P; Sinha, Swati; Ahmad, Khurshid; Rai, Anil

    2017-04-01

    With the increasing accumulation of genomic sequence information of prokaryotes, the study of codon usage bias has gained renewed attention. The purpose of this study was to examine codon selection pattern within and across cyanobacterial species belonging to diverse taxonomic orders and habitats. We performed detailed comparative analysis of cyanobacterial genomes with respect to codon bias. Our analysis reflects that in cyanobacterial genomes, A- and/or T-ending codons were used predominantly in the genes whereas G- and/or C-ending codons were largely avoided. Variation in the codon context usage of cyanobacterial genes corresponded to the clustering of cyanobacteria as per their GC content. Analysis of codon adaptation index (CAI) and synonymous codon usage order (SCUO) revealed that majority of genes are associated with low codon bias. Codon selection pattern in cyanobacterial genomes reflected compositional constraints as major influencing factor. It is also identified that although, mutational constraint may play some role in affecting codon usage bias in cyanobacteria, compositional constraint in terms of genomic GC composition coupled with environmental factors affected codon selection pattern in cyanobacterial genomes. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Single-Molecule FISH Reveals Non-selective Packaging of Rift Valley Fever Virus Genome Segments

    PubMed Central

    Wichgers Schreur, Paul J.; Kortekaas, Jeroen

    2016-01-01

    The bunyavirus genome comprises a small (S), medium (M), and large (L) RNA segment of negative polarity. Although genome segmentation confers evolutionary advantages by enabling genome reassortment events with related viruses, genome segmentation also complicates genome replication and packaging. Accumulating evidence suggests that genomes of viruses with eight or more genome segments are incorporated into virions by highly selective processes. Remarkably, little is known about the genome packaging process of the tri-segmented bunyaviruses. Here, we evaluated, by single-molecule RNA fluorescence in situ hybridization (FISH), the intracellular spatio-temporal distribution and replication kinetics of the Rift Valley fever virus (RVFV) genome and determined the segment composition of mature virions. The results reveal that the RVFV genome segments start to replicate near the site of infection before spreading and replicating throughout the cytoplasm followed by translocation to the virion assembly site at the Golgi network. Despite the average intracellular S, M and L genome segments approached a 1:1:1 ratio, major differences in genome segment ratios were observed among cells. We also observed a significant amount of cells lacking evidence of M-segment replication. Analysis of two-segmented replicons and four-segmented viruses subsequently confirmed the previous notion that Golgi recruitment is mediated by the Gn glycoprotein. The absence of colocalization of the different segments in the cytoplasm and the successful rescue of a tri-segmented variant with a codon shuffled M-segment suggested that inter-segment interactions are unlikely to drive the copackaging of the different segments into a single virion. The latter was confirmed by direct visualization of RNPs inside mature virions which showed that the majority of virions lack one or more genome segments. Altogether, this study suggests that RVFV genome packaging is a non-selective process. PMID:27548280

  18. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types

    PubMed Central

    2014-01-01

    Background Integrating and analyzing heterogeneous genome-scale data is a huge algorithmic challenge for modern systems biology. Bipartite graphs can be useful for representing relationships across pairs of disparate data types, with the interpretation of these relationships accomplished through an enumeration of maximal bicliques. Most previously-known techniques are generally ill-suited to this foundational task, because they are relatively inefficient and without effective scaling. In this paper, a powerful new algorithm is described that produces all maximal bicliques in a bipartite graph. Unlike most previous approaches, the new method neither places undue restrictions on its input nor inflates the problem size. Efficiency is achieved through an innovative exploitation of bipartite graph structure, and through computational reductions that rapidly eliminate non-maximal candidates from the search space. An iterative selection of vertices for consideration based on non-decreasing common neighborhood sizes boosts efficiency and leads to more balanced recursion trees. Results The new technique is implemented and compared to previously published approaches from graph theory and data mining. Formal time and space bounds are derived. Experiments are performed on both random graphs and graphs constructed from functional genomics data. It is shown that the new method substantially outperforms the best previous alternatives. Conclusions The new method is streamlined, efficient, and particularly well-suited to the study of huge and diverse biological data. A robust implementation has been incorporated into GeneWeaver, an online tool for integrating and analyzing functional genomics experiments, available at http://geneweaver.org. The enormous increase in scalability it provides empowers users to study complex and previously unassailable gene-set associations between genes and their biological functions in a hierarchical fashion and on a genome-wide scale. This practical computational resource is adaptable to almost any applications environment in which bipartite graphs can be used to model relationships between pairs of heterogeneous entities. PMID:24731198

  19. Genomics and Biochemistry of Saccharomyces cerevisiae Wine Yeast Strains.

    PubMed

    Eldarov, M A; Kishkovskaia, S A; Tanaschuk, T N; Mardanov, A V

    2016-12-01

    Saccharomyces yeasts have been used for millennia for the production of beer, wine, bread, and other fermented products. Long-term "unconscious" selection and domestication led to the selection of hundreds of strains with desired production traits having significant phenotypic and genetic differences from their wild ancestors. This review summarizes the results of recent research in deciphering the genomes of wine Saccharomyces strains, the use of comparative genomics methods to study the mechanisms of yeast genome evolution under conditions of artificial selection, and the use of genomic and postgenomic approaches to identify the molecular nature of the important characteristics of commercial wine strains of Saccharomyces. Succinctly, data concerning metagenomics of microbial communities of grapes and wine and the dynamics of yeast and bacterial flora in the course of winemaking is provided. A separate section is devoted to an overview of the physiological, genetic, and biochemical features of sherry yeast strains used to produce biologically aged wines. The goal of the review is to convince the reader of the efficacy of new genomic and postgenomic technologies as tools for developing strategies for targeted selection and creation of new strains using "classical" and modern techniques for improving winemaking technology.

  20. Genomic regions with a history of divergent selection affect fitness of hybrids between two butterfly species.

    PubMed

    Gompert, Zachariah; Lucas, Lauren K; Nice, Chris C; Fordyce, James A; Forister, Matthew L; Buerkle, C Alex

    2012-07-01

    Speciation is the process by which reproductively isolated lineages arise, and is one of the fundamental means by which the diversity of life increases. Whereas numerous studies have documented an association between ecological divergence and reproductive isolation, relatively little is known about the role of natural selection in genome divergence during the process of speciation. Here, we use genome-wide DNA sequences and Bayesian models to test the hypothesis that loci under divergent selection between two butterfly species (Lycaeides idas and L. melissa) also affect fitness in an admixed population. Locus-specific measures of genetic differentiation between L. idas and L. melissa and genomic introgression in hybrids varied across the genome. The most differentiated genetic regions were characterized by elevated L. idas ancestry in the admixed population, which occurs in L. idas-like habitat, consistent with the hypothesis that local adaptation contributes to speciation. Moreover, locus-specific measures of genetic differentiation (a metric of divergent selection) were positively associated with extreme genomic introgression (a metric of hybrid fitness). Interestingly, concordance of differentiation and introgression was only partial. We discuss multiple, complementary explanations for this partial concordance. © 2012 The Author(s).

  1. CRISPR-Cas9-Based Genome Editing of Human Induced Pluripotent Stem Cells.

    PubMed

    Giacalone, Joseph C; Sharma, Tasneem P; Burnight, Erin R; Fingert, John F; Mullins, Robert F; Stone, Edwin M; Tucker, Budd A

    2018-02-28

    Human induced pluripotent stem cells (hiPSCs) are the ideal cell source for autologous cell replacement. However, for patients with Mendelian diseases, genetic correction of the original disease-causing mutation is likely required prior to cellular differentiation and transplantation. The emergence of the CRISPR-Cas9 system has revolutionized the field of genome editing. By introducing inexpensive reagents that are relatively straightforward to design and validate, it is now possible to correct genetic variants or insert desired sequences at any location within the genome. CRISPR-based genome editing of patient-specific iPSCs shows great promise for future autologous cell replacement therapies. One caveat, however, is that hiPSCs are notoriously difficult to transfect, and optimized experimental design considerations are often necessary. This unit describes design strategies and methods for efficient CRISPR-based genome editing of patient- specific iPSCs. Additionally, it details a flexible approach that utilizes positive selection to generate clones with a desired genomic modification, Cre-lox recombination to remove the integrated selection cassette, and negative selection to eliminate residual hiPSCs with intact selection cassettes. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  2. Different selective pressures lead to different genomic outcomes as newly-formed hybrid yeasts evolve.

    PubMed

    Piotrowski, Jeff S; Nagarajan, Saisubramanian; Kroll, Evgueny; Stanbery, Alison; Chiotti, Kami E; Kruckeberg, Arthur L; Dunn, Barbara; Sherlock, Gavin; Rosenzweig, Frank

    2012-04-02

    Interspecific hybridization occurs in every eukaryotic kingdom. While hybrid progeny are frequently at a selective disadvantage, in some instances their increased genome size and complexity may result in greater stress resistance than their ancestors, which can be adaptively advantageous at the edges of their ancestors' ranges. While this phenomenon has been repeatedly documented in the field, the response of hybrid populations to long-term selection has not often been explored in the lab. To fill this knowledge gap we crossed the two most distantly related members of the Saccharomyces sensu stricto group, S. cerevisiae and S. uvarum, and established a mixed population of homoploid and aneuploid hybrids to study how different types of selection impact hybrid genome structure. As temperature was raised incrementally from 31°C to 46.5°C over 500 generations of continuous culture, selection favored loss of the S. uvarum genome, although the kinetics of genome loss differed among independent replicates. Temperature-selected isolates exhibited greater inherent and induced thermal tolerance than parental species and founding hybrids, and also exhibited ethanol resistance. In contrast, as exogenous ethanol was increased from 0% to 14% over 500 generations of continuous culture, selection favored euploid S. cerevisiae x S. uvarum hybrids. Ethanol-selected isolates were more ethanol tolerant than S. uvarum and one of the founding hybrids, but did not exhibit resistance to temperature stress. Relative to parental and founding hybrids, temperature-selected strains showed heritable differences in cell wall structure in the forms of increased resistance to zymolyase digestion and Micafungin, which targets cell wall biosynthesis. This is the first study to show experimentally that the genomic fate of newly-formed interspecific hybrids depends on the type of selection they encounter during the course of evolution, underscoring the importance of the ecological theatre in determining the outcome of the evolutionary play.

  3. Changes in Malaria Parasite Drug Resistance in an Endemic Population Over a 25-Year Period With Resulting Genomic Evidence of Selection

    PubMed Central

    Nwakanma, Davis C.; Duffy, Craig W.; Amambua-Ngwa, Alfred; Oriero, Eniyou C.; Bojang, Kalifa A.; Pinder, Margaret; Drakeley, Chris J.; Sutherland, Colin J.; Milligan, Paul J.; MacInnis, Bronwyn; Kwiatkowski, Dominic P.; Clark, Taane G.; Greenwood, Brian M.; Conway, David J.

    2014-01-01

    Background. Analysis of genome-wide polymorphism in many organisms has potential to identify genes under recent selection. However, data on historical allele frequency changes are rarely available for direct confirmation. Methods. We genotyped single nucleotide polymorphisms (SNPs) in 4 Plasmodium falciparum drug resistance genes in 668 archived parasite-positive blood samples of a Gambian population between 1984 and 2008. This covered a period before antimalarial resistance was detected locally, through subsequent failure of multiple drugs until introduction of artemisinin combination therapy. We separately performed genome-wide sequence analysis of 52 clinical isolates from 2008 to prospect for loci under recent directional selection. Results. Resistance alleles increased from very low frequencies, peaking in 2000 for chloroquine resistance-associated crt and mdr1 genes and at the end of the survey period for dhfr and dhps genes respectively associated with pyrimethamine and sulfadoxine resistance. Temporal changes fit a model incorporating likely selection coefficients over the period. Three of the drug resistance loci were in the top 4 regions under strong selection implicated by the genome-wide analysis. Conclusions. Genome-wide polymorphism analysis of an endemic population sample robustly identifies loci with detailed documentation of recent selection, demonstrating power to prospectively detect emerging drug resistance genes. PMID:24265439

  4. Signatures of Diversifying Selection in European Pig Breeds

    PubMed Central

    Wilkinson, Samantha; Lu, Zen H.; Megens, Hendrik-Jan; Archibald, Alan L.; Haley, Chris; Jackson, Ian J.; Groenen, Martien A. M.; Crooijmans, Richard P. M. A.; Ogden, Rob; Wiener, Pamela

    2013-01-01

    Following domestication, livestock breeds have experienced intense selection pressures for the development of desirable traits. This has resulted in a large diversity of breeds that display variation in many phenotypic traits, such as coat colour, muscle composition, early maturity, growth rate, body size, reproduction, and behaviour. To better understand the relationship between genomic composition and phenotypic diversity arising from breed development, the genomes of 13 traditional and commercial European pig breeds were scanned for signatures of diversifying selection using the Porcine60K SNP chip, applying a between-population (differentiation) approach. Signatures of diversifying selection between breeds were found in genomic regions associated with traits related to breed standard criteria, such as coat colour and ear morphology. Amino acid differences in the EDNRB gene appear to be associated with one of these signatures, and variation in the KITLG gene may be associated with another. Other selection signals were found in genomic regions including QTLs and genes associated with production traits such as reproduction, growth, and fat deposition. Some selection signatures were associated with regions showing evidence of introgression from Asian breeds. When the European breeds were compared with wild boar, genomic regions with high levels of differentiation harboured genes related to bone formation, growth, and fat deposition. PMID:23637623

  5. Genomic signature of natural and anthropogenic stress in wild populations of the waterflea Daphnia magna: validation in space, time and experimental evolution.

    PubMed

    Orsini, Luisa; Spanier, Katina I; DE Meester, Luc

    2012-05-01

    Natural populations are confronted with multiple selection pressures resulting in a mosaic of environmental stressors at the landscape level. Identifying the genetic underpinning of adaptation to these complex selection environments and assigning causes of natural selection within multidimensional selection regimes in the wild is challenging. The water flea Daphnia is a renowned ecological model system with its well-documented ecology, the possibility to analyse subfossil dormant egg banks and the short generation time allowing an experimental evolution approach. Capitalizing on the strengths of this model system, we here link candidate genome regions to three selection pressures, known to induce micro-evolutionary responses in Daphnia magna: fish predation, parasitism and land use. Using a genome scan approach in space, time and experimental evolution trials, we provide solid evidence of selection at the genome level under well-characterized environmental gradients in the wild and identify candidate genes linked to the three environmental stressors. Our study reveals differential selection at the genome level in Daphnia populations and provides evidence for repeatable patterns of local adaptation in a geographic mosaic of environmental stressors fuelled by standing genetic variation. Our results imply high evolutionary potential of local populations, which is relevant to understand the dynamics of trait changes in natural populations and their impact on community and ecosystem responses through eco-evolutionary feedbacks. © 2012 Blackwell Publishing Ltd.

  6. The impact of selection, gene flow and demographic history on heterogeneous genomic divergence: three-spine sticklebacks in divergent environments.

    PubMed

    Ferchaud, Anne-Laure; Hansen, Michael M

    2016-01-01

    Heterogeneous genomic divergence between populations may reflect selection, but should also be seen in conjunction with gene flow and drift, particularly population bottlenecks. Marine and freshwater three-spine stickleback (Gasterosteus aculeatus) populations often exhibit different lateral armour plate morphs. Moreover, strikingly parallel genomic footprints across different marine-freshwater population pairs are interpreted as parallel evolution and gene reuse. Nevertheless, in some geographic regions like the North Sea and Baltic Sea, different patterns are observed. Freshwater populations in coastal regions are often dominated by marine morphs, suggesting that gene flow overwhelms selection, and genomic parallelism may also be less pronounced. We used RAD sequencing for analysing 28 888 SNPs in two marine and seven freshwater populations in Denmark, Europe. Freshwater populations represented a variety of environments: river populations accessible to gene flow from marine sticklebacks and large and small isolated lakes with and without fish predators. Sticklebacks in an accessible river environment showed minimal morphological and genomewide divergence from marine populations, supporting the hypothesis of gene flow overriding selection. Allele frequency spectra suggested bottlenecks in all freshwater populations, and particularly two small lake populations. However, genomic footprints ascribed to selection could nevertheless be identified. No genomic regions were consistent freshwater-marine outliers, and parallelism was much lower than in other comparable studies. Two genomic regions previously described to be under divergent selection in freshwater and marine populations were outliers between different freshwater populations. We ascribe these patterns to stronger environmental heterogeneity among freshwater populations in our study as compared to most other studies, although the demographic history involving bottlenecks should also be considered in the interpretation of results. © 2015 John Wiley & Sons Ltd.

  7. Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora

    PubMed Central

    Steige, Kim A.; Laenen, Benjamin; Reimegård, Johan; Slotte, Tanja

    2017-01-01

    Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora. We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species. PMID:28096395

  8. Accounting for Genotype-by-Environment Interactions and Residual Genetic Variation in Genomic Selection for Water-Soluble Carbohydrate Concentration in Wheat.

    PubMed

    Ovenden, Ben; Milgate, Andrew; Wade, Len J; Rebetzke, Greg J; Holland, James B

    2018-05-31

    Abiotic stress tolerance traits are often complex and recalcitrant targets for conventional breeding improvement in many crop species. This study evaluated the potential of genomic selection to predict water-soluble carbohydrate concentration (WSCC), an important drought tolerance trait, in wheat under field conditions. A panel of 358 varieties and breeding lines constrained for maturity was evaluated under rainfed and irrigated treatments across two locations and two years. Whole-genome marker profiles and factor analytic mixed models were used to generate genomic estimated breeding values (GEBVs) for specific environments and environment groups. Additive genetic variance was smaller than residual genetic variance for WSCC, such that genotypic values were dominated by residual genetic effects rather than additive breeding values. As a result, GEBVs were not accurate predictors of genotypic values of the extant lines, but GEBVs should be reliable selection criteria to choose parents for intermating to produce new populations. The accuracy of GEBVs for untested lines was sufficient to increase predicted genetic gain from genomic selection per unit time compared to phenotypic selection if the breeding cycle is reduced by half by the use of GEBVs in off-season generations. Further, genomic prediction accuracy depended on having phenotypic data from environments with strong correlations with target production environments to build prediction models. By combining high-density marker genotypes, stress-managed field evaluations, and mixed models that model simultaneously covariances among genotypes and covariances of complex trait performance between pairs of environments, we were able to train models with good accuracy to facilitate genetic gain from genomic selection. Copyright © 2018 Ovenden et al.

  9. Whole genome sequencing of turbot (Scophthalmus maximus; Pleuronectiformes): a fish adapted to demersal life

    PubMed Central

    Figueras, Antonio; Robledo, Diego; Corvelo, André; Hermida, Miguel; Pereiro, Patricia; Rubiolo, Juan A.; Gómez-Garrido, Jèssica; Carreté, Laia; Bello, Xabier; Gut, Marta; Gut, Ivo Glynne; Marcet-Houben, Marina; Forn-Cuní, Gabriel; Galán, Beatriz; García, José Luis; Abal-Fabeiro, José Luis; Pardo, Belen G.; Taboada, Xoana; Fernández, Carlos; Vlasova, Anna; Hermoso-Pulido, Antonio; Guigó, Roderic; Álvarez-Dios, José Antonio; Gómez-Tato, Antonio; Viñas, Ana; Maside, Xulio; Gabaldón, Toni; Novoa, Beatriz; Bouza, Carmen; Alioto, Tyler; Martínez, Paulino

    2016-01-01

    The turbot is a flatfish (Pleuronectiformes) with increasing commercial value, which has prompted active genomic research aimed at more efficient selection. Here we present the sequence and annotation of the turbot genome, which represents a milestone for both boosting breeding programmes and ascertaining the origin and diversification of flatfish. We compare the turbot genome with model fish genomes to investigate teleost chromosome evolution. We observe a conserved macrosyntenic pattern within Percomorpha and identify large syntenic blocks within the turbot genome related to the teleost genome duplication. We identify gene family expansions and positive selection of genes associated with vision and metabolism of membrane lipids, which suggests adaptation to demersal lifestyle and to cold temperatures, respectively. Our data indicate a quick evolution and diversification of flatfish to adapt to benthic life and provide clues for understanding their controversial origin. Moreover, we investigate the genomic architecture of growth, sex determination and disease resistance, key traits for understanding local adaptation and boosting turbot production, by mapping candidate genes and previously reported quantitative trait loci. The genomic architecture of these productive traits has allowed the identification of candidate genes and enriched pathways that may represent useful information for future marker-assisted selection in turbot. PMID:26951068

  10. Whole genome sequencing of turbot (Scophthalmus maximus; Pleuronectiformes): a fish adapted to demersal life.

    PubMed

    Figueras, Antonio; Robledo, Diego; Corvelo, André; Hermida, Miguel; Pereiro, Patricia; Rubiolo, Juan A; Gómez-Garrido, Jèssica; Carreté, Laia; Bello, Xabier; Gut, Marta; Gut, Ivo Glynne; Marcet-Houben, Marina; Forn-Cuní, Gabriel; Galán, Beatriz; García, José Luis; Abal-Fabeiro, José Luis; Pardo, Belen G; Taboada, Xoana; Fernández, Carlos; Vlasova, Anna; Hermoso-Pulido, Antonio; Guigó, Roderic; Álvarez-Dios, José Antonio; Gómez-Tato, Antonio; Viñas, Ana; Maside, Xulio; Gabaldón, Toni; Novoa, Beatriz; Bouza, Carmen; Alioto, Tyler; Martínez, Paulino

    2016-06-01

    The turbot is a flatfish (Pleuronectiformes) with increasing commercial value, which has prompted active genomic research aimed at more efficient selection. Here we present the sequence and annotation of the turbot genome, which represents a milestone for both boosting breeding programmes and ascertaining the origin and diversification of flatfish. We compare the turbot genome with model fish genomes to investigate teleost chromosome evolution. We observe a conserved macrosyntenic pattern within Percomorpha and identify large syntenic blocks within the turbot genome related to the teleost genome duplication. We identify gene family expansions and positive selection of genes associated with vision and metabolism of membrane lipids, which suggests adaptation to demersal lifestyle and to cold temperatures, respectively. Our data indicate a quick evolution and diversification of flatfish to adapt to benthic life and provide clues for understanding their controversial origin. Moreover, we investigate the genomic architecture of growth, sex determination and disease resistance, key traits for understanding local adaptation and boosting turbot production, by mapping candidate genes and previously reported quantitative trait loci. The genomic architecture of these productive traits has allowed the identification of candidate genes and enriched pathways that may represent useful information for future marker-assisted selection in turbot. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  11. Parasitism drives host genome evolution: Insights from the Pasteuria ramosa-Daphnia magna system.

    PubMed

    Bourgeois, Yann; Roulin, Anne C; Müller, Kristina; Ebert, Dieter

    2017-04-01

    Because parasitism is thought to play a major role in shaping host genomes, it has been predicted that genomic regions associated with resistance to parasites should stand out in genome scans, revealing signals of selection above the genomic background. To test whether parasitism is indeed such a major factor in host evolution and to better understand host-parasite interaction at the molecular level, we studied genome-wide polymorphisms in 97 genotypes of the planktonic crustacean Daphnia magna originating from three localities across Europe. Daphnia magna is known to coevolve with the bacterial pathogen Pasteuria ramosa for which host genotypes (clonal lines) are either resistant or susceptible. Using association mapping, we identified two genomic regions involved in resistance to P. ramosa, one of which was already known from a previous QTL analysis. We then performed a naïve genome scan to test for signatures of positive selection and found that the two regions identified with the association mapping further stood out as outliers. Several other regions with evidence for selection were also found, but no link between these regions and phenotypic variation could be established. Our results are consistent with the hypothesis that parasitism is driving host genome evolution. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  12. Speed congenics: accelerated genome recovery using genetic markers.

    PubMed

    Visscher, P M

    1999-08-01

    Genetic markers throughout the genome can be used to speed up 'recovery' of the recipient genome in the backcrossing phase of the construction of a congenic strain. The prediction of the genomic proportion during backcrossing depends on the assumptions regarding the distribution of chromosome segments, the population structure, the marker spacing and the selection strategy. In this study simulation was used to investigate the rate of recovery of the recipient genome for a mouse, Drosophila and Arabidopsis genome. It was shown that an incorrect assumption of a binomial distribution of chromosome segments, and failing to take account of a reduction in variance in genomic proportion due to selection, can lead to a downward bias of up to two generations in the estimation of the number of generations required for the formation of a congenic strain.

  13. Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies.

    PubMed

    Mina, Marco; Raynaud, Franck; Tavernari, Daniele; Battistello, Elena; Sungalee, Stephanie; Saghafinia, Sadegh; Laessle, Titouan; Sanchez-Vega, Francisco; Schultz, Nikolaus; Oricchio, Elisa; Ciriello, Giovanni

    2017-08-14

    Cancer evolves through the emergence and selection of molecular alterations. Cancer genome profiling has revealed that specific events are more or less likely to be co-selected, suggesting that the selection of one event depends on the others. However, the nature of these evolutionary dependencies and their impact remain unclear. Here, we designed SELECT, an algorithmic approach to systematically identify evolutionary dependencies from alteration patterns. By analyzing 6,456 genomes from multiple tumor types, we constructed a map of oncogenic dependencies associated with cellular pathways, transcriptional readouts, and therapeutic response. Finally, modeling of cancer evolution shows that alteration dependencies emerge only under conditional selection. These results provide a framework for the design of strategies to predict cancer progression and therapeutic response. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. A Universal Next-Generation Sequencing Protocol To Generate Noninfectious Barcoded cDNA Libraries from High-Containment RNA Viruses

    PubMed Central

    Moser, Lindsey A.; Ramirez-Carvajal, Lisbeth; Puri, Vinita; Pauszek, Steven J.; Matthews, Krystal; Dilley, Kari A.; Mullan, Clancy; McGraw, Jennifer; Khayat, Michael; Beeri, Karen; Yee, Anthony; Dugan, Vivien; Heise, Mark T.; Frieman, Matthew B.; Rodriguez, Luis L.; Bernard, Kristen A.; Wentworth, David E.

    2016-01-01

    ABSTRACT Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all processed samples derived from high-consequence pathogens prior to transfer from high-containment laboratories to lower-containment facilities for sequencing. Our established protocol can be scaled up for high-throughput sequencing of hundreds of samples simultaneously, which can dramatically reduce the cost and effort required for NGS library construction. NGS data from this SOP can provide complete genome coverage from viral stocks and can also detect virus-specific reads from limited starting material. Our data suggest that the procedure can be implemented and easily validated by institutional biosafety committees across research laboratories. PMID:27822536

  15. Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins.

    PubMed

    He, Jun; Xu, Jiaqi; Wu, Xiao-Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D; Spangler, Matthew L

    2018-04-01

    SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.

  16. Increased genomic prediction accuracy in wheat breeding through spatial adjustment of field trial data.

    PubMed

    Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav

    2013-12-09

    In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models.

  17. Highly polygenic architecture of antidepressant treatment response: Comparative analysis of SSRI and NRI treatment in an animal model of depression.

    PubMed

    Malki, Karim; Tosto, Maria Grazia; Mouriño-Talín, Héctor; Rodríguez-Lorenzo, Sabela; Pain, Oliver; Jumhaboy, Irfan; Liu, Tina; Parpas, Panos; Newman, Stuart; Malykh, Artem; Carboni, Lucia; Uher, Rudolf; McGuffin, Peter; Schalkwyk, Leonard C; Bryson, Kevin; Herbster, Mark

    2017-04-01

    Response to antidepressant (AD) treatment may be a more polygenic trait than previously hypothesized, with many genetic variants interacting in yet unclear ways. In this study we used methods that can automatically learn to detect patterns of statistical regularity from a sparsely distributed signal across hippocampal transcriptome measurements in a large-scale animal pharmacogenomic study to uncover genomic variations associated with AD. The study used four inbred mouse strains of both sexes, two drug treatments, and a control group (escitalopram, nortriptyline, and saline). Multi-class and binary classification using Machine Learning (ML) and regularization algorithms using iterative and univariate feature selection methods, including InfoGain, mRMR, ANOVA, and Chi Square, were used to uncover genomic markers associated with AD response. Relevant genes were selected based on Jaccard distance and carried forward for gene-network analysis. Linear association methods uncovered only one gene associated with drug treatment response. The implementation of ML algorithms, together with feature reduction methods, revealed a set of 204 genes associated with SSRI and 241 genes associated with NRI response. Although only 10% of genes overlapped across the two drugs, network analysis shows that both drugs modulated the CREB pathway, through different molecular mechanisms. Through careful implementation and optimisations, the algorithms detected a weak signal used to predict whether an animal was treated with nortriptyline (77%) or escitalopram (67%) on an independent testing set. The results from this study indicate that the molecular signature of AD treatment may include a much broader range of genomic markers than previously hypothesized, suggesting that response to medication may be as complex as the pathology. The search for biomarkers of antidepressant treatment response could therefore consider a higher number of genetic markers and their interactions. Through predominately different molecular targets and mechanisms of action, the two drugs modulate the same Creb1 pathway which plays a key role in neurotrophic responses and in inflammatory processes. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc.

  18. Dissection of genomic correlation matrices using multivariate factor analysis in dairy and dual-purpose cattle breeds

    USDA-ARS?s Scientific Manuscript database

    SNP effects estimated in genomic selection programs allow for the prediction of direct genomic values (DGV) both at genome-wide and chromosomal level. As a consequence, genome-wide (G_GW) or chromosomal (G_CHR) correlation matrices between genomic predictions for different traits can be calculated. ...

  19. Detection of selection signatures in Piemontese and Marchigiana cattle, two breeds with similar production aptitudes but different selection histories.

    PubMed

    Sorbolini, Silvia; Marras, Gabriele; Gaspa, Giustino; Dimauro, Corrado; Cellesi, Massimo; Valentini, Alessio; Macciotta, Nicolò Pp

    2015-06-23

    Domestication and selection are processes that alter the pattern of within- and between-population genetic variability. They can be investigated at the genomic level by tracing the so-called selection signatures. Recently, sequence polymorphisms at the genome-wide level have been investigated in a wide range of animals. A common approach to detect selection signatures is to compare breeds that have been selected for different breeding goals (i.e. dairy and beef cattle). However, genetic variations in different breeds with similar production aptitudes and similar phenotypes can be related to differences in their selection history. In this study, we investigated selection signatures between two Italian beef cattle breeds, Piemontese and Marchigiana, using genotyping data that was obtained with the Illumina BovineSNP50 BeadChip. The comparison was based on the fixation index (Fst), combined with a locally weighted scatterplot smoothing (LOWESS) regression and a control chart approach. In addition, analyses of Fst were carried out to confirm candidate genes. In particular, data were processed using the varLD method, which compares the regional variation of linkage disequilibrium between populations. Genome scans confirmed the presence of selective sweeps in the genomic regions that harbour candidate genes that are known to affect productive traits in cattle such as DGAT1, ABCG2, CAPN3, MSTN and FTO. In addition, several new putative candidate genes (for example ALAS1, ABCB8, ACADS and SOD1) were detected. This study provided evidence on the different selection histories of two cattle breeds and the usefulness of genomic scans to detect selective sweeps even in cattle breeds that are bred for similar production aptitudes.

  20. Selective Amplification of the Genome Surrounding Key Placental Genes in Trophoblast Giant Cells.

    PubMed

    Hannibal, Roberta L; Baker, Julie C

    2016-01-25

    While most cells maintain a diploid state, polyploid cells exist in many organisms and are particularly prevalent within the mammalian placenta [1], where they can generate more than 900 copies of the genome [2]. Polyploidy is thought to be an efficient method of increasing the content of the genome by avoiding the costly and slow process of cytokinesis [1, 3, 4]. Polyploidy can also affect gene regulation by amplifying a subset of genomic regions required for specific cellular function [1, 3, 4]. This mechanism is found in the fruit fly Drosophila melanogaster, where polyploid ovarian follicle cells amplify genomic regions containing chorion genes, which facilitate secretion of eggshell proteins [5]. Here, we report that genomic amplification also occurs in mammals at selective regions of the genome in parietal trophoblast giant cells (p-TGCs) of the mouse placenta. Using whole-genome sequencing (WGS) and digital droplet PCR (ddPCR) of mouse p-TGCs, we identified five amplified regions, each containing a gene family known to be involved in mammalian placentation: the prolactins (two clusters), serpins, cathepsins, and the natural killer (NK)/C-type lectin (CLEC) complex [6-12]. We report here the first description of amplification at selective genomic regions in mammals and present evidence that this is an important mode of genome regulation in placental TGCs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Recent Coselection in Human Populations Revealed by Protein–Protein Interaction Network

    PubMed Central

    Qian, Wei; Zhou, Hang; Tang, Kun

    2015-01-01

    Genome-wide scans for signals of natural selection in human populations have identified a large number of candidate loci that underlie local adaptations. This is surprising given the relatively short evolutionary time since the divergence of the human population. One hypothesis that has not been formally examined is whether and how the recent human evolution may have been shaped by coselection in the context of complex molecular interactome. In this study, genome-wide signals of selection were scanned in East Asians, Europeans, and Africans using 1000 Genome data, and subsequently mapped onto the protein–protein interaction (PPI) network. We found that the candidate genes of recent positive selection localized significantly closer to each other on the PPI network than expected, revealing substantial clustering of selected genes. Furthermore, gene pairs of shorter PPI network distances showed higher similarities of their recent evolutionary paths than those further apart. Last, subnetworks enriched with recent coselection signals were identified, which are substantially overrepresented in biological pathways related to signal transduction, neurogenesis, and immune function. These results provide the first genome-wide evidence for association of recent selection signals with the PPI network, shedding light on the potential mechanisms of recent coselection in the human genome. PMID:25532814

  2. Diseases and Molecular Diagnostics: A Step Closer to Precision Medicine.

    PubMed

    Dwivedi, Shailendra; Purohit, Purvi; Misra, Radhieka; Pareek, Puneet; Goel, Apul; Khattri, Sanjay; Pant, Kamlesh Kumar; Misra, Sanjeev; Sharma, Praveen

    2017-10-01

    The current advent of molecular technologies together with a multidisciplinary interplay of several fields led to the development of genomics, which concentrates on the detection of pathogenic events at the genome level. The structural and functional genomics approaches have now pinpointed the technical challenge in the exploration of disease-related genes and the recognition of their structural alterations or elucidation of gene function. Various promising technologies and diagnostic applications of structural genomics are currently preparing a large database of disease-genes, genetic alterations etc., by mutation scanning and DNA chip technology. Further the functional genomics also exploring the expression genetics (hybridization-, PCR- and sequence-based technologies), two-hybrid technology, next generation sequencing with Bioinformatics and computational biology. Advances in microarray "chip" technology as microarrays have allowed the parallel analysis of gene expression patterns of thousands of genes simultaneously. Sequence information collected from the genomes of many individuals is leading to the rapid discovery of single nucleotide polymorphisms or SNPs. Further advances of genetic engineering have also revolutionized immunoassay biotechnology via engineering of antibody-encoding genes and the phage display technology. The Biotechnology plays an important role in the development of diagnostic assays in response to an outbreak or critical disease response need. However, there is also need to pinpoint various obstacles and issues related to the commercialization and widespread dispersal of genetic knowledge derived from the exploitation of the biotechnology industry and the development and marketing of diagnostic services. Implementation of genetic criteria for patient selection and individual assessment of the risks and benefits of treatment emerges as a major challenge to the pharmaceutical industry. Thus this field is revolutionizing current era and further it may open new vistas in the field of disease management.

  3. A 100-Year Review: Methods and impact of genetic selection in dairy cattle-From daughter-dam comparisons to deep learning algorithms.

    PubMed

    Weigel, K A; VanRaden, P M; Norman, H D; Grosu, H

    2017-12-01

    In the early 1900s, breed society herdbooks had been established and milk-recording programs were in their infancy. Farmers wanted to improve the productivity of their cattle, but the foundations of population genetics, quantitative genetics, and animal breeding had not been laid. Early animal breeders struggled to identify genetically superior families using performance records that were influenced by local environmental conditions and herd-specific management practices. Daughter-dam comparisons were used for more than 30 yr and, although genetic progress was minimal, the attention given to performance recording, genetic theory, and statistical methods paid off in future years. Contemporary (herdmate) comparison methods allowed more accurate accounting for environmental factors and genetic progress began to accelerate when these methods were coupled with artificial insemination and progeny testing. Advances in computing facilitated the implementation of mixed linear models that used pedigree and performance data optimally and enabled accurate selection decisions. Sequencing of the bovine genome led to a revolution in dairy cattle breeding, and the pace of scientific discovery and genetic progress accelerated rapidly. Pedigree-based models have given way to whole-genome prediction, and Bayesian regression models and machine learning algorithms have joined mixed linear models in the toolbox of modern animal breeders. Future developments will likely include elucidation of the mechanisms of genetic inheritance and epigenetic modification in key biological pathways, and genomic data will be used with data from on-farm sensors to facilitate precision management on modern dairy farms. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  4. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits

    PubMed Central

    Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits (by genomic selection or MAS) and forage yield. PMID:28068350

  5. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

    PubMed

    Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits (by genomic selection or MAS) and forage yield.

  6. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer).

    PubMed

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili; Liu, Bao; Li, Lin-Feng

    2017-09-01

    Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Identifying footprints of directional and balancing selection in marine and freshwater three-spined stickleback (Gasterosteus aculeatus) populations.

    PubMed

    Mäkinen, H S; Cano, J M; Merilä, J

    2008-08-01

    Natural selection is expected to leave an imprint on the neutral polymorphisms at the adjacent genomic regions of a selected gene. While directional selection tends to reduce within-population genetic diversity and increase among-population differentiation, the reverse is expected under balancing selection. To identify targets of natural selection in the three-spined stickleback (Gasterosteus aculeatus) genome, 103 microsatellite and two indel markers including expressed sequence tags (EST) and quantitative trait loci (QTL)-associated loci, were genotyped in four freshwater and three marine populations. The results indicated that a high proportion of loci (14.7%) might be affected by balancing selection and a lower proportion (2.8%) by directional selection. The strongest signatures of directional selection were detected in a microsatellite locus and two indel markers located in the intronic regions of the Eda-gene coding for the number of lateral plates. Yet, other microsatellite loci previously found to be informative in QTL-mapping studies revealed no signatures of selection. Two novel microsatellite loci (Stn12 and Stn90) located in chromosomes I and VIII, respectively, showed signals of directional selection and might be linked to genomic regions containing gene(s) important for adaptive divergence. Although the coverage of the total genomic content was relatively low, the predominance of balancing selection signals is in agreement with the contention that balancing, rather than directional selection is the predominant mode of selection in the wild.

  8. Can we use genetic and genomic approaches to identify candidate animals for targeted selective treatment.

    PubMed

    Laurenson, Yan C S M; Kyriazakis, Ilias; Bishop, Stephen C

    2013-10-18

    Estimated breeding values (EBV) for faecal egg count (FEC) and genetic markers for host resistance to nematodes may be used to identify resistant animals for selective breeding programmes. Similarly, targeted selective treatment (TST) requires the ability to identify the animals that will benefit most from anthelmintic treatment. A mathematical model was used to combine the concepts and evaluate the potential of using genetic-based methods to identify animals for a TST regime. EBVs obtained by genomic prediction were predicted to be the best determinant criterion for TST in terms of the impact on average empty body weight and average FEC, whereas pedigree-based EBVs for FEC were predicted to be marginally worse than using phenotypic FEC as a determinant criterion. Whilst each method has financial implications, if the identification of host resistance is incorporated into a wider genomic selection indices or selective breeding programmes, then genetic or genomic information may be plausibly included in TST regimes. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Integration of genomic information into sport horse breeding programs for optimization of accuracy of selection.

    PubMed

    Haberland, A M; König von Borstel, U; Simianer, H; König, S

    2012-09-01

    Reliable selection criteria are required for young riding horses to increase genetic gain by increasing accuracy of selection and decreasing generation intervals. In this study, selection strategies incorporating genomic breeding values (GEBVs) were evaluated. Relevant stages of selection in sport horse breeding programs were analyzed by applying selection index theory. Results in terms of accuracies of indices (r(TI) ) and relative selection response indicated that information on single nucleotide polymorphism (SNP) genotypes considerably increases the accuracy of breeding values estimated for young horses without own or progeny performance. In a first scenario, the correlation between the breeding value estimated from the SNP genotype and the true breeding value (= accuracy of GEBV) was fixed to a relatively low value of r(mg) = 0.5. For a low heritability trait (h(2) = 0.15), and an index for a young horse based only on information from both parents, additional genomic information doubles r(TI) from 0.27 to 0.54. Including the conventional information source 'own performance' into the before mentioned index, additional SNP information increases r(TI) by 40%. Thus, particularly with regard to traits of low heritability, genomic information can provide a tool for well-founded selection decisions early in life. In a further approach, different sources of breeding values (e.g. GEBV and estimated breeding values (EBVs) from different countries) were combined into an overall index when altering accuracies of EBVs and correlations between traits. In summary, we showed that genomic selection strategies have the potential to contribute to a substantial reduction in generation intervals in horse breeding programs.

  10. Genome research elucidating environmental adaptation: Dark-fly project as a case study.

    PubMed

    Fuse, Naoyuki

    2017-08-01

    Organisms have the capacity to adapt to diverse environments, and environmental adaptation is a substantial driving force of evolution. Recent progress of genome science has addressed the genetic mechanisms underlying environmental adaptation. Whole genome sequencing has identified adaptive genes selected under particular environments. Genome editing technology enables us to directly test the role(s) of a gene in environmental adaptation. Genome science has also shed light on a unique organism, Dark-fly, which has been reared long-term in the dark. We determined the whole genome sequence of Dark-fly and reenacted environmental selections of the Dark-fly genome to identify the genes related to dark-adaptation. Here I will give an overview of current progress in genome science and summarize our study using Dark-fly, as a case study for environmental adaptation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landrace and cultivars

    USDA-ARS?s Scientific Manuscript database

    Domesticated crops have experienced strong human-driven selection aimed at the development of improved varieties adapted to local conditions. To detect regions of the wheat genome subject to selection during improvement, we developed a high-throughput array to interrogate 9,000 gene-associated DNA m...

  12. A Comparison Between Genotyping-by-sequencing and Array-based Scoring of SNPs for Genomic Prediction Accuracy in Winter Wheat

    USDA-ARS?s Scientific Manuscript database

    The utilization of DNA molecular markers in plant breeding to maximize selection response via marker assisted selection (MAS) and genomic selection (GS) has the potential to revolutionize plant breeding. A key factor affecting GS applicability is the choice of molecular marker platform. Genotypying-...

  13. No genetic adaptation of the Mediterranean keystone shrub Cistus ladanifer in response to experimental fire and extreme drought.

    PubMed

    Torres, Iván; Parra, Antonio; Moreno, José M; Durka, Walter

    2018-01-01

    In Mediterranean ecosystems, climate change is projected to increase fire danger and summer drought, thus reducing post-fire recruitment of obligate seeder species, and possibly affecting the population genetic structure. We performed a genome-wide genetic marker study, using AFLP markers, on individuals from one Central Spain population of the obligate post-fire seeder Cistus ladanifer L. that established after experimental fire and survived during four subsequent years under simulated drought implemented with a rainout shelter system. We explored the effects of the treatments on marker diversity, spatial genetic structure and presence of outlier loci suggestive of selection. We found no effect of fire or drought on any of the genetic diversity metrics. Analysis of Molecular Variance showed very low genetic differentiation among treatments. Neither fire nor drought altered the small-scale spatial genetic structure of the population. Only one locus was significantly associated with the fire treatment, but inconsistently across outlier detection methods. Neither fire nor drought are likely to affect the genetic makeup of emerging C. ladanifer, despite reduced recruitment caused by drought. The lack of genetic change suggests that reduced recruitment is a random, non-selective process with no genome-wide consequences on this keystone, drought- and fire tolerant Mediterranean species.

  14. Complete genome sequence of Flavobacterium psychrophilum strain CSF259-93 used to select rainbow trout for increased genetic resistance against bacterial cold water disease

    USDA-ARS?s Scientific Manuscript database

    The genome sequence of Flavobacterium psychrophilum strain CSF259-93, isolated from rainbow trout (Oncorhynchus mykiss), consists of a single circular genome of 2,900,735 bp and 2,701 predicted open reading frames (ORFs). Strain CSF259-93 has been used to select a line of rainbow trout with increase...

  15. Convergent genomic signatures of domestication in sheep and goats.

    PubMed

    Alberto, Florian J; Boyer, Frédéric; Orozco-terWengel, Pablo; Streeter, Ian; Servin, Bertrand; de Villemereuil, Pierre; Benjelloun, Badr; Librado, Pablo; Biscarini, Filippo; Colli, Licia; Barbato, Mario; Zamani, Wahid; Alberti, Adriana; Engelen, Stefan; Stella, Alessandra; Joost, Stéphane; Ajmone-Marsan, Paolo; Negrini, Riccardo; Orlando, Ludovic; Rezaei, Hamid Reza; Naderi, Saeid; Clarke, Laura; Flicek, Paul; Wincker, Patrick; Coissac, Eric; Kijas, James; Tosser-Klopp, Gwenola; Chikhi, Abdelkader; Bruford, Michael W; Taberlet, Pierre; Pompanon, François

    2018-03-06

    The evolutionary basis of domestication has been a longstanding question and its genetic architecture is becoming more tractable as more domestic species become genome-enabled. Before becoming established worldwide, sheep and goats were domesticated in the fertile crescent 10,500 years before present (YBP) where their wild relatives remain. Here we sequence the genomes of wild Asiatic mouflon and Bezoar ibex in the sheep and goat domestication center and compare their genomes with that of domestics from local, traditional, and improved breeds. Among the genomic regions carrying selective sweeps differentiating domestic breeds from wild populations, which are associated among others to genes involved in nervous system, immunity and productivity traits, 20 are common to Capra and Ovis. The patterns of selection vary between species, suggesting that while common targets of selection related to domestication and improvement exist, different solutions have arisen to achieve similar phenotypic end-points within these closely related livestock species.

  16. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery.

    PubMed

    Hickey, John M; Chiurugwi, Tinashe; Mackay, Ian; Powell, Wayne

    2017-08-30

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying approach to deliver innovative 'step changes' in the rate of genetic gain at scale.

  17. Genotype Specification Language.

    PubMed

    Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

    2016-06-17

    We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility.

  18. Mass Spectrometry Based Ultrasensitive DNA Methylation Profiling Using Target Fragmentation Assay.

    PubMed

    Lin, Xiang-Cheng; Zhang, Ting; Liu, Lan; Tang, Hao; Yu, Ru-Qin; Jiang, Jian-Hui

    2016-01-19

    Efficient tools for profiling DNA methylation in specific genes are essential for epigenetics and clinical diagnostics. Current DNA methylation profiling techniques have been limited by inconvenient implementation, requirements of specific reagents, and inferior accuracy in quantifying methylation degree. We develop a novel mass spectrometry method, target fragmentation assay (TFA), which enable to profile methylation in specific sequences. This method combines selective capture of DNA target from restricted cleavage of genomic DNA using magnetic separation with MS detection of the nonenzymatic hydrolysates of target DNA. This method is shown to be highly sensitive with a detection limit as low as 0.056 amol, allowing direct profiling of methylation using genome DNA without preamplification. Moreover, this method offers a unique advantage in accurately determining DNA methylation level. The clinical applicability was demonstrated by DNA methylation analysis using prostate tissue samples, implying the potential of this method as a useful tool for DNA methylation profiling in early detection of related diseases.

  19. Multivariate Methods for Meta-Analysis of Genetic Association Studies.

    PubMed

    Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G

    2018-01-01

    Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.

  20. Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences.

    PubMed

    Colonna, Vincenza; Ayub, Qasim; Chen, Yuan; Pagani, Luca; Luisi, Pierre; Pybus, Marc; Garrison, Erik; Xue, Yali; Tyler-Smith, Chris; Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2014-06-30

    Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.

  1. Candidate loci involved in domestication and improvement detected by a published 90K wheat SNP array

    PubMed Central

    Gao, Lifeng; Zhao, Guangyao; Huang, Dawei; Jia, Jizeng

    2017-01-01

    Selection is one of the most important forces in crop evolution. Common wheat is a major world food crop and a typical allopolyploid with a huge and complex genome. We applied four approaches to detect loci selected in wheat during domestication and improvement. A total of 7,984 candidate loci were detected, accounting for 23.3% of all 34,317 SNPs analysed, a much higher proportion than estimated in previous reports. We constructed a first generation wheat selection map which revealed the following new insights on genome-wide selection: (1) diversifying selection acted by increasing, decreasing or not affecting gene frequencies; (2) the number of loci under selection during domestication was much higher than that during improvement; (3) the contribution to wheat improvement by the D sub-genome was relatively small due to the bottleneck of hexaploidisation and diversity can be expanded by using synthetic wheat and introgression lines; and (4) clustered selection regions occur throughout the wheat genome, including the centromere regions. This study will not only help future wheat breeding and evolutionary studies, but will also accelerate study of other crops, especially polyploids. PMID:28327671

  2. A selective sweep of >8 Mb on chromosome 26 in the Boxer genome.

    PubMed

    Quilez, Javier; Short, Andrea D; Martínez, Verónica; Kennedy, Lorna J; Ollier, William; Sanchez, Armand; Altet, Laura; Francino, Olga

    2011-07-01

    Modern dog breeds display traits that are either breed-specific or shared by a few breeds as a result of genetic bottlenecks during the breed creation process and artificial selection for breed standards. Selective sweeps in the genome result from strong selection and can be detected as a reduction or elimination of polymorphism in a given region of the genome. Extended regions of homozygosity, indicative of selective sweeps, were identified in a genome-wide scan dataset of 25 Boxers from the United Kingdom genotyped at ~20,000 single-nucleotide polymorphisms (SNPs). These regions were further examined in a second dataset of Boxers collected from a different geographical location and genotyped using higher density SNP arrays (~170,000 SNPs). A selective sweep previously associated with canine brachycephaly was detected on chromosome 1. A novel selective sweep of over 8 Mb was observed on chromosome 26 in Boxer and for a shorter region in English and French bulldogs. It was absent in 171 samples from eight other dog breeds and 7 Iberian wolf samples. A region of extended increased heterozygosity on chromosome 9 overlapped with a previously reported copy number variant (CNV) which was polymorphic in multiple dog breeds. A selective sweep of more than 8 Mb on chromosome 26 was identified in the Boxer genome. This sweep is likely caused by strong artificial selection for a trait of interest and could have inadvertently led to undesired health implications for this breed. Furthermore, we provide supporting evidence for two previously described regions: a selective sweep on chromosome 1 associated with canine brachycephaly and a CNV on chromosome 9 polymorphic in multiple dog breeds.

  3. Assessing the expected response to genomic selection of individuals and families in Eucalyptus breeding with an additive-dominant model.

    PubMed

    Resende, R T; Resende, M D V; Silva, F F; Azevedo, C F; Takahashi, E K; Silva-Junior, O B; Grattapaglia, D

    2017-10-01

    We report a genomic selection (GS) study of growth and wood quality traits in an outbred F 2 hybrid Eucalyptus population (n=768) using high-density single-nucleotide polymorphism (SNP) genotyping. Going beyond previous reports in forest trees, models were developed for different selection targets, namely, families, individuals within families and individuals across the entire population using a genomic model including dominance. To provide a more breeder-intelligible assessment of the performance of GS we calculated the expected response as the percentage gain over the population average expected genetic value (EGV) for different proportions of genomically selected individuals, using a rigorous cross-validation (CV) scheme that removed relatedness between training and validation sets. Predictive abilities (PAs) were 0.40-0.57 for individual selection and 0.56-0.75 for family selection. PAs under an additive+dominance model improved predictions by 5 to 14% for growth depending on the selection target, but no improvement was seen for wood traits. The good performance of GS with no relatedness in CV suggested that our average SNP density (~25 kb) captured some short-range linkage disequilibrium. Truncation GS successfully selected individuals with an average EGV significantly higher than the population average. Response to GS on a per year basis was ~100% more efficient than by phenotypic selection and more so with higher selection intensities. These results contribute further experimental data supporting the positive prospects of GS in forest trees. Because generation times are long, traits are complex and costs of DNA genotyping are plummeting, genomic prediction has good perspectives of adoption in tree breeding practice.

  4. Detection of genomic signatures of recent selection in commercial broiler chickens.

    PubMed

    Fu, Weixuan; Lee, William R; Abasht, Behnam

    2016-08-26

    Identification of the genomic signatures of recent selection may help uncover causal polymorphisms controlling traits relevant to recent decades of selective breeding in livestock. In this study, we aimed at detecting signatures of recent selection in commercial broiler chickens using genotype information from single nucleotide polymorphisms (SNPs). A total of 565 chickens from five commercial purebred lines, including three broiler sire (male) lines and two broiler dam (female) lines, were genotyped using the 60K SNP Illumina iSelect chicken array. To detect genomic signatures of recent selection, we applied two methods based on population comparison, cross-population extended haplotype homozygosity (XP-EHH) and cross-population composite likelihood ratio (XP-CLR), and further analyzed the results to find genomic regions under recent selection in multiple purebred lines. A total of 321 candidate selection regions spanning approximately 1.45 % of the chicken genome in each line were detected by consensus of results of both XP-EHH and XP-CLR methods. To minimize false discovery due to genetic drift, only 42 of the candidate selection regions that were shared by 2 or more purebred lines were considered as high-confidence selection regions in the study. Of these 42 regions, 20 were 50 kb or less while 4 regions were larger than 0.5 Mb. In total, 91 genes could be found in the 42 regions, among which 19 regions contained only 1 or 2 genes, and 9 regions were located at gene deserts. Our results provide a genome-wide scan of recent selection signatures in five purebred lines of commercial broiler chickens. We found several candidate genes for recent selection in multiple lines, such as SOX6 (Sex Determining Region Y-Box 6) and cTR (Thyroid hormone receptor beta). These genes may have been under recent selection due to their essential roles in growth, development and reproduction in chickens. Furthermore, our results suggest that in some candidate regions, the same or opposite alleles have been under recent selection in multiple lines. Most of the candidate genes in the selection regions are novel, and as such they should be of great interest for future research into the genetic architecture of traits relevant to modern broiler breeding.

  5. Genomic data as the “hitchhiker's guide” to cattle adaptation: tracking the milestones of past selection in the bovine genome

    PubMed Central

    Utsunomiya, Yuri T.; Pérez O'Brien, Ana M.; Sonstegard, Tad S.; Sölkner, Johann; Garcia, José F.

    2015-01-01

    The bovine species have witnessed and played a major role in the drastic socio-economical changes that shaped our culture over the last 10,000 years. During this journey, cattle “hitchhiked” on human development and colonized the world, facing strong selective pressures such as dramatic environmental changes and disease challenge. Consequently, hundreds of specialized cattle breeds emerged and spread around the globe, making up a rich spectrum of genomic resources. Their DNA still carry the scars left from adapting to this wide range of conditions, and we are now empowered with data and analytical tools to track the milestones of past selection in their genomes. In this review paper, we provide a summary of the reconstructed demographic events that shaped cattle diversity, offer a critical synthesis of popular methodologies applied to the search for signatures of selection (SS) in genomic data, and give examples of recent SS studies in cattle. Then, we outline the potential and challenges of the application of SS analysis in cattle, and discuss the future directions in this field. PMID:25713583

  6. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics

    PubMed Central

    Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.

    2015-01-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  7. Characterizing the genetic differences between two distinct migrant groups from Indo-European and Dravidian speaking populations in India.

    PubMed

    Ali, Mohammad; Liu, Xuanyao; Pillai, Esakimuthu Nisha; Chen, Peng; Khor, Chiea-Chuen; Ong, Rick Twee-Hee; Teo, Yik-Ying

    2014-07-22

    India is home to many ethnically and linguistically diverse populations. It is hypothesized that history of invasions by people from Persia and Central Asia, who are referred as Aryans in Hindu Holy Scriptures, had a defining role in shaping the Indian population canvas. A shift in spoken languages from Dravidian languages to Indo-European languages around 1500 B.C. is central to the Aryan Invasion Theory. Here we investigate the genetic differences between two sub-populations of India consisting of: (1) The Indo-European language speaking Gujarati Indians with genome-wide data from the International HapMap Project; and (2) the Dravidian language speaking Tamil Indians with genome-wide data from the Singapore Genome Variation Project. We implemented three population genetics measures to identify genomic regions that are significantly differentiated between the two Indian populations originating from the north and south of India. These measures singled out genomic regions with: (i) SNPs exhibiting significant variation in allele frequencies in the two Indian populations; and (ii) differential signals of positive natural selection as quantified by the integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH). One of the regions that emerged spans the SLC24A5 gene that has been functionally shown to affect skin pigmentation, with a higher degree of genetic sharing between Gujarati Indians and Europeans. Our finding points to a gene-flow from Europe to north India that provides an explanation for the lighter skin tones present in North Indians in comparison to South Indians.

  8. Invited review: Inbreeding in the genomics era: Inbreeding, inbreeding depression, and management of genomic variability.

    PubMed

    Howard, Jeremy T; Pryce, Jennie E; Baes, Christine; Maltecca, Christian

    2017-08-01

    Traditionally, pedigree-based relationship coefficients have been used to manage the inbreeding and degree of inbreeding depression that exists within a population. The widespread incorporation of genomic information in dairy cattle genetic evaluations allows for the opportunity to develop and implement methods to manage populations at the genomic level. As a result, the realized proportion of the genome that 2 individuals share can be more accurately estimated instead of using pedigree information to estimate the expected proportion of shared alleles. Furthermore, genomic information allows genome-wide relationship or inbreeding estimates to be augmented to characterize relationships for specific regions of the genome. Region-specific stretches can be used to more effectively manage areas of low genetic diversity or areas that, when homozygous, result in reduced performance across economically important traits. The use of region-specific metrics should allow breeders to more precisely manage the trade-off between the genetic value of the progeny and undesirable side effects associated with inbreeding. Methods tailored toward more effectively identifying regions affected by inbreeding and their associated use to manage the genome at the herd level, however, still need to be developed. We have reviewed topics related to inbreeding, measures of relatedness, genetic diversity and methods to manage populations at the genomic level, and we discuss future challenges related to managing populations through implementing genomic methods at the herd and population levels. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  9. Thinking too positive? Revisiting current methods of population genetic selection inference.

    PubMed

    Bank, Claudia; Ewing, Gregory B; Ferrer-Admettla, Anna; Foll, Matthieu; Jensen, Jeffrey D

    2014-12-01

    In the age of next-generation sequencing, the availability of increasing amounts and improved quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. However, alternative forces, such as demography and background selection (BGS), obscure the footprints of positive selection that we would like to identify. In this review, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (i) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (ii) that genomic information from multiple time points will enhance the power of inference, and (iii) that results from experimental evolution should be utilized to better inform population genomic studies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. Public attitudes to the promotion of genomic crop studies in Japan: correlations between genomic literacy, trust, and favourable attitude.

    PubMed

    Ishiyama, Izumi; Tanzawa, Tetsuro; Watanabe, Maiko; Maeda, Tadahiko; Muto, Kaori; Tamakoshi, Akiko; Nagai, Akiko; Yamagata, Zentaro

    2012-05-01

    This study aimed to assess public attitudes in Japan to the promotion of genomic selection in crop studies and to examine associated factors. We analysed data from a nationwide opinion survey. A total of 4,000 people were selected from the Japanese general population by a stratified two-phase sampling method, and 2,171 people participated by post; this survey asked about the pros and cons of crop-related genomic studies promotion, examined people's scientific literacy in genomics, and investigated factors thought to be related to genomic literacy and attitude. The relationships were examined using logistic regression models stratified by gender. Survey results showed that 50.0% of respondents approved of the promotion of crop-related genomic studies, while 6.7% disapproved. No correlation was found between literacy and attitude towards promotion. Trust in experts, belief in science, an interest in genomic studies and willingness to purchase new products correlated with a positive attitude towards crop-related genomic studies.

  11. Multilocus approaches for the measurement of selection on correlated genetic loci.

    PubMed

    Gompert, Zachariah; Egan, Scott P; Barrett, Rowan D H; Feder, Jeffrey L; Nosil, Patrik

    2017-01-01

    The study of ecological speciation is inherently linked to the study of selection. Methods for estimating phenotypic selection within a generation based on associations between trait values and fitness (e.g. survival) of individuals are established. These methods attempt to disentangle selection acting directly on a trait from indirect selection caused by correlations with other traits via multivariate statistical approaches (i.e. inference of selection gradients). The estimation of selection on genotypic or genomic variation could also benefit from disentangling direct and indirect selection on genetic loci. However, achieving this goal is difficult with genomic data because the number of potentially correlated genetic loci (p) is very large relative to the number of individuals sampled (n). In other words, the number of model parameters exceeds the number of observations (p ≫ n). We present simulations examining the utility of whole-genome regression approaches (i.e. Bayesian sparse linear mixed models) for quantifying direct selection in cases where p ≫ n. Such models have been used for genome-wide association mapping and are common in artificial breeding. Our results show they hold promise for studies of natural selection in the wild and thus of ecological speciation. But we also demonstrate important limitations to the approach and discuss study designs required for more robust inferences. © 2016 John Wiley & Sons Ltd.

  12. Design and implementation of a CORBA-based genome mapping system prototype.

    PubMed

    Hu, J; Mungall, C; Nicholson, D; Archibald, A L

    1998-01-01

    CORBA (Common Object Request Broker Architecture), as an open standard, is considered to be a good solution for the development and deployment of applications in distributed heterogeneous environments. This technology can be applied in the bioinformatics area to enhance utilization, management and interoperation between biological resources. This paper investigates issues in developing CORBA applications for genome mapping information systems in the Internet environment with emphasis on database connectivity and graphical user interfaces. The design and implementation of a CORBA prototype for an animal genome mapping database are described. The prototype demonstration is available via: http://www.ri.bbsrc.ac.uk/ark_corba/. jian.hu@bbsrc.ac.uk

  13. Sparse partial least squares regression for simultaneous dimension reduction and variable selection

    PubMed Central

    Chun, Hyonho; Keleş, Sündüz

    2010-01-01

    Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611

  14. Neuropsychology 3.0: Evidence-Based Science and Practice

    PubMed Central

    Bilder, Robert M.

    2011-01-01

    Neuropsychology is poised for transformations of its concepts and methods, leveraging advances in neuroimaging, the human genome project, psychometric theory, and information technologies. It is argued that a paradigm shift towards evidence-based science and practice can be enabled by innovations, including: (1) formal definition of neuropsychological concepts and tasks in cognitive ontologies; (2) creation of collaborative neuropsychological knowledgebases; and (3) design of web-based assessment methods that permit free development, large-sample implementation, and dynamic refinement of neuropsychological tests and the constructs these aim to assess. This article considers these opportunities, highlights selected obstacles, and offers suggestions for stepwise progress towards these goals. PMID:21092355

  15. Military genomics: a perspective on the successes and challenges of genomic medicine in the Armed Services.

    PubMed

    De Castro, Mauricio J; Turner, Clesson E

    2017-11-01

    We describe the impact genomics has on the health and readiness of the military service member, highlight several examples of the current and future plans for genomic medicine within the military, discuss challenges to implementation and provide recommendations to address some of those challenges. Published 2017. This article is a U.S. Government work and is in the public domain in the USA. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

  16. Increased Genomic Prediction Accuracy in Wheat Breeding Through Spatial Adjustment of Field Trial Data

    PubMed Central

    Lado, Bettina; Matus, Ivan; Rodríguez, Alejandra; Inostroza, Luis; Poland, Jesse; Belzile, François; del Pozo, Alejandro; Quincke, Martín; Castro, Marina; von Zitzewitz, Jarislav

    2013-01-01

    In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models. PMID:24082033

  17. Fundamental differences in diversity and genomic population structure between Atlantic and Pacific Prochlorococcus.

    PubMed

    Kashtan, Nadav; Roggensack, Sara E; Berta-Thompson, Jessie W; Grinberg, Maor; Stepanauskas, Ramunas; Chisholm, Sallie W

    2017-09-01

    The Atlantic and Pacific Oceans represent different biogeochemical regimes in which the abundant marine cyanobacterium Prochlorococcus thrives. We have shown that Prochlorococcus populations in the Atlantic are composed of hundreds of genomically, and likely ecologically, distinct coexisting subpopulations with distinct genomic backbones. Here we ask if differences in the ecology and selection pressures between the Atlantic and Pacific are reflected in the diversity and genomic composition of their indigenous Prochlorococcus populations. We applied large-scale single-cell genomics and compared the cell-by-cell genomic composition of wild populations of co-occurring cells from samples from Station ALOHA off Hawaii, and from Bermuda Atlantic Time Series Station off Bermuda. We reveal fundamental differences in diversity and genomic structure of populations between the sites. The Pacific populations are more diverse than those in the Atlantic, composed of significantly more coexisting subpopulations and lacking dominant subpopulations. Prochlorococcus from the two sites seem to be composed of mostly non-overlapping distinct sets of subpopulations with different genomic backbones-likely reflecting different sets of ocean-specific micro-niches. Furthermore, phylogenetically closely related strains carry ocean-associated nutrient acquisition genes likely reflecting differences in major selection pressures between the oceans. This differential selection, along with geographic separation, clearly has a significant role in shaping these populations.

  18. Molecular karyotyping by array CGH in a Russian cohort of children with intellectual disability, autism, epilepsy and congenital anomalies

    PubMed Central

    2012-01-01

    Background Array comparative genomic hybridization (CGH) has been repeatedly shown to be a successful tool for the identification of genomic variations in a clinical population. During the last decade, the implementation of array CGH has resulted in the identification of new causative submicroscopic chromosome imbalances and copy number variations (CNVs) in neuropsychiatric (neurobehavioral) diseases. Currently, array-CGH-based technologies have become an integral part of molecular diagnosis and research in individuals with neuropsychiatric disorders and children with intellectual disability (mental retardation) and congenital anomalies. Here, we introduce the Russian cohort of children with intellectual disability, autism, epilepsy and congenital anomalies analyzed by BAC array CGH and a novel bioinformatic strategy. Results Among 54 individuals highly selected according to clinical criteria and molecular and cytogenetic data (from 2426 patients evaluated cytogenetically and molecularly between November 2007 and May 2012), chromosomal imbalances were detected in 26 individuals (48%). In two patients (4%), a previously undescribed condition was observed. The latter has been designated as meiotic (constitutional) genomic instability resulted in multiple submicroscopic rearrangements (including CNVs). Using bioinformatic strategy, we were able to identify clinically relevant CNVs in 15 individuals (28%). Selected cases were confirmed by molecular cytogenetic and molecular genetic methods. Eight out of 26 chromosomal imbalances (31%) have not been previously reported. Among them, three cases were co-occurrence of subtle chromosome 9 and 21 deletions. Conclusions We conducted an array CGH study of Russian patients suffering from intellectual disability, autism, epilepsy and congenital anomalies. In total, phenotypic manifestations of clinically relevant genomic variations were found to result from genomic rearrangements affecting 1247 disease-causing and pathway-involved genes. Obviously, a significantly lesser part of them are true candidates for intellectual disability, autism or epilepsy. The success of our preliminary array CGH and bioinformatic study allows us to expand the cohort. According to the available literature, this is the first comprehensive array CGH evaluation of a Russian cohort of children with neuropsychiatric disorders and congenital anomalies. PMID:23272938

  19. Cloud-based interactive analytics for terabytes of genomic variants data

    PubMed Central

    Pan, Cuiping; McInnes, Gregory; Deflaux, Nicole; Snyder, Michael; Bingham, Jonathan; Datta, Somalee; Tsao, Philip S

    2017-01-01

    Abstract Motivation Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired. Results We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information. Availability and implementation Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs. Contact cuiping@stanford.edu or ptsao@stanford.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28961771

  20. Population genomics of the inbred Scandinavian wolf.

    PubMed

    Hagenblad, Jenny; Olsson, Maria; Parker, Heidi G; Ostrander, Elaine A; Ellegren, Hans

    2009-04-01

    The Scandinavian wolf population represents one of the genetically most well-characterized examples of a severely bottlenecked natural population (with only two founders), and of how the addition of new genetic material (one immigrant) can at least temporarily provide a 'genetic rescue'. However, inbreeding depression has been observed in this population and in the absence of additional immigrants, its long-term viability is questioned. To study the effects of inbreeding and selection on genomic diversity, we performed a genomic scan with approximately 250 microsatellite markers distributed across all autosomes and the X chromosome. We found linkage disequilibrium (LD) that extended up to distances of 50 Mb, exceeding that of most outbreeding species studied thus far. LD was particularly pronounced on the X chromosome. Overall levels of observed genomic heterozygosity did not deviate significantly from simulations based on known population history, giving no support for a general selection for heterozygotes. However, we found evidence supporting balancing selection at a number of loci and also evidence suggesting directional selection at other loci. For markers on chromosome 23, the signal of selection was particularly strong, indicating that purifying selection against deleterious alleles may have occurred even in this very small population. These data suggest that population genomics allows the exploration of the effects of neutral and non-neutral evolution on a finer scale than what has previously been possible.

  1. Mating programs including genomic relationships and dominance effects

    USDA-ARS?s Scientific Manuscript database

    Breed associations, artificial-insemination organizations, and on-farm software providers need new computerized mating programs for genomic selection so that genomic inbreeding could be minimized by comparing genotypes of potential mates. Efficient methods for transferring elements of the genomic re...

  2. Standards of Practice: Applying Genetics and Genomics Resources to Oncology
.

    PubMed

    Kerber, Alice S; Ledbetter, Nancy J

    2017-04-01

    Knowledge about genetics and genomics and its application to oncology care is rapidly expanding and evolving. As a result, oncology nurses at all levels must develop and maintain their knowledge of genetics and genomics, as well as be aware of resources to guide practice. This article focuses on implementation of the standards described in the updated Genetics/Genomics Nursing: Scope and Standards of Practice by the basic practitioner.
.

  3. Scalable and cost-effective NGS genotyping in the cloud.

    PubMed

    Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P

    2015-10-15

    While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.

  4. Environmental Adaptation Contributes to Gene Polymorphism across the Arabidopsis thaliana Genome

    PubMed Central

    Lee, Cheng-Ruei

    2012-01-01

    The level of within-species polymorphism differs greatly among genes in a genome. Many genomic studies have investigated the relationship between gene polymorphism and factors such as recombination rate or expression pattern. However, the polymorphism of a gene is affected not only by its physical properties or functional constraints but also by natural selection on organisms in their environments. Specifically, if functionally divergent alleles enable adaptation to different environments, locus-specific polymorphism may be maintained by spatially heterogeneous natural selection. To test this hypothesis and estimate the extent to which environmental selection shapes the pattern of genome-wide polymorphism, we define the "environmental relevance" of a gene as the proportion of genetic variation explained by environmental factors, after controlling for population structure. We found substantial effects of environmental relevance on patterns of polymorphism among genes. In addition, the correlation between environmental relevance and gene polymorphism is positive, consistent with the expectation that balancing selection among heterogeneous environments maintains genetic variation at ecologically important genes. Comparison of the gene ontology annotations shows that genes with high environmental relevance are enriched in unknown function categories. These results suggest an important role for environmental factors in shaping genome-wide patterns of polymorphism and indicate another direction of genomic study. PMID:22798389

  5. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia

    PubMed Central

    Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya

    2013-01-01

    Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38–0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (<0.2) in all soluble solid content and vigor of tree. Results suggest the potential of GWAS and GS for use in future breeding programs in Japanese pear. PMID:23641189

  6. A general heuristic for genome rearrangement problems.

    PubMed

    Dias, Ulisses; Galvão, Gustavo Rodrigues; Lintzmayer, Carla Négri; Dias, Zanoni

    2014-06-01

    In this paper, we present a general heuristic for several problems in the genome rearrangement field. Our heuristic does not solve any problem directly, it is rather used to improve the solutions provided by any non-optimal algorithm that solve them. Therefore, we have implemented several algorithms described in the literature and several algorithms developed by ourselves. As a whole, we implemented 23 algorithms for 9 well known problems in the genome rearrangement field. A total of 13 algorithms were implemented for problems that use the notions of prefix and suffix operations. In addition, we worked on 5 algorithms for the classic problem of sorting by transposition and we conclude the experiments by presenting results for 3 approximation algorithms for the sorting by reversals and transpositions problem and 2 approximation algorithms for the sorting by reversals problem. Another algorithm with better approximation ratio can be found for the last genome rearrangement problem, but it is purely theoretical with no practical implementation. The algorithms we implemented in addition to our heuristic lead to the best practical results in each case. In particular, we were able to improve results on the sorting by transpositions problem, which is a very special case because many efforts have been made to generate algorithms with good results in practice and some of these algorithms provide results that equal the optimum solutions in many cases. Our source codes and benchmarks are freely available upon request from the authors so that it will be easier to compare new approaches against our results.

  7. Identification of Medically Actionable Secondary Findings in the 1000 Genomes

    PubMed Central

    Olfson, Emily; Cottrell, Catherine E.; Davidson, Nicholas O.; Gurnett, Christina A.; Heusel, Jonathan W.; Stitziel, Nathan O.; Chen, Li-Shiun; Hartz, Sarah; Nagarajan, Rakesh; Saccone, Nancy L.; Bierut, Laura J.

    2015-01-01

    The American College of Medical Genetics and Genomics (ACMG) recommends that clinical sequencing laboratories return secondary findings in 56 genes associated with medically actionable conditions. Our goal was to apply a systematic, stringent approach consistent with clinical standards to estimate the prevalence of pathogenic variants associated with such conditions using a diverse sequencing reference sample. Candidate variants in the 56 ACMG genes were selected from Phase 1 of the 1000 Genomes dataset, which contains sequencing information on 1,092 unrelated individuals from across the world. These variants were filtered using the Human Gene Mutation Database (HGMD) Professional version and defined parameters, appraised through literature review, and examined by a clinical laboratory specialist and expert physician. Over 70,000 genetic variants were extracted from the 56 genes, and filtering identified 237 variants annotated as disease causing by HGMD Professional. Literature review and expert evaluation determined that 7 of these variants were pathogenic or likely pathogenic. Furthermore, 5 additional truncating variants not listed as disease causing in HGMD Professional were identified as likely pathogenic. These 12 secondary findings are associated with diseases that could inform medical follow-up, including cancer predisposition syndromes, cardiac conditions, and familial hypercholesterolemia. The majority of the identified medically actionable findings were in individuals from the European (5/379) and Americas (4/181) ancestry groups, with fewer findings in Asian (2/286) and African (1/246) ancestry groups. Our results suggest that medically relevant secondary findings can be identified in approximately 1% (12/1092) of individuals in a diverse reference sample. As clinical sequencing laboratories continue to implement the ACMG recommendations, our results highlight that at least a small number of potentially important secondary findings can be selected for return. Our results also confirm that understudied populations will not reap proportionate benefits of genomic medicine, highlighting the need for continued research efforts on genetic diseases in these populations. PMID:26332594

  8. Scaling up the 454 Titanium Library Construction and Pooling of Barcoded Libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Phung, Wilson; Hack, Christopher; Shapiro, Harris

    2009-03-23

    We have been developing a high throughput 454 library construction process at the Joint Genome Institute to meet the needs of de novo sequencing a large number of microbial and eukaryote genomes, EST, and metagenome projects. We have been focusing efforts in three areas: (1) modifying the current process to allow the construction of 454 standard libraries on a 96-well format; (2) developing a robotic platform to perform the 454 library construction; and (3) designing molecular barcodes to allow pooling and sorting of many different samples. In the development of a high throughput process to scale up the number ofmore » libraries by adapting the process to a 96-well plate format, the key process change involves the replacement of gel electrophoresis for size selection with Solid Phase Reversible Immobilization (SPRI) beads. Although the standard deviation of the insert sizes increases, the overall quality sequence and distribution of the reads in the genome has not changed. The manual process of constructing 454 shotgun libraries on 96-well plates is a time-consuming, labor-intensive, and ergonomically hazardous process; we have been experimenting to program a BioMek robot to perform the library construction. This will not only enable library construction to be completed in a single day, but will also minimize any ergonomic risk. In addition, we have implemented a set of molecular barcodes (AKA Multiple Identifiers or MID) and a pooling process that allows us to sequence many targets simultaneously. Here we will present the testing of pooling a set of selected fosmids derived from the endomycorrhizal fungus Glomus intraradices. By combining the robotic library construction process and the use of molecular barcodes, it is now possible to sequence hundreds of fosmids that represent a minimal tiling path of this genome. Here we present the progress and the challenges of developing these scaled-up processes.« less

  9. Peptide biomarkers used for the selective breeding of a complex polygenic trait in honey bees.

    PubMed

    Guarna, M Marta; Hoover, Shelley E; Huxter, Elizabeth; Higo, Heather; Moon, Kyung-Mee; Domanski, Dominik; Bixby, Miriam E F; Melathopoulos, Andony P; Ibrahim, Abdullah; Peirson, Michael; Desai, Suresh; Micholson, Derek; White, Rick; Borchers, Christoph H; Currie, Robert W; Pernal, Stephen F; Foster, Leonard J

    2017-08-21

    We present a novel way to select for highly polygenic traits. For millennia, humans have used observable phenotypes to selectively breed stronger or more productive livestock and crops. Selection on genotype, using single-nucleotide polymorphisms (SNPs) and genome profiling, is also now applied broadly in livestock breeding programs; however, selection on protein/peptide or mRNA expression markers has not yet been proven useful. Here we demonstrate the utility of protein markers to select for disease-resistant hygienic behavior in the European honey bee (Apis mellifera L.). Robust, mechanistically-linked protein expression markers, by integrating cis- and trans- effects from many genomic loci, may overcome limitations of genomic markers to allow for selection. After three generations of selection, the resulting marker-selected stock outperformed an unselected benchmark stock in terms of hygienic behavior, and had improved survival when challenged with a bacterial disease or a parasitic mite, similar to bees selected using a phenotype-based assessment for this trait. This is the first demonstration of the efficacy of protein markers for industrial selective breeding in any agricultural species, plant or animal.

  10. Genome re-sequencing reveals the history of apple and supports a two-stage model for fruit enlargement

    USDA-ARS?s Scientific Manuscript database

    Human selection has reshaped crop genomes. Here we report an apple genome variation map generated through genome sequencing of 117 diverse accessions. A comprehensive model of apple speciation and domestication along the Silk Road was proposed based on evidence from diverse genomic analyses. Cultiva...

  11. A genome-wide scan for signatures of selection in Chinese indigenous and commercial pig breeds.

    PubMed

    Yang, Songbai; Li, Xiuling; Li, Kui; Fan, Bin; Tang, Zhonglin

    2014-01-15

    Modern breeding and artificial selection play critical roles in pig domestication and shape the genetic variation of different breeds. China has many indigenous pig breeds with various characteristics in morphology and production performance that differ from those of foreign commercial pig breeds. However, the signatures of selection on genes implying for economic traits between Chinese indigenous and commercial pigs have been poorly understood. We identified footprints of positive selection at the whole genome level, comprising 44,652 SNPs genotyped in six Chinese indigenous pig breeds, one developed breed and two commercial breeds. An empirical genome-wide distribution of Fst (F-statistics) was constructed based on estimations of Fst for each SNP across these nine breeds. We detected selection at the genome level using the High-Fst outlier method and found that 81 candidate genes show high evidence of positive selection. Furthermore, the results of network analyses showed that the genes that displayed evidence of positive selection were mainly involved in the development of tissues and organs, and the immune response. In addition, we calculated the pairwise Fst between Chinese indigenous and commercial breeds (CHN VS EURO) and between Northern and Southern Chinese indigenous breeds (Northern VS Southern). The IGF1R and ESR1 genes showed evidence of positive selection in the CHN VS EURO and Northern VS Southern groups, respectively. In this study, we first identified the genomic regions that showed evidences of selection between Chinese indigenous and commercial pig breeds using the High-Fst outlier method. These regions were found to be involved in the development of tissues and organs, the immune response, growth and litter size. The results of this study provide new insights into understanding the genetic variation and domestication in pigs.

  12. A genome-wide scan for signatures of selection in Chinese indigenous and commercial pig breeds

    PubMed Central

    2014-01-01

    Background Modern breeding and artificial selection play critical roles in pig domestication and shape the genetic variation of different breeds. China has many indigenous pig breeds with various characteristics in morphology and production performance that differ from those of foreign commercial pig breeds. However, the signatures of selection on genes implying for economic traits between Chinese indigenous and commercial pigs have been poorly understood. Results We identified footprints of positive selection at the whole genome level, comprising 44,652 SNPs genotyped in six Chinese indigenous pig breeds, one developed breed and two commercial breeds. An empirical genome-wide distribution of Fst (F-statistics) was constructed based on estimations of Fst for each SNP across these nine breeds. We detected selection at the genome level using the High-Fst outlier method and found that 81 candidate genes show high evidence of positive selection. Furthermore, the results of network analyses showed that the genes that displayed evidence of positive selection were mainly involved in the development of tissues and organs, and the immune response. In addition, we calculated the pairwise Fst between Chinese indigenous and commercial breeds (CHN VS EURO) and between Northern and Southern Chinese indigenous breeds (Northern VS Southern). The IGF1R and ESR1 genes showed evidence of positive selection in the CHN VS EURO and Northern VS Southern groups, respectively. Conclusions In this study, we first identified the genomic regions that showed evidences of selection between Chinese indigenous and commercial pig breeds using the High-Fst outlier method. These regions were found to be involved in the development of tissues and organs, the immune response, growth and litter size. The results of this study provide new insights into understanding the genetic variation and domestication in pigs. PMID:24422716

  13. Human pigmentation genes under environmental selection

    PubMed Central

    2012-01-01

    Genome-wide association studies and comparative genomics have established major loci and specific polymorphisms affecting human skin, hair and eye color. Environmental changes have had an impact on selected pigmentation genes as populations have expanded into different regions of the globe. PMID:23110848

  14. Overview of genomic selection in dairy cattle populations

    USDA-ARS?s Scientific Manuscript database

    Genomic selection is most successful for traits recorded over many years in large populations. Holstein breeders have reference populations >10,000 proven bulls via cooperation among major countries, and countries with smaller Holstein populations can contribute additional bulls. Scandinavian red da...

  15. Primer in Genetics and Genomics, Article 2-Advancing Nursing Research With Genomic Approaches.

    PubMed

    Lee, Hyunhwa; Gill, Jessica; Barr, Taura; Yun, Sijung; Kim, Hyungsuk

    2017-03-01

    Nurses investigate reasons for variable patient symptoms and responses to treatments to inform how best to improve outcomes. Genomics has the potential to guide nursing research exploring contributions to individual variability. This article is meant to serve as an introduction to the novel methods available through genomics for addressing this critical issue and includes a review of methodological considerations for selected genomic approaches. This review presents essential concepts in genetics and genomics that will allow readers to identify upcoming trends in genomics nursing research and improve research practice. It introduces general principles of genomic research and provides an overview of the research process. It also highlights selected nursing studies that serve as clinical examples of the use of genomic technologies. Finally, the authors provide suggestions about how to apply genomic technology in nursing research along with directions for future research. Using genomic approaches in nursing research can advance the understanding of the complex pathophysiology of disease susceptibility and different patient responses to interventions. Nurses should be incorporating genomics into education, clinical practice, and research as the influence of genomics in health-care research and practice continues to grow. Nurses are also well placed to translate genomic discoveries into improved methods for patient assessment and intervention.

  16. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing

    PubMed Central

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-01-01

    Goats (Capra hircus) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat’s selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome. PMID:27989103

  17. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing.

    PubMed

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-12-01

    Goats ( Capra hircus ) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat's selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome.

  18. Positive selection on sociobiological traits in invasive fire ants.

    PubMed

    Privman, Eyal; Cohen, Pnina; Cohanim, Amir B; Riba-Grognuz, Oksana; Shoemaker, DeWayne; Keller, Laurent

    2018-06-19

    The fire ant Solenopsis invicta and its close relatives are highly invasive. Enhanced social cooperation may facilitate invasiveness in these and other invasive ant species. We investigated whether invasiveness in Solenopsis fire ants was accompanied by positive selection on sociobiological traits by applying a phylogenomics approach to infer ancient selection, and a population genomics approach to infer recent and ongoing selection in both native and introduced S. invicta populations. A combination of whole-genome sequencing of 40 haploid males and reduced-representation genomic sequencing of 112 diploid workers identified 1,758,116 and 169,682 polymorphic markers, respectively. The resulting high-resolution maps of genomic polymorphism provide high inference power to test for positive selection. Our analyses provide evidence of positive selection on putative ion channel genes, which are implicated in neurological functions, and on vitellogenin, which is a key regulator of development and caste determination. Furthermore, molecular functions implicated in pheromonal signaling have experienced recent positive selection. Genes with signatures of positive selection were significantly more often those over-expressed in workers compared with queens and males, suggesting that worker traits are under stronger selection than queen and male traits. These results provide insights into selection pressures and ongoing adaptation in an invasive social insect and support the hypothesis that sociobiological traits are under more positive selection than traits related to non-social traits in such invasive species. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  19. PRESAGE: PRivacy-preserving gEnetic testing via SoftwAre Guard Extension.

    PubMed

    Chen, Feng; Wang, Chenghong; Dai, Wenrui; Jiang, Xiaoqian; Mohammed, Noman; Al Aziz, Md Momin; Sadat, Md Nazmus; Sahinalp, Cenk; Lauter, Kristin; Wang, Shuang

    2017-07-26

    Advances in DNA sequencing technologies have prompted a wide range of genomic applications to improve healthcare and facilitate biomedical research. However, privacy and security concerns have emerged as a challenge for utilizing cloud computing to handle sensitive genomic data. We present one of the first implementations of Software Guard Extension (SGX) based securely outsourced genetic testing framework, which leverages multiple cryptographic protocols and minimal perfect hash scheme to enable efficient and secure data storage and computation outsourcing. We compared the performance of the proposed PRESAGE framework with the state-of-the-art homomorphic encryption scheme, as well as the plaintext implementation. The experimental results demonstrated significant performance over the homomorphic encryption methods and a small computational overhead in comparison to plaintext implementation. The proposed PRESAGE provides an alternative solution for secure and efficient genomic data outsourcing in an untrusted cloud by using a hybrid framework that combines secure hardware and multiple crypto protocols.

  20. BioQ: tracing experimental origins in public genomic databases using a novel data provenance model

    PubMed Central

    Saccone, Scott F.; Quan, Jiaxi; Jones, Peter L.

    2012-01-01

    Motivation: Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. Results: We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. Availability and implementation: BioQ is freely available to the public at http://bioq.saclab.net Contact: ssaccone@wustl.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22426342

  1. Potential benefits of genomic selection on genetic gain of small ruminant breeding programs.

    PubMed

    Shumbusho, F; Raoul, J; Astruc, J M; Palhiere, I; Elsen, J M

    2013-08-01

    In conventional small ruminant breeding programs, only pedigree and phenotype records are used to make selection decisions but prospects of including genomic information are now under consideration. The objective of this study was to assess the potential benefits of genomic selection on the genetic gain in French sheep and goat breeding designs of today. Traditional and genomic scenarios were modeled with deterministic methods for 3 breeding programs. The models included decisional variables related to male selection candidates, progeny testing capacity, and economic weights that were optimized to maximize annual genetic gain (AGG) of i) a meat sheep breeding program that improved a meat trait of heritability (h(2)) = 0.30 and a maternal trait of h(2) = 0.09 and ii) dairy sheep and goat breeding programs that improved a milk trait of h(2) = 0.30. Values of ±0.20 of genetic correlation between meat and maternal traits were considered to study their effects on AGG. The Bulmer effect was accounted for and the results presented here are the averages of AGG after 10 generations of selection. Results showed that current traditional breeding programs provide an AGG of 0.095 genetic standard deviation (σa) for meat and 0.061 σa for maternal trait in meat breed and 0.147 σa and 0.120 σa in sheep and goat dairy breeds, respectively. By optimizing decisional variables, the AGG with traditional selection methods increased to 0.139 σa for meat and 0.096 σa for maternal traits in meat breeding programs and to 0.174 σa and 0.183 σa in dairy sheep and goat breeding programs, respectively. With a medium-sized reference population (nref) of 2,000 individuals, the best genomic scenarios gave an AGG that was 17.9% greater than with traditional selection methods with optimized values of decisional variables for combined meat and maternal traits in meat sheep, 51.7% in dairy sheep, and 26.2% in dairy goats. The superiority of genomic schemes increased with the size of the reference population and genomic selection gave the best results when nref > 1,000 individuals for dairy breeds and nref > 2,000 individuals for meat breed. Genetic correlation between meat and maternal traits had a large impact on the genetic gain of both traits. Changes in AGG due to correlation were greatest for low heritable maternal traits. As a general rule, AGG was increased both by optimizing selection designs and including genomic information.

  2. Theory of microbial genome evolution

    NASA Astrophysics Data System (ADS)

    Koonin, Eugene

    Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.

  3. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome. PMID:18983662

  4. Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing.

    PubMed

    Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian

    2011-01-01

    The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.

  5. Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing

    PubMed Central

    Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian

    2011-01-01

    Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928

  6. Accurate genomic predictions for BCWD resistance in rainbow trout are achieved using low-density SNP panels: Evidence that long-range LD is a major contributing factor.

    PubMed

    Vallejo, Roger L; Silva, Rafael M O; Evenhuis, Jason P; Gao, Guangtu; Liu, Sixin; Parsons, James E; Martin, Kyle E; Wiens, Gregory D; Lourenco, Daniela A L; Leeds, Timothy D; Palti, Yniv

    2018-06-05

    Previously accurate genomic predictions for Bacterial cold water disease (BCWD) resistance in rainbow trout were obtained using a medium-density single nucleotide polymorphism (SNP) array. Here, the impact of lower-density SNP panels on the accuracy of genomic predictions was investigated in a commercial rainbow trout breeding population. Using progeny performance data, the accuracy of genomic breeding values (GEBV) using 35K, 10K, 3K, 1K, 500, 300 and 200 SNP panels as well as a panel with 70 quantitative trait loci (QTL)-flanking SNP was compared. The GEBVs were estimated using the Bayesian method BayesB, single-step GBLUP (ssGBLUP) and weighted ssGBLUP (wssGBLUP). The accuracy of GEBVs remained high despite the sharp reductions in SNP density, and even with 500 SNP accuracy was higher than the pedigree-based prediction (0.50-0.56 versus 0.36). Furthermore, the prediction accuracy with the 70 QTL-flanking SNP (0.65-0.72) was similar to the panel with 35K SNP (0.65-0.71). Genomewide linkage disequilibrium (LD) analysis revealed strong LD (r 2  ≥ 0.25) spanning on average over 1 Mb across the rainbow trout genome. This long-range LD likely contributed to the accurate genomic predictions with the low-density SNP panels. Population structure analysis supported the hypothesis that long-range LD in this population may be caused by admixture. Results suggest that lower-cost, low-density SNP panels can be used for implementing genomic selection for BCWD resistance in rainbow trout breeding programs. © 2018 The Authors. This article is a U.S. Government work and is in the public domain in the USA. Journal of Animal Breeding and Genetics published by Blackwell Verlag GmbH.

  7. The genomic applications in practice and prevention network.

    PubMed

    Khoury, Muin J; Feero, W Gregory; Reyes, Michele; Citrin, Toby; Freedman, Andrew; Leonard, Debra; Burke, Wylie; Coates, Ralph; Croyle, Robert T; Edwards, Karen; Kardia, Sharon; McBride, Colleen; Manolio, Teri; Randhawa, Gurvaneet; Rasooly, Rebekah; St Pierre, Jeannette; Terry, Sharon

    2009-07-01

    The authors describe the rationale and initial development of a new collaborative initiative, the Genomic Applications in Practice and Prevention Network. The network convened by the Centers for Disease Control and Prevention and the National Institutes of Health includes multiple stakeholders from academia, government, health care, public health, industry and consumers. The premise of Genomic Applications in Practice and Prevention Network is that there is an unaddressed chasm between gene discoveries and demonstration of their clinical validity and utility. This chasm is due to the lack of readily accessible information about the utility of most genomic applications and the lack of necessary knowledge by consumers and providers to implement what is known. The mission of Genomic Applications in Practice and Prevention Network is to accelerate and streamline the effective integration of validated genomic knowledge into the practice of medicine and public health, by empowering and sponsoring research, evaluating research findings, and disseminating high quality information on candidate genomic applications in practice and prevention. Genomic Applications in Practice and Prevention Network will develop a process that links ongoing collection of information on candidate genomic applications to four crucial domains: (1) knowledge synthesis and dissemination for new and existing technologies, and the identification of knowledge gaps, (2) a robust evidence-based recommendation development process, (3) translation research to evaluate validity, utility and impact in the real world and how to disseminate and implement recommended genomic applications, and (4) programs to enhance practice, education, and surveillance.

  8. A Selective Review of Group Selection in High-Dimensional Models

    PubMed Central

    Huang, Jian; Breheny, Patrick; Ma, Shuangge

    2013-01-01

    Grouping structures arise naturally in many statistical modeling problems. Several methods have been proposed for variable selection that respect grouping structure in variables. Examples include the group LASSO and several concave group selection methods. In this article, we give a selective review of group selection concerning methodological developments, theoretical properties and computational algorithms. We pay particular attention to group selection methods involving concave penalties. We address both group selection and bi-level selection methods. We describe several applications of these methods in nonparametric additive models, semiparametric regression, seemingly unrelated regressions, genomic data analysis and genome wide association studies. We also highlight some issues that require further study. PMID:24174707

  9. Methods to address poultry robustness and welfare issues through breeding and associated ethical considerations

    PubMed Central

    Muir, William M.; Cheng, Heng-Wei; Croney, Candace

    2014-01-01

    As consumers and society in general become more aware of ethical and moral dilemmas associated with intensive rearing systems, pressure is put on the animal and poultry industries to adopt alternative forms of housing. This presents challenges especially regarding managing competitive social interactions between animals. However, selective breeding programs are rapidly advancing, enhanced by both genomics and new quantitative genetic theory that offer potential solutions by improving adaptation of the bird to existing and proposed production environments. The outcomes of adaptation could lead to improvement of animal welfare by increasing fitness of the animal for the given environments, which might lead to increased contentment and decreased distress of birds in those systems. Genomic selection, based on dense genetic markers, will allow for more rapid improvement of traits that are expensive or difficult to measure, or have a low heritability, such as pecking, cannibalism, robustness, mortality, leg score, bone strength, disease resistance, and thus has the potential to address many poultry welfare concerns. Recently selection programs to include social effects, known as associative or indirect genetic effects (IGEs), have received much attention. Group, kin, multi-level, and multi-trait selection including IGEs have all been shown to be highly effective in reducing mortality while increasing productivity of poultry layers and reduce or eliminate the need for beak trimming. Multi-level selection was shown to increases robustness as indicated by the greater ability of birds to cope with stressors. Kin selection has been shown to be easy to implement and improve both productivity and animal well-being. Management practices and rearing conditions employed for domestic animal production will continue to change based on ethical and scientific results. However, the animal breeding tools necessary to provide an animal that is best adapted to these changing conditions are readily available and should be used, which will ultimately lead to the best possible outcomes for all impacted. PMID:25505483

  10. Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture

    USDA-ARS?s Scientific Manuscript database

    Previously we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  11. A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods

    PubMed Central

    Ratcliffe, B; El-Dien, O G; Klápště, J; Porth, I; Chen, C; Jaquish, B; El-Kassaby, Y A

    2015-01-01

    Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3–40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31–0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04–0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated. PMID:26126540

  12. A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods.

    PubMed

    Ratcliffe, B; El-Dien, O G; Klápště, J; Porth, I; Chen, C; Jaquish, B; El-Kassaby, Y A

    2015-12-01

    Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.

  13. Conditions for the Evolution of Gene Clusters in Bacterial Genomes

    PubMed Central

    Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.

    2010-01-01

    Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992

  14. Multimedia Presentations on the Human Genome: Implementation and Assessment of a Teaching Program for the Introduction to Genome Science Using a Poster and Animations

    ERIC Educational Resources Information Center

    Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto

    2008-01-01

    Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from…

  15. No evidence that sex and transposable elements drive genome size variation in evening primroses.

    PubMed

    Ågren, J Arvid; Greiner, Stephan; Johnson, Marc T J; Wright, Stephen I

    2015-04-01

    Genome size varies dramatically across species, but despite an abundance of attention there is little agreement on the relative contributions of selective and neutral processes in governing this variation. The rate of sex can potentially play an important role in genome size evolution because of its effect on the efficacy of selection and transmission of transposable elements (TEs). Here, we used a phylogenetic comparative approach and whole genome sequencing to investigate the contribution of sex and TE content to genome size variation in the evening primrose (Oenothera) genus. We determined genome size using flow cytometry for 30 species that vary in genetic system and find that variation in sexual/asexual reproduction cannot explain the almost twofold variation in genome size. Moreover, using whole genome sequences of three species of varying genome sizes and reproductive system, we found that genome size was not associated with TE abundance; instead the larger genomes had a higher abundance of simple sequence repeats. Although it has long been clear that sexual reproduction may affect various aspects of genome evolution in general and TE evolution in particular, it does not appear to have played a major role in genome size evolution in the evening primroses. © 2015 The Author(s).

  16. A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.

    PubMed

    Wang, Tingting; Chen, Yi-Ping Phoebe; Bowman, Phil J; Goddard, Michael E; Hayes, Ben J

    2016-09-21

    Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets.

  17. Relative extended haplotype homozygosity signals across breeds reveal dairy and beef specific signatures of selection.

    PubMed

    Bomba, Lorenzo; Nicolazzi, Ezequiel L; Milanesi, Marco; Negrini, Riccardo; Mancini, Giordano; Biscarini, Filippo; Stella, Alessandra; Valentini, Alessio; Ajmone-Marsan, Paolo

    2015-04-02

    A number of methods are available to scan a genome for selection signatures by evaluating patterns of diversity within and between breeds. Among these, "extended haplotype homozygosity" (EHH) is a reliable approach to detect genome regions under recent selective pressure. The objective of this study was to use this approach to identify regions that are under recent positive selection and shared by the most representative Italian dairy and beef cattle breeds. A total of 3220 animals from Italian Holstein (2179), Italian Brown (775), Simmental (493), Marchigiana (485) and Piedmontese (379) breeds were genotyped with the Illumina BovineSNP50 BeadChip v.1. After standard quality control procedures, genotypes were phased and core haplotypes were identified. The decay of linkage disequilibrium (LD) for each core haplotype was assessed by measuring the EHH. Since accurate estimates of local recombination rates were not available, relative EHH (rEHH) was calculated for each core haplotype. Genomic regions that carry frequent core haplotypes and with significant rEHH values were considered as candidates for recent positive selection. Candidate regions were aligned across to identify signals shared by dairy or beef cattle breeds. Overall, 82 and 87 common regions were detected among dairy and beef cattle breeds, respectively. Bioinformatic analysis identified 244 and 232 genes in these common genomic regions. Gene annotation and pathway analysis showed that these genes are involved in molecular functions that are biologically related to milk or meat production. Our results suggest that a multi-breed approach can lead to the identification of genomic signatures in breeds of cattle that are selected for the same production goal and thus to the localisation of genomic regions of interest in dairy and beef production.

  18. Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes

    PubMed Central

    Deschamps, Matthieu; Laval, Guillaume; Fagny, Maud; Itan, Yuval; Abel, Laurent; Casanova, Jean-Laurent; Patin, Etienne; Quintana-Murci, Lluis

    2016-01-01

    Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000–13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change—containing variation acquired from archaic hominins or adaptive variants in specific populations—improving our understanding of the relative biological importance of innate immunity pathways in natural conditions. PMID:26748513

  19. Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.

    PubMed

    Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen

    2017-12-27

    Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.

  20. Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies.

    PubMed

    Calus, M P L; de Haas, Y; Veerkamp, R F

    2013-10-01

    Genomic selection holds the promise to be particularly beneficial for traits that are difficult or expensive to measure, such that access to phenotypes on large daughter groups of bulls is limited. Instead, cow reference populations can be generated, potentially supplemented with existing information from the same or (highly) correlated traits available on bull reference populations. The objective of this study, therefore, was to develop a model to perform genomic predictions and genome-wide association studies based on a combined cow and bull reference data set, with the accuracy of the phenotypes differing between the cow and bull genomic selection reference populations. The developed bivariate Bayesian stochastic search variable selection model allowed for an unbalanced design by imputing residuals in the residual updating scheme for all missing records. The performance of this model is demonstrated on a real data example, where the analyzed trait, being milk fat or protein yield, was either measured only on a cow or a bull reference population, or recorded on both. Our results were that the developed bivariate Bayesian stochastic search variable selection model was able to analyze 2 traits, even though animals had measurements on only 1 of 2 traits. The Bayesian stochastic search variable selection model yielded consistently higher accuracy for fat yield compared with a model without variable selection, both for the univariate and bivariate analyses, whereas the accuracy of both models was very similar for protein yield. The bivariate model identified several additional quantitative trait loci peaks compared with the single-trait models on either trait. In addition, the bivariate models showed a marginal increase in accuracy of genomic predictions for the cow traits (0.01-0.05), although a greater increase in accuracy is expected as the size of the bull population increases. Our results emphasize that the chosen value of priors in Bayesian genomic prediction models are especially important in small data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  1. A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities.

    PubMed

    Mbogning, Cyprien; Perdry, Hervé; Toussile, Wilson; Broët, Philippe

    2014-01-01

    Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected. We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy. The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns. The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.

  2. ChIP-seq.

    PubMed

    Kim, Tae Hoon; Dekker, Job

    2018-05-01

    Owing to its digital nature, ChIP-seq has become the standard method for genome-wide ChIP analysis. Using next-generation sequencing platforms (notably the Illumina Genome Analyzer), millions of short sequence reads can be obtained. The densities of recovered ChIP sequence reads along the genome are used to determine the binding sites of the protein. Although a relatively small amount of ChIP DNA is required for ChIP-seq, the current sequencing platforms still require amplification of the ChIP DNA by ligation-mediated PCR (LM-PCR). This protocol, which involves linker ligation followed by size selection, is the standard ChIP-seq protocol using an Illumina Genome Analyzer. The size-selected ChIP DNA is amplified by LM-PCR and size-selected for the second time. The purified ChIP DNA is then loaded into the Genome Analyzer. The ChIP DNA can also be processed in parallel for ChIP-chip results. © 2018 Cold Spring Harbor Laboratory Press.

  3. Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web

    PubMed Central

    Miller, Chase A.; Anthony, Jon; Meyer, Michelle M.; Marth, Gabor

    2013-01-01

    Motivation: High-throughput biological research requires simultaneous visualization as well as analysis of genomic data, e.g. read alignments, variant calls and genomic annotations. Traditionally, such integrative analysis required desktop applications operating on locally stored data. Many current terabyte-size datasets generated by large public consortia projects, however, are already only feasibly stored at specialist genome analysis centers. As even small laboratories can afford very large datasets, local storage and analysis are becoming increasingly limiting, and it is likely that most such datasets will soon be stored remotely, e.g. in the cloud. These developments will require web-based tools that enable users to access, analyze and view vast remotely stored data with a level of sophistication and interactivity that approximates desktop applications. As rapidly dropping cost enables researchers to collect data intended to answer questions in very specialized contexts, developers must also provide software libraries that empower users to implement customized data analyses and data views for their particular application. Such specialized, yet lightweight, applications would empower scientists to better answer specific biological questions than possible with general-purpose genome browsers currently available. Results: Using recent advances in core web technologies (HTML5), we developed Scribl, a flexible genomic visualization library specifically targeting coordinate-based data such as genomic features, DNA sequence and genetic variants. Scribl simplifies the development of sophisticated web-based graphical tools that approach the dynamism and interactivity of desktop applications. Availability and implementation: Software is freely available online at http://chmille4.github.com/Scribl/ and is implemented in JavaScript with all modern browsers supported. Contact: gabor.marth@bc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23172864

  4. Expanding RN Scope of Knowledge-Genetics/Genomics: The New Frontier.

    PubMed

    Rogers, Margaret A; Lizer, Shannon; Doughty, Andrea; Hayden, Beth; Klein, Colleen J

    Introducing a new competency into nursing practice requires the support of nursing leadership. A knowledge survey was used to assess nurses' knowledge following a yearlong genomics education initiative. Findings indicate that nurses benefit from repeated exposure to genetics-related content. Recommendations from this study include development and implementation of strategies that can be used to prepare nurses at all levels for the application of genetics and genomics. Clinical nurses with knowledge of genetics will be able to implement evidence-based interventions to manage acute and chronic illnesses. These nurses will then be able to engage patients more fully, thereby helping them to understand the relationship of genetics to healthy outcomes.

  5. Genomic analysis and selected molecular pathways in rare cancers

    NASA Astrophysics Data System (ADS)

    Liu, Stephen V.; Lenkiewicz, Elizabeth; Evers, Lisa; Holley, Tara; Kiefer, Jeffrey; Ruiz, Christian; Glatz, Katharina; Bubendorf, Lukas; Demeure, Michael J.; Eng, Cathy; Ramanathan, Ramesh K.; Von Hoff, Daniel D.; Barrett, Michael T.

    2012-12-01

    It is widely accepted that many cancers arise as a result of an acquired genomic instability and the subsequent evolution of tumor cells with variable patterns of selected and background aberrations. The presence and behaviors of distinct neoplastic cell populations within a patient's tumor may underlie multiple clinical phenotypes in cancers. A goal of many current cancer genome studies is the identification of recurring selected driver events that can be advanced for the development of personalized therapies. Unfortunately, in the majority of rare tumors, this type of analysis can be particularly challenging. Large series of specimens for analysis are simply not available, allowing recurring patterns to remain hidden. In this paper, we highlight the use of DNA content-based flow sorting to identify and isolate DNA-diploid and DNA-aneuploid populations from tumor biopsies as a strategy to comprehensively study the genomic composition and behaviors of individual cancers in a series of rare solid tumors: intrahepatic cholangiocarcinoma, anal carcinoma, adrenal leiomyosarcoma, and pancreatic neuroendocrine tumors. We propose that the identification of highly selected genomic events in distinct tumor populations within each tumor can identify candidate driver events that can facilitate the development of novel, personalized treatment strategies for patients with cancer.

  6. Vertebrate codon bias indicates a highly GC-rich ancestral genome.

    PubMed

    Nabiyouni, Maryam; Prakash, Ashwin; Fedorov, Alexei

    2013-04-25

    Two factors are thought to have contributed to the origin of codon usage bias in eukaryotes: 1) genome-wide mutational forces that shape overall GC-content and create context-dependent nucleotide bias, and 2) positive selection for codons that maximize efficient and accurate translation. Particularly in vertebrates, these two explanations contradict each other and cloud the origin of codon bias in the taxon. On the one hand, mutational forces fail to explain GC-richness (~60%) of third codon positions, given the GC-poor overall genomic composition among vertebrates (~40%). On the other hand, positive selection cannot easily explain strict regularities in codon preferences. Large-scale bioinformatic assessment, of nucleotide composition of coding and non-coding sequences in vertebrates and other taxa, suggests a simple possible resolution for this contradiction. Specifically, we propose that the last common vertebrate ancestor had a GC-rich genome (~65% GC). The data suggest that whole-genome mutational bias is the major driving force for generating codon bias. As the bias becomes prominent, it begins to affect translation and can result in positive selection for optimal codons. The positive selection can, in turn, significantly modulate codon preferences. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Genomic analysis and selected molecular pathways in rare cancers.

    PubMed

    Liu, Stephen V; Lenkiewicz, Elizabeth; Evers, Lisa; Holley, Tara; Kiefer, Jeffrey; Ruiz, Christian; Glatz, Katharina; Bubendorf, Lukas; Demeure, Michael J; Eng, Cathy; Ramanathan, Ramesh K; Von Hoff, Daniel D; Barrett, Michael T

    2012-12-01

    It is widely accepted that many cancers arise as a result of an acquired genomic instability and the subsequent evolution of tumor cells with variable patterns of selected and background aberrations. The presence and behaviors of distinct neoplastic cell populations within a patient's tumor may underlie multiple clinical phenotypes in cancers. A goal of many current cancer genome studies is the identification of recurring selected driver events that can be advanced for the development of personalized therapies. Unfortunately, in the majority of rare tumors, this type of analysis can be particularly challenging. Large series of specimens for analysis are simply not available, allowing recurring patterns to remain hidden. In this paper, we highlight the use of DNA content-based flow sorting to identify and isolate DNA-diploid and DNA-aneuploid populations from tumor biopsies as a strategy to comprehensively study the genomic composition and behaviors of individual cancers in a series of rare solid tumors: intrahepatic cholangiocarcinoma, anal carcinoma, adrenal leiomyosarcoma, and pancreatic neuroendocrine tumors. We propose that the identification of highly selected genomic events in distinct tumor populations within each tumor can identify candidate driver events that can facilitate the development of novel, personalized treatment strategies for patients with cancer.

  8. Rapid Evolutionary Rates and Unique Genomic Signatures Discovered in the First Reference Genome for the Southern Ocean Salp, Salpa thompsoni (Urochordata, Thaliacea)

    PubMed Central

    Jue, Nathaniel K.; Batta-Lona, Paola G.; Trusiak, Sarah; Obergfell, Craig; Bucklin, Ann; O’Neill, Michael J.; O’Neill, Rachel J.

    2016-01-01

    A preliminary genome sequence has been assembled for the Southern Ocean salp, Salpa thompsoni (Urochordata, Thaliacea). Despite the ecological importance of this species in Antarctic pelagic food webs and its potential role as an indicator of changing Southern Ocean ecosystems in response to climate change, no genomic resources are available for S. thompsoni or any closely related urochordate species. Using a multiple-platform, multiple-individual approach, we have produced a 318,767,936-bp genome sequence, covering >50% of the estimated 602 Mb (±173 Mb) genome size for S. thompsoni. Using a nonredundant set of predicted proteins, >50% (16,823) of sequences showed significant homology to known proteins and ∼38% (12,151) of the total protein predictions were associated with Gene Ontology functional information. We have generated 109,958 SNP variant and 9,782 indel predictions for this species, serving as a resource for future phylogenomic and population genetic studies. Comparing the salp genome to available assemblies for four other urochordates, Botryllus schlosseri, Ciona intestinalis, Ciona savignyi and Oikopleura dioica, we found that S. thompsoni shares the previously estimated rapid rates of evolution for these species. High mutation rates are thus independent of genome size, suggesting that rates of evolution >1.5 times that observed for vertebrates are a broad taxonomic characteristic of urochordates. Tests for positive selection implemented in PAML revealed a small number of genes with sites undergoing rapid evolution, including genes involved in ribosome biogenesis and metabolic and immune process that may be reflective of both adaptation to polar, planktonic environments as well as the complex life history of the salps. Finally, we performed an initial survey of small RNAs, revealing the presence of known, conserved miRNAs, as well as novel miRNA genes; unique piRNAs; and mature miRNA signatures for varying developmental stages. Collectively, these resources provide a genomic foundation supporting S. thompsoni as a model species for further examination of the exceptional rates and patterns of genomic evolution shown by urochordates. Additionally, genomic data will allow for the development of molecular indicators of key life history events and processes and afford new understandings and predictions of impacts of climate change on this key species of Antarctic pelagic ecosystems. PMID:27624472

  9. Comparative functional pan-genome analyses to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon metabolism in the genus Mycobacterium.

    PubMed

    Kweon, Ohgew; Kim, Seong-Jae; Blom, Jochen; Kim, Sung-Kwan; Kim, Bong-Soo; Baek, Dong-Heon; Park, Su Inn; Sutherland, John B; Cerniglia, Carl E

    2015-02-14

    The bacterial genus Mycobacterium is of great interest in the medical and biotechnological fields. Despite a flood of genome sequencing and functional genomics data, significant gaps in knowledge between genome and phenome seriously hinder efforts toward the treatment of mycobacterial diseases and practical biotechnological applications. In this study, we propose the use of systematic, comparative functional pan-genomic analysis to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon (PAH) metabolism in the genus Mycobacterium. Phylogenetic, phenotypic, and genomic information for 27 completely genome-sequenced mycobacteria was systematically integrated to reconstruct a mycobacterial phenotype network (MPN) with a pan-genomic concept at a network level. In the MPN, mycobacterial phenotypes show typical scale-free relationships. PAH degradation is an isolated phenotype with the lowest connection degree, consistent with phylogenetic and environmental isolation of PAH degraders. A series of functional pan-genomic analyses provide conserved and unique types of genomic evidence for strong epistatic and pleiotropic impacts on evolutionary trajectories of the PAH-degrading phenotype. Under strong natural selection, the detailed gene gain/loss patterns from horizontal gene transfer (HGT)/deletion events hypothesize a plausible evolutionary path, an epistasis-based birth and pleiotropy-dependent death, for PAH metabolism in the genus Mycobacterium. This study generated a practical mycobacterial compendium of phenotypic and genomic changes, focusing on the PAH-degrading phenotype, with a pan-genomic perspective of the evolutionary events and the environmental challenges. Our findings suggest that when selection acts on PAH metabolism, only a small fraction of possible trajectories is likely to be observed, owing mainly to a combination of the ambiguous phenotypic effects of PAHs and the corresponding pleiotropy- and epistasis-dependent evolutionary adaptation. Evolutionary constraints on the selection of trajectories, like those seen in PAH-degrading phenotypes, are likely to apply to the evolution of other phenotypes in the genus Mycobacterium.

  10. Avian Disease and Oncology Laboratory (ADOL) research update

    USDA-ARS?s Scientific Manuscript database

    GENOMICS To meet the growing demands of consumers, the poultry industry will need to continue to improve methods of selection in breeding programs for production and associated traits. One possible solution is genome-wide marker-assisted selection (GWMAS). In brief, evenly-spaced genetic markers s...

  11. A genome-wide scan for selection signatures in Nelore cattle

    USDA-ARS?s Scientific Manuscript database

    Brazilian Nelore cattle have been selected for growth traits over more than four decades. In recent years, reproductive and meat quality traits have become more important because of increasing consumption, exports and consumer demand. The identification of genomic regions altered by artificial selec...

  12. Genotype imputation in a tropical crossbred dairy cattle population

    USDA-ARS?s Scientific Manuscript database

    The application of new tools, such as genomic selection and genotype imputation, still presents challenges in crossbred populations because relationships of causal variants with markers may vary across breeds. In order to make genomic selection more cost effective, cheap low density chips are often ...

  13. Selecting sequence variants to improve genomic predictions for dairy cattle

    USDA-ARS?s Scientific Manuscript database

    Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...

  14. Relaxation of selective constraint on dog mitochondrial DNA following domestication.

    PubMed

    Björnerfeldt, Susanne; Webster, Matthew T; Vilà, Carles

    2006-08-01

    The domestication of dogs caused a dramatic change in their way of life compared with that of their ancestor, the gray wolf. We hypothesize that this new life style changed the selective forces that acted upon the species, which in turn had an effect on the dog's genome. We sequenced the complete mitochondrial DNA genome in 14 dogs, six wolves, and three coyotes. Here we show that dogs have accumulated nonsynonymous changes in mitochondrial genes at a faster rate than wolves, leading to elevated levels of variation in their proteins. This suggests that a major consequence of domestication in dogs was a general relaxation of selective constraint on their mitochondrial genome. If this change also affected other parts of the dog genome, it could have facilitated the generation of novel functional genetic diversity. This diversity could thus have contributed raw material upon which artificial selection has shaped modern breeds and may therefore be an important source of the extreme phenotypic variation present in modern-day dogs.

  15. A high-density SNP genetic linkage map for the silver-lipped pearl oyster, Pinctada maxima: a valuable resource for gene localisation and marker-assisted selection.

    PubMed

    Jones, David B; Jerry, Dean R; Khatkar, Mehar S; Raadsma, Herman W; Zenger, Kyall R

    2013-11-20

    The silver-lipped pearl oyster, Pinctada maxima, is an important tropical aquaculture species extensively farmed for the highly sought "South Sea" pearls. Traditional breeding programs have been initiated for this species in order to select for improved pearl quality, but many economic traits under selection are complex, polygenic and confounded with environmental factors, limiting the accuracy of selection. The incorporation of a marker-assisted selection (MAS) breeding approach would greatly benefit pearl breeding programs by allowing the direct selection of genes responsible for pearl quality. However, before MAS can be incorporated, substantial genomic resources such as genetic linkage maps need to be generated. The construction of a high-density genetic linkage map for P. maxima is not only essential for unravelling the genomic architecture of complex pearl quality traits, but also provides indispensable information on the genome structure of pearl oysters. A total of 1,189 informative genome-wide single nucleotide polymorphisms (SNPs) were incorporated into linkage map construction. The final linkage map consisted of 887 SNPs in 14 linkage groups, spans a total genetic distance of 831.7 centimorgans (cM), and covers an estimated 96% of the P. maxima genome. Assessment of sex-specific recombination across all linkage groups revealed limited overall heterochiasmy between the sexes (i.e. 1.15:1 F/M map length ratio). However, there were pronounced localised differences throughout the linkage groups, whereby male recombination was suppressed near the centromeres compared to female recombination, but inflated towards telomeric regions. Mean values of LD for adjacent SNP pairs suggest that a higher density of markers will be required for powerful genome-wide association studies. Finally, numerous nacre biomineralization genes were localised providing novel positional information for these genes. This high-density SNP genetic map is the first comprehensive linkage map for any pearl oyster species. It provides an essential genomic tool facilitating studies investigating the genomic architecture of complex trait variation and identifying quantitative trait loci for economically important traits useful in genetic selection programs within the P. maxima pearling industry. Furthermore, this map provides a foundation for further research aiming to improve our understanding of the dynamic process of biomineralization, and pearl oyster evolution and synteny.

  16. WheatGenome.info: A Resource for Wheat Genomics Resource.

    PubMed

    Lai, Kaitao

    2016-01-01

    An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .

  17. Identification of Genomic Regions Associated with Phenotypic Variation between Dog Breeds using Selection Mapping

    PubMed Central

    Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Åke; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.

    2011-01-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease. PMID:22022279

  18. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    PubMed

    Vaysse, Amaury; Ratnakumar, Abhirami; Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H; Hansen, Mark S T; Lawley, Cindy T; Karlsson, Elinor K; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Ake; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T

    2011-10-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  19. Comparison of Marker-Based Genomic Estimated Breeding Values and Phenotypic Evaluation for Selection of Bacterial Spot Resistance in Tomato.

    PubMed

    Liabeuf, Debora; Sim, Sung-Chur; Francis, David M

    2018-03-01

    Bacterial spot affects tomato crops (Solanum lycopersicum) grown under humid conditions. Major genes and quantitative trait loci (QTL) for resistance have been described, and multiple loci from diverse sources need to be combined to improve disease control. We investigated genomic selection (GS) prediction models for resistance to Xanthomonas euvesicatoria and experimentally evaluated the accuracy of these models. The training population consisted of 109 families combining resistance from four sources and directionally selected from a population of 1,100 individuals. The families were evaluated on a plot basis in replicated inoculated trials and genotyped with single nucleotide polymorphisms (SNP). We compared the prediction ability of models developed with 14 to 387 SNP. Genomic estimated breeding values (GEBV) were derived using Bayesian least absolute shrinkage and selection operator regression (BL) and ridge regression (RR). Evaluations were based on leave-one-out cross validation and on empirical observations in replicated field trials using the next generation of inbred progeny and a hybrid population resulting from selections in the training population. Prediction ability was evaluated based on correlations between GEBV and phenotypes (r g ), percentage of coselection between genomic and phenotypic selection, and relative efficiency of selection (r g /r p ). Results were similar with BL and RR models. Models using only markers previously identified as significantly associated with resistance but weighted based on GEBV and mixed models with markers associated with resistance treated as fixed effects and markers distributed in the genome treated as random effects offered greater accuracy and a high percentage of coselection. The accuracy of these models to predict the performance of progeny and hybrids exceeded the accuracy of phenotypic selection.

  20. Hidden genetic variation in the germline genome of Tetrahymena thermophila.

    PubMed

    Dimond, K L; Zufall, R A

    2016-06-01

    Genome architecture varies greatly among eukaryotes. This diversity may profoundly affect the origin and maintenance of genetic variation within a population. Ciliates are microbial eukaryotes with unusual genome features, such as the separation of germline and somatic genomes within a single cell and amitotic division. These features have previously been proposed to increase the rate of molecular evolution in these species. Here, we assessed the fitness effects of genetic variation in the two genomes of natural isolates of the ciliate Tetrahymena thermophila. We find more extensive genetic variation in fitness in the transcriptionally silent germline genome than in the expressed somatic genome. Surprisingly, this variation is not primarily deleterious, but has both beneficial and deleterious effects. We conclude that Tetrahymena genome architecture allows for the maintenance of genetic variation that would otherwise be eliminated by selection. We consider the effect of selection on the two genomes and the impacts of reproductive strategies and the mechanism of sex determination on the structure of this variation. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  1. Analysis of base and codon usage by rubella virus.

    PubMed

    Zhou, Yumei; Chen, Xianfeng; Ushijima, Hiroshi; Frey, Teryl K

    2012-05-01

    Rubella virus (RUBV), a small, plus-strand RNA virus that is an important human pathogen, has the unique feature that the GC content of its genome (70%) is the highest (by 20%) among RNA viruses. To determine the effect of this GC content on genomic evolution, base and codon usage were analyzed across viruses from eight diverse genotypes of RUBV. Despite differences in frequency of codon use, the favored codons in the RUBV genome matched those in the human genome for 18 of the 20 amino acids, indicating adaptation to the host. Although usage patterns were conserved in corresponding genes in the diverse genotypes, within-genome comparison revealed that both base and codon usages varied regionally, particularly in the hypervariable region (HVR) of the P150 replicase gene. While directional mutation pressure was predominant in determining base and codon usage within most of the genome (with the strongest tendency being towards C's at third codon positions), natural selection was predominant in the HVR region. The GC content of this region was the highest in the genome (>80%), and it was not clear if selection at the nucleotide level accompanied selection at the amino acid level. Dinucleotide frequency analysis of the RUBV genome revealed that TpA usage was lower than expected, similar to mammalian genes; however, CpG usage was not suppressed, and TpG usage was not enhanced, as is the case in mammalian genes.

  2. A Gene-Oriented Haplotype Comparison Reveals Recently Selected Genomic Regions in Temperate and Tropical Maize Germplasm

    PubMed Central

    Zhang, Jie; Li, Yongxiang; Zheng, Jun; Zhang, Hongwei; Yang, Xiaohong; Wang, Jianhua; Wang, Guoying

    2017-01-01

    The extensive genetic variation present in maize (Zea mays) germplasm makes it possible to detect signatures of positive artificial selection that occurred during temperate and tropical maize improvement. Here we report an analysis of 532,815 polymorphisms from a maize association panel consisting of 368 diverse temperate and tropical inbred lines. We developed a gene-oriented approach adapting exonic polymorphisms to identify recently selected alleles by comparing haplotypes across the maize genome. This analysis revealed evidence of selection for more than 1100 genomic regions during recent improvement, and included regulatory genes and key genes with visible mutant phenotypes. We find that selected candidate target genes in temperate maize are enriched in biosynthetic processes, and further examination of these candidates highlights two cases, sucrose flux and oil storage, in which multiple genes in a common pathway can be cooperatively selected. Finally, based on available parallel gene expression data, we hypothesize that some genes were selected for regulatory variations, resulting in altered gene expression. PMID:28099470

  3. Tracking footprints of artificial selection in the dog genome.

    PubMed

    Akey, Joshua M; Ruhe, Alison L; Akey, Dayna T; Wong, Aaron K; Connelly, Caitlin F; Madeoy, Jennifer; Nicholas, Thomas J; Neff, Mark W

    2010-01-19

    The size, shape, and behavior of the modern domesticated dog has been sculpted by artificial selection for at least 14,000 years. The genetic substrates of selective breeding, however, remain largely unknown. Here, we describe a genome-wide scan for selection in 275 dogs from 10 phenotypically diverse breeds that were genotyped for over 21,000 autosomal SNPs. We identified 155 genomic regions that possess strong signatures of recent selection and contain candidate genes for phenotypes that vary most conspicuously among breeds, including size, coat color and texture, behavior, skeletal morphology, and physiology. In addition, we demonstrate a significant association between HAS2 and skin wrinkling in the Shar-Pei, and provide evidence that regulatory evolution has played a prominent role in the phenotypic diversification of modern dog breeds. Our results provide a first-generation map of selection in the dog, illustrate how such maps can rapidly inform the genetic basis of canine phenotypic variation, and provide a framework for delineating the mechanistic basis of how artificial selection promotes rapid and pronounced phenotypic evolution.

  4. Automated ensemble assembly and validation of microbial genomes.

    PubMed

    Koren, Sergey; Treangen, Todd J; Hill, Christopher M; Pop, Mihai; Phillippy, Adam M

    2014-05-03

    The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.

  5. PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

    PubMed Central

    Sarika; Arora, Vasu; Iquebal, M. A.; Rai, Anil; Kumar, Dinesh

    2013-01-01

    Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on ‘three-tier architecture’ that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers’ search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/ PMID:23396298

  6. PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.

    PubMed

    Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh

    2013-01-01

    Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/

  7. Extent of Linkage Disequilibrium and Effective Population Size in Four South African Sanga Cattle Breeds.

    PubMed

    Makina, Sithembile O; Taylor, Jeremy F; van Marle-Köster, Este; Muchadeyi, Farai C; Makgahlela, Mahlako L; MacNeil, Michael D; Maiwashe, Azwihangwisi

    2015-01-01

    Knowledge on the extent of linkage disequilibrium (LD) in livestock populations is essential to determine the minimum distance between markers required for effective coverage when conducting genome-wide association studies (GWAS). This study evaluated the extent of LD, persistence of allelic phase and effective population size (Ne) for four Sanga cattle breeds in South Africa including the Afrikaner (n = 44), Nguni (n = 54), Drakensberger (n = 47), and Bonsmara breeds (n = 46), using Angus (n = 31) and Holstein (n = 29) as reference populations. We found that moderate LD extends up to inter-marker distances of 40-60 kb in Angus (0.21) and Holstein (0.21) and up to 100 kb in Afrikaner (0.20). This suggests that genomic selection and association studies performed within these breeds using an average inter-marker r (2)≥ 0.20 would require about 30,000-50,000 SNPs. However, r (2)≥ 0.20 extended only up to 10-20 kb in the Nguni and Drakensberger and 20-40 kb in the Bonsmara indicating that 75,000 to 150,000 SNPs would be necessary for GWAS in these breeds. Correlation between alleles at contiguous loci indicated that phase was not strongly preserved between breeds. This suggests the need for breed-specific reference populations in which a much greater density of markers should be scored to identify breed specific haplotypes which may then be imputed into multi-breed commercial populations. Analysis of effective population size based on the extent of LD, revealed Ne = 95 (Nguni), Ne = 87 (Drakensberger), Ne = 77 (Bonsmara), and Ne = 41 (Afrikaner). Results of this study form the basis for implementation of genomic selection programs in the Sanga breeds of South Africa.

  8. Extent of Linkage Disequilibrium and Effective Population Size in Four South African Sanga Cattle Breeds

    PubMed Central

    Makina, Sithembile O.; Taylor, Jeremy F.; van Marle-Köster, Este; Muchadeyi, Farai C.; Makgahlela, Mahlako L.; MacNeil, Michael D.; Maiwashe, Azwihangwisi

    2015-01-01

    Knowledge on the extent of linkage disequilibrium (LD) in livestock populations is essential to determine the minimum distance between markers required for effective coverage when conducting genome-wide association studies (GWAS). This study evaluated the extent of LD, persistence of allelic phase and effective population size (Ne) for four Sanga cattle breeds in South Africa including the Afrikaner (n = 44), Nguni (n = 54), Drakensberger (n = 47), and Bonsmara breeds (n = 46), using Angus (n = 31) and Holstein (n = 29) as reference populations. We found that moderate LD extends up to inter-marker distances of 40–60 kb in Angus (0.21) and Holstein (0.21) and up to 100 kb in Afrikaner (0.20). This suggests that genomic selection and association studies performed within these breeds using an average inter-marker r2≥ 0.20 would require about 30,000–50,000 SNPs. However, r2≥ 0.20 extended only up to 10–20 kb in the Nguni and Drakensberger and 20–40 kb in the Bonsmara indicating that 75,000 to 150,000 SNPs would be necessary for GWAS in these breeds. Correlation between alleles at contiguous loci indicated that phase was not strongly preserved between breeds. This suggests the need for breed-specific reference populations in which a much greater density of markers should be scored to identify breed specific haplotypes which may then be imputed into multi-breed commercial populations. Analysis of effective population size based on the extent of LD, revealed Ne = 95 (Nguni), Ne = 87 (Drakensberger), Ne = 77 (Bonsmara), and Ne = 41 (Afrikaner). Results of this study form the basis for implementation of genomic selection programs in the Sanga breeds of South Africa. PMID:26648975

  9. Genome-wide analysis reveals signatures of selection for important traits in domestic sheep from different ecoregions.

    PubMed

    Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin

    2016-11-03

    Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.

  10. Recent horizontal transfer of mellifera subfamily mariner transposons into insect lineages representing four different orders shows that selection acts only during horizontal transfer.

    PubMed

    Lampe, David J; Witherspoon, David J; Soto-Adames, Felipe N; Robertson, Hugh M

    2003-04-01

    We report the isolation and sequencing of genomic copies of mariner transposons involved in recent horizontal transfers into the genomes of the European earwig, Forficula auricularia; the European honey bee, Apis mellifera; the Mediterranean fruit fly, Ceratitis capitata; and a blister beetle, Epicauta funebris, insects from four different orders. These elements are in the mellifera subfamily and are the second documented example of full-length mariner elements involved in this kind of phenomenon. We applied maximum likelihood methods to the coding sequences and determined that the copies in each genome were evolving neutrally, whereas reconstructed ancestral coding sequences appeared to be under selection, which strengthens our previous hypothesis that the primary selective constraint on mariner sequence evolution is the act of horizontal transfer between genomes.

  11. Genomic Selection in Plant Breeding: Methods, Models, and Perspectives.

    PubMed

    Crossa, José; Pérez-Rodríguez, Paulino; Cuevas, Jaime; Montesinos-López, Osval; Jarquín, Diego; de Los Campos, Gustavo; Burgueño, Juan; González-Camacho, Juan M; Pérez-Elizalde, Sergio; Beyene, Yoseph; Dreisigacker, Susanne; Singh, Ravi; Zhang, Xuecai; Gowda, Manje; Roorkiwal, Manish; Rutkoski, Jessica; Varshney, Rajeev K

    2017-11-01

    Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Relating hybrid advantage and genome replacement in unisexual salamanders.

    PubMed

    Charney, Noah D

    2012-05-01

    Unisexual vertebrates are model systems for understanding the evolution of sex. Many predominantly clonal lineages allow occasional genetic recombination, which may be sufficient to avoid the accumulation of deleterious mutations and parasites. Introgression of paternal DNA into an all-female lineage represents a one-way flow of genetic material. Over many generations, this could result in complete replacement of the unisexual genomes by those of the donor species. The process of genome replacement may be counteracted by contemporary dispersal or by positive selection on hybrid nuclear genomes in ecotones. I present a conceptual model that relates nuclear genome replacement, positive selection on hybrids and biogeography in unisexual systems. I execute an individual-based simulation of the fate of hybrid genotypes in contact with a single host species. I parameterize these models for unisexual salamanders in the Ambystoma genus, for which the frequency of genome replacement has been a source of ongoing debate. I find that, if genome replacement occurs at a rate greater than 1/10,000 in Ambystoma, then there must be compensating positive selection in order to maintain observed levels of hybrid nuclei. Future researchers studying unisexual systems may use this framework as a guide to evaluating the hybrid superiority hypothesis. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.

  13. ABACAS: algorithm-based automatic contiguation of assembled sequences

    PubMed Central

    Assefa, Samuel; Keane, Thomas M.; Otto, Thomas D.; Newbold, Chris; Berriman, Matthew

    2009-01-01

    Summary: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net Contact: sa4@sanger.ac.uk PMID:19497936

  14. Ecological genomics in Xanthomonas: the nature of genetic adaptation with homologous recombination and host shifts.

    PubMed

    Huang, Chao-Li; Pu, Pei-Hua; Huang, Hao-Jen; Sung, Huang-Mo; Liaw, Hung-Jiun; Chen, Yi-Min; Chen, Chien-Ming; Huang, Ming-Ban; Osada, Naoki; Gojobori, Takashi; Pai, Tun-Wen; Chen, Yu-Tin; Hwang, Chi-Chuan; Chiang, Tzen-Yuh

    2015-03-15

    Comparative genomics provides insights into the diversification of bacterial species. Bacterial speciation usually takes place with lasting homologous recombination, which not only acts as a cohering force between diverging lineages but brings advantageous alleles favored by natural selection, and results in ecologically distinct species, e.g., frequent host shift in Xanthomonas pathogenic to various plants. Using whole-genome sequences, we examined the genetic divergence in Xanthomonas campestris that infected Brassicaceae, and X. citri, pathogenic to a wider host range. Genetic differentiation between two incipient races of X. citri pv. mangiferaeindicae was attributable to a DNA fragment introduced by phages. In contrast to most portions of the genome that had nearly equivalent levels of genetic divergence between subspecies as a result of the accumulation of point mutations, 10% of the core genome involving with homologous recombination contributed to the diversification in Xanthomonas, as revealed by the correlation between homologous recombination and genomic divergence. Interestingly, 179 genes were under positive selection; 98 (54.7%) of these genes were involved in homologous recombination, indicating that foreign genetic fragments may have caused the adaptive diversification, especially in lineages with nutritional transitions. Homologous recombination may have provided genetic materials for the natural selection, and host shifts likely triggered ecological adaptation in Xanthomonas. To a certain extent, we observed positive selection nevertheless contributed to ecological divergence beyond host shifting. Altogether, mediated with lasting gene flow, species formation in Xanthomonas was likely governed by natural selection that played a key role in helping the deviating populations to explore novel niches (hosts) or respond to environmental cues, subsequently triggering species diversification.

  15. Dynamics of Dark-Fly Genome Under Environmental Selections.

    PubMed

    Izutsu, Minako; Toyoda, Atsushi; Fujiyama, Asao; Agata, Kiyokazu; Fuse, Naoyuki

    2015-12-04

    Environmental adaptation is one of the most fundamental features of organisms. Modern genome science has identified some genes associated with adaptive traits of organisms, and has provided insights into environmental adaptation and evolution. However, how genes contribute to adaptive traits and how traits are selected under an environment in the course of evolution remain mostly unclear. To approach these issues, we utilize "Dark-fly", a Drosophila melanogaster line maintained in constant dark conditions for more than 60 years. Our previous analysis identified 220,000 single nucleotide polymorphisms (SNPs) in the Dark-fly genome, but did not clarify which SNPs of Dark-fly are truly adaptive for living in the dark. We found here that Dark-fly dominated over the wild-type fly in a mixed population under dark conditions, and based on this domination we designed an experiment for genome reselection to identify adaptive genes of Dark-fly. For this experiment, large mixed populations of Dark-fly and the wild-type fly were maintained in light conditions or in dark conditions, and the frequencies of Dark-fly SNPs were compared between these populations across the whole genome. We thereby detected condition-dependent selections toward approximately 6% of the genome. In addition, we observed the time-course trajectory of SNP frequency in the mixed populations through generations 0, 22, and 49, which resulted in notable categorization of the selected SNPs into three types with different combinations of positive and negative selections. Our data provided a list of about 100 strong candidate genes associated with the adaptive traits of Dark-fly. Copyright © 2016 Izutsu et al.

  16. Dynamics of Dark-Fly Genome Under Environmental Selections

    PubMed Central

    Izutsu, Minako; Toyoda, Atsushi; Fujiyama, Asao; Agata, Kiyokazu; Fuse, Naoyuki

    2015-01-01

    Environmental adaptation is one of the most fundamental features of organisms. Modern genome science has identified some genes associated with adaptive traits of organisms, and has provided insights into environmental adaptation and evolution. However, how genes contribute to adaptive traits and how traits are selected under an environment in the course of evolution remain mostly unclear. To approach these issues, we utilize “Dark-fly”, a Drosophila melanogaster line maintained in constant dark conditions for more than 60 years. Our previous analysis identified 220,000 single nucleotide polymorphisms (SNPs) in the Dark-fly genome, but did not clarify which SNPs of Dark-fly are truly adaptive for living in the dark. We found here that Dark-fly dominated over the wild-type fly in a mixed population under dark conditions, and based on this domination we designed an experiment for genome reselection to identify adaptive genes of Dark-fly. For this experiment, large mixed populations of Dark-fly and the wild-type fly were maintained in light conditions or in dark conditions, and the frequencies of Dark-fly SNPs were compared between these populations across the whole genome. We thereby detected condition-dependent selections toward approximately 6% of the genome. In addition, we observed the time-course trajectory of SNP frequency in the mixed populations through generations 0, 22, and 49, which resulted in notable categorization of the selected SNPs into three types with different combinations of positive and negative selections. Our data provided a list of about 100 strong candidate genes associated with the adaptive traits of Dark-fly. PMID:26637434

  17. Perspectives on Genetic and Genomic Technologies in an Academic Medical Center: The Duke Experience

    PubMed Central

    Katsanis, Sara Huston; Minear, Mollie A.; Vorderstrasse, Allison; Yang, Nancy; Reeves, Jason W.; Rakhra-Burris, Tejinder; Cook-Deegan, Robert; Ginsburg, Geoffrey S.; Simmons, Leigh Ann

    2015-01-01

    In this age of personalized medicine, genetic and genomic testing is expected to become instrumental in health care delivery, but little is known about its actual implementation in clinical practice. Methods. We surveyed Duke faculty and healthcare providers to examine the extent of genetic and genomic testing adoption. We assessed providers’ use of genetic and genomic testing options and indications in clinical practice, providers’ awareness of pharmacogenetic applications, and providers’ opinions on returning research-generated genetic test results to participants. Most clinician respondents currently use family history routinely in their clinical practice, but only 18 percent of clinicians use pharmacogenetics. Only two respondents correctly identified the number of drug package inserts with pharmacogenetic indications. We also found strong support for the return of genetic research results to participants. Our results demonstrate that while Duke healthcare providers are enthusiastic about genomic technologies, use of genomic tools outside of research has been limited. Respondents favor return of research-based genetic results to participants, but clinicians lack knowledge about pharmacogenetic applications. We identified challenges faced by this institution when implementing genetic and genomic testing into patient care that should inform a policy and education agenda to improve provider support and clinician-researcher partnerships. PMID:25854543

  18. Draft Genome Sequences of Several Fungal Strains Selected for Exposure to Microgravity at the International Space Station

    DOE PAGES

    Singh, Nitin K.; Blachowicz, Adriana; Romsdahl, Jillian; ...

    2017-04-13

    Presented here are the whole-genome sequences of eight fungal strains that were selected for exposure to microgravity at the International Space Station. These baseline sequences will help to understand the observed production of novel bioactive compounds.

  19. Draft Genome Sequences of Several Fungal Strains Selected for Exposure to Microgravity at the International Space Station

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, Nitin K.; Blachowicz, Adriana; Romsdahl, Jillian

    Presented here are the whole-genome sequences of eight fungal strains that were selected for exposure to microgravity at the International Space Station. These baseline sequences will help to understand the observed production of novel bioactive compounds.

  20. Integrated genomic approaches to enhance genetic resistance in chickens

    USDA-ARS?s Scientific Manuscript database

    The chicken has led the way amongst agricultural animal species in infectious disease control and, in particular, selection for genetic resistance. The generation of the chicken genome sequence and the availability of other empowering tools and resources greatly enhance the ability to select for enh...

Top