The YeastGenome app: the Saccharomyces Genome Database at your fingertips.
Wong, Edith D; Karra, Kalpana; Hitz, Benjamin C; Hong, Eurie L; Cherry, J Michael
2013-01-01
The Saccharomyces Genome Database (SGD) is a scientific database that provides researchers with high-quality curated data about the genes and gene products of Saccharomyces cerevisiae. To provide instant and easy access to this information on mobile devices, we have developed YeastGenome, a native application for the Apple iPhone and iPad. YeastGenome can be used to quickly find basic information about S. cerevisiae genes and chromosomal features regardless of internet connectivity. With or without network access, you can view basic information and Gene Ontology annotations about a gene of interest by searching gene names and gene descriptions or by browsing the database within the app to find the gene of interest. With internet access, the app provides more detailed information about the gene, including mutant phenotypes, references and protein and genetic interactions, as well as provides hyperlinks to retrieve detailed information by showing SGD pages and views of the genome browser. SGD provides online help describing basic ways to navigate the mobile version of SGD, highlights key features and answers frequently asked questions related to the app. The app is available from iTunes (http://itunes.com/apps/yeastgenome). The YeastGenome app is provided freely as a service to our community, as part of SGD's mission to provide free and open access to all its data and annotations.
Informational laws of genome structures
Bonnici, Vincenzo; Manca, Vincenzo
2016-01-01
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155
Informational laws of genome structures
NASA Astrophysics Data System (ADS)
Bonnici, Vincenzo; Manca, Vincenzo
2016-06-01
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
USDA-ARS?s Scientific Manuscript database
In the midst of this genomics era, major plant genome databases are collecting massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc., as well as textual descriptions of many of these entities. While basic browsing and sear...
Genome projects and the functional-genomic era.
Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans
2005-12-01
The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.
Discovery of the "RNA continent" through a contrarian's research strategy.
Hayashizaki, Yoshihide
2011-01-01
The International Human Genome Sequencing Consortium completed the decoding of the human genome sequence in 2003. Readers will be aware of the paradigm shift which has occurred since then in the field of life science research. At last, mankind has been able to focus on a complete picture of the full extent of the genome, on which is recorded the basic information that controls all life. Meanwhile, another genome project, centered on Japan and known as the mouse genome encyclopedia project, was progressing with participation from around the world. Led by our research group at RIKEN, it was a full-length cDNA project which aimed to decode the whole RNA (transcriptome) using the mouse as a model. The basic information that controls all life is recorded on the genome, but in order to obtain a complete picture of this extensive information, the decoding of the genome alone is far from sufficient. These two genome projects established that the number of letters in the genome, which is the blueprint of life, is finite, that the number of RNA molecules derived from it is also finite, and that the number of protein molecules derived from the RNA is probably finite too. A massive number of combinations is still involved, but we are now able to understand one section of the network formed by these data. Once an object of study has been understood to be finite, establishing an image of the whole is certain to lead us to an understanding of the whole. Omics is an approach that views the information controlling life as finite and seeks to assemble and analyze it as a whole. Here, I would like to present our transcriptome research while making reference to our unique research strategy.
Proteogenomics | Office of Cancer Clinical Proteomics Research
Proteogenomics, or the integration of proteomics with genomics and transcriptomics, is an emerging approach that promises to advance basic, translational and clinical research. By combining genomic and proteomic information, leading scientists are gaining new insights due to a more complete and unified understanding of complex biological processes.
Heinen, Christopher D
2016-02-01
We have currently entered a genomic era of cancer research which may soon lead to a genomic era of cancer treatment. Patient DNA sequencing information may lead to a personalized approach to managing an individual's cancer as well as future cancer risk. The success of this approach, however, begins not necessarily in the clinician's office, but rather at the laboratory bench of the basic scientist. The basic scientist plays a critical role since the DNA sequencing information is of limited use unless one knows the function of the gene that is altered and the manner by which a sequence alteration affects that function. The role of basic science research in aiding the clinical management of a disease is perhaps best exemplified by considering the case of Lynch syndrome, a hereditary disease that predisposes patients to colorectal and other cancers. This review will examine how the diagnosis, treatment and even prevention of Lynch syndrome-associated cancers has benefitted from extensive basic science research on the DNA mismatch repair genes whose alteration underlies this condition. Copyright © 2015 Elsevier B.V. All rights reserved.
Proteogenomics, integration of proteomics, genomics, and transcriptomics, is an emerging approach that promises to advance basic, translational and clinical research. By combining genomic and proteomic information, leading scientists are gaining new insights due to a more complete and unified understanding of complex biological processes.
Faculty Performance on the Genomic Nursing Concept Inventory.
Read, Catherine Y; Ward, Linda D
2016-01-01
To use the newly developed Genomic Nursing Concept Inventory (GNCI) to evaluate faculty understanding of foundational genomic concepts, explore relative areas of strength and weakness, and compare the results with those of a student sample. An anonymous online survey instrument consisting of demographic or background items and the 31 multiple-choice questions that make up the GNCI was completed by 495 nursing faculty from across the United States in the fall of 2014. Total GNCI score and scores on four subcategories (genome basics, mutations, inheritance, genomic health) were calculated. Relationships between demographic or background variables and total GNCI score were explored. The mean score on the GNCI was 14.93 (SD = 5.31), or 48% correct; topical category scores were highest on the inheritance and genomic health items (59% and 58% correct, respectively), moderate on the mutations items (54% correct), and lowest on the genome basics items (33% correct). These results are strikingly similar to those of a recent study of nursing students. Factors associated with a higher total score on the GNCI included higher self-rated proficiency with genetic/genomic content, having a doctoral degree, having taken a genetics course for academic credit or continuing education, and having taught either a stand-alone genetic/genomic course or lecture content as part of nursing or related course. Self-rated proficiency with genetic/genomic content was fair or poor (70%), with only 7% rating their proficiency as very good or excellent. Faculty knowledge of foundational genomic concepts is similar to that of the students they teach and weakest in the areas related to basic science information. Genomics is increasingly relevant in all areas of clinical nursing practice, and the faculty charged with educating the next generation of nurses must understand foundational concepts. Faculty need to be proactive in seeking out relevant educational programs that include basic genetic/genomic concepts. © 2015 Sigma Theta Tau International.
Toward a Comprehensive Genomic Analysis of Cancer - TCGA
The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) convened a "Toward a Comprehensive Genomic Analysis of Cancer" workshop in Washington, D.C. This workshop brought together physicians, basic scientists and other members of the U.S. and international cancer communities to assist in outlining the most effective strategies for the development of a successful project. Information about this workshop is reported in the Executive Summary.
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.
Hallin, Peter F; Ussery, David W
2004-12-12
Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.
The future of microarray technology: networking the genome search.
D'Ambrosio, C; Gatta, L; Bonini, S
2005-10-01
In recent years microarray technology has been increasingly used in both basic and clinical research, providing substantial information for a better understanding of genome-environment interactions responsible for diseases, as well as for their diagnosis and treatment. However, in genomic research using microarray technology there are several unresolved issues, including scientific, ethical and legal issues. Networks of excellence like GA(2)LEN may represent the best approach for teaching, cost reduction, data repositories, and functional studies implementation.
Oncogenomics and the development of new cancer therapies.
Strausberg, Robert L; Simpson, Andrew J G; Old, Lloyd J; Riggins, Gregory J
2004-05-27
Scientists have sequenced the human genome and identified most of its genes. Now it is time to use these genomic data, and the high-throughput technology developed to generate them, to tackle major health problems such as cancer. To accelerate our understanding of this disease and to produce targeted therapies, further basic mutational and functional genomic information is required. A systematic and coordinated approach, with the results freely available, should speed up progress. This will best be accomplished through an international academic and pharmaceutical oncogenomics initiative.
MIPS: analysis and annotation of proteins from whole genomes in 2005
Mewes, H. W.; Frishman, D.; Mayer, K. F. X.; Münsterkötter, M.; Noubibou, O.; Pagel, P.; Rattei, T.; Oesterheld, M.; Ruepp, A.; Stümpflen, V.
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein–protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (). PMID:16381839
MIPS: analysis and annotation of proteins from whole genomes in 2005.
Mewes, H W; Frishman, D; Mayer, K F X; Münsterkötter, M; Noubibou, O; Pagel, P; Rattei, T; Oesterheld, M; Ruepp, A; Stümpflen, V
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).
MaizeGDB: enabling access to basic, translational, and applied research information
USDA-ARS?s Scientific Manuscript database
MaizeGDB is the Maize Genetics and Genomics Database (available online at http://www.maizegdb.org). The MaizeGDB project is not simply an online database and website but rather an information service to maize researchers that supports customized data access and analysis needs to individual research...
CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria.
Shipman, Seth L; Nivala, Jeff; Macklis, Jeffrey D; Church, George M
2017-07-20
DNA is an excellent medium for archiving data. Recent efforts have illustrated the potential for information storage in DNA using synthesized oligonucleotides assembled in vitro. A relatively unexplored avenue of information storage in DNA is the ability to write information into the genome of a living cell by the addition of nucleotides over time. Using the Cas1-Cas2 integrase, the CRISPR-Cas microbial immune system stores the nucleotide content of invading viruses to confer adaptive immunity. When harnessed, this system has the potential to write arbitrary information into the genome. Here we use the CRISPR-Cas system to encode the pixel values of black and white images and a short movie into the genomes of a population of living bacteria. In doing so, we push the technical limits of this information storage system and optimize strategies to minimize those limitations. We also uncover underlying principles of the CRISPR-Cas adaptation system, including sequence determinants of spacer acquisition that are relevant for understanding both the basic biology of bacterial adaptation and its technological applications. This work demonstrates that this system can capture and stably store practical amounts of real data within the genomes of populations of living cells.
A Proposed Genus Boundary for the Prokaryotes Based on Genomic Insights
Qin, Qi-Long; Xie, Bin-Bin; Zhang, Xi-Ying; Chen, Xiu-Lan; Zhou, Bai-Cheng; Zhou, Jizhong; Oren, Aharon
2014-01-01
Genomic information has already been applied to prokaryotic species definition and classification. However, the contribution of the genome sequence to prokaryotic genus delimitation has been less studied. To gain insights into genus definition for the prokaryotes, we attempted to reveal the genus-level genomic differences in the current prokaryotic classification system and to delineate the boundary of a genus on the basis of genomic information. The average nucleotide sequence identity between two genomes can be used for prokaryotic species delineation, but it is not suitable for genus demarcation. We used the percentage of conserved proteins (POCP) between two strains to estimate their evolutionary and phenotypic distance. A comprehensive genomic survey indicated that the POCP can serve as a robust genomic index for establishing the genus boundary for prokaryotic groups. Basically, two species belonging to the same genus would share at least half of their proteins. In a specific lineage, the genus and family/order ranks showed slight or no overlap in terms of POCP values. A prokaryotic genus can be defined as a group of species with all pairwise POCP values higher than 50%. Integration of whole-genome data into the current taxonomy system can provide comprehensive information for prokaryotic genus definition and delimitation. PMID:24706738
GenColors: annotation and comparative genomics of prokaryotes made easy.
Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen
2007-01-01
GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F. X.
2004-01-01
Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db. PMID:14681437
Basics and applications of genome editing technology.
Yamamoto, Takashi; Sakamoto, Naoaki
2016-01-01
Genome editing with programmable site-specific nucleases is an emerging technology that enables the manipulation of targeted genes in many organisms and cell lines. Since the development of the CRISPR-Cas9 system in 2012, genome editing has rapidly become an indispensable technology for all life science researchers, applicable in various fields. In this seminar, we will introduce the basics of genome editing and focus on the recent development of genome editing tools and technologies for the modification of various organisms and discuss future directions of the genome editing research field, from basic to medical applications.
Keinath, Melissa C.; Timoshevskiy, Vladimir A.; Timoshevskaya, Nataliya Y.; Tsonis, Panagiotis A.; Voss, S. Randal; Smith, Jeramiah J.
2015-01-01
Vertebrates exhibit substantial diversity in genome size, and some of the largest genomes exist in species that uniquely inform diverse areas of basic and biomedical research. For example, the salamander Ambystoma mexicanum (the Mexican axolotl) is a model organism for studies of regeneration, development and genome evolution, yet its genome is ~10× larger than the human genome. As part of a hierarchical approach toward improving genome resources for the species, we generated 600 Gb of shotgun sequence data and developed methods for sequencing individual laser-captured chromosomes. Based on these data, we estimate that the A. mexicanum genome is ~32 Gb. Notably, as much as 19 Gb of the A. mexicanum genome can potentially be considered single copy, which presumably reflects the evolutionary diversification of mobile elements that accumulated during an ancient episode of genome expansion. Chromosome-targeted sequencing permitted the development of assemblies within the constraints of modern computational platforms, allowed us to place 2062 genes on the two smallest A. mexicanum chromosomes and resolves key events in the history of vertebrate genome evolution. Our analyses show that the capture and sequencing of individual chromosomes is likely to provide valuable information for the systematic sequencing, assembly and scaffolding of large genomes. PMID:26553646
Keinath, Melissa C; Timoshevskiy, Vladimir A; Timoshevskaya, Nataliya Y; Tsonis, Panagiotis A; Voss, S Randal; Smith, Jeramiah J
2015-11-10
Vertebrates exhibit substantial diversity in genome size, and some of the largest genomes exist in species that uniquely inform diverse areas of basic and biomedical research. For example, the salamander Ambystoma mexicanum (the Mexican axolotl) is a model organism for studies of regeneration, development and genome evolution, yet its genome is ~10× larger than the human genome. As part of a hierarchical approach toward improving genome resources for the species, we generated 600 Gb of shotgun sequence data and developed methods for sequencing individual laser-captured chromosomes. Based on these data, we estimate that the A. mexicanum genome is ~32 Gb. Notably, as much as 19 Gb of the A. mexicanum genome can potentially be considered single copy, which presumably reflects the evolutionary diversification of mobile elements that accumulated during an ancient episode of genome expansion. Chromosome-targeted sequencing permitted the development of assemblies within the constraints of modern computational platforms, allowed us to place 2062 genes on the two smallest A. mexicanum chromosomes and resolves key events in the history of vertebrate genome evolution. Our analyses show that the capture and sequencing of individual chromosomes is likely to provide valuable information for the systematic sequencing, assembly and scaffolding of large genomes.
Winterfeld, Grit; Becher, Hannes; Voshell, Stephanie; Hilu, Khidir; Röser, Martin
2018-01-01
Karyotype characteristics can provide valuable information on genome evolution and speciation, in particular in taxa with varying basic chromosome numbers and ploidy levels. Due to its worldwide distribution, remarkable variability in morphological traits and the fact that ploidy change plays a key role in its evolution, the canary grass genus Phalaris (Poaceae) is an excellent study system to investigate the role of chromosomal changes in species diversification and expansion. Phalaris comprises diploid species with two basic chromosome numbers of x = 6 and 7 as well as polyploids based on x = 7. To identify distinct karyotype structures and to trace chromosome evolution within the genus, we apply fluorescence in situ hybridisation (FISH) of 5S and 45S rDNA probes in four diploid and four tetraploid Phalaris species of both basic numbers. The data agree with a dysploid reduction from x = 7 to x = 6 as the result of reciprocal translocations between three chromosomes of an ancestor with a diploid chromosome complement of 2n = 14. We recognize three different genomes in the genus: (1) the exclusively Mediterranean genome A based on x = 6, (2) the cosmopolitan genome B based on x = 7 and (3) a genome C based on x = 7 and with a distribution in the Mediterranean and the Middle East. Both auto- and allopolyploidy of genomes B and C are suggested for the formation of tetraploids. The chromosomal divergence observed in Phalaris can be explained by the occurrence of dysploidy, the emergence of three different genomes, and the chromosome rearrangements accompanied by karyotype change and polyploidization. Mapping the recognized karyotypes on the existing phylogenetic tree suggests that genomes A and C are restricted to sections Phalaris and Bulbophalaris, respectively, while genome B occurs across all taxa with x = 7.
Hilu, Khidir; Röser, Martin
2018-01-01
Karyotype characteristics can provide valuable information on genome evolution and speciation, in particular in taxa with varying basic chromosome numbers and ploidy levels. Due to its worldwide distribution, remarkable variability in morphological traits and the fact that ploidy change plays a key role in its evolution, the canary grass genus Phalaris (Poaceae) is an excellent study system to investigate the role of chromosomal changes in species diversification and expansion. Phalaris comprises diploid species with two basic chromosome numbers of x = 6 and 7 as well as polyploids based on x = 7. To identify distinct karyotype structures and to trace chromosome evolution within the genus, we apply fluorescence in situ hybridisation (FISH) of 5S and 45S rDNA probes in four diploid and four tetraploid Phalaris species of both basic numbers. The data agree with a dysploid reduction from x = 7 to x = 6 as the result of reciprocal translocations between three chromosomes of an ancestor with a diploid chromosome complement of 2n = 14. We recognize three different genomes in the genus: (1) the exclusively Mediterranean genome A based on x = 6, (2) the cosmopolitan genome B based on x = 7 and (3) a genome C based on x = 7 and with a distribution in the Mediterranean and the Middle East. Both auto- and allopolyploidy of genomes B and C are suggested for the formation of tetraploids. The chromosomal divergence observed in Phalaris can be explained by the occurrence of dysploidy, the emergence of three different genomes, and the chromosome rearrangements accompanied by karyotype change and polyploidization. Mapping the recognized karyotypes on the existing phylogenetic tree suggests that genomes A and C are restricted to sections Phalaris and Bulbophalaris, respectively, while genome B occurs across all taxa with x = 7. PMID:29462207
Genetics and Genomics in Oncology Nursing: What Does Every Nurse Need to Know?
Eggert, Julie
2017-03-01
In addition to the need for basic education about genetics/genomics, other approaches are suggested to include awareness campaigns, continuing education courses, policy review, and onsite clinical development. These alternative learning strategies encourage oncology nurses across the continuum of care, from the bedside/seatside to oncology nurse research, to integrate genomics into all levels of practice and research in the specialty of oncology nursing. All nurses are warriors in the fight against cancer. The goal of this article is to identify genomic information that oncology nurses, at all levels of care, need to know and use as tools in the war against cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
RPAN: rice pan-genome browser for ∼3000 rice genomes.
Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun
2017-01-25
A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Packaging of HCV-RNA into lentiviral vector
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caval, Vincent; Piver, Eric; Service de Biochimie et Biologie Moleculaire, CHRU de Tours
2011-11-04
Highlights: Black-Right-Pointing-Pointer Description of HCV-RNA Core-D1 interactions. Black-Right-Pointing-Pointer In vivo evaluation of the packaging of HCV genome. Black-Right-Pointing-Pointer Determination of the role of the three basic sub-domains of D1. Black-Right-Pointing-Pointer Heterologous system involving HIV-1 vector particles to mobilise HCV genome. Black-Right-Pointing-Pointer Full length mobilisation of HCV genome and HCV-receptor-independent entry. -- Abstract: The advent of infectious molecular clones of Hepatitis C virus (HCV) has unlocked the understanding of HCV life cycle. However, packaging of the genomic RNA, which is crucial to generate infectious viral particles, remains poorly understood. Molecular interactions of the domain 1 (D1) of HCV Core protein andmore » HCV RNA have been described in vitro. Since compaction of genetic information within HCV genome has hampered conventional mutational approach to study packaging in vivo, we developed a novel heterologous system to evaluate the interactions between HCV RNA and Core D1. For this, we took advantage of the recruitment of Vpr fusion-proteins into HIV-1 particles. By fusing HCV Core D1 to Vpr we were able to package and transfer a HCV subgenomic replicon into a HIV-1 based lentiviral vector. We next examined how deletion mutants of basic sub-domains of Core D1 influenced HCV RNA recruitment. The results emphasized the crucial role of the first and third basic regions of D1 in packaging. Interestingly, the system described here allowed us to mobilise full-length JFH1 genome in CD81 defective cells, which are normally refractory to HCV infection. This finding paves the way to an evaluation of the replication capability of HCV in various cell types.« less
Recombinant transfer in the basic genome of E. coli
Dixit, Purushottam; Studier, F. William; Pang, Tin Yau; ...
2015-07-07
An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less
Recombinant transfer in the basic genome of E. coli
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dixit, Purushottam; Studier, F. William; Pang, Tin Yau
An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less
Information resources at the National Center for Biotechnology Information.
Woodsmall, R M; Benson, D A
1993-01-01
The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583
Greiner, Stephan; Wang, Xi; Herrmann, Reinhold G; Rauwolf, Uwe; Mayer, Klaus; Haberer, Georg; Meurer, Jörg
2008-09-01
A unique combination of genetic features and a rich stock of information make the flowering plant genus Oenothera an appealing model to explore the molecular basis of speciation processes including nucleus-organelle coevolution. From representative species, we have recently reported complete nucleotide sequences of the 5 basic and genetically distinguishable plastid chromosomes of subsection Oenothera (I-V). In nature, Oenothera plastid genomes are associated with 6 distinct, either homozygous or heterozygous, diploid nuclear genotypes of the 3 basic genomes A, B, or C. Artificially produced plastome-genome combinations that do not occur naturally often display interspecific plastome-genome incompatibility (PGI). In this study, we compare formal genetic data available from all 30 plastome-genome combinations with sequence differences between the plastomes to uncover potential determinants for interspecific PGI. Consistent with an active role in speciation, a remarkable number of genes have high Ka/Ks ratios. Different from the Solanacean cybrid model Atropa/tobacco, RNA editing seems not to be relevant for PGIs in Oenothera. However, predominantly sequence polymorphisms in intergenic segments are proposed as possible sources for PGI. A single locus, the bidirectional promoter region between psbB and clpP, is suggested to contribute to compartmental PGI in the interspecific AB hybrid containing plastome I (AB-I), consistent with its perturbed photosystem II activity.
Middelton, L A; Peters, K F
2001-10-01
The information gained from the Human Genome Project and related genetic research will undoubtedly create significant changes in healthcare practice. It is becoming increasingly clear that nurses in all areas of clinical practice will require a fundamental understanding of basic genetics. This article provides the oncology nurse with an overview of basic genetic concepts, including inheritance patterns of single gene conditions, pedigree construction, chromosome aberrations, and the multifactorial basis underlying the common diseases of adulthood. Normal gene structure and function are introduced and the biochemistry of genetic errors is described.
Emerging issues in public health genomics
Roberts, J. Scott
2014-01-01
This review highlights emerging areas of interest in public health genomics. First, recent advances in newborn screening (NBS) are described, with a focus on practice and policy implications of current and future efforts to expand NBS programs (e.g., via next-generation sequencing). Next, research findings from the rapidly progressing field of epigenetics and epigenomics are detailed, highlighting ways in which our emerging understanding in these areas could guide future intervention and research efforts in public health. We close by considering various ethical, legal and social issues posed by recent developments in public health genomics; these include policies to regulate access to personal genomic information; the need to enhance genetic literacy in both health professionals and the public; and challenges in ensuring that the benefits (and burdens) from genomic discoveries and applications are equitably distributed. Needs for future genomics research that integrates across basic and social sciences are also noted. PMID:25184533
DOE R&D Accomplishments Database
1990-04-01
The Human Genome Initiative is a worldwide research effort with the goal of analyzing the structure of human DNA and determining the location of the estimated 100,000 human genes. In parallel with this effort, the DNA of a set of model organisms will be studied to provide the comparative information necessary for understanding the functioning of the human genome. The information generated by the human genome project is expected to be the source book for biomedical science in the 21st century and will by of immense benefit to the field of medicine. It will help us to understand and eventually treat many of the more than 4000 genetic diseases that affect mankind, as well as the many multifactorial diseases in which genetic predisposition plays an important role. A centrally coordinated project focused on specific objectives is believed to be the most efficient and least expensive way of obtaining this information. The basic data produced will be collected in electronic databases that will make the information readily accessible on convenient form to all who need it. This report describes the plans for the U.S. human genome project and updates those originally prepared by the Office of Technology Assessment (OTA) and the National Research Council (NRC) in 1988. In the intervening two years, improvements in technology for almost every aspect of genomics research have taken place. As a result, more specific goals can now be set for the project.
ISOL@: an Italian SOLAnaceae genomics resource.
Chiusano, Maria Luisa; D'Agostino, Nunzio; Traini, Alessandra; Licciardello, Concetta; Raimondo, Enrico; Aversano, Mario; Frusciante, Luigi; Monti, Luigi
2008-03-26
Present-day '-omics' technologies produce overwhelming amounts of data which include genome sequences, information on gene expression (transcripts and proteins) and on cell metabolic status. These data represent multiple aspects of a biological system and need to be investigated as a whole to shed light on the mechanisms which underpin the system functionality. The gathering and convergence of data generated by high-throughput technologies, the effective integration of different data-sources and the analysis of the information content based on comparative approaches are key methods for meaningful biological interpretations. In the frame of the International Solanaceae Genome Project, we propose here ISOLA, an Italian SOLAnaceae genomics resource. ISOLA (available at http://biosrv.cab.unina.it/isola) represents a trial platform and it is conceived as a multi-level computational environment.ISOLA currently consists of two main levels: the genome and the expression level. The cornerstone of the genome level is represented by the Solanum lycopersicum genome draft sequences generated by the International Tomato Genome Sequencing Consortium. Instead, the basic element of the expression level is the transcriptome information from different Solanaceae species, mainly in the form of species-specific comprehensive collections of Expressed Sequence Tags (ESTs). The cross-talk between the genome and the expression levels is based on data source sharing and on tools that enhance data quality, that extract information content from the levels' under parts and produce value-added biological knowledge. ISOLA is the result of a bioinformatics effort that addresses the challenges of the post-genomics era. It is designed to exploit '-omics' data based on effective integration to acquire biological knowledge and to approach a systems biology view. Beyond providing experimental biologists with a preliminary annotation of the tomato genome, this effort aims to produce a trial computational environment where different aspects and details are maintained as they are relevant for the analysis of the organization, the functionality and the evolution of the Solanaceae family.
2014-01-01
Background Leptotrombidium pallidum and Leptotrombidium scutellare are the major vector mites for Orientia tsutsugamushi, the causative agent of scrub typhus. Before these organisms can be subjected to whole-genome sequencing, it is necessary to estimate their genome sizes to obtain basic information for establishing the strategies that should be used for genome sequencing and assembly. Method The genome sizes of L. pallidum and L. scutellare were estimated by a method based on quantitative real-time PCR. In addition, a k-mer analysis of the whole-genome sequences obtained through Illumina sequencing was conducted to verify the mutual compatibility and reliability of the results. Results The genome sizes estimated using qPCR were 191 ± 7 Mb for L. pallidum and 262 ± 13 Mb for L. scutellare. The k-mer analysis-based genome lengths were estimated to be 175 Mb for L. pallidum and 286 Mb for L. scutellare. The estimates from these two independent methods were mutually complementary and within a similar range to those of other Acariform mites. Conclusions The estimation method based on qPCR appears to be a useful alternative when the standard methods, such as flow cytometry, are impractical. The relatively small estimated genome sizes should facilitate whole-genome analysis, which could contribute to our understanding of Arachnida genome evolution and provide key information for scrub typhus prevention and mite vector competence. PMID:24947244
A comprehensive crop genome research project: the Superhybrid Rice Genome Project in China.
Yu, Jun; Wong, Gane Ka-Shu; Liu, Siqi; Wang, Jian; Yang, Huanming
2007-06-29
In May 2000, the Beijing Institute of Genomics formally announced the launch of a comprehensive crop genome research project on rice genomics, the Chinese Superhybrid Rice Genome Project. SRGP is not simply a sequencing project targeted to a single rice (Oryza sativa L.) genome, but a full-swing research effort with an ultimate goal of providing inclusive basic genomic information and molecular tools not only to understand biology of the rice, both as an important crop species and a model organism of cereals, but also to focus on a popular superhybrid rice landrace, LYP9. We have completed the first phase of SRGP and provide the rice research community with a finished genome sequence of an indica variety, 93-11 (the paternal cultivar of LYP9), together with ample data on subspecific (between subspecies) polymorphisms, transcriptomes and proteomes, useful for within-species comparative studies. In the second phase, we have acquired the genome sequence of the maternal cultivar, PA64S, together with the detailed catalogues of genes uniquely expressed in the parental cultivars and the hybrid as well as allele-specific markers that distinguish parental alleles. Although SRGP in China is not an open-ended research programme, it has been designed to pave a way for future plant genomics research and application, such as to interrogate fundamentals of plant biology, including genome duplication, polyploidy and hybrid vigour, as well as to provide genetic tools for crop breeding and to carry along a social burden-leading a fight against the world's hunger. It began with genomics, the newly developed and industry-scale research field, and from the world's most populous country. In this review, we summarize our scientific goals and noteworthy discoveries that exploit new territories of systematic investigations on basic and applied biology of rice and other major cereal crops.
AID to overcome the limitations of genomic information by introducing somatic DNA alterations.
Honjo, Tasuku; Muramatsu, Masamichi; Nagaoka, Hitoshi; Kinoshita, Kazuo; Shinkura, Reiko
2006-05-01
The immune system has adopted somatic DNA alterations to overcome the limitations of the genomic information. Activation induced cytidine deaminase (AID) is an essential enzyme to regulate class switch recombination (CSR), somatic hypermutation (SHM) and gene conversion (GC) of the immunoglobulin gene. AID is known to be required for DNA cleavage of S regions in CSR and V regions in SHM. However, its molecular mechanism is a focus of extensive debate. RNA editing hypothesis postulates that AID edits yet unknown mRNA, to generate specific endonucleases for CSR and SHM. By contrast, DNA deamination hypothesis assumes that AID deaminates cytosine in DNA, followed by DNA cleavage by base excision repair enzymes. We summarize the basic knowledge for molecular mechanisms for CSR and SHM and then discuss the importance of AID not only in the immune regulation but also in the genome instability.
Tu, Jianfeng; Yang, Ying; Yang, Fuhe; Xing, Xiumei
2017-03-01
Peking duck (Anas platyrhychos) and Muscovy duck (Cairina moschata) are two types of domestic ducks and the most popular meat breeds on the world. In this study, we sequenced and compared complete mitochondrial genomes of both breeds. In order to investigate the phylogeny of both breeds within Anseriformes, the sequences of concatenated 12 protein-coding genes were used for phylogenetic analysis. The result was consistent with most of the previous morphological and molecular studies. Our complete mitochondrial genome sequences of both breeds will be useful information in phylogenetics, and be available as basic data for the breeding and genetics.
Architecture of the human regulatory network derived from ENCODE data.
Gerstein, Mark B; Kundaje, Anshul; Hariharan, Manoj; Landt, Stephen G; Yan, Koon-Kiu; Cheng, Chao; Mu, Xinmeng Jasmine; Khurana, Ekta; Rozowsky, Joel; Alexander, Roger; Min, Renqiang; Alves, Pedro; Abyzov, Alexej; Addleman, Nick; Bhardwaj, Nitin; Boyle, Alan P; Cayting, Philip; Charos, Alexandra; Chen, David Z; Cheng, Yong; Clarke, Declan; Eastman, Catharine; Euskirchen, Ghia; Frietze, Seth; Fu, Yao; Gertz, Jason; Grubert, Fabian; Harmanci, Arif; Jain, Preti; Kasowski, Maya; Lacroute, Phil; Leng, Jing Jane; Lian, Jin; Monahan, Hannah; O'Geen, Henriette; Ouyang, Zhengqing; Partridge, E Christopher; Patacsil, Dorrelyn; Pauli, Florencia; Raha, Debasish; Ramirez, Lucia; Reddy, Timothy E; Reed, Brian; Shi, Minyi; Slifer, Teri; Wang, Jing; Wu, Linfeng; Yang, Xinqiong; Yip, Kevin Y; Zilberman-Schapira, Gili; Batzoglou, Serafim; Sidow, Arend; Farnham, Peggy J; Myers, Richard M; Weissman, Sherman M; Snyder, Michael
2012-09-06
Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the transcription factor binding into a hierarchy and integrated it with other genomic information (for example, microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
DroSpeGe: rapid access database for new Drosophila species genomes.
Gilbert, Donald G
2007-01-01
The Drosophila species comparative genome database DroSpeGe (http://insects.eugenes.org/DroSpeGe/) provides genome researchers with rapid, usable access to 12 new and old Drosophila genomes, since its inception in 2004. Scientists can use, with minimal computing expertise, the wealth of new genome information for developing new insights into insect evolution. New genome assemblies provided by several sequencing centers have been annotated with known model organism gene homologies and gene predictions to provided basic comparative data. TeraGrid supplies the shared cyberinfrastructure for the primary computations. This genome database includes homologies to Drosophila melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. BLAST searches of the newest assemblies are integrated with genome maps. GBrowse maps provide detailed views of cross-species aligned genomes. BioMart provides for data mining of annotations and sequences. Common chromosome maps identify major synteny among species. Potential gain and loss of genes is suggested by Gene Ontology groupings for genes of the new species. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
Your Genes, Your Choices: Exploring the Issues Raised by Genetic Research
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, C.
1999-05-31
Your Genes, Your Choices provides accurate information about the ethical, legal, and social implications of the Human Genome Project and genetic research in an easy-to-read style and format. Each chapter in the book begins with a brief vignette, which introduces an issue within a human story, and raises a question for the reader to think about as the basic science and information are presented in the rest of the chapter.
The Human Ageing Genomic Resources: online databases and tools for biogerontologists
de Magalhães, João Pedro; Budovsky, Arie; Lehmann, Gilad; Costa, Joana; Li, Yang; Fraifeld, Vadim; Church, George M.
2009-01-01
Summary Ageing is a complex, challenging phenomenon that will require multiple, interdisciplinary approaches to unravel its puzzles. To assist basic research on ageing, we developed the Human Ageing Genomic Resources (HAGR). This work provides an overview of the databases and tools in HAGR and describes how the gerontology research community can employ them. Several recent changes and improvements to HAGR are also presented. The two centrepieces in HAGR are GenAge and AnAge. GenAge is a gene database featuring genes associated with ageing and longevity in model organisms, a curated database of genes potentially associated with human ageing, and a list of genes tested for their association with human longevity. A myriad of biological data and information is included for hundreds of genes, making GenAge a reference for research that reflects our current understanding of the genetic basis of ageing. GenAge can also serve as a platform for the systems biology of ageing, and tools for the visualization of protein-protein interactions are also included. AnAge is a database of ageing in animals, featuring over 4,000 species, primarily assembled as a resource for comparative and evolutionary studies of ageing. Longevity records, developmental and reproductive traits, taxonomic information, basic metabolic characteristics, and key observations related to ageing are included in AnAge. Software is also available to aid researchers in the form of Perl modules to automate numerous tasks and as an SPSS script to analyse demographic mortality data. The Human Ageing Genomic Resources are available online at http://genomics.senescence.info. PMID:18986374
Directional genomic hybridization for chromosomal inversion discovery and detection.
Ray, F Andrew; Zimmerman, Erin; Robinson, Bruce; Cornforth, Michael N; Bedford, Joel S; Goodwin, Edwin H; Bailey, Susan M
2013-04-01
Chromosomal rearrangements are a source of structural variation within the genome that figure prominently in human disease, where the importance of translocations and deletions is well recognized. In principle, inversions-reversals in the orientation of DNA sequences within a chromosome-should have similar detrimental potential. However, the study of inversions has been hampered by traditional approaches used for their detection, which are not particularly robust. Even with significant advances in whole genome approaches, changes in the absolute orientation of DNA remain difficult to detect routinely. Consequently, our understanding of inversions is still surprisingly limited, as is our appreciation for their frequency and involvement in human disease. Here, we introduce the directional genomic hybridization methodology of chromatid painting-a whole new way of looking at structural features of the genome-that can be employed with high resolution on a cell-by-cell basis, and demonstrate its basic capabilities for genome-wide discovery and targeted detection of inversions. Bioinformatics enabled development of sequence- and strand-specific directional probe sets, which when coupled with single-stranded hybridization, greatly improved the resolution and ease of inversion detection. We highlight examples of the far-ranging applicability of this cytogenomics-based approach, which include confirmation of the alignment of the human genome database and evidence that individuals themselves share similar sequence directionality, as well as use in comparative and evolutionary studies for any species whose genome has been sequenced. In addition to applications related to basic mechanistic studies, the information obtainable with strand-specific hybridization strategies may ultimately enable novel gene discovery, thereby benefitting the diagnosis and treatment of a variety of human disease states and disorders including cancer, autism, and idiopathic infertility.
Strategies for the acquisition of transcriptional and epigenetic information in single cells.
Li, Guang; Dzilic, Elda; Flores, Nick; Shieh, Alice; Wu, Sean M
2017-03-01
As the basic unit of living organisms, each single cell has unique molecular signatures and functions. Our ability to uncover the transcriptional and epigenetic signature of single cells has been hampered by the lack of tools to explore this area of research. The advent of microfluidic single cell technology along with single cell genome-wide DNA amplification methods had greatly improved our understanding of the expression variation in single cells. Transcriptional expression profile by multiplex qPCR or genome-wide RNA sequencing has enabled us to examine genes expression in single cells in different tissues. With the new tools, the identification of new cellular heterogeneity, novel marker genes, unique subpopulations, and spatial locations of each single cell can be acquired successfully. Epigenetic modifications for each single cell can also be obtained via similar methods. Based on single cell genome sequencing, single cell epigenetic information including histone modifications, DNA methylation, and chromatin accessibility have been explored and provided valuable insights regarding gene regulation and disease prognosis. In this article, we review the development of strategies to obtain single cell transcriptional and epigenetic data. Furthermore, we discuss ways in which single cell studies may help to provide greater understanding of the mechanisms of basic cardiovascular biology that will eventually lead to improvement in our ability to diagnose disease and develop new therapies.
[Precision medicine: new opportunities and challenges for molecular epidemiology].
Song, Jing; Hu, Yonghua
2016-04-01
Since the completion of the Human Genome Project in 2003 and the announcement of the Precision Medicine Initiative by U.S. President Barack Obama in January 2015, human beings have initially completed the " three steps" of " genomics to biology, genomics to health as well as genomics to society". As a new inter-discipline, the emergence and development of precision medicine have relied on the support and promotion from biological science, basic medicine, clinical medicine, epidemiology, statistics, sociology and information science, etc. Meanwhile, molecular epidemiology is considered to be the core power to promote precision medical as a cross discipline of epidemiology and molecular biology. This article is based on the characteristics and research progress of medicine and molecular epidemiology respectively, focusing on the contribution and significance of molecular epidemiology to precision medicine, and exploring the possible opportunities and challenges in the future.
Taylor, Christina M.; Mitreva, Makedonka
2011-01-01
A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes). Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention), built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html. PMID:21760913
Evolution of neuronal signalling: transmitters and receptors.
Hoyle, Charles H V
2011-11-16
Evolution is a dynamic process during which the genome should not be regarded as a static entity. Molecular and morphological information yield insights into the evolution of species and their phylogenetic relationships, and molecular information in particular provides information into the evolution of signalling processes. Many signalling systems have their origin in primitive, even unicellular, organisms. Through time, and as organismal complexity increased, certain molecules were employed as intercellular signal molecules. In the autonomic nervous system the basic unit of chemical transmission is a ligand and its cognate receptor. The general mechanisms underlying evolution of signal molecules and their cognate receptors have their basis in the alteration of the genome. In the past this has occurred in large-scale events, represented by two or more doublings of the whole genome, or large segments of the genome, early in the deuterostome lineage, after the emergence of urochordates and cephalochordates, and before the emergence of vertebrates. These duplications were followed by extensive remodelling involving subsequent small-scale changes, ranging from point mutations to exon duplication. Concurrent with these processes was multiple gene loss so that the modern genome contains roughly the same number of genes as in early deuterostomes despite the large-scale genomic duplications. In this review, the principles that underlie evolution that have led to large and small families of autonomic neurotransmitters and their receptors are discussed, with emphasis on G protein-coupled receptors. Copyright © 2010 Elsevier B.V. All rights reserved.
Lijun Liu; Matthew S. Zinkgraf; H. Earl Petzold; Eric P. Beers; Vladimir Filkov; Andrew Groover
2014-01-01
The class I KNOX homeodomain transcription factor ARBORKNOX1 (ARK1) is a key regulator of vascular cambium maintenance and cell differentiation in Populus. Currently, basic information is lacking concerning the distribution, functional characteristics, and evolution of ARK1 binding in the Populus genome.
Microscopy Images as Interactive Tools in Cell Modeling and Cell Biology Education
ERIC Educational Resources Information Center
Araujo-Jorge, Tania C.; Cardona, Tania S.; Mendes, Claudia L. S.; Henriques-Pons, Andrea; Meirelles, Rosane M. S.; Coutinho, Claudia M. L. M.; Aguiar, Luiz Edmundo V.; Meirelles, Maria de Nazareth L.; de Castro, Solange L.; Barbosa, Helene S.; Luz, Mauricio R. M. P.
2004-01-01
The advent of genomics, proteomics, and microarray technology has brought much excitement to science, both in teaching and in learning. The public is eager to know about the processes of life. In the present context of the explosive growth of scientific information, a major challenge of modern cell biology is to popularize basic concepts of…
HAL: a hierarchical format for storing and analyzing multiple genome alignments.
Hickey, Glenn; Paten, Benedict; Earl, Dent; Zerbino, Daniel; Haussler, David
2013-05-15
Large multiple genome alignments and inferred ancestral genomes are ideal resources for comparative studies of molecular evolution, and advances in sequencing and computing technology are making them increasingly obtainable. These structures can provide a rich understanding of the genetic relationships between all subsets of species they contain. Current formats for storing genomic alignments, such as XMFA and MAF, are all indexed or ordered using a single reference genome, however, which limits the information that can be queried with respect to other species and clades. This loss of information grows with the number of species under comparison, as well as their phylogenetic distance. We present HAL, a compressed, graph-based hierarchical alignment format for storing multiple genome alignments and ancestral reconstructions. HAL graphs are indexed on all genomes they contain. Furthermore, they are organized phylogenetically, which allows for modular and parallel access to arbitrary subclades without fragmentation because of rearrangements that have occurred in other lineages. HAL graphs can be created or read with a comprehensive C++ API. A set of tools is also provided to perform basic operations, such as importing and exporting data, identifying mutations and coordinate mapping (liftover). All documentation and source code for the HAL API and tools are freely available at http://github.com/glennhickey/hal. hickey@soe.ucsc.edu or haussler@soe.ucsc.edu Supplementary data are available at Bioinformatics online.
Pang, Jiaohui; Cheng, Qiqun; Sun, Dandan; Zhang, Heng; Jin, Shaofei
2016-09-01
Yellowfin tuna (Thunnus albacares) is one of the most important economic fishes around the world. In the present study, we determined the complete mitochondrial DNA sequence and organization of T. albacares. The entire mitochondrial genome is a circular-molecule of 16,528 bp in length, which encodes 37 genes in all. These genes comprise 13 protein-coding genes (ATP6 and 8, COI-III, Cytb, ND1-6 and 4 L), 22 transfer RNA genes (tRNAs), and 2 ribosomal RNA genes (12S and 16S rRNAs). The complete mitochondrial genome sequence of T. albacares can provide basic information for the studies on molecular taxonomy and conservation genetics of teleost fishes.
Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H
1999-01-01
We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits. PMID:10388831
The Blueprint of a Minimal Cell: MiniBacillus
Reuß, Daniel R.; Commichau, Fabian M.; Gundlach, Jan; Zhu, Bingyao
2016-01-01
SUMMARY Bacillus subtilis is one of the best-studied organisms. Due to the broad knowledge and annotation and the well-developed genetic system, this bacterium is an excellent starting point for genome minimization with the aim of constructing a minimal cell. We have analyzed the genome of B. subtilis and selected all genes that are required to allow life in complex medium at 37°C. This selection is based on the known information on essential genes and functions as well as on gene and protein expression data and gene conservation. The list presented here includes 523 and 119 genes coding for proteins and RNAs, respectively. These proteins and RNAs are required for the basic functions of life in information processing (replication and chromosome maintenance, transcription, translation, protein folding, and secretion), metabolism, cell division, and the integrity of the minimal cell. The completeness of the selected metabolic pathways, reactions, and enzymes was verified by the development of a model of metabolism of the minimal cell. A comparison of the MiniBacillus genome to the recently reported designed minimal genome of Mycoplasma mycoides JCVI-syn3.0 indicates excellent agreement in the information-processing pathways, whereas each species has a metabolism that reflects specific evolution and adaptation. The blueprint of MiniBacillus presented here serves as the starting point for a successive reduction of the B. subtilis genome. PMID:27681641
Insurance and genetic testing: where are we now?
Ostrer, H; Allen, W; Crandall, L A; Moseley, R E; Dewar, M A; Nye, D; McCrary, S V
1993-01-01
Basic research will spur development of genetic tests that are capable of presymptomatic prediction of disease, disability, and premature death in presently asymptomatic individuals. Concerns have been expressed about potential harms related to the use of genetic test results, especially loss of confidentiality, eugenics, and discrimination. Existing laws and administrative policies may not be sufficient to assure that genetic information is used fairly. To provide factual information and conceptual principles upon which sound social policy can be based, the Human Genome Initiative established an Ethical, Legal, and Social Issues Program. Among the first areas to be identified as a priority for study was insurance. This paper provides a review of life, health, and disability insurance systems, including basic principles, risk classification, and market and regulatory issues, and examines the potential impact of genetic information on the insurance industry. PMID:8447322
Pourabed, Ehsan; Ghane Golmohamadi, Farzan; Soleymani Monfared, Peyman; Razavi, Seyed Morteza; Shobbar, Zahra-Sadat
2015-01-01
The basic leucine zipper (bZIP) family is one of the largest and most diverse transcription factors in eukaryotes participating in many essential plant processes. We identified 141 bZIP proteins encoded by 89 genes from the Hordeum vulgare genome. HvbZIPs were classified into 11 groups based on their DNA-binding motif. Amino acid sequence alignment of the HvbZIPs basic-hinge regions revealed some highly conserved residues within each group. The leucine zipper heptads were analyzed predicting their dimerization properties. 34 conserved motifs were identified outside the bZIP domain. Phylogenetic analysis indicated that major diversification within the bZIP family predated the monocot/dicot divergence, although intra-species duplication and parallel evolution seems to be occurred afterward. Localization of HvbZIPs on the barley chromosomes revealed that different groups have been distributed on seven chromosomes of barley. Six types of intron pattern were detected within the basic-hinge regions. Most of the detected cis-elements in the promoter and UTR sequences were involved in seed development or abiotic stress response. Microarray data analysis revealed differential expression pattern of HvbZIPs in response to ABA treatment, drought, and cold stresses and during barley grain development and germination. This information would be helpful for functional characterization of bZIP transcription factors in barley.
Poland, Jesse
2015-04-01
The revolution of inexpensive sequencing has ushered in an unprecedented age of genomics. The promise of using this technology to accelerate plant breeding is being realized with a vision of genomics-assisted breeding that will lead to rapid genetic gain for expensive and difficult traits. The reality is now that robust phenotypic data is an increasing limiting resource to complement the current wealth of genomic information. While genomics has been hailed as the discipline to fundamentally change the scope of plant breeding, a more symbiotic relationship is likely to emerge. In the context of developing and evaluating large populations needed for functional genomics, none excel in this area more than plant breeders. While genetic studies have long relied on dedicated, well-structured populations, the resources dedicated to these populations in the context of readily available, inexpensive genotyping is making this philosophy less tractable relative to directly focusing functional genomics on material in breeding programs. Through shifting effort for basic genomic studies from dedicated structured populations, to capturing the entire scope of genetic determinants in breeding lines, we can move towards not only furthering our understanding of functional genomics in plants, but also rapidly improving crops for increased food security, availability and nutrition. Copyright © 2015 Elsevier Ltd. All rights reserved.
Toledo, Rodrigo A; Sekiya, Tomoko; Longuini, Viviane C; Coutinho, Flavia L; Lourenço, Delmar M; Toledo, Sergio P A
2012-01-01
The finished version of the human genome sequence was completed in 2003, and this event initiated a revolution in medical practice, which is usually referred to as the age of genomic or personalized medicine. Genomic medicine aims to be predictive, personalized, preventive, and also participative (4Ps). It offers a new approach to several pathological conditions, although its impact so far has been more evident in mendelian diseases. This article briefly reviews the potential advantages of this approach, and also some issues that may arise in the attempt to apply the accumulated knowledge from genomic medicine to clinical practice in emerging countries. The advantages of applying genomic medicine into clinical practice are obvious, enabling prediction, prevention, and early diagnosis and treatment of several genetic disorders. However, there are also some issues, such as those related to: (a) the need for approval of a law equivalent to the Genetic Information Nondiscrimination Act, which was approved in 2008 in the USA; (b) the need for private and public funding for genetics and genomics; (c) the need for development of innovative healthcare systems that may substantially cut costs (e.g. costs of periodic medical followup); (d) the need for new graduate and postgraduate curricula in which genomic medicine is emphasized; and (e) the need to adequately inform the population and possible consumers of genetic testing, with reference to the basic aspects of genomic medicine.
Toledo, Rodrigo A.; Sekiya, Tomoko; Longuini, Viviane C.; L. Coutinho, Flavia; Lourenço, Delmar M.; Toledo, Sergio P. A.
2012-01-01
The finished version of the human genome sequence was completed in 2003, and this event initiated a revolution in medical practice, which is usually referred to as the age of genomic or personalized medicine. Genomic medicine aims to be predictive, personalized, preventive, and also participative (4Ps). It offers a new approach to several pathological conditions, although its impact so far has been more evident in mendelian diseases. This article briefly reviews the potential advantages of this approach, and also some issues that may arise in the attempt to apply the accumulated knowledge from genomic medicine to clinical practice in emerging countries. The advantages of applying genomic medicine into clinical practice are obvious, enabling prediction, prevention, and early diagnosis and treatment of several genetic disorders. However, there are also some issues, such as those related to: (a) the need for approval of a law equivalent to the Genetic Information Nondiscrimination Act, which was approved in 2008 in the USA; (b) the need for private and public funding for genetics and genomics; (c) the need for development of innovative healthcare systems that may substantially cut costs (e.g. costs of periodic medical follow-up); (d) the need for new graduate and postgraduate curricula in which genomic medicine is emphasized; and (e) the need to adequately inform the population and possible consumers of genetic testing, with reference to the basic aspects of genomic medicine. PMID:22584698
Getting a head start: the importance of personal genetics education in high schools.
Kung, Johnny T; Gelbart, Marnie E
2012-03-01
With advances in sequencing technology, widespread and affordable genome sequencing will soon be a reality. However, studies suggest that "genetic literacy" of the general public is inadequate to prepare our society for this unprecedented access to our genetic information. As the current generation of high school students will come of age in an era when personal genetic information is increasingly utilized in health care, it is of vital importance to ensure these students understand the genetic concepts necessary to make informed medical decisions. These concepts include not only basic scientific knowledge, but also considerations of the ethical, legal, and social issues that will arise in the age of personal genomics. In this article, we review the current state of genetics education, highlight issues that we believe need to be addressed in a comprehensive genetics education curriculum, and describe our education efforts at the Harvard Medical School-based Personal Genetics Education Project.
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.
Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki
2014-09-01
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Applications of statistical physics and information theory to the analysis of DNA sequences
NASA Astrophysics Data System (ADS)
Grosse, Ivo
2000-10-01
DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.
Genome of Drosophila suzukii, the Spotted Wing Drosophila
Chiu, Joanna C.; Jiang, Xuanting; Zhao, Li; Hamm, Christopher A.; Cridland, Julie M.; Saelao, Perot; Hamby, Kelly A.; Lee, Ernest K.; Kwok, Rosanna S.; Zhang, Guojie; Zalom, Frank G.; Walton, Vaughn M.; Begun, David J.
2013-01-01
Drosophila suzukii Matsumura (spotted wing drosophila) has recently become a serious pest of a wide variety of fruit crops in the United States as well as in Europe, leading to substantial yearly crop losses. To enable basic and applied research of this important pest, we sequenced the D. suzukii genome to obtain a high-quality reference sequence. Here, we discuss the basic properties of the genome and transcriptome and describe patterns of genome evolution in D. suzukii and its close relatives. Our analyses and genome annotations are presented in a web portal, SpottedWingFlyBase, to facilitate public access. PMID:24142924
Immunogenetics of the Elephant Seal
NASA Technical Reports Server (NTRS)
Garza, John Carlos
1999-01-01
The goals of this cooperative agreement fall into three categories: 1) A basic description of Immunogenetic variation in the northern elephant seal genome; 2) A basic genetic map of the northern elephant seal genome; 3). Microevolutionary forces in the northern elephant seal genome. The results described in this report were acquired using funds from this cooperative agreement together with funds from a National Science Foundation Dissertation Improvement Grant.
Genome-wide identification and analysis of the chicken basic helix-loop-helix factors.
Liu, Wu-Yi; Zhao, Chun-Jiang
2010-01-01
Members of the basic helix-loop-helix (bHLH) family of transcription factors play important roles in a wide range of developmental processes. In this study, we conducted a genome-wide survey using the chicken (Gallus gallus) genomic database, and identified 104 bHLH sequences belonging to 42 gene families in an effort to characterize the chicken bHLH transcription factor family. Phylogenetic analyses revealed that chicken has 50, 21, 15, 4, 8, and 3 bHLH members in groups A, B, C, D, E, and F, respectively, while three members belonging to none of these groups were classified as ''orphans". A comparison between chicken and human bHLH repertoires suggested that both organisms have a number of lineage-specific bHLH members in the proteomes. Chromosome distribution patterns and phylogenetic analyses strongly suggest that the bHLH members should have arisen through gene duplication at an early date. Gene Ontology (GO) enrichment statistics showed 51 top GO annotations of biological processes counted in the frequency. The present study deepens our understanding of the chicken bHLH transcription factor family and provides much useful information for further studies using chicken as a model system.
PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses
Purcell, Shaun ; Neale, Benjamin ; Todd-Brown, Kathe ; Thomas, Lori ; Ferreira, Manuel A. R. ; Bender, David ; Maller, Julian ; Sklar, Pamela ; de Bakker, Paul I. W. ; Daly, Mark J. ; Sham, Pak C.
2007-01-01
Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis. PMID:17701901
Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F; Espinoza-Rojas, Daniela A; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G; Figueroa, Jaime E; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J
2017-01-01
Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis , functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be directly associated with inter-genogroup differences in pathogenesis and host-pathogen interactions, information that could be useful in designing novel strategies for diagnosing and controlling P. salmonis infection.
Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F.; Espinoza-Rojas, Daniela A.; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G.; Figueroa, Jaime E.; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J.
2017-01-01
Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be directly associated with inter-genogroup differences in pathogenesis and host-pathogen interactions, information that could be useful in designing novel strategies for diagnosing and controlling P. salmonis infection. PMID:29164068
Genomes as geography: using GIS technology to build interactive genome feature maps
Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J
2006-01-01
Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be exploited to display, explore and interact with genome data. The approach we describe here is organism independent and is equally useful for linear and circular chromosomes. One of the unique capabilities of GenoSIS compared to existing genome browsers is the capacity to generate genome feature maps dynamically in response to complex attribute and spatial queries. PMID:16984652
Genomics and privacy: implications of the new reality of closed data for the field.
Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark
2011-12-01
Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums. © 2011 Greenbaum et al.
NASA Astrophysics Data System (ADS)
Agung, Muhammad Budi; Budiarsa, I. Made; Suwastika, I. Nengah
2017-02-01
Cocoa bean is one of the main commodities from Indonesia for the world, which still have problem regarding yield degradation due to pathogens and disease attack. Developing robust cacao plant that genetically resistant to pathogen and disease attack is an ideal solution in over taking on this problem. The aim of this study was to identify Theobroma cacao genes on database of cacao genome that homolog to response genes of pathogen and disease attack in other plant, through in silico analysis. Basic information survey and gene identification were performed in GenBank and The Arabidopsis Information Resource database. The In silico analysis contains protein BLAST, homology test of each gene's protein candidates, and identification of homologue gene in Cacao Genome Database using data source "Theobroma cacao cv. Matina 1-6 v1.1" genome. Identification found that Thecc1EG011959t1 (EDS1), Thecc1EG006803t1 (EDS5), Thecc1EG013842t1 (ICS1), and Thecc1EG015614t1 (BG_PPAP) gene of Cacao Genome Database were Theobroma cacao genes that homolog to plant's resistance genes which highly possible to have similar functions of each gene's homologue gene.
Minimum Information about a Genotyping Experiment (MIGEN)
Huang, Jie; Mirel, Daniel; Pugh, Elizabeth; Xing, Chao; Robinson, Peter N.; Pertsemlidis, Alexander; Ding, LiangHao; Kozlitina, Julia; Maher, Joseph; Rios, Jonathan; Story, Michael; Marthandan, Nishanth; Scheuermann, Richard H.
2011-01-01
Genotyping experiments are widely used in clinical and basic research laboratories to identify associations between genetic variations and normal/abnormal phenotypes. Genotyping assay techniques vary from single genomic regions that are interrogated using PCR reactions to high throughput assays examining genome-wide sequence and structural variation. The resulting genotype data may include millions of markers of thousands of individuals, requiring various statistical, modeling or other data analysis methodologies to interpret the results. To date, there are no standards for reporting genotyping experiments. Here we present the Minimum Information about a Genotyping Experiment (MIGen) standard, defining the minimum information required for reporting genotyping experiments. MIGen standard covers experimental design, subject description, genotyping procedure, quality control and data analysis. MIGen is a registered project under MIBBI (Minimum Information for Biological and Biomedical Investigations) and is being developed by an interdisciplinary group of experts in basic biomedical science, clinical science, biostatistics and bioinformatics. To accommodate the wide variety of techniques and methodologies applied in current and future genotyping experiment, MIGen leverages foundational concepts from the Ontology for Biomedical Investigations (OBI) for the description of the various types of planned processes and implements a hierarchical document structure. The adoption of MIGen by the research community will facilitate consistent genotyping data interpretation and independent data validation. MIGen can also serve as a framework for the development of data models for capturing and storing genotyping results and experiment metadata in a structured way, to facilitate the exchange of metadata. PMID:22180825
Genomics Portals: integrative web-platform for mining genomics data.
Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario
2010-01-13
A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data
2010-01-01
Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Generation of Infectious Poliovirus with Altered Genetic Information from Cloned cDNA.
Bujaki, Erika
2016-01-01
The effect of specific genetic alterations on virus biology and phenotype can be studied by a great number of available assays. The following method describes the basic protocol to generate infectious poliovirus with altered genetic information from cloned cDNA in cultured cells.The example explained here involves generation of a recombinant poliovirus genome by simply replacing a portion of the 5' noncoding region with a synthetic gene by restriction cloning. The vector containing the full length poliovirus genome and the insert DNA with the known mutation(s) are cleaved for directional cloning, then ligated and transformed into competent bacteria. The recombinant plasmid DNA is then propagated in bacteria and transcribed to RNA in vitro before RNA transfection of cultured cells is performed. Finally, viral particles are recovered from the cell culture.
Producing genome structure populations with the dynamic and automated PGS software.
Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank
2018-05-01
Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.
Musunuru, Kiran; Arora, Pankaj; Cooke, John P; Ferguson, Jane F; Hershberger, Ray E; Hickey, Kathleen T; Lee, Jin-Moo; Lima, João A C; Loscalzo, Joseph; Pereira, Naveen L; Russell, Mark W; Shah, Svati H; Sheikh, Farah; Wang, Thomas J; MacRae, Calum A
2018-06-01
The completion of the Human Genome Project has unleashed a wealth of human genomics information, but it remains unclear how best to implement this information for the benefit of patients. The standard approach of biomedical research, with researchers pursuing advances in knowledge in the laboratory and, separately, clinicians translating research findings into the clinic as much as decades later, will need to give way to new interdisciplinary models for research in genomic medicine. These models should include scientists and clinicians actively working as teams to study patients and populations recruited in clinical settings and communities to make genomics discoveries-through the combined efforts of data scientists, clinical researchers, epidemiologists, and basic scientists-and to rapidly apply these discoveries in the clinic for the prediction, prevention, diagnosis, prognosis, and treatment of cardiovascular diseases and stroke. The highly publicized US Precision Medicine Initiative, also known as All of Us, is a large-scale program funded by the US National Institutes of Health that will energize these efforts, but several ongoing studies such as the UK Biobank Initiative; the Million Veteran Program; the Electronic Medical Records and Genomics Network; the Kaiser Permanente Research Program on Genes, Environment and Health; and the DiscovEHR collaboration are already providing exemplary models of this kind of interdisciplinary work. In this statement, we outline the opportunities and challenges in broadly implementing new interdisciplinary models in academic medical centers and community settings and bringing the promise of genomics to fruition. © 2018 American Heart Association, Inc.
Yabe, Shiori; Hara, Takashi; Ueno, Mariko; Enoki, Hiroyuki; Kimura, Tatsuro; Nishimura, Satoru; Yasui, Yasuo; Ohsawa, Ryo; Iwata, Hiroyoshi
2014-01-01
For genetic studies and genomics-assisted breeding, particularly of minor crops, a genotyping system that does not require a priori genomic information is preferable. Here, we demonstrated the potential of a novel array-based genotyping system for the rapid construction of high-density linkage map and quantitative trait loci (QTL) mapping. By using the system, we successfully constructed an accurate, high-density linkage map for common buckwheat (Fagopyrum esculentum Moench); the map was composed of 756 loci and included 8,884 markers. The number of linkage groups converged to eight, which is the basic number of chromosomes in common buckwheat. The sizes of the linkage groups of the P1 and P2 maps were 773.8 and 800.4 cM, respectively. The average interval between adjacent loci was 2.13 cM. The linkage map constructed here will be useful for the analysis of other common buckwheat populations. We also performed QTL mapping for main stem length and detected four QTL. It took 37 days to process 178 samples from DNA extraction to genotyping, indicating the system enables genotyping of genome-wide markers for a few hundred buckwheat plants before the plants mature. The novel system will be useful for genomics-assisted breeding in minor crops without a priori genomic information. PMID:25914583
Yabe, Shiori; Hara, Takashi; Ueno, Mariko; Enoki, Hiroyuki; Kimura, Tatsuro; Nishimura, Satoru; Yasui, Yasuo; Ohsawa, Ryo; Iwata, Hiroyoshi
2014-12-01
For genetic studies and genomics-assisted breeding, particularly of minor crops, a genotyping system that does not require a priori genomic information is preferable. Here, we demonstrated the potential of a novel array-based genotyping system for the rapid construction of high-density linkage map and quantitative trait loci (QTL) mapping. By using the system, we successfully constructed an accurate, high-density linkage map for common buckwheat (Fagopyrum esculentum Moench); the map was composed of 756 loci and included 8,884 markers. The number of linkage groups converged to eight, which is the basic number of chromosomes in common buckwheat. The sizes of the linkage groups of the P1 and P2 maps were 773.8 and 800.4 cM, respectively. The average interval between adjacent loci was 2.13 cM. The linkage map constructed here will be useful for the analysis of other common buckwheat populations. We also performed QTL mapping for main stem length and detected four QTL. It took 37 days to process 178 samples from DNA extraction to genotyping, indicating the system enables genotyping of genome-wide markers for a few hundred buckwheat plants before the plants mature. The novel system will be useful for genomics-assisted breeding in minor crops without a priori genomic information.
Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D.
Matsuzaki, Motomichi; Misumi, Osami; Shin-I, Tadasu; Maruyama, Shinichiro; Takahara, Manabu; Miyagishima, Shin-Ya; Mori, Toshiyuki; Nishida, Keiji; Yagisawa, Fumi; Nishida, Keishin; Yoshida, Yamato; Nishimura, Yoshiki; Nakao, Shunsuke; Kobayashi, Tamaki; Momoyama, Yu; Higashiyama, Tetsuya; Minoda, Ayumi; Sano, Masako; Nomoto, Hisayo; Oishi, Kazuko; Hayashi, Hiroko; Ohta, Fumiko; Nishizaka, Satoko; Haga, Shinobu; Miura, Sachiko; Morishita, Tomomi; Kabeya, Yukihiro; Terasawa, Kimihiro; Suzuki, Yutaka; Ishii, Yasuyuki; Asakawa, Shuichi; Takano, Hiroyoshi; Ohta, Niji; Kuroiwa, Haruko; Tanaka, Kan; Shimizu, Nobuyoshi; Sugano, Sumio; Sato, Naoki; Nozaki, Hisayoshi; Ogasawara, Naotake; Kohara, Yuji; Kuroiwa, Tsuneyoshi
2004-04-08
Small, compact genomes of ultrasmall unicellular algae provide information on the basic and essential genes that support the lives of photosynthetic eukaryotes, including higher plants. Here we report the 16,520,305-base-pair sequence of the 20 chromosomes of the unicellular red alga Cyanidioschyzon merolae 10D as the first complete algal genome. We identified 5,331 genes in total, of which at least 86.3% were expressed. Unique characteristics of this genomic structure include: a lack of introns in all but 26 genes; only three copies of ribosomal DNA units that maintain the nucleolus; and two dynamin genes that are involved only in the division of mitochondria and plastids. The conserved mosaic origin of Calvin cycle enzymes in this red alga and in green plants supports the hypothesis of the existence of single primary plastid endosymbiosis. The lack of a myosin gene, in addition to the unexpressed actin gene, suggests a simpler system of cytokinesis. These results indicate that the C. merolae genome provides a model system with a simple gene composition for studying the origin, evolution and fundamental mechanisms of eukaryotic cells.
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Wang, Zhihui; Cheng, Ke; Wan, Liyun; Yan, Liying; Jiang, Huifang; Liu, Shengyi; Lei, Yong; Liao, Boshou
2015-12-10
Plant bZIP proteins characteristically harbor a highly conserved bZIP domain with two structural features: a DNA-binding basic region and a leucine (Leu) zipper dimerization region. They have been shown to be diverse transcriptional regulators, playing crucial roles in plant development, physiological processes, and biotic/abiotic stress responses. Despite the availability of six completely sequenced legume genomes, a comprehensive investigation of bZIP family members in legumes has yet to be presented. In this study, we identified 428 bZIP genes encoding 585 distinct proteins in six legumes, Glycine max, Medicago truncatula, Phaseolus vulgaris, Cicer arietinum, Cajanus cajan, and Lotus japonicus. The legume bZIP genes were categorized into 11 groups according to their phylogenetic relationships with genes from Arabidopsis. Four kinds of intron patterns (a-d) within the basic and hinge regions were defined and additional conserved motifs were identified, both presenting high group specificity and supporting the group classification. We predicted the DNA-binding patterns and the dimerization properties, based on the characteristic features in the basic and hinge regions and the Leu zipper, respectively, which indicated that some highly conserved amino acid residues existed across each major group. The chromosome distribution and analysis for WGD-derived duplicated blocks revealed that the legume bZIP genes have expanded mainly by segmental duplication rather than tandem duplication. Expression data further revealed that the legume bZIP genes were expressed constitutively or in an organ-specific, development-dependent manner playing roles in multiple seed developmental stages and tissues. We also detected several key legume bZIP genes involved in drought- and salt-responses by comparing fold changes of expression values in drought-stressed or salt-stressed roots and leaves. In summary, this genome-wide identification, characterization and expression analysis of legume bZIP genes provides valuable information for understanding the molecular functions and evolution of the legume bZIP transcription factor family, and highlights potential legume bZIP genes involved in regulating tissue development and abiotic stress responses.
Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.
Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue
2013-03-01
The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.
Rapid identification of kidney cyst mutations by whole exome sequencing in zebrafish
Ryan, Sean; Willer, Jason; Marjoram, Lindsay; Bagwell, Jennifer; Mankiewicz, Jamie; Leshchiner, Ignaty; Goessling, Wolfram; Bagnat, Michel; Katsanis, Nicholas
2013-01-01
Forward genetic approaches in zebrafish have provided invaluable information about developmental processes. However, the relative difficulty of mapping and isolating mutations has limited the number of new genetic screens. Recent improvements in the annotation of the zebrafish genome coupled to a reduction in sequencing costs prompted the development of whole genome and RNA sequencing approaches for gene discovery. Here we describe a whole exome sequencing (WES) approach that allows rapid and cost-effective identification of mutations. We used our WES methodology to isolate four mutations that cause kidney cysts; we identified novel alleles in two ciliary genes as well as two novel mutants. The WES approach described here does not require specialized infrastructure or training and is therefore widely accessible. This methodology should thus help facilitate genetic screens and expedite the identification of mutants that can inform basic biological processes and the causality of genetic disorders in humans. PMID:24130329
Genetic and Genomic Toolbox of Zea mays
Nannas, Natalie J.; Dawe, R. Kelly
2015-01-01
Maize has a long history of genetic and genomic tool development and is considered one of the most accessible higher plant systems. With a fully sequenced genome, a suite of cytogenetic tools, methods for both forward and reverse genetics, and characterized phenotype markers, maize is amenable to studying questions beyond plant biology. Major discoveries in the areas of transposons, imprinting, and chromosome biology came from work in maize. Moving forward in the post-genomic era, this classic model system will continue to be at the forefront of basic biological study. In this review, we outline the basics of working with maize and describe its rich genetic toolbox. PMID:25740912
A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.
Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu
2018-05-09
The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.
Abu-Elmagd, Muhammad; Assidi, Mourad; Dallol, Ashraf; Buhmeida, Abdelbaset; Pushparaj, Peter Natesan; Kalamegam, Gauthaman; Al-Hamzi, Emad; Shay, Jerry W; Scherer, Stephen W; Agarwal, Ashok; Budowle, Bruce; Gari, Mamdooh; Chaudhary, Adeel; Abuzenadah, Adel; Al-Qahtani, Mohammed
2016-10-17
The Third International Genomic Medicine Conference (3 rd IGMC) was organised by the Centre of Excellence in Genomic Medicine Research (CEGMR) at the King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia (KSA). This conference is a continuation of a series of meetings, which began with the first International Genomic Medicine Conference (1 st IGMC, 2011) followed by the second International Genomic Medicine Conference (2 nd IGMC, 2013). The 3 rd IGMC meeting presented as a timely opportunity to bring scientists from across the world to gather, discuss, and exchange recent advances in the field of genomics and genetics in general as well as practical information on using these new technologies in different basic and clinical applications. The meeting undoubtedly inspired young male and female Saudi researchers, who attended the conference in large numbers, as evidenced by the oversubscribed oral and poster presentations. The conference also witnessed the launch of the first content for npj Genomic Medicine, a high quality new journal was established in partnership by CEGMR with Springer Nature and published as part of the Nature Partner Journal series. Here, we present a brief summary report of the 2-day meeting including highlights from the oral presentations, poster presentations, workshops, poster prize-winners and comments from the distinguished scientists.
Fungal genome sequencing: basic biology to biotechnology.
Sharma, Krishna Kant
2016-08-01
The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.
Sandusky, George E; Teheny, Katie Heinz; Esterman, Mike; Hanson, Jeff; Williams, Stephen D
2007-01-01
The success of molecular research and its applications in both the clinical and basic research arenas is strongly dependent on the collection, handling, storage, and quality control of fresh human tissue samples. This tissue bank was set up to bank fresh surgically obtained human tissue using a Clinical Annotated Tissue Database (CATD) in order to capture the associated patient clinical data and demographics using a one way patient encryption scheme to protect patient identification. In this study, we determined that high quality of tissue samples is imperative for both genomic and proteomic molecular research. This paper also contains a brief compilation of the literature involved in the patient ethics, patient informed consent, patient de-identification, tissue collection, processing, and storage as well as basic molecular research generated from the tissue bank using good clinical practices. The current applicable rules, regulations, and guidelines for handling human tissues are briefly discussed. More than 6,610 cancer patients have been consented (97% of those that were contacted by the consenter) and 16,800 tissue specimens have been banked from these patients in 9 years. All samples collected in the bank were QC'd by a pathologist. Approximately 1,550 tissue samples have been requested for use in basic, clinical, and/or biomarker cancer research studies. Each tissue aliquot removed from the bank for a research study were evaluated by a second H&E, if the samples passed the QC, they were submitted for genomic and proteomic molecular analysis/study. Approximately 75% of samples evaluated were of high histologic quality and used for research studies. Since 2003, we changed the patient informed consent to allow the tissue bank to gather more patient clinical follow-up information. Ninety two percent of the patients (1,865 patients) signed the new informed consent form and agreed to be re-contacted for follow-up information on their disease state. In addition, eighty five percent of patients (1,584) agreed to be re-contacted to provide a biological fluid sample to be used for biomarker research.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research.
Manolio, Teri A; Fowler, Douglas M; Starita, Lea M; Haendel, Melissa A; MacArthur, Daniel G; Biesecker, Leslie G; Worthey, Elizabeth; Chisholm, Rex L; Green, Eric D; Jacob, Howard J; McLeod, Howard L; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S; Cooper, Gregory M; Cox, Nancy J; Herman, Gail E; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A; Nussbaum, Robert L; Ordovas, Jose M; Ramos, Erin M; Robinson, Peter N; Rubinstein, Wendy S; Seidman, Christine; Stranger, Barbara E; Wang, Haoyi; Westerfield, Monte; Bult, Carol
2017-03-23
Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations, we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. Published by Elsevier Inc.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research
Manolio, Teri A.; Fowler, Douglas M.; Starita, Lea M.; Haendel, Melissa A.; MacArthur, Daniel G.; Biesecker, Leslie G.; Worthey, Elizabeth; Chisholm, Rex L.; Green, Eric D.; Jacob, Howard J.; McLeod, Howard L.; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S.; Cooper, Gregory M.; Cox, Nancy J.; Herman, Gail E.; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A.; Nussbaum, Robert L.; Ordovas, Jose M.; Ramos, Erin M.; Robinson, Peter N.; Rubinstein, Wendy S.; Seidman, Christine; Stranger, Barbara E.; Wang, Haoyi; Westerfield, Monte; Bult, Carol
2017-01-01
Summary Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. PMID:28340351
A genome-wide survey on basic helix-loop-helix transcription factors in giant panda.
Dang, Chunwang; Wang, Yong; Zhang, Debao; Yao, Qin; Chen, Keping
2011-01-01
The giant panda (Ailuropoda melanoleuca) is a critically endangered mammalian species. Studies on functions of regulatory proteins involved in developmental processes would facilitate understanding of specific behavior in giant panda. The basic helix-loop-helix (bHLH) proteins play essential roles in a wide range of developmental processes in higher organisms. bHLH family members have been identified in over 20 organisms, including fruit fly, zebrafish, mouse and human. Our present study identified 107 bHLH family members being encoded in giant panda genome. Phylogenetic analyses revealed that they belong to 44 bHLH families with 46, 25, 15, 4, 11 and 3 members in group A, B, C, D, E and F, respectively, while the remaining 3 members were assigned into "orphan". Compared to mouse, the giant panda does not encode seven bHLH proteins namely Beta3a, Mesp2, Sclerax, S-Myc, Hes5 (or Hes6), EBF4 and Orphan 1. These results provide useful background information for future studies on structure and function of bHLH proteins in the regulation of giant panda development.
Genome Editing Redefines Precision Medicine in the Cardiovascular Field
Lahm, Harald; Dreßen, Martina; Lange, Rüdiger; Wu, Sean M.; Krane, Markus
2018-01-01
Genome editing is a powerful tool to study the function of specific genes and proteins important for development or disease. Recent technologies, especially CRISPR/Cas9 which is characterized by convenient handling and high precision, revolutionized the field of genome editing. Such tools have enormous potential for basic science as well as for regenerative medicine. Nevertheless, there are still several hurdles that have to be overcome, but patient-tailored therapies, termed precision medicine, seem to be within reach. In this review, we focus on the achievements and limitations of genome editing in the cardiovascular field. We explore different areas of cardiac research and highlight the most important developments: (1) the potential of genome editing in human pluripotent stem cells in basic research for disease modelling, drug screening, or reprogramming approaches and (2) the potential and remaining challenges of genome editing for regenerative therapies. Finally, we discuss social and ethical implications of these new technologies. PMID:29731778
Organization and integration of biomedical knowledge with concept maps for key peroxisomal pathways.
Willemsen, A M; Jansen, G A; Komen, J C; van Hooff, S; Waterham, H R; Brites, P M T; Wanders, R J A; van Kampen, A H C
2008-08-15
One important area of clinical genomics research involves the elucidation of molecular mechanisms underlying (complex) disorders which eventually may lead to new diagnostic or drug targets. To further advance this area of clinical genomics one of the main challenges is the acquisition and integration of data, information and expert knowledge for specific biomedical domains and diseases. Currently the required information is not very well organized but scattered over biological and biomedical databases, basic text books, scientific literature and experts' minds and may be highly specific, heterogeneous, complex and voluminous. We present a new framework to construct knowledge bases with concept maps for presentation of information and the web ontology language OWL for the representation of information. We demonstrate this framework through the construction of a peroxisomal knowledge base, which focuses on four key peroxisomal pathways and several related genetic disorders. All 155 concept maps in our knowledge base are linked to at least one other concept map, which allows the visualization of one big network of related pieces of information. The peroxisome knowledge base is available from www.bioinformaticslaboratory.nl (Support-->Web applications). Supplementary data is available from www.bioinformaticslaboratory.nl (Research-->Output--> Publications--> KB_SuppInfo)
Maggi, Elaine; Montagna, Cristina
2015-12-01
The American Association for Cancer Research (AACR) Precision Medicine Series "Integrating Clinical Genomics and Cancer Therapy" took place June 13-16, 2015 in Salt Lake City, Utah. The conference was co-chaired by Charles L. Sawyers form Memorial Sloan Kettering Cancer Center in New York, Elaine R. Mardis form Washington University School of Medicine in St. Louis, and Arul M. Chinnaiyan from University of Michigan in Ann Arbor. About 500 clinicians, basic science investigators, bioinformaticians, and postdoctoral fellows joined together to discuss the current state of Clinical Genomics and the advances and challenges of integrating Next Generation Sequencing (NGS) technologies into clinical practice. The plenary sessions and panel discussions covered current platforms and sequencing approaches adopted for NGS assays of cancer genome at several national and international institutions, different approaches used to map and classify targetable sequence variants, and how information acquired with the sequencing of the cancer genome is used to guide treatment options. While challenges still exist from a technological perspective, it emerged that there exists considerable need for the development of tools to aid the identification of the therapy most suitable based on the mutational profile of the somatic cancer genome. The process to match patients to ongoing clinical trials is still complex. In addition, the need for centralized data repositories, preferably linked to well annotated clinical records, that aid sharing of sequencing information is central to begin understanding the contribution of variants of unknown significance to tumor etiology and response to therapy. Here we summarize the highlights of this stimulating four-day conference with a major emphasis on the open problems that the clinical genomics community is currently facing and the tools most needed for advancing this field. Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Identification of the "A" genome of finger millet using chloroplast DNA.
Hilu, K W
1988-01-01
Finger millet (Eleusine corocana subsp. coracana), an important cereal in East Africa and India, is a tetraploid species with unknown genomic components. A recent cytogenetic study confirmed the direct origin of this millet from the tetraploid E. coracana subsp. africana but questioned Eleusine indica as a genomic donor. Chloroplast (ct) DNA sequence analysis using restriction fragment pattern was used to examine the phylogenetic relationships between E. coracana subsp. coracana (domesticated finger millet), E. coracana subspecies africana (wild finger millet), and E. indica. Eleusine tristachya was included since it is the only other annual diploid species in the genus with a basic chromosome number of x = 9 like finger millet. Eight of the ten restriction endonucleases used had 16 to over 30 restriction sites per genome and were informative. E. coracana subsp. coracana and subsp. africana and E. indica were identical in all the restriction sites surveyed, while the ct genome of E, tristachya differed consistently by at least one mutational event for each restriction enzyme surveyed. This random survey of the ct genomes of these species points out E. indica as one of the genome donors (maternal genome donor) of domesticated finger millet contrary to a previous cytogenetic study. The data also substantiate E. coracana subsp. africana as the progenitor of domesticated finger millet. The disparity between the cytogenetic and the molecular approaches is discussed in light of the problems associated with chromosome pairing and polyploidy.
Inda, Márcia A; van Batenburg, Marinus F; Roos, Marco; Belloum, Adam S Z; Vasunin, Dmitry; Wibisono, Adianto; van Kampen, Antoine H C; Breit, Timo M
2008-08-08
Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perform in silico experimentation conveniently with this genomics data, biologists need tools to process and compare datasets routinely and explore the obtained results interactively. The complexity of such experimentation requires these tools to be based on an e-Science approach, hence generic, modular, and reusable. A virtual laboratory environment with workflows, workflow management systems, and Grid computation are therefore essential. Here we apply an e-Science approach to develop SigWin-detector, a workflow-based tool that can detect significantly enriched windows of (genomic) features in a (DNA) sequence in a fast and reproducible way. For proof-of-principle, we utilize a biological use case to detect regions of increased and decreased gene expression (RIDGEs and anti-RIDGEs) in human transcriptome maps. We improved the original method for RIDGE detection by replacing the costly step of estimation by random sampling with a faster analytical formula for computing the distribution of the null hypothesis being tested and by developing a new algorithm for computing moving medians. SigWin-detector was developed using the WS-VLAM workflow management system and consists of several reusable modules that are linked together in a basic workflow. The configuration of this basic workflow can be adapted to satisfy the requirements of the specific in silico experiment. As we show with the results from analyses in the biological use case on RIDGEs, SigWin-detector is an efficient and reusable Grid-based tool for discovering windows enriched for features of a particular type in any sequence of values. Thus, SigWin-detector provides the proof-of-principle for the modular e-Science based concept of integrative bioinformatics experimentation.
Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq.
Marchal, Claire; Sasaki, Takayo; Vera, Daniel; Wilson, Korey; Sima, Jiao; Rivera-Mulia, Juan Carlos; Trevilla-García, Claudia; Nogues, Coralin; Nafie, Ebtesam; Gilbert, David M
2018-05-01
This protocol is an extension to: Nat. Protoc. 6, 870-895 (2014); doi:10.1038/nprot.2011.328; published online 02 June 2011Cycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early- and late-replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and subnuclear position. Moreover, RT is regulated during development and is altered in diseases. Here, we describe E/L Repli-seq, an extension of our Repli-chip protocol. E/L Repli-seq is a rapid, robust and relatively inexpensive protocol for analyzing RT by next-generation sequencing (NGS), allowing genome-wide assessment of how cellular processes are linked to RT. Briefly, cells are pulse-labeled with BrdU, and early and late S-phase fractions are sorted by flow cytometry. Labeled nascent DNA is immunoprecipitated from both fractions and sequenced. Data processing leads to a single bedGraph file containing the ratio of nascent DNA from early versus late S-phase fractions. The results are comparable to those of Repli-chip, with the additional benefits of genome-wide sequence information and an increased dynamic range. We also provide computational pipelines for downstream analyses, for parsing phased genomes using single-nucleotide polymorphisms (SNPs) to analyze RT allelic asynchrony, and for direct comparison to Repli-chip data. This protocol can be performed in up to 3 d before sequencing, and requires basic cellular and molecular biology skills, as well as a basic understanding of Unix and R.
Analysis of Existing International Policy Evidence in Public Health Genomics: Mapping Exercise
Syurina, Elena V.; in den Bäumen, Tobias Schulte; Feron, Frans J.M.; Brand, Angela
2012-01-01
Background In the last decades we have seen a constant growth in the fields of science related to the use of genome-based health information. However, there is a gap between basic science research and the Public Health everyday practice. For a successful introduction of genome-based technologies policy actions on the international level are needed. This work represents the initial stage of the PHGEN II (Public Health Genomics European Network II) project. In order to prepare a base for bridging genomics and Public Health, an inventory study of the existing legislative base dealing with controversies of genome-based knowledge was conducted. The work results in the mapping of the most and the least legislatively covered areas and some preliminary conclusions about the existing gaps. Design and Methods The collection of the evidence-based policies was done through the PHGEN II project. The mapping covered the meta-level (international, European general guidelines). The expert opinion of the partners of the project was required to reflect on and grade the collected evidence. Results An analysis of the evidence was made by the area of coverage: using the list of important policy areas for successful introduction of genome-based technologies into Public Health and the Public Health Genomics Wheel (originally Public Health Wheel developed by Institute of Medicine). Conclusions Severe inequalities in coverage of important issues of Public Health Genomics were found. The most attention was paid to clinical utility and clinical validity of the screening and the protection of human subjects. Important areas such as trade agreements, Public Health Genomics literacy, insurance issues, behaviour modification in response to genomics results etc. were paid less attention to. For the successful adoption of new technologies on the Public Health level the focus should be not only on the translation to clinical practice, but the translation from bench to Public Health policy and back. Coherent and consistent coverage of all aspects of the translation of genome based information and technologies is of outmost importance. PMID:25170444
Genetics in the 21st Century: Implications for patients, consumers and citizens
Roberts, Jonathan; Middleton, Anna
2018-01-01
The first human genome project, completed in 2003, uncovered the genetic building blocks of humankind. Painstakingly cataloguing the basic constituents of our DNA (‘genome sequencing’) took ten years, over three billion dollars and was a multinational collaboration. Since then, our ability to sequence genomes has been finessed so much that by 2018 it is possible to explore the 20,000 or so human genes for under £1000, in a matter of days. Such testing offers clues to our past, present and future health, as well as information about how we respond to medications so that truly ‘personalised medicine’ is now moving closer to a reality. The impact of such a ‘genomic era’ is likely to have some level of impact on an increasingly large number of us, even if we are not directly using healthcare services ourselves. We explore how advancements in genetics are likely to be experienced by people, as patients, consumers and citizens; and urge policy makers to take stock of the pervasive nature of the technology as well as the human response to it. PMID:29259772
Ellis, L B; Hershberger, C D; Wackett, L P
1999-01-01
The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://www.labmed.umn.edu/umbbd/i nde x.html) first became available on the web in 1995 to provide information on microbial biocatalytic reactions of, and biodegradation pathways for, organic chemical compounds, especially those produced by man. Its goal is to become a representative database of biodegradation, spanning the diversity of known microbial metabolic routes, organic functional groups, and environmental conditions under which biodegradation occurs. The database can be used to enhance understanding of basic biochemistry, biocatalysis leading to speciality chemical manufacture, and biodegradation of environmental pollutants. It is also a resource for functional genomics, since it contains information on enzymes and genes involved in specialized metabolism not found in intermediary metabolism databases, and thus can assist in assigning functions to genes homologous to such less common genes. With information on >400 reactions and compounds, it is poised to become a resource for prediction of microbial biodegradation pathways for compounds it does not contain, a process complementary to predicting the functions of new classes of microbial genes. PMID:9847233
A genomewide survey of basic helix–loop–helix factors in Drosophila
Moore, Adrian W.; Barbel, Sandra; Jan, Lily Yeh; Jan, Yuh Nung
2000-01-01
The basic helix–loop–helix (bHLH) transcription factors play important roles in the specification of tissue type during the development of animals. We have used the information contained in the recently published genomic sequence of Drosophila melanogaster to identify 12 additional bHLH proteins. By sequence analysis we have assigned these proteins to families defined by Atonal, Hairy-Enhancer of Split, Hand, p48, Mesp, MYC/USF, and the bHLH-Per, Arnt, Sim (PAS) domain. In addition, one single protein represents a unique family of bHLH proteins. mRNA in situ analysis demonstrates that the genes encoding these proteins are expressed in several tissue types but are particularly concentrated in the developing nervous system and mesoderm. PMID:10973473
Machine learning for Big Data analytics in plants.
Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng
2014-12-01
Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. Copyright © 2014 Elsevier Ltd. All rights reserved.
Niemiec, Emilia; Borry, Pascal; Pinxten, Wim; Howard, Heidi Carmen
2016-12-01
Whole exome sequencing (WES) and whole genome sequencing (WGS) have become increasingly available in the research and clinical settings and are now also being offered by direct-to-consumer (DTC) genetic testing (GT) companies. This offer can be perceived as amplifying the already identified concerns regarding adequacy of informed consent (IC) for both WES/WGS and the DTC GT context. We performed a qualitative content analysis of Websites of four companies offering WES/WGS DTC regarding the following elements of IC: pre-test counseling, benefits and risks, and incidental findings (IFs). The analysis revealed concerns, including the potential lack of pre-test counseling in three of the companies studied, missing relevant information in the risks and benefits sections, and potentially misleading information for consumers. Regarding IFs, only one company, which provides opportunistic screening, provides basic information about their management. In conclusion, some of the information (and related practices) present on the companies' Web pages salient to the consent process are not adequate in reference to recommendations for IC for WGS or WES in the clinical context. Requisite resources should be allocated to ensure that commercial companies are offering high-throughput sequencing under responsible conditions, including an adequate consent process. © 2016 WILEY PERIODICALS, INC.
Govindaraj, Mahalingam
2015-01-01
The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away. PMID:25874133
Sequencing and comparative analyses of the genomes of zoysiagrasses
Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei
2016-01-01
Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196
Sequencing and comparative analyses of the genomes of zoysiagrasses.
Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei
2016-04-01
Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella 'Wakaba' and Z. pacifica 'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica'Kyoto', Z. japonica'Miyagi' and Z. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' at http://zoysia.kazusa.or.jp. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Meeting Report: Genomics in the Undergraduate Curriculum--Rocket Science or Basic Science?
ERIC Educational Resources Information Center
Campbell, A. Malcolm
2002-01-01
At the 102nd annual meeting of the American Society for Microbiology (ASM) in Salt Lake City, Utah, members of the Genome Consortium for Active Teaching and faculty from around the world gathered to discuss educational genomics. The focus of the gathering was a series of presentations by faculty who have successfully incorporated genomics and…
The various aspects of genetic and epigenetic toxicology: testing methods and clinical applications.
Ren, Ning; Atyah, Manar; Chen, Wan-Yong; Zhou, Chen-Hao
2017-05-22
Genotoxicity refers to the ability of harmful substances to damage genetic information in cells. Being exposed to chemical and biological agents can result in genomic instabilities and/or epigenetic alterations, which translate into a variety of diseases, cancer included. This concise review discusses, from both a genetic and epigenetic point of view, the current detection methods of different agents' genotoxicity, along with their basic and clinical relation to human cancer, chemotherapy, germ cells and stem cells.
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout
Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo
2015-01-01
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Identification of the ``a'' Genome of Finger Millet Using Chloroplast DNA
Hilu, K. W.
1988-01-01
Finger millet (Eleusine corocana subsp. coracana), an important cereal in East Africa and India, is a tetraploid species with unknown genomic components. A recent cytogenetic study confirmed the direct origin of this millet from the tetraploid E. coracana subsp. africana but questioned Eleusine indica as a genomic donor. Chloroplast (ct) DNA sequence analysis using restriction fragment pattern was used to examine the phylogenetic relationships between E. coracana subsp. coracana (domesticated finger millet), E. coracana subspecies africana (wild finger millet), and E. indica. Eleusine tristachya was included since it is the only other annual diploid species in the genus with a basic chromosome number of x = 9 like finger millet. Eight of the ten restriction endonucleases used had 16 to over 30 restriction sites per genome and were informative. E. coracana subsp. coracana and subsp. africana and E. indica were identical in all the restriction sites surveyed, while the ct genome of E. tristachya differed consistently by at least one mutational event for each restriction enzyme surveyed. This random survey of the ct genomes of these species points out E. indica as one of the genome donors (maternal genome donor) of domesticated finger millet contrary to a previous cytogenetic study. The data also substantiate E. coracana subsp. africana as the progenitor of domesticated finger millet. The disparity between the cytogenetic and the molecular approaches is discussed in light of the problems associated with chromosome pairing and polyploidy. PMID:8608927
Structural Bioinformatics of the Interactome
Petrey, Donald; Honig, Barry
2014-01-01
The last decade has seen a dramatic expansion in the number and range of techniques available to obtain genome-wide information, and to analyze this information so as to infer both the function of individual molecules and how they interact to modulate the behavior of biological systems. Here we review these techniques, focusing on the construction of physical protein-protein interaction networks, and highlighting approaches that incorporate protein structure which is becoming an increasingly important component of systems-level computational techniques. We also discuss how network analyses are being applied to enhance the basic understanding of biological systems and their disregulation, and how they are being applied in drug development. PMID:24895853
Cancer biology and genomics: translating discoveries, transforming pathology.
Ladanyi, Marc; Hogendoorn, Pancras C W
2011-01-01
Advances in our understanding of cancer biology and discoveries emerging from cancer genomics are being translated into real clinical benefits for patients with cancer. The 2011 Journal of Pathology Annual Review Issue provides a snapshot of recent rapid progress on multiple fronts in the war on cancer or, more precisely, the wars on cancers. Indeed, perhaps the most notable recent shift is reflected by the sharp increase in understanding the biology of multiple specific cancers and using these new insights to inform rationally targeted therapies, with often striking successes. These recent developments, as reviewed in this issue, show how the long-term investments in basic cancer research are finally beginning to bear fruit. Copyright © 2010 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
[Manipulation of the human genome: ethics and law].
Goulart, Maria Carolina Vaz; Iano, Flávia Godoy; Silva, Paulo Maurício; Sales-Peres, Silvia Helena de Carvalho; Sales-Peres, Arsênio
2010-06-01
The molecular biology has provided the basic tool for geneticists deepening in the molecular mechanisms that influence different diseases. It should be noted the scientific and moral responsibility of the researchers, because the scientists should imagine the moral consequences of the commercial application of genetic tests, since this fact involves not only the individual and their families, but the entire population. Besides being also necessary to make a reflection on how this information from the human genome will be used, for good or bad. The objective of this review was to bring the light of knowledge, data on characteristics of the ethical application of molecular biology, linking it with the rights of human beings. After studying literature, it might be observed that the Human Genome Project has generated several possibilities, such as the identification of genes associated with diseases with synergistic properties, but sometimes modifying behavior to genetically intervene in humans, bringing benefits or social harm. The big challenge is to decide what humanity wants on this giant leap.
Postdoctoral Fellows | Center for Cancer Research
The Oncogenomics section of the Genetics Branch is a multidisciplinary and interdisciplinary translational research programmatic effort with the goal of utilizing genomics to develop novel immunotherapies for cancer. Our group is applying high throughput applied genomics methods including single cell RNAseq, single cell TCR sequencing, DNA sequencing, CRISPR/Cas9, bioinformatics combined with T cell based therapeutics to identify and develop novel immunotherapeutics for human cancer. We work with other investigators within the intramural program as well as industrial and pharmaceutical partners to rapidly translate our findings to the clinic. The program takes advantage of the uniqueness of the National Cancer Institute, (NCI), Center for Cancer Research (CCR) intramural program in that it spans high-risk basic discovery research in immunology, genomics and tumor biology, through preclinical translational research, to paradigm-shifting clinical trials. The position is available immediately. The appointment duration is up to 5 years. Stipends are commensurate with education and experience. Additional information can be found on Dr. Khan’s profile page: https://ccr.cancer.gov/Genetics-Branch/javed-khan
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.
Zeng, Victor; Extavour, Cassandra G
2012-01-01
The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
Franek, Michal; Suchánková, Jana; Sehnalová, Petra; Krejčí, Jana; Legartová, Soňa; Kozubek, Stanislav; Večeřa, Josef; Sorokin, Dmitry V; Bártová, Eva
2016-04-01
Studies on fixed samples or genome-wide analyses of nuclear processes are useful for generating snapshots of a cell population at a particular time point. However, these experimental approaches do not provide information at the single-cell level. Genome-wide studies cannot assess variability between individual cells that are cultured in vitro or originate from different pathological stages. Immunohistochemistry and immunofluorescence are fundamental experimental approaches in clinical laboratories and are also widely used in basic research. However, the fixation procedure may generate artifacts and prevents monitoring of the dynamics of nuclear processes. Therefore, live-cell imaging is critical for studying the kinetics of basic nuclear events, such as DNA replication, transcription, splicing, and DNA repair. This review is focused on the advanced microscopy analyses of the cells, with a particular focus on live cells. We note some methodological innovations and new options for microscope systems that can also be used to study tissue sections. Cornerstone methods for the biophysical research of living cells, such as fluorescence recovery after photobleaching and fluorescence resonance energy transfer, are also discussed, as are studies on the effects of radiation at the individual cellular level.
Delivery of genomic medicine for common chronic adult diseases: a systematic review.
Scheuner, Maren T; Sieverding, Pauline; Shekelle, Paul G
2008-03-19
The greatest public health benefit of advances in understanding the human genome may be realized for common chronic diseases such as cardiovascular disease, diabetes mellitus, and cancer. Attempts to integrate such knowledge into clinical practice are still in the early stages, and as a result, many questions surround the current state of this translation. To synthesize current information on genetic health services for common adult-onset conditions by examining studies that have addressed the outcomes, consumer information needs, delivery, and challenges in integrating these services. MEDLINE articles published between January 2000 and February 2008. Original research articles and systematic reviews dealing with common chronic adult-onset conditions were reviewed. A total of 3371 citations were reviewed, 170 articles retrieved, and 68 articles included in the analysis. Data were independently extracted by one reviewer and checked by another with disagreement resolved by consensus. Variables assessed included study design and 4 key areas: outcomes of genomic medicine, consumer information needs, delivery of genomic medicine, and challenges and barriers to integration of genomic medicine. Sixty-eight articles contributed data to the synthesis: 5 systematic reviews, 8 experimental studies, 35 surveys, 7 pre/post studies, 3 observational studies, and 10 qualitative reports. Three systematic reviews, 4 experimental studies, and 9 additional studies reported on outcomes of genetic services. Generally there were modest positive effects on psychological outcomes such as worry and anxiety, behavioral outcomes have shown mixed results, and clinical outcomes were less well studied. One systematic review, 1 randomized controlled trial, and 14 other studies assessed consumer information needs and found in general that genetics knowledge was reported to be low but that attitudes were generally positive. Three randomized controlled trials and 13 other studies assessed how genomic medicine is delivered and newer models of delivery. One systematic review and 19 other studies assessed barriers; the most consistent finding was the self-assessed inadequacy of the primary care workforce to deliver genetic services. Additional identified barriers included lack of oversight of genetic testing and concerns about privacy and discrimination. Many gaps in knowledge about organization, clinician, and patient needs must be filled to translate basic and clinical science advances in genomics of common chronic diseases into practice.
Negi, Pooja; Rai, Archana N; Suprasanna, Penna
2016-01-01
The recognition of a positive correlation between organism genome size with its transposable element (TE) content, represents a key discovery of the field of genome biology. Considerable evidence accumulated since then suggests the involvement of TEs in genome structure, evolution and function. The global genome reorganization brought about by transposon activity might play an adaptive/regulatory role in the host response to environmental challenges, reminiscent of McClintock's original 'Controlling Element' hypothesis. This regulatory aspect of TEs is also garnering support in light of the recent evidences, which project TEs as "distributed genomic control modules." According to this view, TEs are capable of actively reprogramming host genes circuits and ultimately fine-tuning the host response to specific environmental stimuli. Moreover, the stress-induced changes in epigenetic status of TE activity may allow TEs to propagate their stress responsive elements to host genes; the resulting genome fluidity can permit phenotypic plasticity and adaptation to stress. Given their predominating presence in the plant genomes, nested organization in the genic regions and potential regulatory role in stress response, TEs hold unexplored potential for crop improvement programs. This review intends to present the current information about the roles played by TEs in plant genome organization, evolution, and function and highlight the regulatory mechanisms in plant stress responses. We will also briefly discuss the connection between TE activity, host epigenetic response and phenotypic plasticity as a critical link for traversing the translational bridge from a purely basic study of TEs, to the applied field of stress adaptation and crop improvement.
Understanding Genomic Knowledge in Rural Appalachia: The West Virginia Genome Community Project.
Mallow, Jennifer A; Theeke, Laurie A; Crawford, Patricia; Prendergast, Elizabeth; Conner, Chuck; Richards, Tony; McKown, Barbara; Bush, Donna; Reed, Donald; Stabler, Meagan E; Zhang, Jianjun; Dino, Geri; Barr, Taura L
Rural communities have limited knowledge about genetics and genomics and are also underrepresented in genomic education initiatives. The purpose of this project was to assess genomic and epigenetic knowledge and beliefs in rural West Virginia. A total of 93 participants from three communities participated in focus groups and 68 participants completed a demographic survey. The age of the respondents ranged from 21 to 81 years. Most respondents had a household income of less than $40,000, were female and most were married, completed at least a HS/GED or some college education working either part-time or full-time. A Community Based Participatory Research process with focus groups and demographic questionnaires was used. Most participants had a basic understanding of genetics and epigenetics, but not genomics. Participants reported not knowing much of their family history and that their elders did not discuss such information. If the conversations occurred, it was only during times of crisis or an illness event. Mental health and substance abuse are topics that are not discussed with family in this rural population. Most of the efforts surrounding genetic/genomic understanding have focused on urban populations. This project is the first of its kind in West Virginia and has begun to lay the much needed infrastructure for developing educational initiatives and extending genomic research projects into our rural Appalachian communities. By empowering the public with education, regarding the influential role genetics, genomics, and epigenetics have on their health, we can begin to tackle the complex task of initiating behavior changes that will promote the health and well-being of individuals, families and communities.
Negi, Pooja; Rai, Archana N.; Suprasanna, Penna
2016-01-01
The recognition of a positive correlation between organism genome size with its transposable element (TE) content, represents a key discovery of the field of genome biology. Considerable evidence accumulated since then suggests the involvement of TEs in genome structure, evolution and function. The global genome reorganization brought about by transposon activity might play an adaptive/regulatory role in the host response to environmental challenges, reminiscent of McClintock's original ‘Controlling Element’ hypothesis. This regulatory aspect of TEs is also garnering support in light of the recent evidences, which project TEs as “distributed genomic control modules.” According to this view, TEs are capable of actively reprogramming host genes circuits and ultimately fine-tuning the host response to specific environmental stimuli. Moreover, the stress-induced changes in epigenetic status of TE activity may allow TEs to propagate their stress responsive elements to host genes; the resulting genome fluidity can permit phenotypic plasticity and adaptation to stress. Given their predominating presence in the plant genomes, nested organization in the genic regions and potential regulatory role in stress response, TEs hold unexplored potential for crop improvement programs. This review intends to present the current information about the roles played by TEs in plant genome organization, evolution, and function and highlight the regulatory mechanisms in plant stress responses. We will also briefly discuss the connection between TE activity, host epigenetic response and phenotypic plasticity as a critical link for traversing the translational bridge from a purely basic study of TEs, to the applied field of stress adaptation and crop improvement. PMID:27777577
Astolfi, P A; Salamini, F; Sgaramella, V
2010-09-01
Theoretical and experimental evidences support the hypothesis that the genomes and the epigenomes may be different in the somatic cells of complex organisms. In the genome, the differences range from single base substitutions to chromosome number; in the epigenome, they entail multiple postsynthetic modifications of the chromatin. Somatic genome variations (SGV) may accumulate during development in response both to genetic programs, which may differ from tissue to tissue, and to environmental stimuli, which are often undetected and generally irreproducible. SGV may jeopardize physiological cellular functions, but also create novel coding and regulatory sequences, to be exposed to intraorganismal Darwinian selection. Genomes acknowledged as comparatively poor in genes, such as humans', could thus increase their pristine informational endowment. A better understanding of SGV will contribute to basic issues such as the "nature vs nurture" dualism and the inheritance of acquired characters. On the applied side, they may explain the low yield of cloning via somatic cell nuclear transfer, provide clues to some of the problems associated with transdifferentiation, and interfere with individual DNA analysis. SGV may be unique in the different cells types and in the different developmental stages, and thus explain the several hundred gaps persisting in the human genomes "completed" so far. They may compound the variations associated to our epigenomes and make of each of us an "(epi)genomic" mosaic. An ensuing paradigm is the possibility that a single genome (the ephemeral one assembled at fertilization) has the capacity to generate several different brains in response to different environments.
Villela, Luciana Cristine Vasques; Alves, Anderson Luis; Varela, Eduardo Sousa; Yamagishi, Michel Eduardo Beleza; Giachetto, Poliana Fernanda; da Silva, Naiara Milagres Augusto; Ponzetto, Josi Margarete; Paiva, Samuel Rezende; Caetano, Alexandre Rodrigues
2017-02-01
The cachara (Pseudoplatystoma reticulatum) is a Neotropical freshwater catfish from family Pimelodidae (Siluriformes) native to Brazil. The species is of relative economic importance for local aquaculture production and basic biological information is under development to help boost efforts to domesticate and raise the species in commercial systems. The complete cachara mitochondrial genome was obtained by assembling Illumina RNA-seq data from pooled samples. The full mitogenome was found to be 16,576 bp in length, showing the same basic structure, order, and genetic organization observed in other Pimelodidae, with 13 protein-coding genes, 2 rNA genes, 22 trNAs, and a control region. Observed base composition was 24.63% T, 28.47% C, 31.45% A, and 15.44% G. With the exception of NAD6 and eight tRNAs, all of the observed mitochondrial genes were found to be coded on the H strand. A total of 107 SNPs were identified in P. reticulatum mtDNA, 67 of which were located in coding regions. Of these SNPs, 10 result in amino acid changes. Analysis of the obtained sequence with 94 publicly available full Siluriformes mitogenomes resulted in a phylogenetic tree that generally agreed with available phylogenetic proposals for the order. The first report of the complete Pseudoplatystoma reticulatum mitochondrial genome sequence revealed general gene organization, structure, content, and order similar to most vertebrates. Specific sequence and content features were observed and may have functional attributes which are now available for further investigation.
Genome-wide identification and evolution of the PIN-FORMED (PIN) gene family in Glycine max.
Liu, Yuan; Wei, Haichao
2017-07-01
Soybean (Glycine max) is one of the most important crop plants. Wild and cultivated soybean varieties have significant differences worth further investigation, such as plant morphology, seed size, and seed coat development; these characters may be related to auxin biology. The PIN gene family encodes essential transport proteins in cell-to-cell auxin transport, but little research on soybean PIN genes (GmPIN genes) has been done, especially with respect to the evolution and differences between wild and cultivated soybean. In this study, we retrieved 23 GmPIN genes from the latest updated G. max genome database; six GmPIN protein sequences were changed compared with the previous database. Based on the Plant Genome Duplication Database, 18 GmPIN genes have been involved in segment duplication. Three pairs of GmPIN genes arose after the second soybean genome duplication, and six occurred after the first genome duplication. The duplicated GmPIN genes retained similar expression patterns. All the duplicated GmPIN genes experienced purifying selection (K a /K s < 1) to prevent accumulation of non-synonymous mutations and thus remained more similar. In addition, we also focused on the artificial selection of the soybean PIN genes. Five artificially selected GmPIN genes were identified by comparing the genome sequence of 17 wild and 14 cultivated soybean varieties. Our research provides useful and comprehensive basic information for understanding GmPIN genes.
Essential elements of personalized medicine.
Burke, Wylie; Brown Trinidad, Susan; Press, Nancy A
2014-02-01
Genomic information has been promoted as the basis for "personalized" health care. We considered the benefits provided by genomic testing in context of the concept of personalized medicine. We evaluated current and potential uses of genomic testing in health care, using prostate cancer as an example, and considered their implications for individualizing or otherwise improving health care. Personalized medicine is most accurately seen as a comprehensive effort to tailor health care to the individual, spanning multiple dimensions. While genomic tests will offer many potential opportunities to improve the delivery of care, including the potential for genomic research to offer opportunities to improve prostate cancer screening and treatment, such advances do not in themselves constitute a paradigm shift in the delivery of health care. Rather, personalized medicine is based on a partnership between clinician and patient that utilizes shared decision making to determine the best health care options among the available choices, weighing the patient's personal values and preferences together with clinical findings. This approach is particularly important for difficult clinical decisions involving uncertainty and trade-offs, such as those involved in prostate cancer screening and management. The delivery of personalized medicine also requires adequate health care access and assurance that basic health needs have been met. Substantial research investment will be needed to identify how genomic tests can contribute to this effort. © 2014 Published by Elsevier Inc.
Advances in Genomics of Entomopathogenic Fungi.
Wang, J B; St Leger, R J; Wang, C
2016-01-01
Fungi are the commonest pathogens of insects and crucial regulators of insect populations. The rapid advance of genome technologies has revolutionized our understanding of entomopathogenic fungi with multiple Metarhizium spp. sequenced, as well as Beauveria bassiana, Cordyceps militaris, and Ophiocordyceps sinensis among others. Phylogenomic analysis suggests that the ancestors of many of these fungi were plant endophytes or pathogens, with entomopathogenicity being an acquired characteristic. These fungi now occupy a wide range of habitats and hosts, and their genomes have provided a wealth of information on the evolution of virulence-related characteristics, as well as the protein families and genomic structure associated with ecological and econutritional heterogeneity, genome evolution, and host range diversification. In particular, their evolutionary transition from plant pathogens or endophytes to insect pathogens provides a novel perspective on how new functional mechanisms important for host switching and virulence are acquired. Importantly, genomic resources have helped make entomopathogenic fungi ideal model systems for answering basic questions in parasitology, entomology, and speciation. At the same time, identifying the selective forces that act upon entomopathogen fitness traits could underpin both the development of new mycoinsecticides and further our understanding of the natural roles of these fungi in nature. These roles frequently include mutualistic relationships with plants. Genomics has also facilitated the rapid identification of genes encoding biologically useful molecules, with implications for the development of pharmaceuticals and the use of these fungi as bioreactors. Copyright © 2016 Elsevier Inc. All rights reserved.
Genome Diversity and Evolution in the Budding Yeasts (Saccharomycotina)
Dujon, Bernard A.; Louis, Edward J.
2017-01-01
Considerable progress in our understanding of yeast genomes and their evolution has been made over the last decade with the sequencing, analysis, and comparisons of numerous species, strains, or isolates of diverse origins. The role played by yeasts in natural environments as well as in artificial manufactures, combined with the importance of some species as model experimental systems sustained this effort. At the same time, their enormous evolutionary diversity (there are yeast species in every subphylum of Dikarya) sparked curiosity but necessitated further efforts to obtain appropriate reference genomes. Today, yeast genomes have been very informative about basic mechanisms of evolution, speciation, hybridization, domestication, as well as about the molecular machineries underlying them. They are also irreplaceable to investigate in detail the complex relationship between genotypes and phenotypes with both theoretical and practical implications. This review examines these questions at two distinct levels offered by the broad evolutionary range of yeasts: inside the best-studied Saccharomyces species complex, and across the entire and diversified subphylum of Saccharomycotina. While obviously revealing evolutionary histories at different scales, data converge to a remarkably coherent picture in which one can estimate the relative importance of intrinsic genome dynamics, including gene birth and loss, vs. horizontal genetic accidents in the making of populations. The facility with which novel yeast genomes can now be studied, combined with the already numerous available reference genomes, offer privileged perspectives to further examine these fundamental biological questions using yeasts both as eukaryotic models and as fungi of practical importance. PMID:28592505
SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models
Aziz, Ramy K.; Devoid, Scott; Disz, Terrence; Edwards, Robert A.; Henry, Christopher S.; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Stevens, Rick L.; Vonstein, Veronika; Xia, Fangfang
2012-01-01
The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users. PMID:23110173
Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data
Oetjens, Matthew T.; Brown-Gentry, Kristin; Goodloe, Robert; Dilks, Holli H.; Crawford, Dana C.
2016-01-01
Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so for ethnicity. Overall, the strategies outlined here for a large epidemiologic study can be applied to other datasets accessible for genotype–phenotype studies but are sans genome-wide data. PMID:27200085
Shuster, Michèle
2011-01-01
In recognition of the entry into the era of personalized medicine, a new set of genetics and genomics competencies for nurses was introduced in 2006. Since then, there have been a number of reports about the critical importance of these competencies for nursing practices and about the challenges of addressing these competencies in the preservice (basic science) nursing curriculum. At least one suggestion has been made to infuse genetics and genomics throughout the basic science curriculum for prenursing students. Based on this call and a review of the competencies, this study sought to assess the impact of incorporation of genetics and genomics content into a prenursing microbiology course. Broadly, two areas that address the competencies were incorporated into the course: 1) the biological basis and implications of genetic diversity and 2) the technological aspects of assessing genetic diversity in bacteria and viruses. These areas address how genetics and genomics contribute to healthcare, including diagnostics and selection of treatment. Analysis of learning gains suggests that genetics and genomics content can be learned as effectively as microbiology content in this setting. Future studies are needed to explore the most effective ways to introduce genetics and genomics technology into the prenursing curriculum. PMID:21633070
Hughes, Kevin S; Ambinder, Edward P; Hess, Gregory P; Yu, Peter Paul; Bernstam, Elmer V; Routbort, Mark J; Clemenceau, Jean Rene; Hamm, John T; Febbo, Phillip G; Domchek, Susan M; Chen, James L; Warner, Jeremy L
2017-09-20
At the ASCO Data Standards and Interoperability Summit held in May 2016, it was unanimously decided that four areas of current oncology clinical practice have serious, unmet health information technology needs. The following areas of need were identified: 1) omics and precision oncology, 2) advancing interoperability, 3) patient engagement, and 4) value-based oncology. To begin to address these issues, ASCO convened two complementary workshops: the Omics and Precision Oncology Workshop in October 2016 and the Advancing Interoperability Workshop in December 2016. A common goal was to address the complexity, enormity, and rapidly changing nature of genomic information, which existing electronic health records are ill equipped to manage. The subject matter experts invited to the Omics and Precision Oncology Workgroup were tasked with the responsibility of determining a specific, limited need that could be addressed by a software application (app) in the short-term future, using currently available genomic knowledge bases. Hence, the scope of this workshop was to determine the basic functionality of one app that could serve as a test case for app development. The goal of the second workshop, described separately, was to identify the specifications for such an app. This approach was chosen both to facilitate the development of a useful app and to help ASCO and oncologists better understand the mechanics, difficulties, and gaps in genomic clinical decision support tool development. In this article, we discuss the key challenges and recommendations identified by the workshop participants. Our hope is to narrow the gap between the practicing oncologist and ongoing national efforts to provide precision oncology and value-based care to cancer patients.
DNA Repair and Genome Maintenance in Bacillus subtilis
Lenhart, Justin S.; Schroeder, Jeremy W.; Walsh, Brian W.
2012-01-01
Summary: From microbes to multicellular eukaryotic organisms, all cells contain pathways responsible for genome maintenance. DNA replication allows for the faithful duplication of the genome, whereas DNA repair pathways preserve DNA integrity in response to damage originating from endogenous and exogenous sources. The basic pathways important for DNA replication and repair are often conserved throughout biology. In bacteria, high-fidelity repair is balanced with low-fidelity repair and mutagenesis. Such a balance is important for maintaining viability while providing an opportunity for the advantageous selection of mutations when faced with a changing environment. Over the last decade, studies of DNA repair pathways in bacteria have demonstrated considerable differences between Gram-positive and Gram-negative organisms. Here we review and discuss the DNA repair, genome maintenance, and DNA damage checkpoint pathways of the Gram-positive bacterium Bacillus subtilis. We present their molecular mechanisms and compare the functions and regulation of several pathways with known information on other organisms. We also discuss DNA repair during different growth phases and the developmental program of sporulation. In summary, we present a review of the function, regulation, and molecular mechanisms of DNA repair and mutagenesis in Gram-positive bacteria, with a strong emphasis on B. subtilis. PMID:22933559
Standards of Practice: Applying Genetics and Genomics Resources to Oncology .
Kerber, Alice S; Ledbetter, Nancy J
2017-04-01
Knowledge about genetics and genomics and its application to oncology care is rapidly expanding and evolving. As a result, oncology nurses at all levels must develop and maintain their knowledge of genetics and genomics, as well as be aware of resources to guide practice. This article focuses on implementation of the standards described in the updated Genetics/Genomics Nursing: Scope and Standards of Practice by the basic practitioner. .
Cancer Pharmacogenomics: Integrating Discoveries in Basic, Clinical and Population Sciences to Advance Predictive Cancer Care, a 2010 workshop sponsored by the Epidemiology and Genomics Research Program.
Bogenpohl, James W; Mignogna, Kristin M; Smith, Maren L; Miles, Michael F
2017-01-01
Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce nonbiased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA, and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x environmental interactions affecting brain functioning in health and disease.
Bogenpohl, James W.; Mignogna, Kristin M.; Smith, Maren L.; Miles, Michael F.
2016-01-01
Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce non-biased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x environmental interactions affecting brain functioning in health and disease. PMID:27933543
Standard Mutation Nomenclature in Molecular Diagnostics
Ogino, Shuji; Gulley, Margaret L.; den Dunnen, Johan T.; Wilson, Robert B.
2007-01-01
To translate basic research findings into clinical practice, it is essential that information about mutations and variations in the human genome are communicated easily and unequivocally. Unfortunately, there has been much confusion regarding the description of genetic sequence variants. This is largely because research articles that first report novel sequence variants do not often use standard nomenclature, and the final genomic sequence is compiled over many separate entries. In this article, we discuss issues crucial to clear communication, using examples of genes that are commonly assayed in clinical laboratories. Although molecular diagnostics is a dynamic field, this should not inhibit the need for and movement toward consensus nomenclature for accurate reporting among laboratories. Our aim is to alert laboratory scientists and other health care professionals to the important issues and provide a foundation for further discussions that will ultimately lead to solutions. PMID:17251329
Yamamoto, Takashi
Programmable site-specific nuclease mediated-genome editing is an emerging biotechnology for precise manipulation of target genes. In genome editing, gene-knockout as well as gene-knockin are possible in various organisms and cultured cells. CRISPR-Cas9, which was developed in 2012, is a convenient and efficient programmable site-specific nuclease and the use spreads around the world rapidly. For this, it is important for the progress of life science research to introduce the genome editing technology.
Genomics Community Resources | Informatics Technology for Cancer Research (ITCR)
To facilitate genomic research and the dissemination of its products, National Human Genome Research Institute (NHGRI) supports genomic resources that are crucial for basic research, disease studies, model organism studies, and other biomedical research. Awards under this FOA will support the development and distribution of genomic resources that will be valuable for the broad research community, using cost-effective approaches. Such resources include (but are not limited to) databases and informatics resources (such as human and model organism databases, ontologies, and analysi
Diagnostic devices for isothermal nucleic acid amplification.
Chang, Chia-Chen; Chen, Chien-Cheng; Wei, Shih-Chung; Lu, Hui-Hsin; Liang, Yang-Hung; Lin, Chii-Wann
2012-01-01
Since the development of the polymerase chain reaction (PCR) technique, genomic information has been retrievable from lesser amounts of DNA than previously possible. PCR-based amplifications require high-precision instruments to perform temperature cycling reactions; further, they are cumbersome for routine clinical use. However, the use of isothermal approaches can eliminate many complications associated with thermocycling. The application of diagnostic devices for isothermal DNA amplification has recently been studied extensively. In this paper, we describe the basic concepts of several isothermal amplification approaches and review recent progress in diagnostic device development.
Diagnostic Devices for Isothermal Nucleic Acid Amplification
Chang, Chia-Chen; Chen, Chien-Cheng; Wei, Shih-Chung; Lu, Hui-Hsin; Liang, Yang-Hung; Lin, Chii-Wann
2012-01-01
Since the development of the polymerase chain reaction (PCR) technique, genomic information has been retrievable from lesser amounts of DNA than previously possible. PCR-based amplifications require high-precision instruments to perform temperature cycling reactions; further, they are cumbersome for routine clinical use. However, the use of isothermal approaches can eliminate many complications associated with thermocycling. The application of diagnostic devices for isothermal DNA amplification has recently been studied extensively. In this paper, we describe the basic concepts of several isothermal amplification approaches and review recent progress in diagnostic device development. PMID:22969402
Genome sequence of Clostridium tunisiense TJ, isolated from drain sediment from a pesticide factory.
Sun, Lili; Wang, Yu; Yu, Chunyan; Zhao, Yongqin; Gan, Yinbo
2012-12-01
Clostridium tunisiense is a Gram-positive, obligate anaerobe that was first isolated in an anaerobic environment under eutrophication. Here we report the first genome sequence of the Clostridium tunisiense TJ isolated from drain sediment of a pesticide factory in Tianjin, China. The genome is of great importance for both basic and application research.
2012-01-01
Background The molecular mechanisms altered by the traditional mutation and screening approach during the improvement of antibiotic-producing microorganisms are still poorly understood although this information is essential to design rational strategies for industrial strain improvement. In this study, we applied comparative genomics to identify all genetic changes occurring during the development of an erythromycin overproducer obtained using the traditional mutate-and- screen method. Results Compared with the parental Saccharopolyspora erythraea NRRL 2338, the genome of the overproducing strain presents 117 deletion, 78 insertion and 12 transposition sites, with 71 insertion/deletion sites mapping within coding sequences (CDSs) and generating frame-shift mutations. Single nucleotide variations are present in 144 CDSs. Overall, the genomic variations affect 227 proteins of the overproducing strain and a considerable number of mutations alter genes of key enzymes in the central carbon and nitrogen metabolism and in the biosynthesis of secondary metabolites, resulting in the redirection of common precursors toward erythromycin biosynthesis. Interestingly, several mutations inactivate genes coding for proteins that play fundamental roles in basic transcription and translation machineries including the transcription anti-termination factor NusB and the transcription elongation factor Efp. These mutations, along with those affecting genes coding for pleiotropic or pathway-specific regulators, affect global expression profile as demonstrated by a comparative analysis of the parental and overproducer expression profiles. Genomic data, finally, suggest that the mutate-and-screen process might have been accelerated by mutations in DNA repair genes. Conclusions This study helps to clarify the mechanisms underlying antibiotic overproduction providing valuable information about new possible molecular targets for rationale strain improvement. PMID:22401291
Nguyen, Thao T.B.; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J.; Blair, David; Laha, Thewarach; Sripa, Banchob
2015-01-01
Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers have been used for identification and genetic diversity, however, no information about microsatellites of this liver fluke published so far. We here report microsatellite characterization and marker development for genetic diversity study in C. sinensis using genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥ 12 base pairs) were identified from genome database of C. sinensis with hexa-nucleotide motif being the most abundant (51%) followed by penta-nucleotide (18.3%) and tri-nucleotide (12.7%). The tetra-nucleotide, di-nucleotide and mononucleotide motifs accounted for 9.75 %, 7.63% and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72 % of 547 Mb of the whole genome size and the frequency of microsatellites were found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri, and tetra-nucleotide, the repeat numbers redundant are six (28%), four (45%) and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and O. viverrini. Seven out of 24 loci showed heterozygous with observed heterozygosity ranged from 0.467 to 1. Four-primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites and the genome-wide markers developed may be a useful tool for genetic study of C. sinensis. PMID:25782682
Nguyen, Thao T B; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J; Blair, David; Laha, Thewarach; Sripa, Banchob
2015-06-01
Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers that have been used for identification and genetic diversity; however, no information about microsatellites of this liver fluke is published so far. We here report microsatellite characterization and marker development for a genetic diversity study in C. sinensis, using a genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥12 base pairs) were identified from a genome database of C. sinensis, with hexanucleotide motif being the most abundant (51%) followed by pentanucleotide (18.3%) and trinucleotide (12.7%). The tetranucleotide, dinucleotide, and mononucleotide motifs accounted for 9.75, 7.63, and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72% of 547 Mb of the whole genome size, and the frequency of microsatellites was found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri-, and tetranucleotide, the repeat numbers redundant are six (28%), four (45%), and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT, and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and Opisthorchis viverrini. Seven out of 24 loci showed to be heterozygous with observed heterozygosity that ranged from 0.467 to 1. Four primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites, and the genome-wide markers developed may be a useful tool for the genetic study of C. sinensis.
Kumar, Akash; Dougherty, Max; Findlay, Gregory M; Geisheker, Madeleine; Klein, Jason; Lazar, John; Machkovech, Heather; Resnick, Jesse; Resnick, Rebecca; Salter, Alexander I; Talebi-Liasi, Faezeh; Arakawa, Christopher; Baudin, Jacob; Bogaard, Andrew; Salesky, Rebecca; Zhou, Qian; Smith, Kelly; Clark, John I; Shendure, Jay; Horwitz, Marshall S
2014-01-01
Even in cases where there is no obvious family history of disease, genome sequencing may contribute to clinical diagnosis and management. Clinical application of the genome has not yet become routine, however, in part because physicians are still learning how best to utilize such information. As an educational research exercise performed in conjunction with our medical school human anatomy course, we explored the potential utility of determining the whole genome sequence of a patient who had died following a clinical diagnosis of idiopathic pulmonary fibrosis (IPF). Medical students performed dissection and whole genome sequencing of the cadaver. Gross and microscopic findings were more consistent with the fibrosing variant of nonspecific interstitial pneumonia (NSIP), as opposed to IPF per se. Variants in genes causing Mendelian disorders predisposing to IPF were not detected. However, whole genome sequencing identified several common variants associated with IPF, including a single nucleotide polymorphism (SNP), rs35705950, located in the promoter region of the gene encoding mucin glycoprotein MUC5B. The MUC5B promoter polymorphism was recently found to markedly elevate risk for IPF, though a particular association with NSIP has not been previously reported, nor has its contribution to disease risk previously been evaluated in the genome-wide context of all genetic variants. We did not identify additional predicted functional variants in a region of linkage disequilibrium (LD) adjacent to MUC5B, nor did we discover other likely risk-contributing variants elsewhere in the genome. Whole genome sequencing thus corroborates the association of rs35705950 with MUC5B dysregulation and interstitial lung disease. This novel exercise additionally served a unique mission in bridging clinical and basic science education.
Understanding Genomic Knowledge in Rural Appalachia: The West Virginia Genome Community Project
Mallow, Jennifer A.; Theeke, Laurie A.; Crawford, Patricia; Prendergast, Elizabeth; Conner, Chuck; Richards, Tony; McKown, Barbara; Bush, Donna; Reed, Donald; Stabler, Meagan E.; Zhang, Jianjun; Dino, Geri; Barr, Taura L.
2016-01-01
Purpose Rural communities have limited knowledge about genetics and genomics and are also underrepresented in genomic education initiatives. The purpose of this project was to assess genomic and epigenetic knowledge and beliefs in rural West Virginia. Sample A total of 93 participants from three communities participated in focus groups and 68 participants completed a demographic survey. The age of the respondents ranged from 21 to 81 years. Most respondents had a household income of less than $40,000, were female and most were married, completed at least a HS/GED or some college education working either part-time or full-time. Method A Community Based Participatory Research process with focus groups and demographic questionnaires was used. Findings Most participants had a basic understanding of genetics and epigenetics, but not genomics. Participants reported not knowing much of their family history and that their elders did not discuss such information. If the conversations occurred, it was only during times of crisis or an illness event. Mental health and substance abuse are topics that are not discussed with family in this rural population. Conclusions Most of the efforts surrounding genetic/genomic understanding have focused on urban populations. This project is the first of its kind in West Virginia and has begun to lay the much needed infrastructure for developing educational initiatives and extending genomic research projects into our rural Appalachian communities. By empowering the public with education, regarding the influential role genetics, genomics, and epigenetics have on their health, we can begin to tackle the complex task of initiating behavior changes that will promote the health and well-being of individuals, families and communities. PMID:27212895
Lew, Jocelyne M; Kapopoulou, Adamandia; Jones, Louis M; Cole, Stewart T
2011-01-01
TubercuList (http://tuberculist.epfl.ch/), the relational database that presents genome-derived information about H37Rv, the paradigm strain of Mycobacterium tuberculosis, has been active for ten years and now presents its twentieth release. Here, we describe some of the recent changes that have resulted from manual annotation with information from the scientific literature. Through manual curation, TubercuList strives to provide current gene-based information and is thus distinguished from other online sources of genome sequence data for M. tuberculosis. New, mostly small, genes have been discovered and the coordinates of some existing coding sequences have been changed when bioinformatics or experimental data suggest that this is required. Nucleotides that are polymorphic between different sources of H37Rv are annotated and gene essentiality data have been updated. A host of functional information has been gleaned from the literature and many new activities of proteins and RNAs have been included. To facilitate basic and translational research, TubercuList also provides links to other specialized databases that present diverse datasets such as 3D-structures, expression profiles, drug development criteria and drug resistance information, in addition to direct access to PubMed articles pertinent to particular genes. TubercuList has been and remains a highly valuable tool for the tuberculosis research community with >75,000 visitors per month. Copyright © 2010 Elsevier Ltd. All rights reserved.
Whole-genome sequencing for comparative genomics and de novo genome assembly.
Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C
2015-01-01
Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).
Long-read sequencing data analysis for yeasts.
Yue, Jia-Xing; Liti, Gianni
2018-06-01
Long-read sequencing technologies have become increasingly popular due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast Saccharomyces cerevisiae has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here, we present a modular computational framework named long-read sequencing data analysis for yeasts (LRSDAY), the first one-stop solution that streamlines this process. Starting from the raw sequencing reads, LRSDAY can produce chromosome-level genome assembly and comprehensive genome annotation in a highly automated manner with minimal manual intervention, which is not possible using any alternative tool available to date. The annotated genomic features include centromeres, protein-coding genes, tRNAs, transposable elements (TEs), and telomere-associated elements. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable to virtually any eukaryotic organism. When applying LRSDAY to an S. cerevisiae strain, it takes ∼41 h to generate a complete and well-annotated genome from ∼100× Pacific Biosciences (PacBio) running the basic workflow with four threads. Basic experience working within the Linux command-line environment is recommended for carrying out the analysis using LRSDAY.
[The ENCODE project and functional genomics studies].
Ding, Nan; Qu, Hongzhu; Fang, Xiangdong
2014-03-01
Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.
Wang, Xu-Hua; Wang, Yong; Liu, A-Ke; Liu, Xiao-Ting; Zhou, Yang; Yao, Qin; Chen, Ke-Ping
2015-04-01
The basic helix-loop-helix (bHLH) domain is a highly conserved amino acid motif that defines a group of DNA-binding transcription factors. bHLH proteins play essential regulatory roles in a variety of biological processes in animal, plant, and fungus. The domestic dog, Canis lupus familiaris, is a good model organism for genetic, physiological, and behavioral studies. In this study, we identified 115 putative bHLH genes in the dog genome. Based on a phylogenetic analysis, 51, 26, 14, 4, 12, and 4 dog bHLH genes were assigned to six separate groups (A-F); four bHLH genes were categorized as ''orphans''. Within-group evolutionary relationships inferred from the phylogenetic analysis were consistent with positional conservation, other conserved domains flanking the bHLH motif, and highly conserved intron/exon patterns in other vertebrates. Our analytical results confirmed the GenBank annotations of 89 dog bHLH proteins and provided information that could be used to update the annotations of the remaining 26 dog bHLH proteins. These data will provide good references for further studies on the structures and regulatory functions of bHLH proteins in the growth and development of dogs, which may help in understanding the mechanisms that underlie the physical and behavioral differences between dogs and wolves.
Genomics and Public Health Research: Can the State Allow Access to Genomic Databases?
Cousineau, J; Girard, N; Monardes, C; Leroux, T; Jean, M Stanton
2012-01-01
Because many diseases are multifactorial disorders, the scientific progress in genomics and genetics should be taken into consideration in public health research. In this context, genomic databases will constitute an important source of information. Consequently, it is important to identify and characterize the State’s role and authority on matters related to public health, in order to verify whether it has access to such databases while engaging in public health genomic research. We first consider the evolution of the concept of public health, as well as its core functions, using a comparative approach (e.g. WHO, PAHO, CDC and the Canadian province of Quebec). Following an analysis of relevant Quebec legislation, the precautionary principle is examined as a possible avenue to justify State access to and use of genomic databases for research purposes. Finally, we consider the Influenza pandemic plans developed by WHO, Canada, and Quebec, as examples of key tools framing public health decision-making process. We observed that State powers in public health, are not, in Quebec, well adapted to the expansion of genomics research. We propose that the scope of the concept of research in public health should be clear and include the following characteristics: a commitment to the health and well-being of the population and to their determinants; the inclusion of both applied research and basic research; and, an appropriate model of governance (authorization, follow-up, consent, etc.). We also suggest that the strategic approach version of the precautionary principle could guide collective choices in these matters. PMID:23113174
Szałaj, Przemysław; Tang, Zhonghui; Michalski, Paul; Pietal, Michal J; Luo, Oscar J; Sadowski, Michał; Li, Xingwang; Radew, Kamen; Ruan, Yijun; Plewczynski, Dariusz
2016-12-01
ChIA-PET is a high-throughput mapping technology that reveals long-range chromatin interactions and provides insights into the basic principles of spatial genome organization and gene regulation mediated by specific protein factors. Recently, we showed that a single ChIA-PET experiment provides information at all genomic scales of interest, from the high-resolution locations of binding sites and enriched chromatin interactions mediated by specific protein factors, to the low resolution of nonenriched interactions that reflect topological neighborhoods of higher-order chromosome folding. This multilevel nature of ChIA-PET data offers an opportunity to use multiscale 3D models to study structural-functional relationships at multiple length scales, but doing so requires a structural modeling platform. Here, we report the development of 3D-GNOME (3-Dimensional Genome Modeling Engine), a complete computational pipeline for 3D simulation using ChIA-PET data. 3D-GNOME consists of three integrated components: a graph-distance-based heat map normalization tool, a 3D modeling platform, and an interactive 3D visualization tool. Using ChIA-PET and Hi-C data derived from human B-lymphocytes, we demonstrate the effectiveness of 3D-GNOME in building 3D genome models at multiple levels, including the entire genome, individual chromosomes, and specific segments at megabase (Mb) and kilobase (kb) resolutions of single average and ensemble structures. Further incorporation of CTCF-motif orientation and high-resolution looping patterns in 3D simulation provided additional reliability of potential biologically plausible topological structures. © 2016 Szałaj et al.; Published by Cold Spring Harbor Laboratory Press.
Basic leucine zipper domain transcription factors: the vanguards in plant immunity.
Noman, Ali; Liu, Zhiqin; Aqeel, Muhammad; Zainab, Madiha; Khan, Muhammad Ifnan; Hussain, Ansar; Ashraf, Muhammad Furqan; Li, Xia; Weng, Yahong; He, Shuilin
2017-12-01
Regulation of spatio-temporal expression patterns of stress tolerance associated plant genes is an essential component of the stress responses. Eukaryotes assign a large amount of their genome to transcription with multiple transcription factors (TFs). Often, these transcription factors fit into outsized gene groups which, in several cases, exclusively belong to plants. Basic leucine zipper domain (bZIP) transcription factors regulate vital processes in plants and animals. In plants, bZIPs are implicated in numerous fundamental processes like seed development, energy balance, and responses to abiotic or biotic stresses. Systematic analysis of the information obtained over the last two decades disclosed a constitutive role of bZIPs against biotic stress. bZIP TFs are vital players in plant innate immunity due to their ability to regulate genes associated with PAMP-triggered immunity, effector-triggered immunity, and hormonal signaling networks. Expression analysis of studied bZIP genes suggests that exploration and functional characterization of novel bZIP TFs in planta is helpful in improving crop resistance against pathogens and environmental stresses. Our review focuses on major advancements in bZIP TFs and plant responses against different pathogens. The integration of genomics information with the functional studies provides new insights into the regulation of plant defense mechanisms and engineering crops with improved resistance to invading pathogens. Conclusively, succinct functions of bZIPs as positive or negative regulator mediate resistance to the plant pathogens and lay a foundation for understanding associated genes and TFs regulating different pathways. Moreover, bZIP TFs may offer a comprehensive transgenic gizmo for engineering disease resistance in plant breeding programs.
2012-01-01
Background Amazona vittata is a critically endangered Puerto Rican endemic bird, the only surviving native parrot species in the United States territory, and the first parrot in the large Neotropical genus Amazona, to be studied on a genomic scale. Findings In a unique community-based funded project, DNA from an A. vittata female was sequenced using a HiSeq Illumina platform, resulting in a total of ~42.5 billion nucleotide bases. This provided approximately 26.89x average coverage depth at the completion of this funding phase. Filtering followed by assembly resulted in 259,423 contigs (N50 = 6,983 bp, longest = 75,003 bp), which was further scaffolded into 148,255 fragments (N50 = 19,470, longest = 206,462 bp). This provided ~76% coverage of the genome based on an estimated size of 1.58 Gb. The assembled scaffolds allowed basic genomic annotation and comparative analyses with other available avian whole-genome sequences. Conclusions The current data represents the first genomic information from and work carried out with a unique source of funding. This analysis further provides a means for directed training of young researchers in genetic and bioinformatics analyses and will facilitate progress towards a full assembly and annotation of the Puerto Rican parrot genome. It also adds extensive genomic data to a new branch of the avian tree, making it useful for comparative analyses with other avian species. Ultimately, the knowledge acquired from these data will contribute to an improved understanding of the overall population health of this species and aid in ongoing and future conservation efforts. PMID:23587420
The Human Genome Initiative: First Steps.
ERIC Educational Resources Information Center
Newman, Alan R.
1990-01-01
Described is the basic biology involved in mapping chromosomes as presented at a symposium at a recent meeting of the American Chemical Association which focused on the Human Genome Initiative. Different types of gene maps and techniques used to produce gene maps are discussed. (CW)
Education through Fiction: Acquiring Opinion-Forming Skills in the Context of Genomics
ERIC Educational Resources Information Center
Knippels, Marie-Christine P. J.; Severiens, Sabine E.; Klop, Tanja
2009-01-01
The present study examined the outcomes of a newly designed four-lesson science module on opinion-forming in the context of genomics in upper secondary education. The lesson plan aims to foster 16-year-old students' opinion-forming skills in the context of genomics and to test the effect of the use of fiction in the module. The basic hypothesis…
Primer on Molecular Genetics; DOE Human Genome Program
DOE R&D Accomplishments Database
1992-04-01
This report is taken from the April 1992 draft of the DOE Human Genome 1991--1992 Program Report, which is expected to be published in May 1992. The primer is intended to be an introduction to basic principles of molecular genetics pertaining to the genome project. The material contained herein is not final and may be incomplete. Techniques of genetic mapping and DNA sequencing are described.
Pediatric Genomic Data Inventory (PGDI) Overview
About Pediatric cancer is a genetic disease that can largely differ from similar malignancies in an adult population. To fuel new discoveries and treatments specific to pediatric oncologies, the NCI Office of Cancer Genomics has developed a dynamic resource known as the Pediatric Genomic Data Inventory to allow investigators to more easily locate genomic datasets. This resource lists known ongoing and completed sequencing projects of pediatric cancer cohorts from the United States and other countries, along with some basic details and reference metadata.
Maize - GO annotation methods, evaluation, and review (Maize-GAMER)
USDA-ARS?s Scientific Manuscript database
Making a genome sequence accessible and useful involves three basic steps: genome assembly, structural annotation, and functional annotation. The quality of data generated at each step influences the accuracy of inferences that can be made, with high-quality analyses produce better datasets resultin...
McCullough, Laurence B.; Slashinski, Melody J.; McGuire, Amy L.; Street, Richard L.; Eng, Christine M.; Gibbs, Richard A.; Parsons, D. Williams; Plon, Sharon E.
2016-01-01
Background Some anticipate that physician and parents will be ill-prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. Procedure As part of the Baylor Advancing Sequencing in Childhood Cancer Care (BASIC3) study, we conducted semi-structured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision-making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Results Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice although they were concerned about impact on parents. For parents, there is an urgency to protect their chil's health and in this context they expect genomic information to better prepare them to participate in decisions about their chil's care. Conclusion Our data do not support concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, i.e., leave physicians or parents ill-prepared or unprepared to make responsible decisions about patient care. PMID:26505993
MitoNuc: a database of nuclear genes coding for mitochondrial proteins. Update 2002.
Attimonelli, Marcella; Catalano, Domenico; Gissi, Carmela; Grillo, Giorgio; Licciulli, Flavio; Liuni, Sabino; Santamaria, Monica; Pesole, Graziano; Saccone, Cecilia
2002-01-01
Mitochondria, besides their central role in energy metabolism, have recently been found to be involved in a number of basic processes of cell life and to contribute to the pathogenesis of many degenerative diseases. All functions of mitochondria depend on the interaction of nuclear and organelle genomes. Mitochondrial genomes have been extensively sequenced and analysed and data have been collected in several specialised databases. In order to collect information on nuclear coded mitochondrial proteins we developed MitoNuc, a database containing detailed information on sequenced nuclear genes coding for mitochondrial proteins in Metazoa. The MitoNuc database can be retrieved through SRS and is available via the web site http://bighost.area.ba.cnr.it/mitochondriome where other mitochondrial databases developed by our group, the complete list of the sequenced mitochondrial genomes, links to other mitochondrial sites and related information, are available. The MitoAln database, related to MitoNuc in the previous release, reporting the multiple alignments of the relevant homologous protein coding regions, is no longer supported in the present release. In order to keep the links among entries in MitoNuc from homologous proteins, a new field in the database has been defined: the cluster identifier, an alpha numeric code used to identify each cluster of homologous proteins. A comment field derived from the corresponding SWISS-PROT entry has been introduced; this reports clinical data related to dysfunction of the protein. The logic scheme of MitoNuc database has been implemented in the ORACLE DBMS. This will allow the end-users to retrieve data through a friendly interface that will be soon implemented.
CMG-biotools, a free workbench for basic comparative microbial genomics.
Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David
2013-01-01
Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training.
2010-01-01
Background Shared-usage high throughput screening (HTS) facilities are becoming more common in academe as large-scale small molecule and genome-scale RNAi screening strategies are adopted for basic research purposes. These shared facilities require a unique informatics infrastructure that must not only provide access to and analysis of screening data, but must also manage the administrative and technical challenges associated with conducting numerous, interleaved screening efforts run by multiple independent research groups. Results We have developed Screensaver, a free, open source, web-based lab information management system (LIMS), to address the informatics needs of our small molecule and RNAi screening facility. Screensaver supports the storage and comparison of screening data sets, as well as the management of information about screens, screeners, libraries, and laboratory work requests. To our knowledge, Screensaver is one of the first applications to support the storage and analysis of data from both genome-scale RNAi screening projects and small molecule screening projects. Conclusions The informatics and administrative needs of an HTS facility may be best managed by a single, integrated, web-accessible application such as Screensaver. Screensaver has proven useful in meeting the requirements of the ICCB-Longwood/NSRB Screening Facility at Harvard Medical School, and has provided similar benefits to other HTS facilities. PMID:20482787
Generation of comprehensive thoracic oncology database--tool for translational research.
Surati, Mosmi; Robinson, Matthew; Nandi, Suvobroto; Faoro, Leonardo; Demchuk, Carley; Kanteti, Rajani; Ferguson, Benjamin; Gangadhar, Tara; Hensing, Thomas; Hasina, Rifat; Husain, Aliya; Ferguson, Mark; Karrison, Theodore; Salgia, Ravi
2011-01-22
The Thoracic Oncology Program Database Project was created to serve as a comprehensive, verified, and accessible repository for well-annotated cancer specimens and clinical data to be available to researchers within the Thoracic Oncology Research Program. This database also captures a large volume of genomic and proteomic data obtained from various tumor tissue studies. A team of clinical and basic science researchers, a biostatistician, and a bioinformatics expert was convened to design the database. Variables of interest were clearly defined and their descriptions were written within a standard operating manual to ensure consistency of data annotation. Using a protocol for prospective tissue banking and another protocol for retrospective banking, tumor and normal tissue samples from patients consented to these protocols were collected. Clinical information such as demographics, cancer characterization, and treatment plans for these patients were abstracted and entered into an Access database. Proteomic and genomic data have been included in the database and have been linked to clinical information for patients described within the database. The data from each table were linked using the relationships function in Microsoft Access to allow the database manager to connect clinical and laboratory information during a query. The queried data can then be exported for statistical analysis and hypothesis generation.
De novo sequencing and analysis of the transcriptome of Panax ginseng in the leaf-expansion period.
Liu, Shichao; Wang, Siming; Liu, Meichen; Yang, Fei; Zhang, Hui; Liu, Shiyang; Wang, Qun; Zhao, Yu
2016-08-01
Panax ginseng, a traditional Chinese medicine, is used worldwide for its variety of health benefits and its treatment efficacy. However, it is difficult to cultivate due to its vulnerability to environmental stresses. The present study provided the first report, to the best of our knowledge, of transcriptome analysis of ginseng at the leaf‑expansion stage. Using the Illumina sequencing platform, >40,000,000 high‑quality paired‑end reads were obtained and assembled into 100,533 unique sequences. When the sequences were searched against the publicly available National Center for Biotechnology Information protein database using The Basic Local Alignment Search Tool, 61,599 sequences exhibited similarity to known proteins. Functional annotation and classification, including use of the Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases, revealed that the activated genes in ginseng were predominantly ribonuclease‑like storage genes, environmental stress genes, pathogenesis-related genes and other antioxidant genes. A number of candidate genes in environmental stress‑associated pathways were also identified. These novel data provide useful information on the growth and development stages of ginseng, and serve as an important public information platform for further understanding of the molecular mechanisms and functional genomics of ginseng.
Tolopko, Andrew N; Sullivan, John P; Erickson, Sean D; Wrobel, David; Chiang, Su L; Rudnicki, Katrina; Rudnicki, Stewart; Nale, Jennifer; Selfors, Laura M; Greenhouse, Dara; Muhlich, Jeremy L; Shamu, Caroline E
2010-05-18
Shared-usage high throughput screening (HTS) facilities are becoming more common in academe as large-scale small molecule and genome-scale RNAi screening strategies are adopted for basic research purposes. These shared facilities require a unique informatics infrastructure that must not only provide access to and analysis of screening data, but must also manage the administrative and technical challenges associated with conducting numerous, interleaved screening efforts run by multiple independent research groups. We have developed Screensaver, a free, open source, web-based lab information management system (LIMS), to address the informatics needs of our small molecule and RNAi screening facility. Screensaver supports the storage and comparison of screening data sets, as well as the management of information about screens, screeners, libraries, and laboratory work requests. To our knowledge, Screensaver is one of the first applications to support the storage and analysis of data from both genome-scale RNAi screening projects and small molecule screening projects. The informatics and administrative needs of an HTS facility may be best managed by a single, integrated, web-accessible application such as Screensaver. Screensaver has proven useful in meeting the requirements of the ICCB-Longwood/NSRB Screening Facility at Harvard Medical School, and has provided similar benefits to other HTS facilities.
Quantifying on- and off-target genome editing.
Hendel, Ayal; Fine, Eli J; Bao, Gang; Porteus, Matthew H
2015-02-01
Genome editing with engineered nucleases is a rapidly growing field thanks to transformative technologies that allow researchers to precisely alter genomes for numerous applications including basic research, biotechnology, and human gene therapy. While the ability to make precise and controlled changes at specified sites throughout the genome has grown tremendously in recent years, we still lack a comprehensive and standardized battery of assays for measuring the different genome editing outcomes created at endogenous genomic loci. Here we review the existing assays for quantifying on- and off-target genome editing and describe their utility in advancing the technology. We also highlight unmet assay needs for quantifying on- and off-target genome editing outcomes and discuss their importance for the genome editing field. Copyright © 2014 Elsevier Ltd. All rights reserved.
Probability, statistics, and computational science.
Beerenwinkel, Niko; Siebourg, Juliane
2012-01-01
In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.
Full-Genome Sequence of a Reassortant H1N2 Influenza A Virus Isolated from Pigs in Brazil.
Schmidt, Candice; Cibulski, Samuel Paulo; Muterle Varela, Ana Paula; Mengue Scheffer, Camila; Wendlant, Adrieli; Quoos Mayer, Fabiana; Lopes de Almeida, Laura; Franco, Ana Cláudia; Roehe, Paulo Michel
2014-12-18
In this study, the full-genome sequence of a reassortant H1N2 swine influenza virus is reported. The isolate has the hemagglutinin (HA) and neuraminidase (NA) genes from human lineage (H1-δ cluster and N2), and the internal genes (polymerase basic 1 [PB1], polymerase basic 2 [PB2], polymerase acidic [PA], nucleoprotein [NP], matrix [M], and nonstructural [NS]) are derived from human 2009 pandemic H1N1 (H1N1pdm09) virus. Copyright © 2014 Schmidt et al.
Systems Biology Approaches for Understanding Genome Architecture.
Sewitz, Sven; Lipkow, Karen
2016-01-01
The linear and three-dimensional arrangement and composition of chromatin in eukaryotic genomes underlies the mechanisms directing gene regulation. Understanding this organization requires the integration of many data types and experimental results. Here we describe the approach of integrating genome-wide protein-DNA binding data to determine chromatin states. To investigate spatial aspects of genome organization, we present a detailed description of how to run stochastic simulations of protein movements within a simulated nucleus in 3D. This systems level approach enables the development of novel questions aimed at understanding the basic mechanisms that regulate genome dynamics.
Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin
2016-08-09
Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular.
ERIC Educational Resources Information Center
Bello, Julia; Butler, Charles; Radavich, Rosanne; York, Alan; Oseto, Christian; Orvis, Kathryn; Pittendrigh, Barry R.
2007-01-01
Although members of the general public have often heard of the terms "genetic engineering" and, more recently, genomics, they typically have little to no knowledge about these topics, and in some cases are confused about basic concepts in these areas. There is currently a need for teaching models to explain concepts behind genomics.…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas
We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequencemore » (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; ...
2017-08-08
Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas
Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
Coutelle, C; Speer, A; Grade, K; Rosenthal, A; Hunger, H D
1989-01-01
The introduction of molecular human genetics has become a paradigma for the application of genetic engineering in medicine. The main principles of this technology are the isolation of molecular probes, their application in hybridization reactions, specific gene-amplification by the polymerase chain reaction, and DNA sequencing reactions. These methods are used for the analysis of monogenic diseases by linkage studies and the elucidation of the molecular defect causing these conditions, respectively. They are also the basis for genomic diagnosis of monogenic diseases, introduced into the health care system of the GDR by a national project on Duchenne/Becker muscular dystrophy, Cystic Fibrosis and Phenylketonuria. The rapid development of basic research on the molecular analysis of the human genome and genomic diagnosis indicates, that human molecular genetics is becoming a decisive basic discipline of modern medicine.
Off-target Effects in CRISPR/Cas9-mediated Genome Engineering
Zhang, Xiao-Hui; Tee, Louis Y; Wang, Xiao-Gang; Huang, Qun-Shan; Yang, Shi-Hua
2015-01-01
CRISPR/Cas9 is a versatile genome-editing technology that is widely used for studying the functionality of genetic elements, creating genetically modified organisms as well as preclinical research of genetic disorders. However, the high frequency of off-target activity (≥50%)—RGEN (RNA-guided endonuclease)-induced mutations at sites other than the intended on-target site—is one major concern, especially for therapeutic and clinical applications. Here, we review the basic mechanisms underlying off-target cutting in the CRISPR/Cas9 system, methods for detecting off-target mutations, and strategies for minimizing off-target cleavage. The improvement off-target specificity in the CRISPR/Cas9 system will provide solid genotype–phenotype correlations, and thus enable faithful interpretation of genome-editing data, which will certainly facilitate the basic and clinical application of this technology. PMID:26575098
DOE Office of Scientific and Technical Information (OSTI.GOV)
Studier, F.W.; Daegelen, P.; Lenski, R. E.
2009-12-01
Each difference between the genome sequences of Escherichia coli B strains REL606 and BL21(DE3) can be interpreted in light of known laboratory manipulations plus a gene conversion between ribosomal RNA operons. Two treatments with 1-methyl-3-nitro-1-nitrosoguanidine in the REL606 lineage produced at least 93 single-base-pair mutations ({approx} 90% GC-to-AT transitions) and 3 single-base-pair GC deletions. Two UV treatments in the BL21(DE3) lineage produced only 4 single-base-pair mutations but 16 large deletions. P1 transductions from K-12 into the two B lineages produced 317 single-base-pair differences and 9 insertions or deletions, reflecting differences between B DNA in BL21(DE3) and integrated restriction fragments ofmore » K-12 DNA inherited by REL606. Two sites showed selective enrichment of spontaneous mutations. No unselected spontaneous single-base-pair mutations were evident. The genome sequences revealed that a progenitor of REL606 had been misidentified, explaining initially perplexing differences. Limited sequencing of other B strains defined characteristic properties of B and allowed assembly of the inferred genome of the ancestral B of Delbrueck and Luria. Comparison of the B and K-12 genomes shows that more than half of the 3793 proteins of their basic genomes are predicted to be identical, although {approx} 310 appear to be functional in either B or K-12 but not in both. The ancestral basic genome appears to have had {approx} 4039 coding sequences occupying {approx} 4.0 Mbp. Repeated horizontal transfer from diverged Escherichia coli genomes and homologous recombination may explain the observed variable distribution of single-base-pair differences. Fifteen sites are occupied by phage-related elements, but only six by comparable elements at the same site. More than 50 sites are occupied by IS elements in both B and K, 16 in common, and likely founding IS elements are identified. A signature of widespread cryptic phage P4-type mobile elements was identified. Complex deletions (dense clusters of small deletions and substitutions) apparently removed nonessential genes from {approx} 30 sites in the basic genomes.« less
Design and implementation of a database for Brucella melitensis genome annotation.
De Hertogh, Benoît; Lahlimi, Leïla; Lambert, Christophe; Letesson, Jean-Jacques; Depiereux, Eric
2008-03-18
The genome sequences of three Brucella biovars and of some species close to Brucella sp. have become available, leading to new relationship analysis. Moreover, the automatic genome annotation of the pathogenic bacteria Brucella melitensis has been manually corrected by a consortium of experts, leading to 899 modifications of start sites predictions among the 3198 open reading frames (ORFs) examined. This new annotation, coupled with the results of automatic annotation tools of the complete genome sequences of the B. melitensis genome (including BLASTs to 9 genomes close to Brucella), provides numerous data sets related to predicted functions, biochemical properties and phylogenic comparisons. To made these results available, alphaPAGe, a functional auto-updatable database of the corrected sequence genome of B. melitensis, has been built, using the entity-relationship (ER) approach and a multi-purpose database structure. A friendly graphical user interface has been designed, and users can carry out different kinds of information by three levels of queries: (1) the basic search use the classical keywords or sequence identifiers; (2) the original advanced search engine allows to combine (by using logical operators) numerous criteria: (a) keywords (textual comparison) related to the pCDS's function, family domains and cellular localization; (b) physico-chemical characteristics (numerical comparison) such as isoelectric point or molecular weight and structural criteria such as the nucleic length or the number of transmembrane helix (TMH); (c) similarity scores with Escherichia coli and 10 species phylogenetically close to B. melitensis; (3) complex queries can be performed by using a SQL field, which allows all queries respecting the database's structure. The database is publicly available through a Web server at the following url: http://www.fundp.ac.be/urbm/bioinfo/aPAGe.
de Andrade, Roberto R S; Vaslin, Maite F S
2014-03-07
Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/.
2014-01-01
Background Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. Methods In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. Results The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. Conclusions SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. Availability and implementation SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/. PMID:24607237
USDA-ARS?s Scientific Manuscript database
The scientific presentations at the First International Brachypodium Conference (abstracts available at www.brachy2013.unimore.it) are evidence of the widespread adoption of Brachypodium as a model system. Furthermore, the wide range of topics presented (genome evolution, roots, abiotic and biotic s...
The whole genome sequence assembly of the soybean aphid, Aphis glycines
USDA-ARS?s Scientific Manuscript database
Aphids are emerging as model organisms for both basic and applied research. Of the 5,000 estimated species, only two aphids have published whole genome sequences: the pea aphid Acyrthosiphon pisum, and the Russian wheat aphid, Diuraphis noxia. The soybean aphid (Aphis glycines) is an extreme special...
De Rocquigny, H; Gabus, C; Vincent, A; Fournié-Zaluski, M C; Roques, B; Darlix, J L
1992-01-01
The nucleocapsid (NC) of human immunodeficiency virus type 1 consists of a large number of NC protein molecules, probably wrapping the dimeric RNA genome within the virion inner core. NC protein is a gag-encoded product that contains two zinc fingers flanked by basic residues. In human immunodeficiency virus type 1 virions, NCp15 is ultimately processed into NCp7 and p6 proteins. During virion assembly the retroviral NC protein is necessary for core formation and genomic RNA encapsidation, which are essential for virus infectivity. In vitro NCp15 activates viral RNA dimerization, a process most probably linked in vivo to genomic RNA packaging, and replication primer tRNA(Lys,3) annealing to the initiation site of reverse transcription. To characterize the domains of human immunodeficiency virus type 1 NC protein necessary for its various functions, the 72-amino acid NCp7 and several derived peptides were synthesized in a pure form. We show here that synthetic NCp7 with or without the two zinc fingers has the RNA annealing activities of NCp15. Further deletions of the N-terminal 12 and C-terminal 8 amino acids, leading to a 27-residue peptide lacking the finger domains, have little or no effect on NC protein activity in vitro. However deletion of short sequences containing basic residues flanking the first finger leads to a complete loss of NC protein activity. It is proposed that the basic residues and the zinc fingers cooperate to select and package the genomic RNA in vivo. Inhibition of the viral RNA binding and annealing activities associated with the basic residues flanking the first zinc finger of NC protein could therefore be used as a model for the design of antiviral agents. Images PMID:1631144
De Rocquigny, H; Gabus, C; Vincent, A; Fournié-Zaluski, M C; Roques, B; Darlix, J L
1992-07-15
The nucleocapsid (NC) of human immunodeficiency virus type 1 consists of a large number of NC protein molecules, probably wrapping the dimeric RNA genome within the virion inner core. NC protein is a gag-encoded product that contains two zinc fingers flanked by basic residues. In human immunodeficiency virus type 1 virions, NCp15 is ultimately processed into NCp7 and p6 proteins. During virion assembly the retroviral NC protein is necessary for core formation and genomic RNA encapsidation, which are essential for virus infectivity. In vitro NCp15 activates viral RNA dimerization, a process most probably linked in vivo to genomic RNA packaging, and replication primer tRNA(Lys,3) annealing to the initiation site of reverse transcription. To characterize the domains of human immunodeficiency virus type 1 NC protein necessary for its various functions, the 72-amino acid NCp7 and several derived peptides were synthesized in a pure form. We show here that synthetic NCp7 with or without the two zinc fingers has the RNA annealing activities of NCp15. Further deletions of the N-terminal 12 and C-terminal 8 amino acids, leading to a 27-residue peptide lacking the finger domains, have little or no effect on NC protein activity in vitro. However deletion of short sequences containing basic residues flanking the first finger leads to a complete loss of NC protein activity. It is proposed that the basic residues and the zinc fingers cooperate to select and package the genomic RNA in vivo. Inhibition of the viral RNA binding and annealing activities associated with the basic residues flanking the first zinc finger of NC protein could therefore be used as a model for the design of antiviral agents.
Microeconomic principles explain an optimal genome size in bacteria.
Ranea, Juan A G; Grant, Alastair; Thornton, Janet M; Orengo, Christine A
2005-01-01
Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral 'molecular technology' to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).
Lee, Ciaran M; Zhu, Haibao; Davis, Timothy H; Deshmukh, Harshahardhan; Bao, Gang
2017-01-01
The CRISPR/Cas9 system is a powerful tool for precision genome editing. The ability to accurately modify genomic DNA in situ with single nucleotide precision opens up new possibilities for not only basic research but also biotechnology applications and clinical translation. In this chapter, we outline the procedures for design, screening, and validation of CRISPR/Cas9 systems for targeted modification of coding sequences in the human genome and how to perform genome editing in induced pluripotent stem cells with high efficiency and specificity.
Imaging and the new biology: What's wrong with this picture?
NASA Astrophysics Data System (ADS)
Vannier, Michael W.
2004-05-01
The Human Genome has been defined, giving us one part of the equation that stems from the central dogma of molecular biology. Despite this awesome scientific achievement, the correspondence between genomics and imaging is weak, since we cannot predict an organism's phenotype from even perfect knowledge of its genetic complement. Biological knowledge comes in several forms, and the genome is perhaps the best known and most completely understood type. Imaging creates another form of biological information, providing the ability to study morphology, growth and development, metabolic processes, and diseases in vitro and in vivo at many levels of scale. The principal challenge in biomedical imaging for the future lies in the need to reconcile the data provided by one or multiple modalities with other forms of biological knowledge, most importantly the genome, proteome, physiome, and other "-ome's." To date, the imaging science community has not set a high priority on the unification of their results with genomics, proteomics, and physiological functions in most published work. Images are relatively isolated from other forms of biological data, impairing our ability to conceive and address many fundamental questions in research and clinical practice. This presentation will explain the challenge of biological knowledge integration in basic research and clinical applications from the standpoint of imaging and image processing. The impediments to progress, isolation of the imaging community, and mainstream of new and future biological science will be identified, so the critical and immediate need for change can be highlighted.
Genome build information is an essential part of genomic track files.
Kanduri, Chakravarthi; Domanska, Diana; Hovig, Eivind; Sandve, Geir Kjetil
2017-09-14
Genomic locations are represented as coordinates on a specific genome build version, but the build information is frequently missing when coordinates are provided. We show that this information is essential to correctly interpret and analyse the genomic intervals contained in genomic track files. Although not a substitute for best practices, we also provide a tool to predict the genome build version of genomic track files.
What the papers say: Text mining for genomics and systems biology
2010-01-01
Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining -- the automated extraction of information from (electronically) published sources -- could potentially fulfil an important role -- but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward. PMID:21106487
A Third Approach to Gene Prediction Suggests Thousands of Additional Human Transcribed Regions
Glusman, Gustavo; Qin, Shizhen; El-Gewely, M. Raafat; Siegel, Andrew F; Roach, Jared C; Hood, Leroy; Smit, Arian F. A
2006-01-01
The identification and characterization of the complete ensemble of genes is a main goal of deciphering the digital information stored in the human genome. Many algorithms for computational gene prediction have been described, ultimately derived from two basic concepts: (1) modeling gene structure and (2) recognizing sequence similarity. Successful hybrid methods combining these two concepts have also been developed. We present a third orthogonal approach to gene prediction, based on detecting the genomic signatures of transcription, accumulated over evolutionary time. We discuss four algorithms based on this third concept: Greens and CHOWDER, which quantify mutational strand biases caused by transcription-coupled DNA repair, and ROAST and PASTA, which are based on strand-specific selection against polyadenylation signals. We combined these algorithms into an integrated method called FEAST, which we used to predict the location and orientation of thousands of putative transcription units not overlapping known genes. Many of the newly predicted transcriptional units do not appear to code for proteins. The new algorithms are particularly apt at detecting genes with long introns and lacking sequence conservation. They therefore complement existing gene prediction methods and will help identify functional transcripts within many apparent “genomic deserts.” PMID:16543943
Pharmacogenomics in cardiovascular clinical trials.
Shah, R; Darne, B; Atar, D; Abadie, E; Adams, K F; Zannad, F
2004-12-01
Genomics - having quickly emerged as the central discipline in basic science and biomedical research - is poised to take the center stage in clinical medicine as well over the next few decades. Although there is no specific regulatory guideline on the application of pharmacogenetics to drug development, some recommendations are already included in several published guidelines on drug development. The patients more likely to provide the most valuable information on the specific contribution of a given gene or its variant are those who fail to respond to a drug ('therapeutic failures') and those who develop toxicity to the drug. However, before drawing definite conclusions on subgroups following pharmacogenomic analyses, one must be aware of disease classification, data collection, and how much is known about the disease process. It seems reasonable to collect genomic DNA from all patients enrolled in clinical drug trials (along with appropriate consent to permit pharmacogenetic studies) for the purpose of post hoc analyses. One exception to post hoc genomic analysis is when patients with a specific genotype are excluded from randomization into a clinical trial. Physicians will need to understand the concept of genetic variability, its interactions with the environment (e.g. drug-drug or drug-disease interactions), and its implication for patient care.
Jeltsch, Albert
2018-01-01
Genome targeting of restriction enzymes and DNA methyltransferases has many important applications including genome and epigenome editing. 15–20 years ago, my group was involved in the development of approaches for programmable genome targeting, aiming to connect enzymes with an oligodeoxynucleotide (ODN), which could form a sequence-specific triple helix at the genomic target site. Importantly, the target site of such enzyme-ODN conjugate could be varied simply by altering the ODN sequence promising great applicative values. However, this approach was facing many problems including the preparation and purification of the enzyme-ODN conjugates, their efficient delivery into cells, slow kinetics of triple helix formation and the requirement of a poly-purine target site sequence. Hence, for several years genome and epigenome editing approaches mainly were based on Zinc fingers and TAL proteins as targeting devices. More recently, CRISPR/Cas systems were discovered, which use a bound RNA for genome targeting that forms an RNA/DNA duplex with one DNA strand of the target site. These systems combine all potential advantages of the once imagined enzyme-ODN conjugates and avoid all main disadvantageous. Consequently, the application of CRISPR/Cas in genome and epigenome editing has exploded in recent years. We can draw two important conclusions from this example of research history. First, evolution still is the better bioengineer than humans and, whenever tested in parallel, natural solutions outcompete engineered ones. Second, CRISPR/Cas system were discovered in pure, curiosity driven, basic research, highlighting that it is basic, bottom-up research paving the way for fundamental innovation. PMID:29434619
From Mendel to the Human Genome Project: The Implications for Nurse Education.
ERIC Educational Resources Information Center
Burton, Hilary; Stewart, Alison
2003-01-01
The Human Genome Project is brining new opportunities to predict and prevent diseases. Although pediatric nurses are the closest to these developments, most nurses will encounter genetic aspects of practice and must understand the basic science and its ethical, legal, and social dimensions. (Includes commentary by Peter Birchenall.) (SK)
USDA-ARS?s Scientific Manuscript database
Modern day genomics holds the promise of solving the complexities of basic plant sciences, and of catalyzing practical advances in plant breeding. While contiguous, "base perfect" deep sequencing is a key module of any genome project, recent advances in parallel next generation sequencing technologi...
Comparative Genomics in Homo sapiens.
Oti, Martin; Sammeth, Michael
2018-01-01
Genomes can be compared at different levels of divergence, either between species or within species. Within species genomes can be compared between different subpopulations, such as human subpopulations from different continents. Investigating the genomic differences between different human subpopulations is important when studying complex diseases that are affected by many genetic variants, as the variants involved can differ between populations. The 1000 Genomes Project collected genome-scale variation data for 2504 human individuals from 26 different populations, enabling a systematic comparison of variation between human subpopulations. In this chapter, we present step-by-step a basic protocol for the identification of population-specific variants employing the 1000 Genomes data. These variants are subsequently further investigated for those that affect the proteome or RNA splice sites, to investigate potentially biologically relevant differences between the populations.
Investigating Genomic Mechanisms of Treatment Resistance in Castration Resistant Prostate Cancer
2015-05-01
and genomically profiled. Figure 3 shows data from a series of cell- line experiments showing that PC3 prostate cancer cells are recoverable and...coursework until the second-half of the grant period. I am enrolled in the UCSF Biomedical Sciences Graduate Program class BMS 255: Genetics : Basic... Genetics and Genomics. This class is set to start in January 2016. Given a large number of clinical, teaching, and research duties I will plan to enroll
Reconstruction of a composite comparative map composed of ten legume genomes.
Lee, Chaeyoung; Yu, Dongwoon; Choi, Hong-Kyu; Kim, Ryan W
2017-01-01
The Fabaceae (legume family) is the third largest and the second of agricultural importance among flowering plant groups. In this study, we report the reconstruction of a composite comparative map composed of ten legume genomes, including seven species from the galegoid clade ( Medicago truncatula , Medicago sativa , Lens culinaris, Pisum sativum , Lotus japonicus , Cicer arietinum , Vicia faba ) and three species from the phaseoloid clade ( Vigna radiata , Phaseolus vulgaris , Glycine max ). To accomplish this comparison, a total of 209 cross-species gene-derived markers were employed. The comparative analysis resulted in a single extensive genetic/genomic network composed of 93 chromosomes or linkage groups, from which 110 synteny blocks and other evolutionary events (e.g., 13 inversions) were identified. This comparative map also allowed us to deduce several large scale evolutionary events, such as chromosome fusion/fission, with which might explain differences in chromosome numbers among compared species or between the two clades. As a result, useful properties of cross-species genic markers were re-verified as an efficient tool for cross-species translation of genomic information, and similar approaches, combined with a high throughput bioinformatic marker design program, should be effective for applying the knowledge of trait-associated genes to other important crop species for breeding purposes. Here, we provide a basic comparative framework for the ten legume species, and expect to be usefully applied towards the crop improvement in legume breeding.
Samollow, Paul B; Kammerer, Candace M; Mahaney, Susan M; Schneider, Jennifer L; Westenberger, Scott J; VandeBerg, John L; Robinson, Edward S
2004-01-01
The gray, short-tailed opossum, Monodelphis domestica, is the most extensively used, laboratory-bred marsupial resource for basic biologic and biomedical research worldwide. To enhance the research utility of this species, we are building a linkage map, using both anonymous markers and functional gene loci, that will enable the localization of quantitative trait loci (QTL) and provide comparative information regarding the evolution of mammalian and other vertebrate genomes. The current map is composed of 83 loci distributed among eight autosomal linkage groups and the X chromosome. The autosomal linkage groups appear to encompass a very large portion of the genome, yet span a sex-average distance of only 633.0 cM, making this the most compact linkage map known among vertebrates. Most surprising, the male map is much larger than the female map (884.6 cM vs. 443.1 cM), a pattern contrary to that in eutherian mammals and other vertebrates. The finding of genome-wide reduction in female recombination in M. domestica, coupled with recombination data from two other, distantly related marsupial species, suggests that reduced female recombination might be a widespread metatherian attribute. We discuss possible explanations for reduced female recombination in marsupials as a consequence of the metatherian characteristic of determinate paternal X chromosome inactivation. PMID:15020427
PCR Amplification Strategies towards full-length HIV-1 Genome sequencing.
Liu, Chao Chun; Ji, Hezhao
2018-06-26
The advent of next generation sequencing has enabled greater resolution of viral diversity and improved feasibility of full viral genome sequencing allowing routine HIV-1 full genome sequencing in both research and diagnostic settings. Regardless of the sequencing platform selected, successful PCR amplification of the HIV-1 genome is essential for sequencing template preparation. As such, full HIV-1 genome amplification is a crucial step in dictating the successful and reliable sequencing downstream. Here we reviewed existing PCR protocols leading to HIV-1 full genome sequencing. In addition to the discussion on basic considerations on relevant PCR design, the advantages as well as the pitfalls of published protocols were reviewed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho
2011-01-01
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.
Methyl jasmonate as a vital substance in plants.
Cheong, Jong-Joo; Choi, Yang Do
2003-07-01
The plant floral scent methyl jasmonate (MeJA) has been identified as a vital cellular regulator that mediates diverse developmental processes and defense responses against biotic and abiotic stresses. The pleiotropic effects of MeJA have raised numerous questions about its regulation for biogenesis and mode of action. Characterization of the gene encoding jasmonic acid carboxyl methyltransferase has provided basic information on the role(s) of this phytohormone in gene-activation control and systemic long-distance signaling. Recent approaches using functional genomics and bioinformatics have identified a whole set of MeJA-responsive genes, and provide insights into how plants use volatile signals to withstand diverse and variable environments.
Functional regression method for whole genome eQTL epistasis analysis with sequencing data.
Xu, Kelin; Jin, Li; Xiong, Momiao
2017-05-18
Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction identified using FRGM, RPKM and DESeq were 16,2361, 260 and 51, respectively, from the 350 European samples. The proposed FRGM for epistasis analysis of RNA-seq can capture isoform and position-level information and will have a broad application. Both simulations and real data analysis highlight the potential for the FRGM to be a good choice of the epistatic analysis with sequencing data.
CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics
Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David
2013-01-01
Background Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. Results The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. Conclusion This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training. PMID:23577086
Conference summary: Navigating the Sea of Genomic Data, October 28-29, 2015.
Pihlstrom, Bruce L; Barnett, Michael L
2016-03-01
The rapid pace of biomedical discoveries in the past few years has resulted in substantial advances in our ability to diagnose, treat, and prevent a wide variety of diseases. The sequencing of the human genome offered the possibility of understanding the etiology, pathogenesis, and risk of developing disease from a genetic perspective and has resulted, for example, in the development of genomic-based diagnostic or risk-assessment tests for a number of medical and dental conditions. To assess the scientific evidence underlying such tests and determine whether they may be useful in clinical practice, practitioners need to have a basic understanding of the state-of-the-science of genomics and genetic testing. To assist practitioners in understanding the science of genomics, the American Dental Association and the Task Force on Design and Analysis in Oral Health Research co-sponsored a landmark conference, Navigating the Sea of Genomic Data, held October 28-29, 2015, at the American Dental Association headquarters building in Chicago, IL. The purpose of this conference was to review the basics of genomic science, promote sound design and analysis of genomic studies of oral diseases, and provide a basis or "framework" to guide practitioners in assessing new development in genomics and genetic tests for oral diseases. Presentations at this conference were made by 9 world-renowned scientists who discussed a wide range of topics involving genomic science, genetic testing for rare mendelian single gene disorders, and genetic testing for assessing the risk of experiencing common complex diseases. This article summarizes the key points and concepts presented by the speakers. It is essential for oral health care professionals to have a fundamental understanding of genomic science so that they can evaluate new advances in this field and the use of genetic testing for the benefit of their patients. Copyright © 2016 American Dental Association. Published by Elsevier Inc. All rights reserved.
Systems Approaches to Biology and Disease Enable Translational Systems Medicine
Hood, Leroy; Tian, Qiang
2012-01-01
The development and application of systems strategies to biology and disease are transforming medical research and clinical practice in an unprecedented rate. In the foreseeable future, clinicians, medical researchers, and ultimately the consumers and patients will be increasingly equipped with a deluge of personal health information, e.g., whole genome sequences, molecular profiling of diseased tissues, and periodic multi-analyte blood testing of biomarker panels for disease and wellness. The convergence of these practices will enable accurate prediction of disease susceptibility and early diagnosis for actionable preventive schema and personalized treatment regimes tailored to each individual. It will also entail proactive participation from all major stakeholders in the health care system. We are at the dawn of predictive, preventive, personalized, and participatory (P4) medicine, the fully implementation of which requires marrying basic and clinical researches through advanced systems thinking and the employment of high-throughput technologies in genomics, proteomics, nanofluidics, single-cell analysis, and computation strategies in a highly-orchestrated discipline we termed translational systems medicine. PMID:23084773
Finding similar nucleotide sequences using network BLAST searches.
Ladunga, Istvan
2009-06-01
The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge.
USDA-ARS?s Scientific Manuscript database
We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the minimum information about any (x) sequence (MIxS). The standards are the minimum information about a single amplified genome (MISAG) and the ...
Kim, ChangKug; Park, DongSuk; Seol, YoungJoo; Hahn, JangHo
2011-01-01
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage. PMID:21887015
Zhu, Chuankun; Tong, Jingou; Yu, Xiaomu; Guo, Wenjie
2015-08-01
Comparative mapping provides an efficient method to connect genomes of non-model and model fishes. In this study, we used flanking sequences of the 659 microsatellites on a genetic map of bighead carp (Aristichthys nobilis) to comprehensively study syntenic relationships between bighead carp and nine model and non-model fishes. Of the five model and two food fishes with whole genome data, Cyprinus carpio showed the highest rate of positive BLAST hits (95.3 %) with bighead carp map, followed by Danio rerio (70.9 %), Oreochromis niloticus (21.7 %), Tetraodon nigroviridis (6.4 %), Gasterosteus aculeatus (5.2 %), Oryzias latipes (4.7 %) and Fugu rubripes (3.5 %). Chromosomal syntenic analyses showed that inversion was the basic chromosomal rearrangement during genomic evolution of cyprinids, and the extent of inversions and translocations was found to be positively correlated with evolutionary relationships among fishes studied. Among the five investigated cyprinids, linkage groups (LGs) of bighead carp, Hypophthalmichthys molitrix and Ctenopharyngodon idella exhibited a one-to-one relationship. Besides, LG 9 of bighead carp and homologous LGs of silver carp and grass carp all corresponded to the chromosomes 10 and 22 of zebrafish, suggesting that chromosomal fission may have occurred in the ancestor of zebrafish. On the other hand, LGs of bighead carp and common carp showed an approximate one-to-two relationship with extensive translocations, confirming the occurrence of a 4th whole genome duplication in common carp. This study provides insights into the understanding of genome evolution among cyprinids and would aid in transferring positional and functional information of genes from model fish like zebrafish to non-model fish like bighead carp.
Mao, Meng; Yang, Xiushuai; Poff, Kirsten; Bennett, Gordon
2017-06-01
Insect species in the Auchenorrhyncha suborder (Hemiptera) maintain ancient obligate symbioses with bacteria that provide essential amino acids (EAAs) deficient in their plant-sap diets. Molecular studies have revealed that two complementary symbiont lineages, "Candidatus Sulcia muelleri" and a betaproteobacterium ("Ca. Zinderia insecticola" in spittlebugs [Cercopoidea] and "Ca. Nasuia deltocephalinicola" in leafhoppers [Cicadellidae]) may have persisted in the suborder since its origin ∼300 Ma. However, investigation of how this pair has co-evolved on a genomic level is limited to only a few host lineages. We sequenced the complete genomes of Sulcia and a betaproteobacterium from the treehopper, Entylia carinata (Membracidae: ENCA), as the first representative from this species-rich group. It also offers the opportunity to compare symbiont evolution across a major insect group, the Membracoidea (leafhoppers + treehoppers). Genomic analyses show that the betaproteobacteria in ENCA is a member of the Nasuia lineage. Both symbionts have larger genomes (Sulcia = 218 kb and Nasuia = 144 kb) than related lineages in Deltocephalinae leafhoppers, retaining genes involved in basic cellular functions and information processing. Nasuia-ENCA further exhibits few unique gene losses, suggesting that its parent lineage in the common ancestor to the Membracoidea was already highly reduced. Sulcia-ENCA has lost the abilities to synthesize menaquinone cofactor and to complete the synthesis of the branched-chain EAAs. Both capabilities are conserved in other Sulcia lineages sequenced from across the Auchenorrhyncha. Finally, metagenomic sequencing recovered the partial genome of an Arsenophonus symbiont, although it infects only 20% of individuals indicating a facultative role. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yang, Xiushuai; Poff, Kirsten; Bennett, Gordon
2017-01-01
Abstract Insect species in the Auchenorrhyncha suborder (Hemiptera) maintain ancient obligate symbioses with bacteria that provide essential amino acids (EAAs) deficient in their plant-sap diets. Molecular studies have revealed that two complementary symbiont lineages, “Candidatus Sulcia muelleri” and a betaproteobacterium (“Ca. Zinderia insecticola” in spittlebugs [Cercopoidea] and “Ca. Nasuia deltocephalinicola” in leafhoppers [Cicadellidae]) may have persisted in the suborder since its origin ∼300 Ma. However, investigation of how this pair has co-evolved on a genomic level is limited to only a few host lineages. We sequenced the complete genomes of Sulcia and a betaproteobacterium from the treehopper, Entylia carinata (Membracidae: ENCA), as the first representative from this species-rich group. It also offers the opportunity to compare symbiont evolution across a major insect group, the Membracoidea (leafhoppers + treehoppers). Genomic analyses show that the betaproteobacteria in ENCA is a member of the Nasuia lineage. Both symbionts have larger genomes (Sulcia = 218 kb and Nasuia = 144 kb) than related lineages in Deltocephalinae leafhoppers, retaining genes involved in basic cellular functions and information processing. Nasuia-ENCA further exhibits few unique gene losses, suggesting that its parent lineage in the common ancestor to the Membracoidea was already highly reduced. Sulcia-ENCA has lost the abilities to synthesize menaquinone cofactor and to complete the synthesis of the branched-chain EAAs. Both capabilities are conserved in other Sulcia lineages sequenced from across the Auchenorrhyncha. Finally, metagenomic sequencing recovered the partial genome of an Arsenophonus symbiont, although it infects only 20% of individuals indicating a facultative role. PMID:28854637
Mennigen, Jan A; Zhang, Dapeng
2016-12-01
Rainbow trout represent an important teleost research model and aquaculture species. As such, rainbow trout are employed in diverse areas of biological research, including basic biological disciplines such as comparative physiology, toxicology, and, since rainbow trout have undergone both teleost- and salmonid-specific rounds of genome duplication, molecular evolution. In recent years, microRNAs (miRNAs, small non-protein coding RNAs) have emerged as important posttranscriptional regulators of gene expression in animals. Given the increasingly recognized importance of miRNAs as an additional layer in the regulation of gene expression and hence biological function, recent efforts using RNA- and genome sequencing approaches have resulted in the creation of several resources for the construction of a comprehensive repertoire of rainbow trout miRNAs and isomiRs (variant miRNA sequences that all appear to derive from the same gene but vary in sequence due to post-transcriptional processing). Importantly, through the recent publication of the rainbow trout genome (Berthelot et al., 2014), mRNA 3'UTR information has become available, allowing for the first time the genome-wide prediction of miRNA-target RNA relationships in this species. We here report the creation of the microtrout database, a comprehensive resource for rainbow trout miRNA and annotated 3'UTRs. The comprehensive database was used to implement an algorithm to predict genome-wide rainbow trout-specific miRNA-mRNA target relationships, generating an improved predictive framework over previously published approaches. This work will serve as a useful framework and sequence resource to experimentally address the role of miRNAs in several research areas using the rainbow trout model, examples of which are discussed. Copyright © 2016 Elsevier Inc. All rights reserved.
SOPanG: online text searching over a pan-genome.
Cislak, Aleksander; Grabowski, Szymon; Holub, Jan
2018-06-22
The many thousands of high-quality genomes available nowadays imply a shift from single genome to pan-genomic analyses. A basic algorithmic building brick for such a scenario is online search over a collection of similar texts, a problem with surprisingly few solutions presented so far. We present SOPanG, a simple tool for exact pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome. Thanks to bit-parallelism, it achieves pattern matching speeds above 400MB/s, more than an order of magnitude higher than of other software. SOPanG is available for free from: https://github.com/MrAlexSee/sopang. Supplementary data are available at Bioinformatics online.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas
The number of genomes from uncultivated microbes will soon surpass the number of isolate genomes in public databases (Hugenholtz, Skarshewski, & Parks, 2016). Technological advancements in high-throughput sequencing and assembly, including single-cell genomics and the computational extraction of genomes from metagenomes (GFMs), are largely responsible. Here we propose community standards for reporting the Minimum Information about a Single-Cell Genome (MIxS-SCG) and Minimum Information about Genomes extracted From Metagenomes (MIxS-GFM) specific for Bacteria and Archaea. The standards have been developed in the context of the International Genomics Standards Consortium (GSC) community (Field et al., 2014) and can be viewed as amore » supplement to other GSC checklists including the Minimum Information about a Genome Sequence (MIGS), Minimum information about a Metagenomic Sequence(s) (MIMS) (Field et al., 2008) and Minimum Information about a Marker Gene Sequence (MIMARKS) (P. Yilmaz et al., 2011). Community-wide acceptance of MIxS-SCG and MIxS-GFM for Bacteria and Archaea will enable broad comparative analyses of genomes from the majority of taxa that remain uncultivated, improving our understanding of microbial function, ecology, and evolution.« less
USDA-ARS?s Scientific Manuscript database
Basic leucine zipper (bZIP) genes are known to play dominant roles in plant response to development signals, as well as abiotic or biotic stress stimuli. Fifty bZIP genes across the woodland strawberry (Fragaria vesca) genome were identified and analyzed. They can be divided into 10 clades according...
CRISPR/Cas system for yeast genome engineering: advances and applications
Stovicek, Vratislav; Holkenbrink, Carina
2017-01-01
Abstract The methods based on the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system have quickly gained popularity for genome editing and transcriptional regulation in many organisms, including yeast. This review aims to provide a comprehensive overview of CRISPR application for different yeast species: from basic principles and genetic design to applications. PMID:28505256
Multiple Myeloma Genomics: A Systematic Review.
Weaver, Casey J; Tariman, Joseph D
2017-08-01
This integrative review describes the genomic variants that have been found to be associated with poor prognosis in patients diagnosed with multiple myeloma (MM). Second, it identifies MM genetic and genomic changes using next-generation sequencing, specifically whole-genome sequencing or exome sequencing. A search for peer-reviewed articles through PubMed, EBSCOhost, and DePaul WorldCat Libraries Worldwide yielded 33 articles that were included in the final analysis. The most commonly reported genetic changes were KRAS, NRAS, TP53, FAM46C, BRAF, DIS3, ATM, and CCND1. These genetic changes play a role in the pathogenesis of MM, prognostication, and therapeutic targets for novel therapies. MM genetics and genomics are expanding rapidly; oncology nurse clinicians must have basic competencies in genetics and genomics to help patients understand the complexities of genetic and genomic alterations and be able to refer patients to appropriate genomic professionals if needed. Copyright © 2017 Elsevier Inc. All rights reserved.
Stade, Björn; Seelow, Dominik; Thomsen, Ingo; Krawczak, Michael; Franke, Andre
2014-01-01
Next Generation Sequencing (NGS) of whole exomes or genomes is increasingly being used in human genetic research and diagnostics. Sharing NGS data with third parties can help physicians and researchers to identify causative or predisposing mutations for a specific sample of interest more efficiently. In many cases, however, the exchange of such data may collide with data privacy regulations. GrabBlur is a newly developed tool to aggregate and share NGS-derived single nucleotide variant (SNV) data in a public database, keeping individual samples unidentifiable. In contrast to other currently existing SNV databases, GrabBlur includes phenotypic information and contact details of the submitter of a given database entry. By means of GrabBlur human geneticists can securely and easily share SNV data from resequencing projects. GrabBlur can ease the interpretation of SNV data by offering basic annotations, genotype frequencies and in particular phenotypic information - given that this information was shared - for the SNV of interest. GrabBlur facilitates the combination of phenotypic and NGS data (VCF files) via a local interface or command line operations. Data submissions may include HPO (Human Phenotype Ontology) terms, other trait descriptions, NGS technology information and the identity of the submitter. Most of this information is optional and its provision at the discretion of the submitter. Upon initial intake, GrabBlur merges and aggregates all sample-specific data. If a certain SNV is rare, the sample-specific information is replaced with the submitter identity. Generally, all data in GrabBlur are highly aggregated so that they can be shared with others while ensuring maximum privacy. Thus, it is impossible to reconstruct complete exomes or genomes from the database or to re-identify single individuals. After the individual information has been sufficiently "blurred", the data can be uploaded into a publicly accessible domain where aggregated genotypes are provided alongside phenotypic information. A web interface allows querying the database and the extraction of gene-wise SNV information. If an interesting SNV is found, the interrogator can get in contact with the submitter to exchange further information on the carrier and clarify, for example, whether the latter's phenotype matches with phenotype of their own patient.
A dictionary based informational genome analysis
2012-01-01
Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068
Building international genomics collaboration for global health security
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.
Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installationmore » of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.« less
Building international genomics collaboration for global health security
Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; ...
2015-12-07
Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installationmore » of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.« less
Dostálková, Alžběta; Kaufman, Filip; Křížová, Ivana; Kultová, Anna; Strohalmová, Karolína; Hadravová, Romana; Ruml, Tomáš; Rumlová, Michaela
2018-05-15
In addition to specific RNA-binding zinc finger domains, the retroviral Gag polyprotein contains clusters of basic amino acid residues that are thought to support Gag-viral genomic RNA (gRNA) interactions. One of these clusters is the basic K 16 NK 18 EK 20 region, located upstream of the first zinc finger of the Mason-Pfizer monkey virus (M-PMV) nucleocapsid (NC) protein. To investigate the role of this basic region in the M-PMV life cycle, we used a combination of in vivo and in vitro methods to study a series of mutants in which the overall charge of this region was more positive (RNRER), more negative (AEAEA), or neutral (AAAAA). The mutations markedly affected gRNA incorporation and the onset of reverse transcription. The introduction of a more negative charge (AEAEA) significantly reduced the incorporation of M-PMV gRNA into nascent particles. Moreover, the assembly of immature particles of the AEAEA Gag mutant was relocated from the perinuclear region to the plasma membrane. In contrast, an enhancement of the basicity of this region of M-PMV NC (RNRER) caused a substantially more efficient incorporation of gRNA, subsequently resulting in an increase in M-PMV RNRER infectivity. Nevertheless, despite the larger amount of gRNA packaged by the RNRER mutant, the onset of reverse transcription was delayed in comparison to that of the wild type. Our data clearly show the requirement for certain positively charged amino acid residues upstream of the first zinc finger for proper gRNA incorporation, assembly of immature particles, and proceeding of reverse transcription. IMPORTANCE We identified a short sequence within the Gag polyprotein that, together with the zinc finger domains and the previously identified RKK motif, contributes to the packaging of genomic RNA (gRNA) of Mason-Pfizer monkey virus (M-PMV). Importantly, in addition to gRNA incorporation, this basic region (KNKEK) at the N terminus of the nucleocapsid protein is crucial for the onset of reverse transcription. Mutations that change the positive charge of the region to a negative one significantly reduced specific gRNA packaging. The assembly of immature particles of this mutant was reoriented from the perinuclear region to the plasma membrane. On the contrary, an enhancement of the basic character of this region increased both the efficiency of gRNA packaging and the infectivity of the virus. However, the onset of reverse transcription was delayed even in this mutant. In summary, the basic region in M-PMV Gag plays a key role in the packaging of genomic RNA and, consequently, in assembly and reverse transcription. Copyright © 2018 American Society for Microbiology.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart
2017-04-24
High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.
Govin, Jerome; Gaucher, Jonathan; Ferro, Myriam; Debernardi, Alexandra; Garin, Jerome; Khochbin, Saadi; Rousseaux, Sophie
2012-01-01
After meiosis, during the final stages of spermatogenesis, the haploid male genome undergoes major structural changes, resulting in a shift from a nucleosome-based genome organization to the sperm-specific, highly compacted nucleoprotamine structure. Recent data support the idea that region-specific programming of the haploid male genome is of high importance for the post-fertilization events and for successful embryo development. Although these events constitute a unique and essential step in reproduction, the mechanisms by which they occur have remained completely obscure and the factors involved have mostly remained uncharacterized. Here, we sought a strategy to significantly increase our understanding of proteins controlling the haploid male genome reprogramming, based on the identification of proteins in two specific pools: those with the potential to bind nucleic acids (basic proteins) and proteins capable of binding basic proteins (acidic proteins). For the identification of acidic proteins, we developed an approach involving a transition-protein (TP)-based chromatography, which has the advantage of retaining not only acidic proteins due to the charge interactions, but also potential TP-interacting factors. A second strategy, based on an in-depth bioinformatic analysis of the identified proteins, was then applied to pinpoint within the lists obtained, male germ cells expressed factors relevant to the post-meiotic genome organization. This approach reveals a functional network of DNA-packaging proteins and their putative chaperones and sheds a new light on the way the critical transitions in genome organizations could take place. This work also points to a new area of research in male infertility and sperm quality assessments.
Sperber, Göran; Lövgren, Anders; Eriksson, Nils-Einar; Benachenhou, Farid; Blomberg, Jonas
2009-01-01
Background The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. Methods RetroTector© (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. Results ROL was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10 000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of <= 100 Megabases. Jobs are shown in an IP-number specific list. Results are text files, and can be viewed with the program, RetroTectorViewer.jar (at the same site), which has the full graphical capabilities of the basic ReTe program. A detailed analysis of any retroviral sequences found in the submitted sequence is graphically presented, exportable in standard formats. With the current server, a complete analysis of a 1 Megabase sequence is complete in 10 minutes. It is possible to mask nonretroviral repetitive sequences in the submitted sequence, using host genome specific "brooms", which increase specificity. Discussion Proviral sequences can be hard to recognize, especially if the integration occurred many million years ago. Precise delineation of LTR, gag, pro, pol and env can be difficult, requiring manual work. ROL is a way of simplifying these tasks. Conclusion ROL provides 1. annotation and presentation of known retroviral sequences, 2. detection of proviral chains in unknown genomic sequences, with up to 100 Mbase per submission. PMID:19534753
Sperber, Göran; Lövgren, Anders; Eriksson, Nils-Einar; Benachenhou, Farid; Blomberg, Jonas
2009-06-16
The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. RetroTector (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. ROL http://www.fysiologi.neuro.uu.se/jbgs/ was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10,000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of
Axelsen, Jacob Bock; Yan, Koon-Kiu; Maslov, Sergei
2007-01-01
Background The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins. Results We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate. Conclusion We separately measure the short-term ("raw") duplication and deletion rates rdup∗, rdel∗ which include gene copies that will be removed soon after the duplication event and their dramatically reduced long-term counterparts rdup, rdel. High deletion rate among recently duplicated proteins is consistent with a scenario in which they didn't have enough time to significantly change their functional roles and thus are to a large degree disposable. Systematic trends of each of the four duplication/deletion rates with the total number of genes in the genome were analyzed. All but the deletion rate of recent duplicates rdel∗ were shown to systematically increase with Ngenes. Abnormally flat shapes of sequence identity histograms observed for yeast and human are consistent with lineages leading to these organisms undergoing one or more whole-genome duplications. This interpretation is corroborated by our analysis of the genome of Paramecium tetraurelia where the p-4 profile of the histogram is gradually restored by the successive removal of paralogs generated in its four known whole-genome duplication events. PMID:18039386
It's Not Your Grandmother's Genetics Anymore!
ERIC Educational Resources Information Center
Smith, Mike U.
2014-01-01
Genetics is perhaps the most rapidly growing field of science today. Recent findings such as those of the Human Genome Project have led to new understandings of basic genetic phenomena and even to increased confusion about some basic genetic ideas, such as the nature of the gene. These developments directly influence how we should teach genetics.…
Marques, Isabel; Shiposha, Valeriia; López-Alvarez, Diana; Manzaneda, Antonio J; Hernandez, Pilar; Olonova, Marina; Catalán, Pilar
2017-06-15
Brachypodium distachyon (Poaceae), an annual Mediterranean Aluminum (Al)-sensitive grass, is currently being used as a model species to provide new information on cereals and biofuel crops. The plant has a short life cycle and one of the smallest genomes in the grasses being well suited to experimental manipulation. Its genome has been fully sequenced and several genomic resources are being developed to elucidate key traits and gene functions. A reliable germplasm collection that reflects the natural diversity of this species is therefore needed for all these genomic resources. However, despite being a model plant, we still know very little about its genetic diversity. As a first step to overcome this gap, we used nuclear Simple Sequence Repeats (nSSR) to study the patterns of genetic diversity and population structure of B. distachyon in 14 populations sampled across the Iberian Peninsula (Spain), one of its best known areas. We found very low levels of genetic diversity, allelic number and heterozygosity in B. distachyon, congruent with a highly selfing system. Our results indicate the existence of at least three genetic clusters providing additional evidence for the existence of a significant genetic structure in the Iberian Peninsula and supporting this geographical area as an important genetic reservoir. Several hotspots of genetic diversity were detected and populations growing on basic soils were significantly more diverse than those growing in acidic soils. A partial Mantel test confirmed a statistically significant Isolation-By-Distance (IBD) among all studied populations, as well as a statistically significant Isolation-By-Environment (IBE) revealing the presence of environmental-driven isolation as one explanation for the genetic patterns found in the Iberian Peninsula. The finding of higher genetic diversity in eastern Iberian populations occurring in basic soils suggests that these populations can be better adapted than those occurring in western areas of the Iberian Peninsula where the soils are more acidic and accumulate toxic Al ions. This suggests that the western Iberian acidic soils might prevent the establishment of Al-sensitive B. distachyon populations, potentially causing the existence of more genetically depauperated individuals.
Genome-wide network of regulatory genes for construction of a chordate embryo.
Shoguchi, Eiichi; Hamaguchi, Makoto; Satoh, Nori
2008-04-15
Animal development is controlled by gene regulation networks that are composed of sequence-specific transcription factors (TF) and cell signaling molecules (ST). Although housekeeping genes have been reported to show clustering in the animal genomes, whether the genes comprising a given regulatory network are physically clustered on a chromosome is uncertain. We examined this question in the present study. Ascidians are the closest living relatives of vertebrates, and their tadpole-type larva represents the basic body plan of chordates. The Ciona intestinalis genome contains 390 core TF genes and 119 major ST genes. Previous gene disruption assays led to the formulation of a basic chordate embryonic blueprint, based on over 3000 genetic interactions among 79 zygotic regulatory genes. Here, we mapped the regulatory genes, including all 79 regulatory genes, on the 14 pairs of Ciona chromosomes by fluorescent in situ hybridization (FISH). Chromosomal localization of upstream and downstream regulatory genes demonstrates that the components of coherent developmental gene networks are evenly distributed over the 14 chromosomes. Thus, this study provides the first comprehensive evidence that the physical clustering of regulatory genes, or their target genes, is not relevant for the genome-wide control of gene expression during development.
Nervous system regulation of the cancer genome
Cole, Steven W.
2012-01-01
Genomics-based analyses have provided deep insight into the basic biology of cancer and are now clarifying the molecular pathways by which psychological and social factors can regulate tumor cell gene expression and genome evolution. This review summarizes basic and clinical research on neural and endocrine regulation of the cancer genome and its interactions with the surrounding tumor microenvironment, including the specific types of genes subject to neural and endocrine regulation, the signal transduction pathways that mediate such effects, and therapeutic approaches that might be deployed to mitigate their impact. Beta-adrenergic signaling from the sympathetic nervous system has been found to up-regulated a diverse array of genes that contribute to tumor progression and metastasis, whereas glucocorticoid-regulated genes can inhibit DNA repair and promote cancer cell survival and resistance to chemotherapy. Relationships between socio-environmental risk factors, neural and endocrine signaling to the tumor microenvironment, and transcriptional responses by cancer cells and surrounding stromal cells are providing new mechanistic insights into the social epidemiology of cancer, new therapeutic approaches for protecting the health of cancer patients, and new molecular biomarkers for assessing the impact of behavioral and pharmacologic interventions. PMID:23207104
Batzir, Nurit Assia; Tovin, Adi; Hendel, Ayal
2017-06-01
Genome editing with engineered nucleases is a rapidly growing field thanks to transformative technologies that allow researchers to precisely alter genomes for numerous applications including basic research, biotechnology, and human gene therapy. The genome editing process relies on creating a site-specific DNA double-strand break (DSB) by engineered nucleases and then allowing the cell's repair machinery to repair the break such that precise changes are made to the DNA sequence. The recent development of CRISPR-Cas systems as easily accessible and programmable tools for genome editing accelerates the progress towards using genome editing as a new approach to human therapeutics. Here we review how genome editing using engineered nucleases works and how using different genome editing outcomes can be used as a tool set for treating human diseases. We then review the major challenges of therapeutic genome editing and we discuss how its potential enhancement through CRISPR guide RNA and Cas9 protein modifications could resolve some of these challenges. Copyright© of YS Medical Media ltd.
Li, Fengmei; Liu, Wuyi
2017-06-01
The basic helix-loop-helix (bHLH) transcription factors (TFs) form a huge superfamily and play crucial roles in many essential developmental, genetic, and physiological-biochemical processes of eukaryotes. In total, 109 putative bHLH TFs were identified and categorized successfully in the genomic databases of cattle, Bos Taurus, after removing redundant sequences and merging genetic isoforms. Through phylogenetic analyses, 105 proteins among these bHLH TFs were classified into 44 families with 46, 25, 14, 3, 13, and 4 members in the high-order groups A, B, C, D, E, and F, respectively. The remaining 4 bHLH proteins were sorted out as 'orphans.' Next, these 109 putative bHLH proteins identified were further characterized as significantly enriched in 524 significant Gene Ontology (GO) annotations (corrected P value ≤ 0.05) and 21 significantly enriched pathways (corrected P value ≤ 0.05) that had been mapped by the web server KOBAS 2.0. Furthermore, 95 bHLH proteins were further screened and analyzed together with two uncharacterized proteins in the STRING online database to reconstruct the protein-protein interaction network of cattle bHLH TFs. Ultimately, 89 bHLH proteins were fully mapped in a network with 67 biological process, 13 molecular functions, 5 KEGG pathways, 12 PFAM protein domains, and 25 INTERPRO classified protein domains and features. These results provide much useful information and a good reference for further functional investigations and updated researches on cattle bHLH TFs.
OryzaGenome: Genome Diversity Database of Wild Oryza Species.
Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi-Xuan; Han, Bin; Kurata, Nori
2016-01-01
The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Bourdon-Lacombe, Julie A; Moffat, Ivy D; Deveau, Michelle; Husain, Mainul; Auerbach, Scott; Krewski, Daniel; Thomas, Russell S; Bushel, Pierre R; Williams, Andrew; Yauk, Carole L
2015-07-01
Toxicogenomics promises to be an important part of future human health risk assessment of environmental chemicals. The application of gene expression profiles (e.g., for hazard identification, chemical prioritization, chemical grouping, mode of action discovery, and quantitative analysis of response) is growing in the literature, but their use in formal risk assessment by regulatory agencies is relatively infrequent. Although additional validations for specific applications are required, gene expression data can be of immediate use for increasing confidence in chemical evaluations. We believe that a primary reason for the current lack of integration is the limited practical guidance available for risk assessment specialists with limited experience in genomics. The present manuscript provides basic information on gene expression profiling, along with guidance on evaluating the quality of genomic experiments and data, and interpretation of results presented in the form of heat maps, pathway analyses and other common approaches. Moreover, potential ways to integrate information from gene expression experiments into current risk assessment are presented using published studies as examples. The primary objective of this work is to facilitate integration of gene expression data into human health risk assessments of environmental chemicals. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Information Commons for Rice (IC4R)
2016-01-01
Rice is the most important staple food for a large part of the world's human population and also a key model organism for plant research. Here, we present Information Commons for Rice (IC4R; http://ic4r.org), a rice knowledgebase featuring adoption of an extensible and sustainable architecture that integrates multiple omics data through community-contributed modules. Each module is developed and maintained by different committed groups, deals with data collection, processing and visualization, and delivers data on-demand via web services. In the current version, IC4R incorporates a variety of rice data through multiple committed modules, including genome-wide expression profiles derived entirely from RNA-Seq data, resequencing-based genomic variations obtained from re-sequencing data of thousands of rice varieties, plant homologous genes covering multiple diverse plant species, post-translational modifications, rice-related literatures and gene annotations contributed by the rice research community. Unlike extant related databases, IC4R is designed for scalability and sustainability and thus also features collaborative integration of rice data and low costs for database update and maintenance. Future directions of IC4R include incorporation of other omics data and association of multiple omics data with agronomically important traits, dedicating to build IC4R into a valuable knowledgebase for both basic and translational researches in rice. PMID:26519466
Communicating Genetic and Genomic Information: Health Literacy and Numeracy Considerations
Lea, D.H.; Kaphingst, K.A.; Bowen, D.; Lipkus, I.; Hadley, D.W.
2011-01-01
Genomic research is transforming our understanding of the role of genes in health and disease. These advances, and their application to common diseases that affect large segments of the general population, suggest that researchers and practitioners in public health genomics will increasingly be called upon to translate genomic information to individuals with varying levels of health literacy and numeracy. This paper discusses the current state of research regarding public understanding of genetics and genomics, the influence of health literacy and numeracy on genetic communication, and behavioral responses to genetic and genomic information. The existing research suggests that members of the general public have some familiarity with genetic and genomic terms but have gaps in understanding of underlying concepts. Findings from the limited research base to date indicate that health literacy affects understanding of print and oral communications about genetic and genomic information. Numeracy is also likely to be an important predictor of being able to understand and apply this information, although little research has been conducted in this area to date. In addition, although some research has examined behavior change in response to the receipt of information about genetic risk for familial disorders and genomic susceptibility to common, complex diseases, the effects of health literacy and numeracy on these responses have not been examined. Potential areas in which additional research is needed are identified and practical suggestions for presenting numeric risk information are outlined. Public health genomics researchers and practitioners are uniquely positioned to engage in research that explores how different audiences react to and use genomic risk information. PMID:20407217
Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N
2014-09-01
Broad genome-wide testing is increasingly finding its way to the public through the online direct-to-consumer marketing of so-called personal genome tests. Personal genome tests estimate genetic susceptibilities to multiple diseases and other phenotypic traits simultaneously. Providers commonly make use of Terms of Service agreements rather than informed consent procedures. However, to protect consumers from the potential physical, psychological and social harms associated with personal genome testing and to promote autonomous decision-making with regard to the testing offer, we argue that current practices of information provision are insufficient and that there is a place--and a need--for informed consent in personal genome testing, also when it is offered commercially. The increasing quantity, complexity and diversity of most testing offers, however, pose challenges for information provision and informed consent. Both specific and generic models for informed consent fail to meet its moral aims when applied to personal genome testing. Consumers should be enabled to know the limitations, risks and implications of personal genome testing and should be given control over the genetic information they do or do not wish to obtain. We present the outline of a new model for informed consent which can meet both the norm of providing sufficient information and the norm of providing understandable information. The model can be used for personal genome testing, but will also be applicable to other, future forms of broad genetic testing or screening in commercial and clinical settings. © 2012 John Wiley & Sons Ltd.
Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok
2005-09-09
Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome.
Sievers, Aaron; Bosiek, Katharina; Bisch, Marc; Dreessen, Chris; Riedel, Jascha; Froß, Patrick; Hausmann, Michael; Hildenbrand, Georg
2017-01-01
In genome analysis, k-mer-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve k-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local k-mer spectra (frequency distribution of k-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤ k ≤ 4) on relatively small viral genomes of Papillomaviridae and Herpesviridae, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in Papillomaviridae and Herpesviridae formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the k-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown k-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest k-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard k-mer analysis. PMID:28422050
Reconstruction of a Bacterial Genome from DNA Cassettes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christopher Dupont; John Glass; Laura Sheahan
2011-12-31
This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolicmore » processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.« less
Forster, Samuel C; Browne, Hilary P; Kumar, Nitin; Hunt, Martin; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Lawley, Trevor D
2016-01-04
The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Catalan, Pilar; Chalhoub, Boulos; Chochois, Vincent; Garvin, David F; Hasterok, Robert; Manzaneda, Antonio J; Mur, Luis A J; Pecchioni, Nicola; Rasmussen, Søren K; Vogel, John P; Voxeur, Aline
2014-07-01
The scientific presentations at the First International Brachypodium Conference (abstracts available at http://www.brachy2013.unimore.it) are evidence of the widespread adoption of Brachypodium distachyon as a model system. Furthermore, the wide range of topics presented (genome evolution, roots, abiotic and biotic stress, comparative genomics, natural diversity, and cell walls) demonstrates that the Brachypodium research community has achieved a critical mass of tools and has transitioned from resource development to addressing biological questions, particularly those unique to grasses. Copyright © 2014 Elsevier Ltd. All rights reserved.
Genomic instability and bystander effects: a paradigm shift in radiation biology?
NASA Technical Reports Server (NTRS)
Morgan, William F.
2002-01-01
A basic paradigm in radiobiology is that, following exposure to ionizing radiation, the deposition of energy in the cell nucleus and the resulting damage to DNA, the principal target, are responsible for the radiation's deleterious biological effects. Findings in two rapidly expanding fields of research--radiation-induced genomic instability and bystander effects--have caused us to reevaluate these central tenets. In this article, the potential influence of induced genomic instability and bystander effects on cellular injury after exposure to low-level radiation will be reviewed.
CRISPR-based technologies for the manipulation of eukaryotic genomes
Komor, Alexis C.; Badran, Ahmed H.; Liu, David R.
2016-01-01
The CRISPR-Cas9 RNA-guided DNA endonuclease has contributed to an explosion of advances in the life sciences that have grown from the ability to edit genomes within living cells. In this review we summarize CRISPR-based technologies that enable mammalian genome editing and their various applications. We describe recent developments that extend the generality, DNA specificity, product selectivity, and fundamental capabilities of natural CRISPR systems, and some of the remarkable advancements in basic research, biotechnology, and therapeutics development that these developments have facilitated. PMID:27866654
Internet Versus Virtual Reality Settings for Genomics Information Provision.
Persky, Susan; Kistler, William D; Klein, William M P; Ferrer, Rebecca A
2018-06-22
Current models of genomic information provision will be unable to handle large-scale clinical integration of genomic information, as may occur in primary care settings. Therefore, adoption of digital tools for genetic and genomic information provision is anticipated, primarily using Internet-based, distributed approaches. The emerging consumer communication platform of virtual reality (VR) is another potential intermediate approach between face-to-face and distributed Internet platforms to engage in genomics education and information provision. This exploratory study assessed whether provision of genomics information about body weight in a simulated, VR-based consultation (relative to a distributed, Internet platform) would be associated with differences in health behavior-related attitudes and beliefs, and interpersonal reactions to the avatar-physician. We also assessed whether outcomes differed depending upon whether genomic versus lifestyle-oriented information was conveyed. There were significant differences between communication platforms for all health behavior-oriented outcomes. Following communication in the VR setting, participants reported greater self-efficacy, dietary behavioral intentions, and exercise behavioral intentions than in the Internet-based setting. There were no differences in trust of the physician by setting, and no interaction between setting effects and the content of the information. This study was a first attempt to examine the potential capabilities of a VR-based communication setting for conveying genomic content in the context of weight management. There may be benefits to use of VR settings for communication about genomics, as well as more traditional health information, when it comes to influencing the attitudes and beliefs that underlie healthy lifestyle behaviors.
Identification of functional elements and regulatory circuits by Drosophila modENCODE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roy, Sushmita; Ernst, Jason; Kharchenko, Peter V.
2010-12-22
To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- andmore » tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less
Targeted mutagenesis using zinc-finger nucleases in perennial fruit trees.
Peer, Reut; Rivlin, Gil; Golobovitch, Sara; Lapidot, Moshe; Gal-On, Amit; Vainstein, Alexander; Tzfira, Tzvi; Flaishman, Moshe A
2015-04-01
Targeting a gene in apple or fig with ZFN, introduced by transient or stable transformation, should allow genome editing with high precision to advance basic science and breeding programs. Genome editing is a powerful tool for precise gene manipulation in any organism; it has recently been shown to be of great value for annual plants. Classical breeding strategies using conventional cross-breeding and induced mutations have played an important role in the development of new cultivars in fruit trees. However, fruit-tree breeding is a lengthy process with many limitations. Efficient and widely applied methods for targeted modification of fruit-tree genomes are not yet available. In this study, transgenic apple and fig lines carrying a zinc-finger nuclease (ZFNs) under the control of a heat-shock promoter were developed. Editing of a mutated uidA gene, following expression of the ZFN genes by heat shock, was confirmed by GUS staining and PCR product sequencing. Finally, whole plants with a repaired uidA gene due to deletion of a stop codon were regenerated. The ZFN-mediated gene modifications were stable and passed onto regenerants from ZFN-treated tissue cultures. This is the first demonstration of efficient and precise genome editing, using ZFN at a specific genomic locus, in two different perennial fruit trees-apple and fig. We conclude that targeting a gene in apple or fig with a ZFN introduced by transient or stable transformation should allow knockout of a gene of interest. Using this technology for genome editing allows for marker gene-independent and antibiotic selection-free genome engineering with high precision in fruit trees to advance basic science as well as nontransgenic breeding programs.
Federal Research and Development Funding: FY2011
2011-03-25
malignancies, and will undertake complete genome sequencing and analysis of 300 autism spectrum disorder cases. In support of the National Nanotechnology...clinical trials by 2016. NIH’s HIV/AIDS research portfolio, covering the spectrum from basic viral research to vaccine development trials, would...cancer, heart disease, and autism , particularly over $1 billion in research applying the technology produced by the Human Genome Project.42 Table 9
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.
Shao, Mingfu; Lin, Yu; Moret, Bernard M E
2015-05-01
Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.
Enabling responsible public genomics.
Conley, John M; Doerr, Adam K; Vorhaus, Daniel B
2010-01-01
As scientific understandings of genetics advance, researchers require increasingly rich datasets that combine genomic data from large numbers of individuals with medical and other personal information. Linking individuals' genetic data and personal information precludes anonymity and produces medically significant information--a result not contemplated by the established legal and ethical conventions governing human genomic research. To pursue the next generation of human genomic research and commerce in a responsible fashion, scientists, lawyers, and regulators must address substantial new issues, including researchers' duties with respect to clinically significant data, the challenges to privacy presented by genomic data, the boundary between genomic research and commerce, and the practice of medicine. This Article presents a new model for understanding and addressing these new challenges--a "public genomics" premised on the idea that ethically, legally, and socially responsible genomics research requires openness, not privacy, as its organizing principle. Responsible public genomics combines the data contributed by informed and fully consenting information altruists and the research potential of rich datasets in a genomic commons that is freely and globally available. This Article examines the risks and benefits of this public genomics model in the context of an ambitious genetic research project currently under way--the Personal Genome Project. This Article also (i) demonstrates that large-scale genomic projects are desirable, (ii) evaluates the risks and challenges presented by public genomics research, and (iii) determines that the current legal and regulatory regimes restrict beneficial and responsible scientific inquiry while failing to adequately protect participants. The Article concludes by proposing a modified normative and legal framework that embraces and enables a future of responsible public genomics.
Holden, Brian J; Pinney, John W; Lovell, Simon C; Amoutzias, Grigoris D; Robertson, David L
2007-01-01
Background Alternative representations of biochemical networks emphasise different aspects of the data and contribute to the understanding of complex biological systems. In this study we present a variety of automated methods for visualisation of a protein-protein interaction network, using the basic helix-loop-helix (bHLH) family of transcription factors as an example. Results Network representations that arrange nodes (proteins) according to either continuous or discrete information are investigated, revealing the existence of protein sub-families and the retention of interactions following gene duplication events. Methods of network visualisation in conjunction with a phylogenetic tree are presented, highlighting the evolutionary relationships between proteins, and clarifying the context of network hubs and interaction clusters. Finally, an optimisation technique is used to create a three-dimensional layout of the phylogenetic tree upon which the protein-protein interactions may be projected. Conclusion We show that by incorporating secondary genomic, functional or phylogenetic information into network visualisation, it is possible to move beyond simple layout algorithms based on network topology towards more biologically meaningful representations. These new visualisations can give structure to complex networks and will greatly help in interpreting their evolutionary origins and functional implications. Three open source software packages (InterView, TVi and OptiMage) implementing our methods are available. PMID:17683601
Metadata to Describe Genomic Information.
Delgado, Jaime; Naro, Daniel; Llorente, Silvia; Gelpí, Josep Lluís; Royo, Romina
2018-01-01
Interoperable metadata is key for the management of genomic information. We propose a flexible approach that we contribute to the standardization by ISO/IEC of a new format for efficient and secure compressed storage and transmission of genomic information.
Initiation Application Schedule Service Information and Pricing Services Sample Requirements Pricing SNP Genotyping General Information Genome Wide Association Custom FFPE Sample Options Methylation Linkage Consortium Developed Mouse Whole Genome Sequencing General Information Whole Genome Whole Exome Custom
An integrated clinical and genomic information system for cancer precision medicine.
Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk
2018-04-20
Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.
Information-optimal genome assembly via sparse read-overlap graphs.
Shomorony, Ilan; Kim, Samuel H; Courtade, Thomas A; Tse, David N C
2016-09-01
In the context of third-generation long-read sequencing technologies, read-overlap-based approaches are expected to play a central role in the assembly step. A fundamental challenge in assembling from a read-overlap graph is that the true sequence corresponds to a Hamiltonian path on the graph, and, under most formulations, the assembly problem becomes NP-hard, restricting practical approaches to heuristics. In this work, we avoid this seemingly fundamental barrier by first setting the computational complexity issue aside, and seeking an algorithm that targets information limits In particular, we consider a basic feasibility question: when does the set of reads contain enough information to allow unambiguous reconstruction of the true sequence? Based on insights from this information feasibility question, we present an algorithm-the Not-So-Greedy algorithm-to construct a sparse read-overlap graph. Unlike most other assembly algorithms, Not-So-Greedy comes with a performance guarantee: whenever information feasibility conditions are satisfied, the algorithm reduces the assembly problem to an Eulerian path problem on the resulting graph, and can thus be solved in linear time. In practice, this theoretical guarantee translates into assemblies of higher quality. Evaluations on both simulated reads from real genomes and a PacBio Escherichia coli K12 dataset demonstrate that Not-So-Greedy compares favorably with standard string graph approaches in terms of accuracy of the resulting read-overlap graph and contig N50. Available at github.com/samhykim/nsg courtade@eecs.berkeley.edu or dntse@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren
2011-01-01
Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151
Altering Genomic Integrity: Heavy Metal Exposure Promotes Transposable Element-Mediated Damage.
Morales, Maria E; Servant, Geraldine; Ade, Catherine; Roy-Engel, Astrid M
2015-07-01
Maintenance of genomic integrity is critical for cellular homeostasis and survival. The active transposable elements (TEs) composed primarily of three mobile element lineages LINE-1, Alu, and SVA comprise approximately 30% of the mass of the human genome. For the past 2 decades, studies have shown that TEs significantly contribute to genetic instability and that TE-caused damages are associated with genetic diseases and cancer. Different environmental exposures, including several heavy metals, influence how TEs interact with its host genome increasing their negative impact. This mini-review provides some basic knowledge on TEs, their contribution to disease, and an overview of the current knowledge on how heavy metals influence TE-mediated damage.
Milan, David J; Lubitz, Steven A; Kääb, Stefan; Ellinor, Patrick T
2010-08-01
Genome-wide association studies have been increasingly used to study the genetics of complex human diseases. Within the field of cardiac electrophysiology, this technique has been applied to conditions such as atrial fibrillation, and several electrocardiographic parameters including the QT interval. While these studies have identified multiple genomic regions associated with each trait, questions remain, including the best way to explore the pathophysiology of each association and the potential for clinical utility. This review will summarize recent genome-wide association study results within cardiac electrophysiology and discuss their broader implications in basic science and clinical medicine. Copyright 2010 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.
Sterol Synthesis in Diverse Bacteria.
Wei, Jeremy H; Yin, Xinchi; Welander, Paula V
2016-01-01
Sterols are essential components of eukaryotic cells whose biosynthesis and function has been studied extensively. Sterols are also recognized as the diagenetic precursors of steranes preserved in sedimentary rocks where they can function as geological proxies for eukaryotic organisms and/or aerobic metabolisms and environments. However, production of these lipids is not restricted to the eukaryotic domain as a few bacterial species also synthesize sterols. Phylogenomic studies have identified genes encoding homologs of sterol biosynthesis proteins in the genomes of several additional species, indicating that sterol production may be more widespread in the bacterial domain than previously thought. Although the occurrence of sterol synthesis genes in a genome indicates the potential for sterol production, it provides neither conclusive evidence of sterol synthesis nor information about the composition and abundance of basic and modified sterols that are actually being produced. Here, we coupled bioinformatics with lipid analyses to investigate the scope of bacterial sterol production. We identified oxidosqualene cyclase (Osc), which catalyzes the initial cyclization of oxidosqualene to the basic sterol structure, in 34 bacterial genomes from five phyla (Bacteroidetes, Cyanobacteria, Planctomycetes, Proteobacteria, and Verrucomicrobia) and in 176 metagenomes. Our data indicate that bacterial sterol synthesis likely occurs in diverse organisms and environments and also provides evidence that there are as yet uncultured groups of bacterial sterol producers. Phylogenetic analysis of bacterial and eukaryotic Osc sequences confirmed a complex evolutionary history of sterol synthesis in this domain. Finally, we characterized the lipids produced by Osc-containing bacteria and found that we could generally predict the ability to synthesize sterols. However, predicting the final modified sterol based on our current knowledge of sterol synthesis was difficult. Some bacteria produced demethylated and saturated sterol products even though they lacked homologs of the eukaryotic proteins required for these modifications emphasizing that several aspects of bacterial sterol synthesis are still completely unknown.
CyanoBase: the cyanobacteria genome database update 2010.
Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu
2010-01-01
CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly.
Hartzler, Andrea; McCarty, Catherine A.; Rasmussen, Luke V.; Williams, Marc S.; Brilliant, Murray; Bowton, Erica A.; Clayton, Ellen Wright; Faucett, William A.; Ferryman, Kadija; Field, Julie R.; Fullerton, Stephanie M.; Horowitz, Carol R.; Koenig, Barbara A.; McCormick, Jennifer B.; Ralston, James D.; Sanderson, Saskia C.; Smith, Maureen E.; Trinidad, Susan Brown
2014-01-01
Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine. PMID:24030437
Scanning the human genome at kilobase resolution.
Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming
2008-05-01
Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.
Linking the potato genome to the conserved ortholog set (COS) markers
2013-01-01
Background Conserved ortholog set (COS) markers are an important functional genomics resource that has greatly improved orthology detection in Asterid species. A comprehensive list of these markers is available at Sol Genomics Network (http://solgenomics.net/) and many of these have been placed on the genetic maps of a number of solanaceous species. Results We amplified over 300 COS markers from eight potato accessions involving two diploid landraces of Solanum tuberosum Andigenum group (formerly classified as S. goniocalyx, S. phureja), and a dihaploid clone derived from a modern tetraploid cultivar of S. tuberosum and the wild species S. berthaultii, S. chomatophilum, and S. paucissectum. By BLASTn (Basic Local Alignment Search Tool of the NCBI, National Center for Biotechnology Information) algorithm we mapped the DNA sequences of these markers into the potato genome sequence. Additionally, we mapped a subset of these markers genetically in potato and present a comparison between the physical and genetic locations of these markers in potato and in comparison with the genetic location in tomato. We found that most of the COS markers are single-copy in the reference genome of potato and that the genetic location in tomato and physical location in potato sequence are mostly in agreement. However, we did find some COS markers that are present in multiple copies and those that map in unexpected locations. Sequence comparisons between species show that some of these markers may be paralogs. Conclusions The sequence-based physical map becomes helpful in identification of markers for traits of interest thereby reducing the number of markers to be tested for applications like marker assisted selection, diversity, and phylogenetic studies. PMID:23758607
Li, Ting; Liu, Bo; Chen, Chih Ying; Yang, Bing
2016-05-20
Over the last decades, much endeavor has been made to advance genome editing technology due to its promising role in both basic and synthetic biology. The breakthrough has been made in recent years with the advent of sequence-specific endonucleases, especially zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPRs) guided nucleases (e.g., Cas9). In higher eukaryotic organisms, site-directed mutagenesis usually can be achieved through non-homologous end-joining (NHEJ) repair to the DNA double-strand breaks (DSBs) caused by the exogenously applied nucleases. However, site-specific gene replacement or genuine genome editing through homologous recombination (HR) repair to DSBs remains a challenge. As a proof of concept gene replacement through TALEN-based HR in rice (Oryza sativa), we successfully produced double point mutations in rice acetolactate synthase gene (OsALS) and generated herbicide resistant rice lines by using TALENs and donor DNA carrying the desired mutations. After ballistic delivery into rice calli of TALEN construct and donor DNA, nine HR events with different genotypes of OsALS were obtained in T0 generation at the efficiency of 1.4%-6.3% from three experiments. The HR-mediated gene edits were heritable to the progeny of T1 generation. The edited T1 plants were as morphologically normal as the control plants while displayed strong herbicide resistance. The results demonstrate the feasibility of TALEN-mediated genome editing in rice and provide useful information for further genome editing by other nuclease-based genome editing platforms. Copyright © 2016 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Next Generation Sequence Analysis and Computational Genomics Using Graphical Pipeline Workflows
Torri, Federica; Dinov, Ivo D.; Zamanyan, Alen; Hobel, Sam; Genco, Alex; Petrosyan, Petros; Clark, Andrew P.; Liu, Zhizhong; Eggert, Paul; Pierce, Jonathan; Knowles, James A.; Ames, Joseph; Kesselman, Carl; Toga, Arthur W.; Potkin, Steven G.; Vawter, Marquis P.; Macciardi, Fabio
2012-01-01
Whole-genome and exome sequencing have already proven to be essential and powerful methods to identify genes responsible for simple Mendelian inherited disorders. These methods can be applied to complex disorders as well, and have been adopted as one of the current mainstream approaches in population genetics. These achievements have been made possible by next generation sequencing (NGS) technologies, which require substantial bioinformatics resources to analyze the dense and complex sequence data. The huge analytical burden of data from genome sequencing might be seen as a bottleneck slowing the publication of NGS papers at this time, especially in psychiatric genetics. We review the existing methods for processing NGS data, to place into context the rationale for the design of a computational resource. We describe our method, the Graphical Pipeline for Computational Genomics (GPCG), to perform the computational steps required to analyze NGS data. The GPCG implements flexible workflows for basic sequence alignment, sequence data quality control, single nucleotide polymorphism analysis, copy number variant identification, annotation, and visualization of results. These workflows cover all the analytical steps required for NGS data, from processing the raw reads to variant calling and annotation. The current version of the pipeline is freely available at http://pipeline.loni.ucla.edu. These applications of NGS analysis may gain clinical utility in the near future (e.g., identifying miRNA signatures in diseases) when the bioinformatics approach is made feasible. Taken together, the annotation tools and strategies that have been developed to retrieve information and test hypotheses about the functional role of variants present in the human genome will help to pinpoint the genetic risk factors for psychiatric disorders. PMID:23139896
2011-01-01
Background Green plant leaves have always fascinated biologists as hosts for photosynthesis and providers of basic energy to many food webs. Today, comprehensive databases of gene expression data enable us to apply increasingly more advanced computational methods for reverse-engineering the regulatory network of leaves, and to begin to understand the gene interactions underlying complex emergent properties related to stress-response and development. These new systems biology methods are now also being applied to organisms such as Populus, a woody perennial tree, in order to understand the specific characteristics of these species. Results We present a systems biology model of the regulatory network of Populus leaves. The network is reverse-engineered from promoter information and expression profiles of leaf-specific genes measured over a large set of conditions related to stress and developmental. The network model incorporates interactions between regulators, such as synergistic and competitive relationships, by evaluating increasingly more complex regulatory mechanisms, and is therefore able to identify new regulators of leaf development not found by traditional genomics methods based on pair-wise expression similarity. The approach is shown to explain available gene function information and to provide robust prediction of expression levels in new data. We also use the predictive capability of the model to identify condition-specific regulation as well as conserved regulation between Populus and Arabidopsis. Conclusions We outline a computationally inferred model of the regulatory network of Populus leaves, and show how treating genes as interacting, rather than individual, entities identifies new regulators compared to traditional genomics analysis. Although systems biology models should be used with care considering the complexity of regulatory programs and the limitations of current genomics data, methods describing interactions can provide hypotheses about the underlying cause of emergent properties and are needed if we are to identify target genes other than those constituting the "low hanging fruit" of genomic analysis. PMID:21232107
MIPS: analysis and annotation of proteins from whole genomes
Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
MIPS: analysis and annotation of proteins from whole genomes.
Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Genome-Wide Analysis of bZIP-Encoding Genes in Maize
Wei, Kaifa; Chen, Juan; Wang, Yanmei; Chen, Yanhui; Chen, Shaoxiang; Lin, Yina; Pan, Si; Zhong, Xiaojun; Xie, Daoxin
2012-01-01
In plants, basic leucine zipper (bZIP) proteins regulate numerous biological processes such as seed maturation, flower and vascular development, stress signalling and pathogen defence. We have carried out a genome-wide identification and analysis of 125 bZIP genes that exist in the maize genome, encoding 170 distinct bZIP proteins. This family can be divided into 11 groups according to the phylogenetic relationship among the maize bZIP proteins and those in Arabidopsis and rice. Six kinds of intron patterns (a–f) within the basic and hinge regions are defined. The additional conserved motifs have been identified and present the group specificity. Detailed three-dimensional structure analysis has been done to display the sequence conservation and potential distribution of the bZIP domain. Further, we predict the DNA-binding pattern and the dimerization property on the basis of the characteristic features in the basic and hinge regions and the leucine zipper, respectively, which supports our classification greatly and helps to classify 26 distinct subfamilies. The chromosome distribution and the genetic analysis reveal that 58 ZmbZIP genes are located in the segmental duplicate regions in the maize genome, suggesting that the segment chromosomal duplications contribute greatly to the expansion of the maize bZIP family. Across the 60 different developmental stages of 11 organs, three apparent clusters formed represent three kinds of different expression patterns among the ZmbZIP gene family in maize development. A similar but slightly different expression pattern of bZIPs in two inbred lines displays that 22 detected ZmbZIP genes might be involved in drought stress. Thirteen pairs and 143 pairs of ZmbZIP genes show strongly negative and positive correlations in the four distinct fungal infections, respectively, based on the expression profile and Pearson's correlation coefficient analysis. PMID:23103471
[Genetic and epigenetic news in gerontology].
Baranov, V S; Glotov, O S; Baranova, E V
2014-01-01
The overview represents the recent most conspicuous findings in aging studies. It includes new data on the whole genome association studies (GWAS) in big cohort of centenaries, recently found mutation protecting from Alzheimer disease, discovery of hypothalamus as a command center of human aging, very important data on the negative effect of common antioxidants in the treatment of lung cancer as well as new data concerning antiaging and anticancer effects of common drugs such as rapamycine and metformin. Substantial part of the review is devoted to the epigenetic problems of senescence and feasible impact of basic epigenetic mechanisms (methylation of DNA and histone proteins, DNA heterochromatization) in regulation of gene expression, long-term genome reprogramming during early childhood, and transgeneration transmission of epigenetic traits. The necessity of transition from molecular studies of dormant human genome (anatomy of human genome) to genome in action (dynamic genome) and thus with special emphasis to epigenetic medicine is stressed.
Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements
Liu, Pengfei; Erez, Ayelet; Sreenath Nagamani, Sandesh C.; Dhar, Shweta U.; Kołodziejska, Katarzyna E.; Dharmadhikari, Avinash V.; Cooper, M. Lance; Wiszniewska, Joanna; Zhang, Feng; Withers, Marjorie A.; Bacino, Carlos A.; Campos-Acevedo, Luis Daniel; Delgado, Mauricio R.; Freedenberg, Debra; Garnica, Adolfo; Grebe, Theresa A.; Hernández-Almaguer, Dolores; Immken, LaDonna; Lalani, Seema R.; McLean, Scott D.; Northrup, Hope; Scaglia, Fernando; Strathearn, Lane; Trapane, Pamela; Kang, Sung-Hae L.; Patel, Ankita; Cheung, Sau Wai; Hastings, P. J.; Stankiewicz, Paweł; Lupski, James R.; Bi, Weimin
2011-01-01
SUMMARY Complex genomic rearrangements (CGR) consisting of two or more breakpoint junctions have been observed in genomic disorders. Recently, a chromosome catastrophe phenomenon termed chromothripsis, in which numerous genomic rearrangements are apparently acquired in one single catastrophic event, was described in multiple cancers. Here we show that constitutionally acquired CGRs share similarities with cancer chromothripsis. In the 17 CGR cases investigated we observed localization and multiple copy number changes including deletions, duplications and/or triplications, as well as extensive translocations and inversions. Genomic rearrangements involved varied in size and complexities; in one case, array comparative genomic hybridization revealed 18 copy number changes. Breakpoint sequencing identified characteristic features, including small templated insertions at breakpoints and microhomology at breakpoint junctions, which have been attributed to replicative processes. The resemblance between CGR and chromothripsis suggests similar mechanistic underpinnings. Such chromosome catastrophic events appear to reflect basic DNA metabolism operative throughout an organism’s life cycle. PMID:21925314
Comparative analysis of chloroplast genomes of the genus Citrus and its close relatives.
Liu, Xiaogang; Wu, Hongkun; Luo, Yan; Xi, Wanpeng; Zhou, Zhiqin
2017-01-01
The genus Citrus and its close relatives are economically and nutritionally important fruit trees. However, the huge controversy over the phylogeny of key wild species, as well as the genetic relationship between the cultivated species and their putative wild progenitors, remains unresolved. Comparative analyses of chloroplast (cp) genomes have been useful in resolving various phylogenetic issues. Thus far, the cp genomes of only two Citrus species have been sequenced. In this study, we sequenced six complete cp genomes, four belonging to the genus Citrus, and two belonging to the genera Fortunella and Poncirus, respectively. These newly sequenced genomes together with the two publicly available were used for comparative analyses of the genus Citrus and its close relatives. All eight cp genomes share similar basic structure, gene order and gene content. Phylogenetic analyses supported the monophyly of the three genera in the order Sapindales within the major clade Malvidae.
CyanoBase: the cyanobacteria genome database update 2010
Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu
2010-01-01
CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly. PMID:19880388
Jordan, Daniel M; Do, Ron
2018-04-11
While sequence-based genetic tests have long been available for specific loci, especially for Mendelian disease, the rapidly falling costs of genome-wide genotyping arrays, whole-exome sequencing, and whole-genome sequencing are moving us toward a future where full genomic information might inform the prognosis and treatment of a variety of diseases, including complex disease. Similarly, the availability of large populations with full genomic information has enabled new insights about the etiology and genetic architecture of complex disease. Insights from the latest generation of genomic studies suggest that our categorization of diseases as complex may conceal a wide spectrum of genetic architectures and causal mechanisms that ranges from Mendelian forms of complex disease to complex regulatory structures underlying Mendelian disease. Here, we review these insights, along with advances in the prediction of disease risk and outcomes from full genomic information. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 19 is August 31, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Povey, Sue; Al Aqeel, Aida I; Cambon-Thomsen, Anne; Dalgleish, Raymond; den Dunnen, Johan T; Firth, Helen V; Greenblatt, Marc S; Barash, Carol Isaacson; Parker, Michael; Patrinos, George P; Savige, Judith; Sobrido, Maria-Jesus; Winship, Ingrid; Cotton, Richard GH
2010-01-01
More than 1,000 Web-based locus-specific variation databases (LSDBs) are listed on the Website of the Human Genetic Variation Society (HGVS). These individual efforts, which often relate phenotype to genotype, are a valuable source of information for clinicians, patients, and their families, as well as for basic research. The initiators of the Human Variome Project recently recognized that having access to some of the immense resources of unpublished information already present in diagnostic laboratories would provide critical data to help manage genetic disorders. However, there are significant ethical issues involved in sharing these data worldwide. An international working group presents second-generation guidelines addressing ethical issues relating to the curation of human LSDBs that provide information via a Web-based interface. It is intended that these should help current and future curators and may also inform the future decisions of ethics committees and legislators. These guidelines have been reviewed by the Ethics Committee of the Human Genome Organization (HUGO). Hum Mutat 31:–6, 2010. © 2010 Wiley-Liss, Inc. PMID:20683926
Genetic, genomic, and molecular tools for studying the protoploid yeast, L. waltii.
Di Rienzi, Sara C; Lindstrom, Kimberly C; Lancaster, Ragina; Rolczynski, Lisa; Raghuraman, M K; Brewer, Bonita J
2011-02-01
Sequencing of the yeast Kluyveromyces waltii (recently renamed Lachancea waltii) provided evidence of a whole genome duplication event in the lineage leading to the well-studied Saccharomyces cerevisiae. While comparative genomic analyses of these yeasts have proven to be extremely instructive in modeling the loss or maintenance of gene duplicates, experimental tests of the ramifications following such genome alterations remain difficult. To transform L. waltii from an organism of the computational comparative genomic literature into an organism of the functional comparative genomic literature, we have developed genetic, molecular and genomic tools for working with L. waltii. In particular, we have characterized basic properties of L. waltii (growth, ploidy, molecular karyotype, mating type and the sexual cycle), developed transformation, cell cycle arrest and synchronization protocols, and have created centromeric and non-centromeric vectors as well as a genome browser for L. waltii. We hope that these tools will be used by the community to follow up on the ideas generated by sequence data and lead to a greater understanding of eukaryotic biology and genome evolution. 2010 John Wiley & Sons, Ltd.
Genetic, genomic, and molecular tools for studying the protoploid yeast, L. waltii
Di Rienzi, Sara C.; Lindstrom, Kimberly C.; Lancaster, Ragina; Rolczynski, Lisa; Raghuraman, M. K.; Brewer, Bonita J.
2011-01-01
Sequencing of the yeast Kluyveromyces waltii (recently renamed Lachancea waltii) provided evidence of a whole genome duplication event in the lineage leading to the well-studied Saccharomyces cerevisiae. While comparative genomic analyses of these yeasts have proven to be extremely instructive in modeling the loss or maintenance of gene duplicates, experimental tests of the ramifications following such genome alterations remain difficult. To transform L. waltii from an organism of the computational comparative genomic literature into an organism of the functional comparative genomic literature, we have developed genetic, molecular and genomic tools for working with L. waltii. In particular, we have characterized basic properties of L. waltii (growth, ploidy, molecular karyotype, mating type and the sexual cycle), developed transformation, cell cycle arrest and synchronization protocols, and have created centromeric and non-centromeric vectors as well as a genome browser for L. waltii. We hope that these tools will be used by the community to follow up on the ideas generated by sequence data and lead to a greater understanding of eukaryotic biology and genome evolution. PMID:21246627
The UCSC Genome Browser database: extensions and updates 2013.
Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Kuhn, Robert M; Wong, Matthew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Raney, Brian J; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Lee, Brian T; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Dreszer, Timothy R; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James
2013-01-01
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
Wu, Jing; Chen, Jibao; Wang, Lanfen; Wang, Shumin
2017-01-01
WRKY transcription factor plays a key role in drought stress. However, the characteristics of the WRKY gene family in the common bean (Phaseolus vulgaris L.) are unknown. In this study, we identified 88 complete WRKY proteins from the draft genome sequence of the “G19833” common bean. The predicted genes were non-randomly distributed in all chromosomes. Basic information, amino acid motifs, phylogenetic tree and the expression patterns of PvWRKY genes were analyzed, and the proteins were classified into groups 1, 2, and 3. Group 2 was further divided into five subgroups: 2a, 2b, 2c, 2d, and 2e. Finally, we detected 19 WRKY genes that were responsive to drought stress using qRT-PCR; 11 were down-regulated, and 8 were up-regulated under drought stress. This study comprehensively examines WRKY proteins in the common bean, a model food legume, and it provides a foundation for the functional characterization of the WRKY family and opportunities for understanding the mechanisms of drought stress tolerance in this plant. PMID:28386267
Wu, Jing; Chen, Jibao; Wang, Lanfen; Wang, Shumin
2017-01-01
WRKY transcription factor plays a key role in drought stress. However, the characteristics of the WRKY gene family in the common bean ( Phaseolus vulgaris L.) are unknown. In this study, we identified 88 complete WRKY proteins from the draft genome sequence of the "G19833" common bean. The predicted genes were non-randomly distributed in all chromosomes. Basic information, amino acid motifs, phylogenetic tree and the expression patterns of PvWRKY genes were analyzed, and the proteins were classified into groups 1, 2, and 3. Group 2 was further divided into five subgroups: 2a, 2b, 2c, 2d, and 2e. Finally, we detected 19 WRKY genes that were responsive to drought stress using qRT-PCR; 11 were down-regulated, and 8 were up-regulated under drought stress. This study comprehensively examines WRKY proteins in the common bean, a model food legume, and it provides a foundation for the functional characterization of the WRKY family and opportunities for understanding the mechanisms of drought stress tolerance in this plant.
Spitsbergen, Jan M.; Kent, Michael L.
2007-01-01
The zebrafish (Danio rerio) is now the pre-eminent vertebrate model system for clarification of the roles of specific genes and signaling pathways in development. The zebrafish genome will be completely sequenced within the next 1–2 years. Together with the substantial historical database regarding basic developmental biology, toxicology, and gene transfer, the rich foundation of molecular genetic and genomic data makes zebrafish a powerful model system for clarifying mechanisms in toxicity. In contrast to the highly advanced knowledge base on molecular developmental genetics in zebrafish, our database regarding infectious and noninfectious diseases and pathologic lesions in zebrafish lags far behind the information available on most other domestic mammalian and avian species, particularly rodents. Currently, minimal data are available regarding spontaneous neoplasm rates or spontaneous aging lesions in any of the commonly used wild-type or mutant lines of zebrafish. Therefore, to fully utilize the potential of zebrafish as an animal model for understanding human development, disease, and toxicology we must greatly advance our knowledge on zebrafish diseases and pathology. PMID:12597434
Walking through the statistical black boxes of plant breeding.
Xavier, Alencar; Muir, William M; Craig, Bruce; Rainey, Katy Martin
2016-10-01
The main statistical procedures in plant breeding are based on Gaussian process and can be computed through mixed linear models. Intelligent decision making relies on our ability to extract useful information from data to help us achieve our goals more efficiently. Many plant breeders and geneticists perform statistical analyses without understanding the underlying assumptions of the methods or their strengths and pitfalls. In other words, they treat these statistical methods (software and programs) like black boxes. Black boxes represent complex pieces of machinery with contents that are not fully understood by the user. The user sees the inputs and outputs without knowing how the outputs are generated. By providing a general background on statistical methodologies, this review aims (1) to introduce basic concepts of machine learning and its applications to plant breeding; (2) to link classical selection theory to current statistical approaches; (3) to show how to solve mixed models and extend their application to pedigree-based and genomic-based prediction; and (4) to clarify how the algorithms of genome-wide association studies work, including their assumptions and limitations.
Genome elimination: translating basic research into a future tool for plant breeding.
Comai, Luca
2014-06-01
During the course of our history, humankind has been through different periods of agricultural improvement aimed at enhancing our food supply and the performance of food crops. In recent years, it has become apparent that future crop improvement efforts will require new approaches to address the local challenges of farmers while empowering discovery across industry and academia. New plant breeding approaches are needed to meet this challenge to help feed a growing world population. Here I discuss how a basic research discovery is being translated into a potential future tool for plant breeding, and share the story of researcher Simon Chan, who recognized the potential application of this new approach--genome elimination--for the breeding of staple food crops in Africa and South America.
Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok
2005-01-01
Background Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. Results The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. Conclusion The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome. PMID:16150155
The biology of cancer: what do oncology nurses really need to know.
Eggert, Julie
2011-02-01
To describe the impact of genetics and genomics on the biology of cancer and the implications for patient care. Pubmed; CINAHL. Cancer research in genetics/genomics has identified new mechanisms influencing personalized risk assessment/management, early detection, cancer treatment, and long-term screening/surveillance. Understanding the basics of genetics/genomics on the biology of cancer will facilitate patient education and care delivery, including the administration and monitoring of genetically targeted therapies whose toxicities may in part be mediated by the molecular pathways targeted by the specific agent. Copyright © 2011 Elsevier Inc. All rights reserved.
CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes.
Komor, Alexis C; Badran, Ahmed H; Liu, David R
2017-01-12
The CRISPR-Cas9 RNA-guided DNA endonuclease has contributed to an explosion of advances in the life sciences that have grown from the ability to edit genomes within living cells. In this Review, we summarize CRISPR-based technologies that enable mammalian genome editing and their various applications. We describe recent developments that extend the generality, DNA specificity, product selectivity, and fundamental capabilities of natural CRISPR systems, and we highlight some of the remarkable advancements in basic research, biotechnology, and therapeutics science that these developments have facilitated. Copyright © 2017 Elsevier Inc. All rights reserved.
Genetics and Genomics of Endometriosis
Hansen, Keith A.; Eyster, Kathleen M.
2015-01-01
Endometriosis is a common cause of morbidity in women with an unknown etiology. Studies have demonstrated the familial nature of endometriosis and suggest that inheritance occurs in a polygenic/multifactorial fashion. Studies have attempted to define the gene or genes responsible for endometriosis through association or linkage studies with candidate genes or DNA mapping technology. A number of genomics studies have demonstrated significant alterations in gene expression in endometriosis. A more thorough understanding of the genetics and genomics of endometriosis will facilitate understanding the basic biology of the disease and open new inroads to diagnosis and treatment of this enigmatic condition. PMID:20436317
Ethical, legal, and social issues in the translation of genomics into health care.
Badzek, Laurie; Henaghan, Mark; Turner, Martha; Monsen, Rita
2013-03-01
The rapid continuous feed of new information from scientific discoveries related to the human genome makes translation and incorporation of information into the clinical setting difficult and creates ethical, legal, and social challenges for providers. This article overviews some of the legal and ethical foundations that guide our response to current complex issues in health care associated with the impact of scientific discoveries related to the human genome. Overlapping ethical, legal, and social implications impact nurses and other healthcare professionals as they seek to identify and translate into practice important information related to new genomic scientific knowledge. Ethical and legal foundations such as professional codes, human dignity, and human rights provide the framework for understanding highly complex genomic issues. Ethical, legal, and social concerns of the health provider in the translation of genomic knowledge into practice including minimizing harms, maximizing benefits, transparency, confidentiality, and informed consent are described. Additionally, nursing professional competencies related to ethical, legal, and social issues in the translation of genomics into health care are discussed. Ethical, legal, and social considerations in new genomic discovery necessitate that healthcare professionals have knowledge and competence to respond to complex genomic issues and provide appropriate information and care to patients, families, and communities. Understanding the ethical, legal, and social issues in the translation of genomic information into practice is essential to provide patients, families, and communities with competent, safe, effective health care. © 2013 Sigma Theta Tau International.
Kaphingst, Kimberly A; Stafford, Jewel D; McGowan, Lucy D'Agostino; Seo, Joann; Lachance, Christina R; Goodman, Melody S
2015-02-01
Few studies have examined how individuals respond to genomic risk information for common, chronic diseases. This randomized study examined differences in responses by type of genomic information (genetic test/family history) and disease condition (diabetes/heart disease), and by race/ethnicity in a medically underserved population. 1,057 English-speaking adults completed a survey containing 1 of 4 vignettes (2-by-2 randomized design). Differences in dependent variables (i.e., interest in receiving genomic assessment, discussing with doctor or family, changing health habits) by experimental condition and race/ethnicity were examined using chi-squared tests and multivariable regression analysis. No significant differences were found in dependent variables by type of genomic information or disease condition. In multivariable models, Hispanics were more interested in receiving a genomic assessment than Whites (OR = 1.93; p < .0001); respondents with marginal (OR = 1.54; p = .005) or limited (OR = 1.85; p = .009) health literacy had greater interest than those with adequate health literacy. Blacks (OR = 1.78; p = .001) and Hispanics (OR = 1.85; p = .001) had greater interest in discussing information with family than Whites. Non-Hispanic Blacks (OR = 1.45; p = .04) had greater interest in discussing genomic information with a doctor than Whites. Blacks (β = -0.41; p < .001) and Hispanics (β = -0.25; p = .033) intended to change fewer health habits than Whites; health literacy was negatively associated with number of health habits participants intended to change. Findings suggest that race/ethnicity may affect responses to genomic risk information. Additional research could examine how cognitive representations of this information differ across racial/ethnic groups. Health literacy is also critical to consider in developing approaches to communicating genomic information.
Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers
Wajid, Bilal; Serpedin, Erchin
2012-01-01
In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity. PMID:22768980
Harnessing Whole Genome Sequencing in Medical Mycology.
Cuomo, Christina A
2017-01-01
Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.
[Ethical considerations in genomic cohort study].
Choi, Eun Kyung; Kim, Ock-Joo
2007-03-01
During the last decade, genomic cohort study has been developed in many countries by linking health data and genetic data in stored samples. Genomic cohort study is expected to find key genetic components that contribute to common diseases, thereby promising great advance in genome medicine. While many countries endeavor to build biobank systems, biobank-based genome research has raised important ethical concerns including genetic privacy, confidentiality, discrimination, and informed consent. Informed consent for biobank poses an important question: whether true informed consent is possible in population-based genomic cohort research where the nature of future studies is unforeseeable when consent is obtained. Due to the sensitive character of genetic information, protecting privacy and keeping confidentiality become important topics. To minimize ethical problems and achieve scientific goals to its maximum degree, each country strives to build population-based genomic cohort research project, by organizing public consultation, trying public and expert consensus in research, and providing safeguards to protect privacy and confidentiality.
The business value and cost-effectiveness of genomic medicine.
Crawford, James M; Aspinall, Mara G
2012-05-01
Genomic medicine offers the promise of more effective diagnosis and treatment of human diseases. Genome sequencing early in the course of disease may enable more timely and informed intervention, with reduced healthcare costs and improved long-term outcomes. However, genomic medicine strains current models for demonstrating value, challenging efforts to achieve fair payment for services delivered, both for laboratory diagnostics and for use of molecular information in clinical management. Current models of healthcare reform stipulate that care must be delivered at equal or lower cost, with better patient and population outcomes. To achieve demonstrated value, genomic medicine must overcome many uncertainties: the clinical relevance of genomic variation; potential variation in technical performance and/or computational analysis; management of massive information sets; and must have available clinical interventions that can be informed by genomic analysis, so as to attain more favorable cost management of healthcare delivery and demonstrate improvements in cost-effectiveness.
Can all heritable biology really be reduced to a single dimension?
Babbitt, Gregory A; Coppola, Erin E; Alawad, Mohammed A; Hudson, André O
2016-03-10
A long-held presupposition in the field of bioinformatics holds that genetic, and now even epigenetic 'information' can be abstracted from the physicochemical details of the macromolecular polymers in which it resides. It is perhaps rather ironic that this basic conjecture originated upon the first observations of DNA structure itself. This static model of DNA led very quickly to the conclusion that only the nucleobase sequence itself is rich enough in molecular complexity to replicate a complex biology. This idea has been pervasive throughout genomic science, higher education and popular culture ever since; to the point that most of us would accept it unquestioningly as fact. What is more alarming is that this conjecture is driving a significant portion of the technological development in modern genomics towards methods strongly rooted in DNA sequencing, thereby reducing a dynamic multi-dimensional biology into single-dimensional forms of data. Evidence countering this central tenet of bioinformatics has been quietly mounting over many decades, prompting some to propose that the genome must be studied from the perspective of its molecular reality, rather than as a body of information to be represented symbolically. Here, we explore the epistemological boundary between bioinformatics and molecular biology, and warn against an 'overtly' bioinformatic perspective. We review a selection of new bioinformatic methods that move beyond sequence-based approaches to include consideration of databased three dimensional structures. However, we also note that these hybrid methods still ignore the most important element of gene function when attempting to improve outcomes; the fourth dimension of molecular dynamics over time. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Yu, Youjian; Liang, Ying; Lv, Meiling; Wu, Jian; Lu, Gang; Cao, Jiashu
2014-01-01
Polygalacturonase (PG, EC3.2.1.15), one of the hydrolytic enzymes associated with the modification of pectin network in plant cell wall, has an important role in various cell-separation processes that are essential for plant development. PGs are encoded by a large gene family in plants. However, information on this gene family in plant development remains limited. In the present study, 53 and 62 putative members of the PG gene family in cucumber and watermelon genomes, respectively, were identified by genome-wide search to explore the composition, structure, and evolution of the PG family in Cucurbitaceae crops. The results showed that tandem duplication could be an important factor that contributes to the expansion of the PG genes in the two crops. The phylogenetic and evolutionary analyses suggested that PGs could be classified into seven clades, and that the exon/intron structures and intron phases were conserved within but divergent between clades. At least 24 ancestral PGs were detected in the common ancestor of Arabidopsis and Cucumis sativus. Expression profile analysis by quantitative real-time polymerase chain reaction demonstrated that most CsPGs exhibit specific or high expression pattern in one of the organs/tissues. The 16 CsPGs associated with fruit development could be divided into three subsets based on their specific expression patterns and the cis-elements of fruit-specific, endosperm/seed-specific, and ethylene-responsive exhibited in their promoter regions. Our comparative analysis provided some basic information on the PG gene family, which would be valuable for further functional analysis of the PG genes during plant development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Genomic reassortment of influenza A virus in North American swine, 1998–2011
Detmer, Susan E.; Wentworth, David E.; Tan, Yi; Schwartzbard, Aaron; Halpin, Rebecca A.; Stockwell, Timothy B.; Lin, Xudong; Vincent, Amy L.; Gramer, Marie R.; Holmes, Edward C.
2012-01-01
Revealing the frequency and determinants of reassortment among RNA genome segments is fundamental to understanding basic aspects of the biology and evolution of the influenza virus. To estimate the extent of genomic reassortment in influenza viruses circulating in North American swine, we performed a phylogenetic analysis of 139 whole-genome viral sequences sampled during 1998–2011 and representing seven antigenically distinct viral lineages. The highest amounts of reassortment were detected between the H3 and the internal gene segments (PB2, PB1, PA, NP, M and NS), while the lowest reassortment frequencies were observed among the H1γ, H1pdm and neuraminidase segments, particularly N1. Less reassortment was observed among specific haemagglutinin–neuraminidase combinations that were more prevalent in swine, suggesting that some genome constellations may be evolutionarily more stable. PMID:22993190
2013-01-01
Background Multiple laboratories now offer clinical whole genome sequencing (WGS). We anticipate WGS becoming routinely used in research and clinical practice. Many institutions are exploring how best to educate geneticists and other professionals about WGS. Providing students in WGS courses with the option to analyze their own genome sequence is one strategy that might enhance students’ engagement and motivation to learn about personal genomics. However, if this option is presented to students, it is vital they make informed decisions, do not feel pressured into analyzing their own genomes by their course directors or peers, and feel free to analyze a third-party genome if they prefer. We therefore developed a 26-hour introductory genomics course in part to help students make informed decisions about whether to receive personal WGS data in a subsequent advanced genomics course. In the advanced course, they had the option to receive their own personal genome data, or an anonymous genome, at no financial cost to them. Our primary aims were to examine whether students made informed decisions regarding analyzing their personal genomes, and whether there was evidence that the introductory course enabled the students to make a more informed decision. Methods This was a longitudinal cohort study in which students (N = 19) completed questionnaires assessing their intentions, informed decision-making, attitudes and knowledge before (T1) and after (T2) the introductory course, and before the advanced course (T3). Informed decision-making was assessed using the Decisional Conflict Scale. Results At the start of the introductory course (T1), most (17/19) students intended to receive their personal WGS data in the subsequent course, but many expressed conflict around this decision. Decisional conflict decreased after the introductory course (T2) indicating there was an increase in informed decision-making, and did not change before the advanced course (T3). This suggests that it was the introductory course content rather than simply time passing that had the effect. In the advanced course, all (19/19) students opted to receive their personal WGS data. No changes in technical knowledge of genomics were observed. Overall attitudes towards WGS were broadly positive. Conclusions Providing students with intensive introductory education about WGS may help them make informed decisions about whether or not to work with their personal WGS data in an educational setting. PMID:24373383
An Integrated Molecular Database on Indian Insects.
Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil
2018-01-01
MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Simionato, Elena; Ledent, Valérie; Richards, Gemma; Thomas-Chollier, Morgane; Kerner, Pierre; Coornaert, David; Degnan, Bernard M; Vervoort, Michel
2007-01-01
Background Molecular and genetic analyses conducted in model organisms such as Drosophila and vertebrates, have provided a wealth of information about how networks of transcription factors control the proper development of these species. Much less is known, however, about the evolutionary origin of these elaborated networks and their large-scale evolution. Here we report the first evolutionary analysis of a whole superfamily of transcription factors, the basic helix-loop-helix (bHLH) proteins, at the scale of the whole metazoan kingdom. Results We identified in silico the putative full complement of bHLH genes in the sequenced genomes of 12 different species representative of the main metazoan lineages, including three non-bilaterian metazoans, the cnidarians Nematostella vectensis and Hydra magnipapillata and the demosponge Amphimedon queenslandica. We have performed extensive phylogenetic analyses of the 695 identified bHLHs, which has allowed us to allocate most of these bHLHs to defined evolutionary conserved groups of orthology. Conclusion Three main features in the history of the bHLH gene superfamily can be inferred from these analyses: (i) an initial diversification of the bHLHs has occurred in the pre-Cambrian, prior to metazoan cladogenesis; (ii) a second expansion of the bHLH superfamily occurred early in metazoan evolution before bilaterians and cnidarians diverged; and (iii) the bHLH complement during the evolution of the bilaterians has been remarkably stable. We suggest that these features may be extended to other developmental gene families and reflect a general trend in the evolution of the developmental gene repertoires of metazoans. PMID:17335570
Genomic prediction using phenotypes from pedigreed lines with no marker data
USDA-ARS?s Scientific Manuscript database
Until now genomic prediction in plant breeding has only used information from individuals that have been genotyped. In practice, information from non-genotyped relatives of genotyped individuals can be used to improve the genomic prediction accuracy. Single-step genomic prediction integrates all the...
Selfish genetic elements, genetic conflict, and evolutionary innovation.
Werren, John H
2011-06-28
Genomes are vulnerable to selfish genetic elements (SGEs), which enhance their own transmission relative to the rest of an individual's genome but are neutral or harmful to the individual as a whole. As a result, genetic conflict occurs between SGEs and other genetic elements in the genome. There is growing evidence that SGEs, and the resulting genetic conflict, are an important motor for evolutionary change and innovation. In this review, the kinds of SGEs and their evolutionary consequences are described, including how these elements shape basic biological features, such as genome structure and gene regulation, evolution of new genes, origin of new species, and mechanisms of sex determination and development. The dynamics of SGEs are also considered, including possible "evolutionary functions" of SGEs.
Selfish genetic elements, genetic conflict, and evolutionary innovation
Werren, John H.
2011-01-01
Genomes are vulnerable to selfish genetic elements (SGEs), which enhance their own transmission relative to the rest of an individual's genome but are neutral or harmful to the individual as a whole. As a result, genetic conflict occurs between SGEs and other genetic elements in the genome. There is growing evidence that SGEs, and the resulting genetic conflict, are an important motor for evolutionary change and innovation. In this review, the kinds of SGEs and their evolutionary consequences are described, including how these elements shape basic biological features, such as genome structure and gene regulation, evolution of new genes, origin of new species, and mechanisms of sex determination and development. The dynamics of SGEs are also considered, including possible “evolutionary functions” of SGEs. PMID:21690392
Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto
2008-11-01
Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from Mendel's peas not only because many students do not understand that plants are organisms, but also because human biology contains important social and health issues. Therefore, we have developed a teaching program for the introduction to genome science, whose subjects are focused on the human genome. This program comprises mixed multimedia presentations: a large poster with illustrations and text on the human genome (a human genome map for every home), and animations on the basics of genome science. We implemented and assessed this program at four high schools. Our results indicate that students felt that they learned about the human genome from the program and some increases in students' understanding were observed with longer exposure to the mixed multimedia presentations. Copyright © 2008 International Union of Biochemistry and Molecular Biology, Inc.
Tabachnick, Walter J
2003-09-01
The completion of the Anopheles gambiae Giles genome sequencing project is a milestone toward developing more effective strategies in reducing the impact of malaria and other vector borne diseases. The successes in developing transgenic approaches using mosquitoes have provided another essential new tool for further progress in basic vector genetics and the goal of disease control. The use of transgenic approaches to develop refractory mosquitoes is also possible. The ability to use genome sequence to identify genes, and transgenic approaches to construct refractory mosquitoes, has provided the opportunity that with the future development of an appropriate genetic drive system, refractory transgenes can be released into vector populations leading to nontransmitting mosquitoes. An. gambiae populations incapable of transmitting malaria. This compelling strategy will be very difficult to achieve and will require a broad substantial research program for success. The fundamental information that is required on genome structure, gene function and environmental effects on genetic expression are largely unknown. The ability to predict gene effects on phenotype is rudimentary, particularly in natural populations. As a result, the release of a refractory transgene into natural mosquito populations is imprecise and there is little ability to predict unintended consequences. The new genetic tools at hand provide opportunities to address an array of important issues, many of which can have immediate impact on the effectiveness of a host of strategies to control vector borne disease. Transgenic release approaches represent only one strategy that should be pursued. A balanced research program is required.
Flow cytogenetics and chromosome sorting.
Cram, L S
1990-06-01
This review of flow cytogenetics and chromosome sorting provides an overview of general information in the field and describes recent developments in more detail. From the early developments of chromosome analysis involving single parameter or one color analysis to the latest developments in slit scanning of single chromosomes in a flow stream, the field has progressed rapidly and most importantly has served as an important enabling technology for the human genome project. Technological innovations that advanced flow cytogenetics are described and referenced. Applications in basic cell biology, molecular biology, and clinical investigations are presented. The necessary characteristics for large number chromosome sorting are highlighted. References to recent review articles are provided as a starting point for locating individual references that provide more detail. Specific references are provided for recent developments.
Dengue Virus Genome Uncoating Requires Ubiquitination.
Byk, Laura A; Iglesias, Néstor G; De Maio, Federico A; Gebhard, Leopoldo G; Rossi, Mario; Gamarnik, Andrea V
2016-06-28
The process of genome release or uncoating after viral entry is one of the least-studied steps in the flavivirus life cycle. Flaviviruses are mainly arthropod-borne viruses, including emerging and reemerging pathogens such as dengue, Zika, and West Nile viruses. Currently, dengue virus is one of the most significant human viral pathogens transmitted by mosquitoes and is responsible for about 390 million infections every year around the world. Here, we examined for the first time molecular aspects of dengue virus genome uncoating. We followed the fate of the capsid protein and RNA genome early during infection and found that capsid is degraded after viral internalization by the host ubiquitin-proteasome system. However, proteasome activity and capsid degradation were not necessary to free the genome for initial viral translation. Unexpectedly, genome uncoating was blocked by inhibiting ubiquitination. Using different assays to bypass entry and evaluate the first rounds of viral translation, a narrow window of time during infection that requires ubiquitination but not proteasome activity was identified. In this regard, ubiquitin E1-activating enzyme inhibition was sufficient to stabilize the incoming viral genome in the cytoplasm of infected cells, causing its retention in either endosomes or nucleocapsids. Our data support a model in which dengue virus genome uncoating requires a nondegradative ubiquitination step, providing new insights into this crucial but understudied viral process. Dengue is the most significant arthropod-borne viral infection in humans. Although the number of cases increases every year, there are no approved therapeutics available for the treatment of dengue infection, and many basic aspects of the viral biology remain elusive. After entry, the viral membrane must fuse with the endosomal membrane to deliver the viral genome into the cytoplasm for translation and replication. A great deal of information has been obtained in the last decade regarding molecular aspects of the fusion step, but little is known about the events that follow this process, which leads to viral RNA release from the nucleocapsid. Here, we investigated the fate of nucleocapsid components (capsid protein and viral genome) during the infection process and found that capsid is degraded by the ubiquitin-proteasome system. However, in contrast to that observed for other RNA and DNA viruses, dengue virus capsid degradation was not responsible for genome uncoating. Interestingly, we found that dengue virus genome release requires a nondegradative ubiquitination step. These results provide the first insights into dengue virus uncoating and present new opportunities for antiviral intervention. Copyright © 2016 Byk et al.
Marton, Ira; Honig, Arik; Omid, Ayelet; De Costa, Noam; Marhevka, Elena; Cohen, Barry; Zuker, Amir; Vainstein, Alexander
2013-01-01
Researchers and biotechnologists require methods to accurately modify the genome of higher eukaryotic cells. Such modifications include, but are not limited to, site-specific mutagenesis, site-specific insertion of foreign DNA, and replacement and deletion of native sequences. Accurate genome modifications in plant species have been rather limited, with only a handful of plant species and genes being modified through the use of early genome-editing techniques. The development of rare-cutting restriction enzymes as a tool for the induction of site-specific genomic double-strand breaks and their introduction as a reliable tool for genome modification in animals, animal cells and human cell lines have paved the way for the adaptation of rare-cutting restriction enzymes to genome editing in plant cells. Indeed, the number of plant species and genes which have been successfully edited using zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and engineered homing endonucleases is on the rise. In our review, we discuss the basics of rare-cutting restriction enzyme-mediated genome-editing technology with an emphasis on its application in plant species.
5-years later - have faculty integrated medical genetics into nurse practitioner curriculum?
Maradiegue, Ann H; Edwards, Quannetta T; Seibert, Diane
2013-10-31
Abstract Many genetic/genomic educational opportunities are available to assist nursing faculty in their knowledge and understanding of genetic/genomics. This study was conducted to assess advance practice nursing faculty members' current knowledge of medical genetics/genomics, their integration of genetics/genomics content into advance practice nursing curricula, any prior formal training/education in genetics/genomics, and their comfort level in teaching genetics/genomic content. A secondary aim was to conduct a comparative analysis of the 2010 data to a previous study conducted in 2005, to determine changes that have taken place during that time period. During a national nurse practitioner faculty conference, 85 nurse practitioner faculty voluntarily completed surveys. Approximately 70% of the 2010 faculty felt comfortable teaching basic genetic/genomic concepts compared to 50% in 2005. However, there continue to be education gaps in the genetic/genomic content taught to advance practice nursing students. If nurses are going to be a crucial member of the health-care team, they must achieve the requisite competencies to deliver the increasingly complex care patients require.
Review of general algorithmic features for genome assemblers for next generation sequencers.
Wajid, Bilal; Serpedin, Erchin
2012-04-01
In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity. Copyright © 2012 Beijing Institute of Genomics, Chinese Academy of Sciences. Published by Elsevier Ltd. All rights reserved.
BeetleBase in 2010: Revisions to Provide Comprehensive Genomic Information for Tribolium castaneum
USDA-ARS?s Scientific Manuscript database
BeetleBase (http://www.beetlebase.org) has been updated to provide more comprehensive genomic information for the red flour beetle Tribolium castaneum. The database contains genomic sequence scaffolds mapped to 10 linkage groups (genome assembly release Tcas_3.0), genetic linkage maps, the official ...
Dressler, Lynn G
2013-01-01
The provision of personalized genomic medicine presents significant policy challenges, such as ensuring equitable patient access to testing, preparing clinicians to manage genomic results, justifying test reimbursement, sharing genomic information for patient care, and protecting patients against misuse of genetic information.
Barnard, Annette-Christi; Nijhof, Ard M.; Fick, Wilma; Stutzer, Christian; Maritz-Olivier, Christine
2012-01-01
The availability of genome sequencing data in combination with knowledge of expressed genes via transcriptome and proteome data has greatly advanced our understanding of arthropod vectors of disease. Not only have we gained insight into vector biology, but also into their respective vector-pathogen interactions. By combining the strengths of postgenomic databases and reverse genetic approaches such as RNAi, the numbers of available drug and vaccine targets, as well as number of transgenes for subsequent transgenic or paratransgenic approaches, have expanded. These are now paving the way for in-field control strategies of vectors and their pathogens. Basic scientific questions, such as understanding the basic components of the vector RNAi machinery, is vital, as this allows for the transfer of basic RNAi machinery components into RNAi-deficient vectors, thereby expanding the genetic toolbox of these RNAi-deficient vectors and pathogens. In this review, we focus on the current knowledge of arthropod vector RNAi machinery and the impact of RNAi on understanding vector biology and vector-pathogen interactions for which vector genomic data is available on VectorBase. PMID:24705082
Covell, David G
2015-01-01
Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE) and Sanger Cancer Genome Project (CGP). The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a) evaluate drug responses of compounds with similar mechanism of action (MOA), b) examine measures of gene expression (GE), copy number (CN) and mutation status (MUT) biomarkers, combined with gene set enrichment analysis (GSEA), for hypothesizing biological processes important for drug response, c) conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d) assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.
Jonas, Elisabeth; de Koning, Dirk-Jan
2015-01-01
Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection (GS) in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies). It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating GS into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken, and fish. It outlines tasks to help understanding possible consequences when applying genomic information in breeding scenarios. PMID:25750652
Jonas, Elisabeth; de Koning, Dirk-Jan
2015-01-01
Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection (GS) in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies). It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating GS into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken, and fish. It outlines tasks to help understanding possible consequences when applying genomic information in breeding scenarios.
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.
Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song
2013-01-01
Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
... Tips Info Center Research Topics Federal Policy Glossary Stem Cell Information General Information Clinical Trials Funding Information Current ... Basics » Stem Cell Basics I. Back to top Stem Cell Basics I. Introduction: What are stem cells, and ...
proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.
Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer
2017-01-04
The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Xia, Wei; Mason, Annaliese S.; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru
2013-01-01
Background Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. Methodology/Principal Findings To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Conclusions/Significance Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species. PMID:23555859
Fan, Haikuo; Xiao, Yong; Yang, Yaodong; Xia, Wei; Mason, Annaliese S; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru
2013-01-01
Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species.
Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong
2015-01-01
Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Human Genome Sequencing in Health and Disease
Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.
2013-01-01
Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320
Efficient Breeding by Genomic Mating.
Akdemir, Deniz; Sánchez, Julio I
2016-01-01
Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.
Valle Mansilla, José Ignacio
2011-01-01
Biomedical researchers often now ask subjects to donate samples to be deposited in biobanks. This is not only of interest to researchers, patients and society as a whole can benefit from the improvements in diagnosis, treatment, and prevention that the advent of genomic medicine portends. However, there is a growing debate regarding the social and ethical implications of creating biobanks and using stored human tissue samples for genomic research. Our aim was to identify factors related to both scientists and patients' preferences regarding the sort of information to convey to subjects about the results of the study and the risks related to genomic research. The method used was a survey addressed to 204 scientists and 279 donors from the U.S. and Spain. In this sample, researchers had already published genomic epidemiology studies; and research subjects had actually volunteered to donate a human sample for genomic research. Concerning the results, patients supported more frequently than scientists their right to know individual results from future genomic research. These differences were statistically significant after adjusting by the opportunity to receive genetic research results from the research they had previously participated and their perception of risks regarding genetic information compared to other clinical data. A slight majority of researchers supported informing participants about individual genomic results only if the reliability and clinical validity of the information had been established. Men were more likely than women to believe that patients should be informed of research results even if these conditions were not met. Also among patients, almost half of them would always prefer to be informed about individual results from future genomic research. The three main factors associated to a higher support of a non-limited access to individual results were: being from the US, having previously been offered individual information and considering genomic data more sensitive than other personal medical data. Moreover, the disease of patients, the educational level and the patient's country of origin were factors associated with the perception of risks related to genomic information. As a conclusion, it is mandatory to clarify the criteria required to establish when individual results from genomic research should be offered to participants.
The Carnegie Department of Embryology at 100: Looking Forward.
Spradling, Allan C
2016-01-01
Biological research has a realistic chance within the next 50 years of discovering the basic mechanisms by which metazoan genomes encode the complex morphological structures and capabilities that characterize life as we know it. However, achieving those goals is now threatened by researchers who advocate an end to basic research on nonmammalian organisms. For the sake of society, medicine, and the science of biology, the focus of biomedical research should place more emphasis on basic studies guided by the underlying evolutionary commonality of all major animals, as manifested in their genes, pathways, cells, and organs. © 2016 Elsevier Inc. All rights reserved.
An insight into cyanobacterial genomics--a perspective.
Lakshmi, Palaniswamy Thanga Velan
2007-05-20
At the turn of the millennium, cyanobacteria deserve attention to be reviewed to understand the past, present and future. The advent of post genomic research, which encompasses functional genomics, structural genomics, transcriptomics, pharmacogenomics, proteomics and metabolomics that allows a systematic wide approach for biological system studies. Thus by exploiting genomic and associated protein information through computational analyses, the fledging information that are generated by biotechnological analyses, could be well extrapolated to fill in the lacuna of scarce information on cyanobacteria and as an effort this paper attempts to highlights the perspectives available and awakens researcher to concentrate in the field of cyanobacterial informatics.
Progress in Understanding and Sequencing the Genome of Brassica rapa
Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo
2008-01-01
Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissions, resulted in the present-day “diploid” Brassica species with variation in chromosome number and phenotype. Triplicated genomic segments of B. rapa are collinear to those of A. thaliana with InDels. The genome triplication has led to an approximately 1.7-fold increase in the B. rapa gene number compared to that of A. thaliana. Repetitive DNA of B. rapa has also been extensively amplified and has diverged from that of A. thaliana. For its whole-genome sequencing, the Brassica rapa Genome Sequencing Project (BrGSP) consortium has developed suitable genomic resources and constructed genetic and physical maps. Ten chromosomes of B. rapa are being allocated to BrGSP consortium participants, and each chromosome will be sequenced by a BAC-by-BAC approach. Genome sequencing of B. rapa will offer a new perspective for plant biology and evolution in the context of polyploidization. PMID:18288250
Suckiel, Sabrina A; Linderman, Michael D; Sanderson, Saskia C; Diaz, George A; Wasserstein, Melissa; Kasarskis, Andrew; Schadt, Eric E; Zinberg, Randi E
2016-10-01
Personal genome sequencing is increasingly utilized by healthy individuals for predispositional screening and other applications. However, little is known about the impact of 'genomic counseling' on informed decision-making in this context. Our primary aim was to compare measures of participants' informed decision-making before and after genomic counseling in the HealthSeq project, a longitudinal cohort study of individuals receiving personal results from whole genome sequencing (WGS). Our secondary aims were to assess the impact of the counseling on WGS knowledge and concerns, and to explore participants' satisfaction with the counseling. Questionnaires were administered to participants (n = 35) before and after their pre-test genomic counseling appointment. Informed decision-making was measured using the Decisional Conflict Scale (DCS) and the Satisfaction with Decision Scale (SDS). DCS scores decreased after genomic counseling (mean: 11.34 before vs. 5.94 after; z = -4.34, p < 0.001, r = 0.52), and SDS scores increased (mean: 27.91 vs. 29.06 respectively; z = 2.91, p = 0.004, r = 0.35). Satisfaction with counseling was high (mean (SD) = 26.91 (2.68), on a scale where 6 = low and 30 = high satisfaction). HealthSeq participants felt that their decision regarding receiving personal results from WGS was more informed after genomic counseling. Further research comparing the impact of different genomic counseling models is needed.
The Power of CRISPR-Cas9-Induced Genome Editing to Speed Up Plant Breeding
Wang, Wenqin; Le, Hien T. T.
2016-01-01
Genome editing with engineered nucleases enabling site-directed sequence modifications bears a great potential for advanced plant breeding and crop protection. Remarkably, the RNA-guided endonuclease technology (RGEN) based on the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein 9 (Cas9) is an extremely powerful and easy tool that revolutionizes both basic research and plant breeding. Here, we review the major technical advances and recent applications of the CRISPR-Cas9 system for manipulation of model and crop plant genomes. We also discuss the future prospects of this technology in molecular plant breeding. PMID:28097123
Weitzel, Jeffrey N.; Blazer, Kathleen R.; MacDonald, Deborah J.; Culver, Julie O.; Offit, Kenneth
2012-01-01
Scientific and technologic advances are revolutionizing our approach to genetic cancer risk assessment, cancer screening and prevention, and targeted therapy, fulfilling the promise of personalized medicine. In this monograph we review the evolution of scientific discovery in cancer genetics and genomics, and describe current approaches, benefits and barriers to the translation of this information to the practice of preventive medicine. Summaries of known hereditary cancer syndromes and highly penetrant genes are provided and contrasted with recently-discovered genomic variants associated with modest increases in cancer risk. We describe the scope of knowledge, tools, and expertise required for the translation of complex genetic and genomic test information into clinical practice. The challenges of genomic counseling include the need for genetics and genomics professional education and multidisciplinary team training, the need for evidence-based information regarding the clinical utility of testing for genomic variants, the potential dangers posed by premature marketing of first-generation genomic profiles, and the need for new clinical models to improve access to and responsible communication of complex disease-risk information. We conclude that given the experiences and lessons learned in the genetics era, the multidisciplinary model of genetic cancer risk assessment and management will serve as a solid foundation to support the integration of personalized genomic information into the practice of cancer medicine. PMID:21858794
Recent Progress in Genome Editing Approaches for Inherited Cardiovascular Diseases.
Kaur, Balpreet; Perea-Gil, Isaac; Karakikes, Ioannis
2018-06-02
This review describes the recent progress in nuclease-based therapeutic applications for inherited heart diseases in vitro, highlights the development of the most recent genome editing technologies and discusses the associated challenges for clinical translation. Inherited cardiovascular disorders are passed from generation to generation. Over the past decade, considerable progress has been made in understanding the genetic basis of inherited heart diseases. The timely emergence of genome editing technologies using engineered programmable nucleases has revolutionized the basic research of inherited cardiovascular diseases and holds great promise for the development of targeted therapies. The genome editing toolbox is rapidly expanding, and new tools have been recently added that significantly expand the capabilities of engineered nucleases. Newer classes of versatile engineered nucleases, such as the "base editors," have been recently developed, offering the potential for efficient and precise therapeutic manipulation of the human genome.
Chelomina, G N
2017-01-01
The review summarizes the results of first genomic and transcriptomic investigations of the liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda). The studies mark the dawn of the genomic era for opisthorchiids, which cause severe hepatobiliary diseases in humans and animals. Their results aided in understanding the molecular mechanisms of adaptation to parasitism, parasite survival in mammalian biliary tracts, and genome dynamics in the individual development and the development of parasite-host relationships. Special attention is paid to the achievements in studying the codon usage bias and the roles of mobile genetic elements (MGEs) and small interfering RNAs (siRNAs). Interspecific comparisons at the genomic and transcriptomic levels revealed molecular differences, which may contribute to understanding the specialized niches and physiological needs of the respective species. The studies in C. sinensis provide a basis for further basic and applied research in liver flukes and, in particular, the development of efficient means to prevent, diagnose, and treat clonorchiasis.
Mirabello, Lisa; Clarke, Megan A; Nelson, Chase W; Dean, Michael; Wentzensen, Nicolas; Yeager, Meredith; Cullen, Michael; Boland, Joseph F; Schiffman, Mark; Burk, Robert D
2018-02-13
Of the ~60 human papillomavirus (HPV) genotypes that infect the cervicovaginal epithelium, only 12-13 "high-risk" types are well-established as causing cervical cancer, with HPV16 accounting for over half of all cases worldwide. While HPV16 is the most important carcinogenic type, variants of HPV16 can differ in their carcinogenicity by 10-fold or more in epidemiologic studies. Strong genotype-phenotype associations embedded in the small 8-kb HPV16 genome motivate molecular studies to understand the underlying molecular mechanisms. Understanding the mechanisms of HPV genomic findings is complicated by the linkage of HPV genome variants. A panel of experts in various disciplines gathered on 21 November 2016 to discuss the interdisciplinary science of HPV oncogenesis. Here, we summarize the discussion of the complexity of the viral-host interaction and highlight important next steps for selected applied basic laboratory studies guided by epidemiological genomic findings.
Mirabello, Lisa; Clarke, Megan A.; Nelson, Chase W.; Dean, Michael; Wentzensen, Nicolas; Yeager, Meredith; Cullen, Michael; Boland, Joseph F.; Schiffman, Mark
2018-01-01
Of the ~60 human papillomavirus (HPV) genotypes that infect the cervicovaginal epithelium, only 12–13 “high-risk” types are well-established as causing cervical cancer, with HPV16 accounting for over half of all cases worldwide. While HPV16 is the most important carcinogenic type, variants of HPV16 can differ in their carcinogenicity by 10-fold or more in epidemiologic studies. Strong genotype-phenotype associations embedded in the small 8-kb HPV16 genome motivate molecular studies to understand the underlying molecular mechanisms. Understanding the mechanisms of HPV genomic findings is complicated by the linkage of HPV genome variants. A panel of experts in various disciplines gathered on 21 November 2016 to discuss the interdisciplinary science of HPV oncogenesis. Here, we summarize the discussion of the complexity of the viral–host interaction and highlight important next steps for selected applied basic laboratory studies guided by epidemiological genomic findings. PMID:29438321
Using the Saccharomyces Genome Database (SGD) for analysis of genomic information
Skrzypek, Marek S.; Hirschman, Jodi
2011-01-01
Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739
Integrating genomics into undergraduate nursing education.
Daack-Hirsch, Sandra; Dieter, Carla; Quinn Griffin, Mary T
2011-09-01
To prepare the next generation of nurses, faculty are now faced with the challenge of incorporating genomics into curricula. Here we discuss how to meet this challenge. Steps to initiate curricular changes to include genomics are presented along with a discussion on creating a genomic curriculum thread versus a standalone course. Ideas for use of print material and technology on genomic topics are also presented. Information is based on review of the literature and curriculum change efforts by the authors. In recognition of advances in genomics, the nursing profession is increasing an emphasis on the integration of genomics into professional practice and educational standards. Incorporating genomics into nurses' practices begins with changes in our undergraduate curricula. Information given in didactic courses should be reinforced in clinical practica, and Internet-based tools such as WebQuest, Second Life, and wikis offer attractive, up-to-date platforms to deliver this now crucial content. To provide information that may assist faculty to prepare the next generation of nurses to practice using genomics. © 2011 Sigma Theta Tau International.
Lee, Ju Seok; Chen, Junghuei; Deaton, Russell; Kim, Jin-Woo
2014-01-01
Genetic material extracted from in situ microbial communities has high promise as an indicator of biological system status. However, the challenge is to access genomic information from all organisms at the population or community scale to monitor the biosystem's state. Hence, there is a need for a better diagnostic tool that provides a holistic view of a biosystem's genomic status. Here, we introduce an in vitro methodology for genomic pattern classification of biological samples that taps large amounts of genetic information from all genes present and uses that information to detect changes in genomic patterns and classify them. We developed a biosensing protocol, termed Biological Memory, that has in vitro computational capabilities to "learn" and "store" genomic sequence information directly from genomic samples without knowledge of their explicit sequences, and that discovers differences in vitro between previously unknown inputs and learned memory molecules. The Memory protocol was designed and optimized based upon (1) common in vitro recombinant DNA operations using 20-base random probes, including polymerization, nuclease digestion, and magnetic bead separation, to capture a snapshot of the genomic state of a biological sample as a DNA memory and (2) the thermal stability of DNA duplexes between new input and the memory to detect similarities and differences. For efficient read out, a microarray was used as an output method. When the microarray-based Memory protocol was implemented to test its capability and sensitivity using genomic DNA from two model bacterial strains, i.e., Escherichia coli K12 and Bacillus subtilis, results indicate that the Memory protocol can "learn" input DNA, "recall" similar DNA, differentiate between dissimilar DNA, and detect relatively small concentration differences in samples. This study demonstrated not only the in vitro information processing capabilities of DNA, but also its promise as a genomic pattern classifier that could access information from all organisms in a biological system without explicit genomic information. The Memory protocol has high potential for many applications, including in situ biomonitoring of ecosystems, screening for diseases, biosensing of pathological features in water and food supplies, and non-biological information processing of memory devices, among many.
Moyer, Tyler C; Holland, Andrew J
2015-01-01
The ability to rapidly and specifically modify the genome of mammalian cells has been a long-term goal of biomedical researchers. Recently, the clustered, regularly interspaced, short palindromic repeats (CRISPR)/Cas9 system from bacteria has been exploited for genome engineering in human cells. The CRISPR system directs the RNA-guided Cas9 nuclease to a specific genomic locus to induce a DNA double-strand break that may be subsequently repaired by homology-directed repair using an exogenous DNA repair template. Here we describe a protocol using CRISPR/Cas9 to achieve bi-allelic insertion of a point mutation in human cells. Using this method, homozygous clonal cell lines can be constructed in 5-6 weeks. This method can also be adapted to insert larger DNA elements, such as fluorescent proteins and degrons, at defined genomic locations. CRISPR/Cas9 genome engineering offers exciting applications in both basic science and translational research. Copyright © 2015 Elsevier Inc. All rights reserved.
The impact of next-generation sequencing on genomics
Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa
2011-01-01
This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide
Liolios, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Kyrpides, Nikos C.
2006-01-01
The Genomes On Line Database (GOLD) is a web resource for comprehensive access to information regarding complete and ongoing genome sequencing projects worldwide. The database currently incorporates information on over 1500 sequencing projects, of which 294 have been completed and the data deposited in the public databases. GOLD v.2 has been expanded to provide information related to organism properties such as phenotype, ecotype and disease. Furthermore, project relevance and availability information is now included. GOLD is available at . It is also mirrored at the Institute of Molecular Biology and Biotechnology, Crete, Greece at PMID:16381880
Merlo, Domenico F; Filiberti, Rosangela; Kobernus, Michael; Bartonova, Alena; Gamulin, Marija; Ferencic, Zeljko; Dusinska, Maria; Fucic, Aleksandra
2012-06-28
Development of graphical/visual presentations of cancer etiology caused by environmental stressors is a process that requires combining the complex biological interactions between xenobiotics in living and occupational environment with genes (gene-environment interaction) and genomic and non-genomic based disease specific mechanisms in living organisms. Traditionally, presentation of causal relationships includes the statistical association between exposure to one xenobiotic and the disease corrected for the effect of potential confounders. Within the FP6 project HENVINET, we aimed at considering together all known agents and mechanisms involved in development of selected cancer types. Selection of cancer types for causal diagrams was based on the corpus of available data and reported relative risk (RR). In constructing causal diagrams the complexity of the interactions between xenobiotics was considered a priority in the interpretation of cancer risk. Additionally, gene-environment interactions were incorporated such as polymorphisms in genes for repair and for phase I and II enzymes involved in metabolism of xenobiotics and their elimination. Information on possible age or gender susceptibility is also included. Diagrams are user friendly thanks to multistep access to information packages and the possibility of referring to related literature and a glossary of terms. Diagrams cover both chemical and physical agents (ionizing and non-ionizing radiation) and provide basic information on the strength of the association between type of exposure and cancer risk reported by human studies and supported by mechanistic studies. Causal diagrams developed within HENVINET project represent a valuable source of information for professionals working in the field of environmental health and epidemiology, and as educational material for students. Cancer risk results from a complex interaction of environmental exposures with inherited gene polymorphisms, genetic burden collected during development and non genomic capacity of response to environmental insults. In order to adopt effective preventive measures and the associated regulatory actions, a comprehensive investigation of cancer etiology is crucial. Variations and fluctuations of cancer incidence in human populations do not necessarily reflect environmental pollution policies or population distribution of polymorphisms of genes known to be associated with increased cancer risk. Tools which may be used in such a comprehensive research, including molecular biology applied to field studies, require a methodological shift from the reductionism that has been used until recently as a basic axiom in interpretation of data. The complexity of the interactions between cells, genes and the environment, i.e. the resonance of the living matter with the environment, can be synthesized by systems biology. Within the HENVINET project such philosophy was followed in order to develop interactive causal diagrams for the investigation of cancers with possible etiology in environmental exposure. Causal diagrams represent integrated knowledge and seed tool for their future development and development of similar diagrams for other environmentally related diseases such as asthma or sterility. In this paper development and application of causal diagrams for cancer are presented and discussed.
Cheering for Team Science | Office of Cancer Genomics
As a graduate student, my PhD thesis focused on the function of a single human gene, within a genome of some 20,000 genes. Although this sometimes made my work seem insignificant, I was reminded of how important one small piece of a large puzzle can be when I discovered all the ways the gene knockout cells were disadvantaged. Studying the basic biology of our cells made me appreciate the beautiful complexity of human biology.
Full-Genome Analysis of Avian Influenza A(H5N1) Virus from a Human, North America, 2013
Pabbaraju, Kanti; Tellier, Raymond; Wong, Sallene; Li, Yan; Bastien, Nathalie; Tang, Julian W.; Drews, Steven J.; Jang, Yunho; Davis, C. Todd; Tipples, Graham A.
2014-01-01
Full-genome analysis was conducted on the first isolate of a highly pathogenic avian influenza A(H5N1) virus from a human in North America. The virus has a hemagglutinin gene of clade 2.3.2.1c and is a reassortant with an H9N2 subtype lineage polymerase basic 2 gene. No mutations conferring resistance to adamantanes or neuraminidase inhibitors were found. PMID:24755439
CIDR Skip navigation Home About CIDR General Highlights Newsletter Staff Employment Opportunities Genotyping General Information Genome Wide Association Custom FFPE Sample Options Methylation Linkage Consortium Developed Mouse Whole Genome Sequencing General Information Whole Genome Whole Exome Custom
Origin and Diversification of Basic-Helix-Loop-Helix Proteins in Plants
Pires, Nuno; Dolan, Liam
2010-01-01
Basic helix-loop-helix (bHLH) proteins are a class of transcription factors found throughout eukaryotic organisms. Classification of the complete sets of bHLH proteins in the sequenced genomes of Arabidopsis thaliana and Oryza sativa (rice) has defined the diversity of these proteins among flowering plants. However, the evolutionary relationships of different plant bHLH groups and the diversity of bHLH proteins in more ancestral groups of plants are currently unknown. In this study, we use whole-genome sequences from nine species of land plants and algae to define the relationships between these proteins in plants. We show that few (less than 5) bHLH proteins are encoded in the genomes of chlorophytes and red algae. In contrast, many bHLH proteins (100–170) are encoded in the genomes of land plants (embryophytes). Phylogenetic analyses suggest that plant bHLH proteins are monophyletic and constitute 26 subfamilies. Twenty of these subfamilies existed in the common ancestors of extant mosses and vascular plants, whereas six further subfamilies evolved among the vascular plants. In addition to the conserved bHLH domains, most subfamilies are characterized by the presence of highly conserved short amino acid motifs. We conclude that much of the diversity of plant bHLH proteins was established in early land plants, over 440 million years ago. PMID:19942615
32 CFR Appendix A to Part 275 - Obtaining Basic Identifying Account Information
Code of Federal Regulations, 2010 CFR
2010-07-01
... 32 National Defense 2 2010-07-01 2010-07-01 false Obtaining Basic Identifying Account Information... Information A. A DoD law enforcement office may issue a formal written request for basic identifying account... only the above specified basic identifying information concerning a customer's account. C. A format for...
Interpretation of Genomic Data Questions and Answers
Simon, Richard
2008-01-01
Using a question and answer format we describe important aspects of using genomic technologies in cancer research. The main challenges are not managing the mass of data, but rather the design, analysis and accurate reporting of studies that result in increased biological knowledge and medical utility. Many analysis issues address the use of expression microarrays but are also applicable to other whole genome assays. Microarray based clinical investigations have generated both unrealistic hyperbole and excessive skepticism. Genomic technologies are tremendously powerful and will play instrumental roles in elucidating the mechanisms of oncogenesis and in devlopingan era of predictive medicine in which treatments are tailored to individual tumors. Achieving these goals involves challenges in re-thinking many paradigms for the conduct of basic and clinical cancer research and for the organization of interdisciplinary collaboration. PMID:18582627
Genome Engineering with TALE and CRISPR Systems in Neuroscience
Lee, Han B.; Sundberg, Brynn N.; Sigafoos, Ashley N.; Clark, Karl J.
2016-01-01
Recent advancement in genome engineering technology is changing the landscape of biological research and providing neuroscientists with an opportunity to develop new methodologies to ask critical research questions. This advancement is highlighted by the increased use of programmable DNA-binding agents (PDBAs) such as transcription activator-like effector (TALE) and RNA-guided clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated (Cas) systems. These PDBAs fused or co-expressed with various effector domains allow precise modification of genomic sequences and gene expression levels. These technologies mirror and extend beyond classic gene targeting methods contributing to the development of novel tools for basic and clinical neuroscience. In this Review, we discuss the recent development in genome engineering and potential applications of this technology in the field of neuroscience. PMID:27092173
Guerra, Daniel J.
2011-01-01
Autism spectrum disorders (ASDs) have become increasingly common in recent years. The discovery of single-nucleotide polymorphisms and accompanying copy number variations within the genome has increased our understanding of the architecture of the disease. These genetic and genomic alterations coupled with epigenetic phenomena have pointed to a neuroimmunopathological mechanism for ASD. Model animal studies, developmental biology, and affective neuroscience laid a foundation for dissecting the neural pathways impacted by these disease-generating mechanisms. The goal of current autism research is directed toward a systems biological approach to find the most basic genetic and environmental causes to this severe developmental disease. It is hoped that future genomic and neuroimmunological research will be directed toward finding the road toward prevention, treatment, and cure of ASD. PMID:22937247
Genome Engineering with TALE and CRISPR Systems in Neuroscience.
Lee, Han B; Sundberg, Brynn N; Sigafoos, Ashley N; Clark, Karl J
2016-01-01
Recent advancement in genome engineering technology is changing the landscape of biological research and providing neuroscientists with an opportunity to develop new methodologies to ask critical research questions. This advancement is highlighted by the increased use of programmable DNA-binding agents (PDBAs) such as transcription activator-like effector (TALE) and RNA-guided clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated (Cas) systems. These PDBAs fused or co-expressed with various effector domains allow precise modification of genomic sequences and gene expression levels. These technologies mirror and extend beyond classic gene targeting methods contributing to the development of novel tools for basic and clinical neuroscience. In this Review, we discuss the recent development in genome engineering and potential applications of this technology in the field of neuroscience.
USDA-ARS?s Scientific Manuscript database
This chapter covers the use of wild beets in sugar beet improvement, including the basic botany of the species, its distribution; geographical locations of genetic diversity; morphology; cytology and karyotype; genome size; taxonomic position; agricultural status (model plant/weeds/invasive species/...
Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen; Kawamoto, Kensaku
2014-04-04
Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine.
Welch, Brandon M.; Rodriguez Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku
2014-01-01
Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine. PMID:25411644
The minimum information about a genome sequence (MIGS) specification
Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; Gil, Ingio San; Wilson, Gareth; Wipat, Anil
2008-01-01
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases. PMID:18464787
McGuire, Amy L; Fisher, Rebecca; Cusenza, Paul; Hudson, Kathy; Rothstein, Mark A; McGraw, Deven; Matteson, Stephen; Glaser, John; Henley, Douglas E
2008-07-01
As clinical genetics evolves, and we embark down the path toward more personalized and effective health care, the amount, detail, and complexity of genetic/genomic test information within the electronic health record will increase. This information should be appropriately protected to secure the trust of patients and to support interoperable electronic health information exchange. This article discusses characteristics of genetic/genomic test information, including predictive capability, immutability, and uniqueness, which should be considered when developing policies about information protection. Issues related to "genetic exceptionalism"; i.e., whether genetic/genomic test information should be treated differently from other medical information for purposes of data access and permissible use, are also considered. These discussions can help guide policy that will facilitate the biological and clinical resource development to support the introduction of this information into health care.
Generation of non-genomic oligonucleotide tag sequences for RNA template-specific PCR
Pinto, Fernando Lopes; Svensson, Håkan; Lindblad, Peter
2006-01-01
Background In order to overcome genomic DNA contamination in transcriptional studies, reverse template-specific polymerase chain reaction, a modification of reverse transcriptase polymerase chain reaction, is used. The possibility of using tags whose sequences are not found in the genome further improves reverse specific polymerase chain reaction experiments. Given the absence of software available to produce genome suitable tags, a simple tool to fulfill such need was developed. Results The program was developed in Perl, with separate use of the basic local alignment search tool, making the tool platform independent (known to run on Windows XP and Linux). In order to test the performance of the generated tags, several molecular experiments were performed. The results show that Tagenerator is capable of generating tags with good priming properties, which will deliberately not result in PCR amplification of genomic DNA. Conclusion The program Tagenerator is capable of generating tag sequences that combine genome absence with good priming properties for RT-PCR based experiments, circumventing the effects of genomic DNA contamination in an RNA sample. PMID:16820068
Shelton, Ann K; Freeman, Bradley D; Fish, Anne F; Bachman, Jean A; Richardson, Lloyd I
2015-03-01
Many research studies conducted today in critical care have a genomics component. Patients' surrogates asked to authorize participation in genomics research for a loved one in the intensive care unit may not be prepared to make informed decisions about a patient's participation in the research. To examine the effectiveness of a new, computer-based education module on surrogates' understanding of the process of informed consent for genomics research. A pilot study was conducted with visitors in the waiting rooms of 2 intensive care units in a Midwestern tertiary care medical center. Visitors were randomly assigned to the experimental (education module plus a sample genomics consent form; n = 65) or the control (sample genomics consent form only; n = 69) group. Participants later completed a test on informed genomics consent. Understanding the process of informed consent was greater (P = .001) in the experimental group than in the control group. Specifically, compared with the control group, the experimental group had a greater understanding of 8 of 13 elements of informed consent: intended benefits of research (P = .02), definition of surrogate consenter (P= .001), withdrawal from the study (P = .001), explanation of risk (P = .002), purpose of the institutional review board (P = .001), definition of substituted judgment (P = .03), compensation for harm (P = .001), and alternative treatments (P = .004). Computer-based education modules may be an important addition to conventional approaches for obtaining informed consent in the intensive care unit. Preparing patients' family members who may consider serving as surrogate consenters is critical to facilitating genomics research in critical care. ©2015 American Association of Critical-Care Nurses.
Genomic Data Commons and Genomic Cloud Pilots - Google Hangout
Join us for a live, moderated discussion about two NCI efforts to expand access to cancer genomics data: the Genomic Data Commons and Genomic Cloud Pilots. NCI subject matters experts will include Louis M. Staudt, M.D., Ph.D., Director Center for Cancer Genomics, Warren Kibbe, Ph.D., Director, NCI Center for Biomedical Informatics and Information Technology, and moderated by Anthony Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics and Information Technology. We welcome your questions before and during the Hangout on Twitter using the hashtag #AskNCI.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muchero, Wellington; Labbe, Jessy L; Priya, Ranjan
2014-01-01
To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel andmore » fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.« less
Welch, Brandon M; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku
2014-01-01
Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time.
INDIGO – INtegrated Data Warehouse of MIcrobial GenOmes with Examples from the Red Sea Extremophiles
Alam, Intikhab; Antunes, André; Kamau, Allan Anthony; Ba alawi, Wail; Kalkatawi, Manal; Stingl, Ulrich; Bajic, Vladimir B.
2013-01-01
Background The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes. Results We developed a data warehouse system (INDIGO) that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments. Conclusions We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG) pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo. PMID:24324765
Alam, Intikhab; Antunes, André; Kamau, Allan Anthony; Ba Alawi, Wail; Kalkatawi, Manal; Stingl, Ulrich; Bajic, Vladimir B
2013-01-01
The next generation sequencing technologies substantially increased the throughput of microbial genome sequencing. To functionally annotate newly sequenced microbial genomes, a variety of experimental and computational methods are used. Integration of information from different sources is a powerful approach to enhance such annotation. Functional analysis of microbial genomes, necessary for downstream experiments, crucially depends on this annotation but it is hampered by the current lack of suitable information integration and exploration systems for microbial genomes. We developed a data warehouse system (INDIGO) that enables the integration of annotations for exploration and analysis of newly sequenced microbial genomes. INDIGO offers an opportunity to construct complex queries and combine annotations from multiple sources starting from genomic sequence to protein domain, gene ontology and pathway levels. This data warehouse is aimed at being populated with information from genomes of pure cultures and uncultured single cells of Red Sea bacteria and Archaea. Currently, INDIGO contains information from Salinisphaera shabanensis, Haloplasma contractile, and Halorhabdus tiamatea - extremophiles isolated from deep-sea anoxic brine lakes of the Red Sea. We provide examples of utilizing the system to gain new insights into specific aspects on the unique lifestyle and adaptations of these organisms to extreme environments. We developed a data warehouse system, INDIGO, which enables comprehensive integration of information from various resources to be used for annotation, exploration and analysis of microbial genomes. It will be regularly updated and extended with new genomes. It is aimed to serve as a resource dedicated to the Red Sea microbes. In addition, through INDIGO, we provide our Automatic Annotation of Microbial Genomes (AAMG) pipeline. The INDIGO web server is freely available at http://www.cbrc.kaust.edu.sa/indigo.
Fumoto, Masaki; Miyazaki, Satoru; Sugawara, Hideaki
2002-01-01
Genome Information Broker (GIB) is a powerful tool for the study of comparative genomics. GIB allows users to retrieve and display partial and/or whole genome sequences together with the relevant biological annotation. GIB has accumulated all the completed microbial genome and has recently been expanded to include Arabidopsis thaliana genome data from DDBJ/EMBL/GenBank. In the near future, hundreds of genome sequences will be determined. In order to handle such huge data, we have enhanced the GIB architecture by using XML, CORBA and distributed RDBs. We introduce the new GIB here. GIB is freely accessible at http://gib.genes.nig.ac.jp/. PMID:11752256
Standards for Clinical Grade Genomic Databases.
Yohe, Sophia L; Carter, Alexis B; Pfeifer, John D; Crawford, James M; Cushman-Vokoun, Allison; Caughron, Samuel; Leonard, Debra G B
2015-11-01
Next-generation sequencing performed in a clinical environment must meet clinical standards, which requires reproducibility of all aspects of the testing. Clinical-grade genomic databases (CGGDs) are required to classify a variant and to assist in the professional interpretation of clinical next-generation sequencing. Applying quality laboratory standards to the reference databases used for sequence-variant interpretation presents a new challenge for validation and curation. To define CGGD and the categories of information contained in CGGDs and to frame recommendations for the structure and use of these databases in clinical patient care. Members of the College of American Pathologists Personalized Health Care Committee reviewed the literature and existing state of genomic databases and developed a framework for guiding CGGD development in the future. Clinical-grade genomic databases may provide different types of information. This work group defined 3 layers of information in CGGDs: clinical genomic variant repositories, genomic medical data repositories, and genomic medicine evidence databases. The layers are differentiated by the types of genomic and medical information contained and the utility in assisting with clinical interpretation of genomic variants. Clinical-grade genomic databases must meet specific standards regarding submission, curation, and retrieval of data, as well as the maintenance of privacy and security. These organizing principles for CGGDs should serve as a foundation for future development of specific standards that support the use of such databases for patient care.
Coverage of genomic medicine: information gap between lay public and scientists.
Sugawara, Yuya; Narimatsu, Hiroto; Fukao, Akira
2012-01-01
The sharing of information between the lay public and medical professionals is crucial to the conduct of personalized medicine using genomic information in the near future. Mass media, such as newspapers, can play an important role in disseminating scientific information. However, studies on the role of newspaper coverage of genome-related articles are highly limited. We investigated the coverage of genomic medicine in five major Japanese newspapers (Asahi, Mainichi, Yomiuri, Sankei, and Nikkei) using Nikkei Telecom and articles in scientific journals in PubMed from 1995 to 2009. The number of genome-related articles in all five newspapers temporarily increased in 2000, and began continuously decreasing thereafter from 2001 to 2009. Conversely, there was a continuous increasing trend in the number of genome-related articles in PubMed during this period. The numbers of genome-related articles among the five major newspapers from 1995 to 2009 were significantly different (P = 0.002). Commentaries, research articles, and articles about companies were the most frequent in 2001 and 2003, when the number of genome-related articles transiently increased in the five newspapers. This study highlights the significant gap between newspaper coverage and scientific articles in scientific journals.
Integrating population genetics and conservation biology in the era of genomics.
Ouborg, N Joop
2010-02-23
As one of the final activities of the ESF-CONGEN Networking programme, a conference entitled 'Integrating Population Genetics and Conservation Biology' was held at Trondheim, Norway, from 23 to 26 May 2009. Conference speakers and poster presenters gave a display of the state-of-the-art developments in the field of conservation genetics. Over the five-year running period of the successful ESF-CONGEN Networking programme, much progress has been made in theoretical approaches, basic research on inbreeding depression and other genetic processes associated with habitat fragmentation and conservation issues, and with applying principles of conservation genetics in the conservation of many species. Future perspectives were also discussed in the conference, and it was concluded that conservation genetics is evolving into conservation genomics, while at the same time basic and applied research on threatened species and populations from a population genetic point of view continues to be emphasized.
De Rocquigny, H; Ficheux, D; Gabus, C; Allain, B; Fournie-Zaluski, M C; Darlix, J L; Roques, B P
1993-02-25
The 56 amino acid nucleocapsid protein (NCp10) of Moloney Murine Leukemia Virus, contains a CysX2CysX4HisX4Cys zinc finger flanked by basic residues. In vitro NCp10 promotes genomic RNA dimerization, a process most probably linked to genomic RNA packaging, and replication primer tRNA(Pro) annealing to the initiation site of reverse transcription. To characterize the amino-acid sequences involved in the various functions of NCp10, we have synthesized by solid phase method the native protein and a series of derived peptides shortened at the N- or C-terminus with or without the zinc finger domain. In the latter case, the two parts of the protein were linked by a Glycine - Glycine spacer. The in vitro studies of these peptides show that nucleic acid annealing activities of NCp10 do not require a zinc finger but are critically dependent on the presence of specific sequences located on each side of the CCHC domain and containing proline and basic residues. Thus, deletion of 11R or 49PRPQT, of the fully active 29 residue peptide 11RQGGERRRSQLDRDGGKKPRGPRGPRPQT53 leads to a complete loss of NCp10 activity. Therefore it is proposed that in NCp10, the zinc finger directs the spatial recognition of the target RNAs by the basic domains surrounding the zinc finger.
HpBase: A genome database of a sea urchin, Hemicentrotus pulcherrimus.
Kinjo, Sonoko; Kiyomoto, Masato; Yamamoto, Takashi; Ikeo, Kazuho; Yaguchi, Shunsuke
2018-04-01
To understand the mystery of life, it is important to accumulate genomic information for various organisms because the whole genome encodes the commands for all the genes. Since the genome of Strongylocentrotus purpratus was sequenced in 2006 as the first sequenced genome in echinoderms, the genomic resources of other North American sea urchins have gradually been accumulated, but no sea urchin genomes are available in other areas, where many scientists have used the local species and reported important results. In this manuscript, we report a draft genome of the sea urchin Hemincentrotus pulcherrimus because this species has a long history as the target of developmental and cell biology in East Asia. The genome of H. pulcherrimus was assembled into 16,251 scaffold sequences with an N50 length of 143 kbp, and approximately 25,000 genes were identified in the genome. The size of the genome and the sequencing coverage were estimated to be approximately 800 Mbp and 100×, respectively. To provide these data and information of annotation, we constructed a database, HpBase (http://cell-innovation.nig.ac.jp/Hpul/). In HpBase, gene searches, genome browsing, and blast searches are available. In addition, HpBase includes the "recipes" for experiments from each lab using H. pulcherrimus. These recipes will continue to be updated according to the circumstances of individual scientists and can be powerful tools for experimental biologists and for the community. HpBase is a suitable dataset for evolutionary, developmental, and cell biologists to compare H. pulcherrimus genomic information with that of other species and to isolate gene information. © 2018 Japanese Society of Developmental Biologists.
Precision medicine for psychopharmacology: a general introduction.
Shin, Cheolmin; Han, Changsu; Pae, Chi-Un; Patkar, Ashwin A
2016-07-01
Precision medicine is an emerging medical model that can provide accurate diagnoses and tailored therapeutic strategies for patients based on data pertaining to genes, microbiomes, environment, family history and lifestyle. Here, we provide basic information about precision medicine and newly introduced concepts, such as the precision medicine ecosystem and big data processing, and omics technologies including pharmacogenomics, pharamacometabolomics, pharmacoproteomics, pharmacoepigenomics, connectomics and exposomics. The authors review the current state of omics in psychiatry and the future direction of psychopharmacology as it moves towards precision medicine. Expert commentary: Advances in precision medicine have been facilitated by achievements in multiple fields, including large-scale biological databases, powerful methods for characterizing patients (such as genomics, proteomics, metabolomics, diverse cellular assays, and even social networks and mobile health technologies), and computer-based tools for analyzing large amounts of data.
All about the Human Genome Project (HGP)
... CSER), and Genome Sequencing Informatics Tools (GS-IT) Comparative Genomics Background information prepared for the media on ... other species to the human sequence. Background on Comparative Genomic Analysis New Process to Prioritize Animal Genomes ...
Family genome browser: visualizing genomes with pedigree information.
Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong
2015-07-15
Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Effects of informed consent for individual genome sequencing on relevant knowledge.
Kaphingst, K A; Facio, F M; Cheng, M-R; Brooks, S; Eidem, H; Linn, A; Biesecker, B B; Biesecker, L G
2012-11-01
Increasing availability of individual genomic information suggests that patients will need knowledge about genome sequencing to make informed decisions, but prior research is limited. In this study, we examined genome sequencing knowledge before and after informed consent among 311 participants enrolled in the ClinSeq™ sequencing study. An exploratory factor analysis of knowledge items yielded two factors (sequencing limitations knowledge; sequencing benefits knowledge). In multivariable analysis, high pre-consent sequencing limitations knowledge scores were significantly related to education [odds ratio (OR): 8.7, 95% confidence interval (CI): 2.45-31.10 for post-graduate education, and OR: 3.9; 95% CI: 1.05, 14.61 for college degree compared with less than college degree] and race/ethnicity (OR: 2.4, 95% CI: 1.09, 5.38 for non-Hispanic Whites compared with other racial/ethnic groups). Mean values increased significantly between pre- and post-consent for the sequencing limitations knowledge subscale (6.9-7.7, p < 0.0001) and sequencing benefits knowledge subscale (7.0-7.5, p < 0.0001); increase in knowledge did not differ by sociodemographic characteristics. This study highlights gaps in genome sequencing knowledge and underscores the need to target educational efforts toward participants with less education or from minority racial/ethnic groups. The informed consent process improved genome sequencing knowledge. Future studies could examine how genome sequencing knowledge influences informed decision making. © 2012 John Wiley & Sons A/S.
Public preferences for communicating personal genomic risk information: a focus group study.
Smit, Amelia K; Keogh, Louise A; Hersch, Jolyn; Newson, Ainsley J; Butow, Phyllis; Williams, Gabrielle; Cust, Anne E
2016-12-01
Personalized genomic risk information has the potential to motivate behaviour change and promote population health, but the success of this will depend upon effective risk communication strategies. To determine preferences for different graphical and written risk communication formats, and the delivery of genomic risk information including the mode of communication and the role of health professionals. Focus groups, transcribed and analysed thematically. Thirty-four participants from the public. Participants were provided with, and invited to discuss, a hypothetical scenario giving an individual's personalized genomic risk of melanoma displayed in several graphical formats. Participants preferred risk formats that were familiar and easy to understand, such as a 'double pie chart' and '100 person diagram' (pictograph). The 100 person diagram was considered persuasive because it humanized and personalized the risk information. People described the pie chart format as resembling bank data and food (such as cake and pizza). Participants thought that email, web-based platforms and postal mail were viable options for communicating genomic risk information. However, they felt that it was important that a health professional (either a genetic counsellor or 'informed' general practitioner) be available for discussion at the time of receiving the risk information, to minimize potential negative emotional responses and misunderstanding. Face-to-face or telephone delivery was preferred for delivery of high-risk results. These public preferences for communication strategies for genomic risk information will help to guide translation of genome-based knowledge into improved population health. © 2015 The Authors. Health Expectations. Published by John Wiley & Sons Ltd.
Sugimoto, Naoko; Iwaki, Tomoko; Chardwiriyapreecha, Soracom; Shimazu, Masamitsu; Sekito, Takayuki; Takegawa, Kaoru; Kakinuma, Yoshimi
2010-01-01
A recent study filling the gap in the genome sequence in the left arm of chromosome 2 of Schizosaccharomyces pombe revealed a homolog of budding yeast Vba2p, a vacuolar transporter of basic amino acids. GFP-tagged Vba2p in fission yeast was localized to the vacuolar membrane. Upon disruption of vba2, the uptake of several amino acids, including lysine, histidine, and arginine, was impaired. A transient increase in lysine uptake under nitrogen starvation was lowered by this mutation. These findings suggest that Vba2p is involved in basic amino acid transport in S. pombe under diverse conditions.
Antimicrobial peptide-like genes in Nasonia vitripennis: a genomic perspective
2010-01-01
Background Antimicrobial peptides (AMPs) are an essential component of innate immunity which can rapidly respond to diverse microbial pathogens. Insects, as a rich source of AMPs, attract great attention of scientists in both understanding of the basic biology of the immune system and searching molecular templates for anti-infective drug design. Despite a large number of AMPs have been identified from different insect species, little information in terms of these peptides is available from parasitic insects. Results By using integrated computational approaches to systemically mining the Hymenopteran parasitic wasp Nasonia vitripennis genome, we establish the first AMP repertoire whose members exhibit extensive sequence and structural diversity and can be distinguished into multiple molecular types, including insect and fungal defensin-like peptides (DLPs) with the cysteine-stabilized α-helical and β-sheet (CSαβ) fold; Pro- or Gly-rich abaecins and hymenoptaecins; horseshoe crab tachystatin-type AMPs with the inhibitor cystine knot (ICK) fold; and a linear α-helical peptide. Inducible expression pattern of seven N. vitripennis AMP genes were verified, and two representative peptides were synthesized and functionally identified to be antibacterial. In comparison with Apis mellifera (Hymenoptera) and several non-Hymenopteran model insects, N. vitripennis has evolved a complex antimicrobial immune system with more genes and larger protein precursors. Three classical strategies that are likely responsible for the complexity increase have been recognized: 1) Gene duplication; 2) Exon duplication; and 3) Exon-shuffling. Conclusion The present study established the N. vitripennis peptidome associated with antimicrobial immunity by using a combined computational and experimental strategy. As the first AMP repertoire of a parasitic wasp, our results offer a basic platform for further studying the immunological and evolutionary significances of these newly discovered AMP-like genes in this class of insects. PMID:20302637
Buseh, Aaron G; Stevens, Patricia E; Millon-Underwood, Sandra; Townsend, Leolia; Kelber, Sheryl T
2013-10-01
There is limited information about what African Americans think about biobanks and the ethical questions surrounding them. Likewise, there is a gap in capacity to successfully enroll African Americans as biobank donors. The purposes of this community-based participatory study were to: (a) explore African Americans' perspectives on genetics/genomic research, (b) understand facilitators and barriers to participation in such studies, and (c) enlist their ideas about how to attract and sustain engagement of African Americans in genetics initiatives. As the first phase in a mixed methods study, we conducted four focus groups with 21 African American community leaders in one US Midwest city. The sample consisted of executive directors of community organizations and prominent community activists. Data were analyzed thematically. Skepticism about biomedical research and lack of trust characterized discussions about biomedical research and biobanks. The Tuskegee Untreated Syphilis Study and the Henrietta Lacks case influenced their desire to protect their community from harm and exploitation. Connections between genetics and family history made genetics/genomics research personal, pitting intrusion into private affairs against solutions. Participants also expressed concerns about ethical issues involved in genomics research, calling attention to how research had previously been conducted in their community. Participants hoped personalized medicine might bring health benefits to their people and proposed African American communities have a "seat at the table." They called for basic respect, authentic collaboration, bidirectional education, transparency and prerogative, and meaningful benefits and remuneration. Key to building trust and overcoming African Americans' trepidation and resistance to participation in biobanks are early and persistent engagement with the community, partnerships with community stakeholders to map research priorities, ethical conduct of research, and a guarantee of equitable distribution of benefits from genomics discoveries.
Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses
Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.
2004-01-01
The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.
2018-01-01
The basic helix-loop-helix (bHLH) proteins represent a key group of transcription factors implicated in numerous eukaryotic developmental and signal transduction processes. Characterization of bHLHs from model species such as humans, fruit flies, nematodes and plants have yielded important information on their functions and evolutionary origin. However, relatively little is known about bHLHs in non-model organisms despite the availability of a vast number of high-throughput sequencing datasets, enabling previously intractable genome-wide and cross-species analyses to be now performed. We extensively searched for bHLHs in 126 crustacean species represented across major Crustacea taxa and identified 3777 putative bHLH orthologues. We have also included seven whole-genome datasets representative of major arthropod lineages to obtain a more accurate prediction of the full bHLH gene complement. With focus on important food crop species from Decapoda, we further defined higher-order groupings and have successfully recapitulated previous observations in other animals. Importantly, we also observed evidence for lineage-specific bHLH expansions in two basal crustaceans (branchiopod and copepod), suggesting a mode of evolution through gene duplication as an adaptation to changing environments. In-depth analysis on bHLH-PAS members confirms the phenomenon coined as ‘modular evolution’ (independently evolved domains) typically seen in multidomain proteins. With the amphipod Parhyale hawaiensis as the exception, our analyses have focused on crustacean transcriptome datasets. Hence, there is a clear requirement for future analyses on whole-genome sequences to overcome potential limitations associated with transcriptome mining. Nonetheless, the present work will serve as a key resource for future mechanistic and biochemical studies on bHLHs in economically important crustacean food crop species. PMID:29657824
Three-dimensional optical coherence tomography of the embryonic murine cardiovascular system
NASA Astrophysics Data System (ADS)
Luo, Wei; Marks, Daniel L.; Ralston, Tyler S.; Boppart, Stephen A.
2006-03-01
Optical coherence tomography (OCT) is an emerging high-resolution real-time biomedical imaging technology that has potential as a novel investigational tool in developmental biology and functional genomics. In this study, murine embryos and embryonic hearts are visualized with an OCT system capable of 2-µm axial and 15-µm lateral resolution and with real-time acquisition rates. We present, to our knowledge, the first sets of high-resolution 2- and 3-D OCT images that reveal the internal structures of the mammalian (murine) embryo (E10.5) and embryonic (E14.5 and E17.5) cardiovascular system. Strong correlations are observed between OCT images and corresponding hematoxylin- and eosin-stained histological sections. Real-time in vivo embryonic (E10.5) heart activity is captured by spectral-domain optical coherence tomography, processed, and displayed at a continuous rate of five frames per second. With the ability to obtain not only high-resolution anatomical data but also functional information during cardiovascular development, the OCT technology has the potential to visualize and quantify changes in murine development and in congenital and induced heart disease, as well as enable a wide range of basic in vitro and in vivo research studies in functional genomics.
High level of microsynteny and purifying selection affect the evolution of WRKY family in Gramineae.
Jin, Jing; Kong, Jingjing; Qiu, Jianle; Zhu, Huasheng; Peng, Yuancheng; Jiang, Haiyang
2016-01-01
The WRKY gene family, which encodes proteins in the regulation processes of diverse developmental stages, is one of the largest families of transcription factors in higher plants. In this study, by searching for interspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found 35 chromosomal segments of subgroup I genes of WRKY family (WRKY I) in four Gramineae species (Brachypodium, rice, sorghum, and maize) formed eight orthologous groups. After a stepwise gene-by-gene reciprocal comparison of all the protein sequences in the WRKY I gene flanking areas, highly conserved regions of microsynteny were found in the four Gramineae species. Most gene pairs showed conserved orientation within syntenic genome regions. Furthermore, tandem duplication events played the leading role in gene expansion. Eventually, environmental selection pressure analysis indicated strong purifying selection for the WRKY I genes in Gramineae, which may have been followed by gene loss and rearrangement. The results presented in this study provide basic information of Gramineae WRKY I genes and form the foundation for future functional studies of these genes. High level of microsynteny in the four grass species provides further evidence that a large-scale genome duplication event predated speciation.
Morimoto, Tomomi; Arii, Jun; Akashi, Hiroomi; Kawaguchi, Yasushi
2009-03-01
Information on sites in HSV genomes at which foreign gene(s) can be inserted without disrupting viral genes or affecting properties of the parental virus are important for basic research on HSV and development of HSV-based vectors for human therapy. The intergenic region between HSV-1 UL3 and UL4 genes has been reported to satisfy the requirements for such an insertion site. The UL3 and UL4 genes are oriented toward the intergenic region and, therefore, insertion of a foreign gene(s) into the region between the UL3 and UL4 polyadenylation signals should not disrupt any viral genes or transcriptional units. HSV-1 and HSV-2 each have more than 10 additional regions structurally similar to the intergenic region between UL3 and UL4. In the studies reported here, it has been demonstrated that insertion of a reporter gene expression cassette into several of the HSV-1 and HSV-2 intergenic regions has no effect on viral growth in cell culture or virulence in mice, suggesting that these multiple intergenic regions may be suitable HSV sites for insertion of foreign genes.
Barakate, Abdellah; Stephens, Jennifer
2016-01-01
Modern omics platforms have made the determination of susceptible/resistance genes feasible in any species generating huge numbers of potential targets for crop protection. However, the efforts to validate these targets have been hampered by the lack of a fast, precise, and efficient gene targeting system in plants. Now, the repurposing of clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) system has solved this problem. CRISPR/Cas9 is the latest synthetic endonuclease that has revolutionized basic research by allowing facile genome editing in prokaryotes and eukaryotes. Gene knockout is now feasible at an unprecedented efficiency with the possibility of multiplexing several targets and even genome-wide mutagenesis screening. In a short time, this powerful tool has been engineered for an array of applications beyond gene editing. Here, we briefly describe the CRISPR/Cas9 system, its recent improvements and applications in gene manipulation and single DNA/RNA molecule analysis. We summarize a few recent tests targeting plant pathogens and discuss further potential applications in pest control and plant–pathogen interactions that will inform plant breeding for crop protection. PMID:27313592
Status and opportunities for genomics research with rainbow trout
Thorgaard, G.H.; Bailey, G.S.; Williams, D.; Buhler, D.R.; Kaattari, S.L.; Ristow, S.S.; Hansen, J.D.; Winton, J.R.; Bartholomew, J.L.; Nagler, J.J.; Walsh, P.J.; Vijayan, M.M.; Devlin, R.H.; Hardy, R.W.; Overturf, K.E.; Young, W.P.; Robison, B.D.; Rexroad, C.; Palti, Y.
2002-01-01
The rainbow trout (Oncorhynchus mykiss) is one of the most widely studied of model fish species. Extensive basic biological information has been collected for this species, which because of their large size relative to other model fish species are particularly suitable for studies requiring ample quantities of specific cells and tissue types. Rainbow trout have been widely utilized for research in carcinogenesis, toxicology, comparative immunology, disease ecology, physiology and nutrition. They are distinctive in having evolved from a relatively recent tetraploid event, resulting in a high incidence of duplicated genes. Natural populations are available and have been well characterized for chromosomal, protein, molecular and quantitative genetic variation. Their ease of culture, and experimental and aquacultural significance has led to the development of clonal lines and the widespread application of transgenic technology to this species. Numerous microsatellites have been isolated and two relatively detailed genetic maps have been developed. Extensive sequencing of expressed sequence tags has begun and four BAC libraries have been developed. The development and analysis of additional genomic sequence data will provide distinctive opportunities to address problems in areas such as evolution of the immune system and duplicate genes. ?? 2002 Elsevier Science Inc. All rights reserved.
Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M
2005-08-01
The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. (c) 2005 Wiley-Liss, Inc.
Construction of a minimal genome as a chassis for synthetic biology.
Sung, Bong Hyun; Choe, Donghui; Kim, Sun Chang; Cho, Byung-Kwan
2016-11-30
Microbial diversity and complexity pose challenges in understanding the voluminous genetic information produced from whole-genome sequences, bioinformatics and high-throughput '-omics' research. These challenges can be overcome by a core blueprint of a genome drawn with a minimal gene set, which is essential for life. Systems biology and large-scale gene inactivation studies have estimated the number of essential genes to be ∼300-500 in many microbial genomes. On the basis of the essential gene set information, minimal-genome strains have been generated using sophisticated genome engineering techniques, such as genome reduction and chemical genome synthesis. Current size-reduced genomes are not perfect minimal genomes, but chemically synthesized genomes have just been constructed. Some minimal genomes provide various desirable functions for bioindustry, such as improved genome stability, increased transformation efficacy and improved production of biomaterials. The minimal genome as a chassis genome for synthetic biology can be used to construct custom-designed genomes for various practical and industrial applications. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.
Telenti, Amalio; Ayday, Erman; Hubaux, Jean Pierre
2014-01-01
The storage of greater numbers of exomes or genomes raises the question of loss of privacy for the individual and for families if genomic data are not properly protected. Access to genome data may result from a personal decision to disclose, or from gaps in protection. In either case, revealing genome data has consequences beyond the individual, as it compromises the privacy of family members. Increasing availability of genome data linked or linkable to metadata through online social networks and services adds one additional layer of complexity to the protection of genome privacy. The field of computer science and information technology offers solutions to secure genomic data so that individuals, medical personnel or researchers can access only the subset of genomic information required for healthcare or dedicated studies. PMID:25254097
Schmidt, Martin; Van Bel, Michiel; Woloszynska, Magdalena; Slabbinck, Bram; Martens, Cindy; De Block, Marc; Coppens, Frederik; Van Lijsebettens, Mieke
2017-07-06
Cytosine methylation in plant genomes is important for the regulation of gene transcription and transposon activity. Genome-wide methylomes are studied upon mutation of the DNA methyltransferases, adaptation to environmental stresses or during development. However, from basic biology to breeding programs, there is a need to monitor multiple samples to determine transgenerational methylation inheritance or differential cytosine methylation. Methylome data obtained by sodium hydrogen sulfite (bisulfite)-conversion and next-generation sequencing (NGS) provide genome-wide information on cytosine methylation. However, a profiling method that detects cytosine methylation state dispersed over the genome would allow high-throughput analysis of multiple plant samples with distinct epigenetic signatures. We use specific restriction endonucleases to enrich for cytosine coverage in a bisulfite and NGS-based profiling method, which was compared to whole-genome bisulfite sequencing of the same plant material. We established an effective methylome profiling method in plants, termed plant-reduced representation bisulfite sequencing (plant-RRBS), using optimized double restriction endonuclease digestion, fragment end repair, adapter ligation, followed by bisulfite conversion, PCR amplification and NGS. We report a performant laboratory protocol and a straightforward bioinformatics data analysis pipeline for plant-RRBS, applicable for any reference-sequenced plant species. As a proof of concept, methylome profiling was performed using an Oryza sativa ssp. indica pure breeding line and a derived epigenetically altered line (epiline). Plant-RRBS detects methylation levels at tens of millions of cytosine positions deduced from bisulfite conversion in multiple samples. To evaluate the method, the coverage of cytosine positions, the intra-line similarity and the differential cytosine methylation levels between the pure breeding line and the epiline were determined. Plant-RRBS reproducibly covers commonly up to one fourth of the cytosine positions in the rice genome when using MspI-DpnII within a group of five biological replicates of a line. The method predominantly detects cytosine methylation in putative promoter regions and not-annotated regions in rice. Plant-RRBS offers high-throughput and broad, genome-dispersed methylation detection by effective read number generation obtained from reproducibly covered genome fractions using optimized endonuclease combinations, facilitating comparative analyses of multi-sample studies for cytosine methylation and transgenerational stability in experimental material and plant breeding populations.
From Genetics to Genomics: A Short Introduction for Pediatric Neurologists.
Neubauer, Bernd A; Lemke, Johannes R
2016-01-01
It is estimated that in humans approximately 50% of all 22500 genes are needed for the development and maintenance of the nervous system. The introduction of high-throughput technology in genetic analysis has therefore major implications, not only for the investigation of specific disease entities but also for the diagnostic workup of single individuals with neurologic disorders of genetic origin. A short primer for clinicians is presented, addressing aspects of current developments in medical genomics. Significant findings of the last years are exemplified in an educational manner to provide a basic understanding of disease mechanisms that were unraveled by recent genomic analysis. Georg Thieme Verlag KG Stuttgart · New York.
Castillo-Quan, Jorge I; Pérez-Osorio, Julia M
2009-01-01
The establishment of medical genomics in Mexico offers the possibility to study in a more comprehensive manner the etiological factors of different diseases, providing a global view of the interaction between the genome and the environment. Nutrition is recognized as a significant determinant in several diseases, yet its interaction with polymorphisms, and in general with the genome, has not been properly addressed Mexico has a high prevalence of polymorphisms of the methylenetetrahydrofolate reductase gene, and in both clinical and basic studies this has been associated with an increased susceptibility of developing Alzheimer's disease. We propose a potential nutrigenomic approach for the study of Alzheimer disease in Mexico.
The plastid genomes of flowering plants.
Ruhlman, Tracey A; Jansen, Robert K
2014-01-01
The plastid genome (plastome) has proved a valuable source of data for evaluating evolutionary relationships among angiosperms. Through basic and applied approaches, plastid transformation technology offers the potential to understand and improve plant productivity, providing food, fiber, energy and medicines to meet the needs of a burgeoning global population. The growing genomic resources available to both phylogenetic and biotechnological investigations are allowing novel insights and expanding the scope of plastome research to encompass new species. In this chapter we present an overview of some of the seminal and contemporary research that has contributed to our current understanding of plastome evolution and attempt to highlight the relationship between evolutionary mechanisms and tools of plastid genetic engineering.
GRIL: genome rearrangement and inversion locator.
Darling, Aaron E; Mau, Bob; Blattner, Frederick R; Perna, Nicole T
2004-01-01
GRIL is a tool to automatically identify collinear regions in a set of bacterial-size genome sequences. GRIL uses three basic steps. First, regions of high sequence identity are located. Second, some of these regions are filtered based on user-specified criteria. Finally, the remaining regions of sequence identity are used to define significant collinear regions among the sequences. By locating collinear regions of sequence, GRIL provides a basis for multiple genome alignment using current alignment systems. GRIL also provides a basis for using current inversion distance tools to infer phylogeny. GRIL is implemented in C++ and runs on any x86-based Linux or Windows platform. It is available from http://asap.ahabs.wisc.edu/gril
Genome Sequence of the Electrogenic Petroleum-Degrading Thalassospira sp. Strain HJ
Kiseleva, Larisa; Garushyants, Sofya K.; Briliute, Justina; Simpson, David J. W.; Goryanin, Igor
2015-01-01
We present the draft genome of the petroleum-degrading Thalassospira sp. strain HJ, isolated from tidal marine sediment. Knowledge of this genomic information will inform studies on electrogenesis and means to degrade environmental organic contaminants, including compounds found in petroleum. PMID:25977412
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence' (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C.
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/ PMID:17981842
Persky, Susan; Ferrer, Rebecca A.; Klein, William M. P.
2016-01-01
It is crucial to examine patient reactions to genomics-informed approaches to weight management within a clinical context, and understand the influence of patient characteristics (here, emotion and race). Examining nonverbal reactions offers a window into patients’ implicit cognitive, attitudinal and affective processes related to clinical encounters. We simulated a weight management clinical interaction with a virtual reality-based physician, and experimentally manipulated patient emotional state (anger/ fear) and whether the physician made genomic or personal behavior attributions for weight. Participants were 190 overweight females who racially identified as either Black or White. Participants made less visual contact when receiving genomic information in the anger condition, and Black participants exhibited lowered voice pitch when receiving genomic information. Black participants also increased their interpersonal distance when receiving genomic information in the anger condition. By studying non-conscious nonverbal behavior, we can better understand the nuances of these interactions. PMID:27146511
Shim, Donghwan; Park, Sin-Gi; Kim, Kangmin; Bae, Wonsil; Lee, Gir Won; Ha, Byeong-Suk; Ro, Hyeon-Su; Kim, Myungkil; Ryoo, Rhim; Rhee, Sung-Keun; Nou, Ill-Sup; Koo, Chang-Duck; Hong, Chang Pyo; Ryu, Hojin
2016-04-10
Lentinula edodes, the popular shiitake mushroom, is one of the most important cultivated edible mushrooms. It is used as a food and for medicinal purposes. Here, we present the 46.1 Mb draft genome of L. edodes, comprising 13,028 predicted gene models. The genome assembly consists of 31 scaffolds. Gene annotation provides key information about various signaling pathways and secondary metabolites. This genomic information should help establish the molecular genetic markers for MAS/MAB and increase our understanding of the genome structure and function. Copyright © 2016 Elsevier B.V. All rights reserved.
Compositional patterns in the genomes of unicellular eukaryotes.
Costantini, Maria; Alvarez-Valin, Fernando; Costantini, Susan; Cammarano, Rosalia; Bernardi, Giorgio
2013-11-05
The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes.
Genetics/genomics education for nongenetic health professionals: a systematic literature review.
Talwar, Divya; Tseng, Tung-Sung; Foster, Margaret; Xu, Lei; Chen, Lei-Shih
2017-07-01
The completion of the Human Genome Project has enhanced avenues for disease prevention, diagnosis, and management. Owing to the shortage of genetic professionals, genetics/genomics training has been provided to nongenetic health professionals for years to establish their genomic competencies. We conducted a systematic literature review to summarize and evaluate the existing genetics/genomics education programs for nongenetic health professionals. Five electronic databases were searched from January 1990 to June 2016. Forty-four studies met our inclusion criteria. There was a growing publication trend. Program participants were mainly physicians and nurses. The curricula, which were most commonly provided face to face, included basic genetics; applied genetics/genomics; ethical, legal, and social implications of genetics/genomics; and/or genomic competencies/recommendations in particular professional fields. Only one-third of the curricula were theory-based. The majority of studies adopted a pre-/post-test design and lacked follow-up data collection. Nearly all studies reported participants' improvements in one or more of the following areas: knowledge, attitudes, skills, intention, self-efficacy, comfort level, and practice. However, most studies did not report participants' age, ethnicity, years of clinical practice, data validity, and data reliability. Many genetics/genomics education programs for nongenetic health professionals exist. Nevertheless, enhancement in methodological quality is needed to strengthen education initiatives.Genet Med advance online publication 20 October 2016.
Reproductive Toxicology Testing with EDCS
An introduction to reproductive toxicology: the basic approaches to testing chemicals for adverse effects using multigenerational studies with rats and how the regulatory agencies used the data in risk assessments. Case studies were presented of how endocrine or genomic data were...
Sharp, Richard R
2011-03-01
As we look to a time when whole-genome sequencing is integrated into patient care, it is possible to anticipate a number of ethical challenges that will need to be addressed. The most intractable of these concern informed consent and the responsible management of very large amounts of genetic information. Given the range of possible findings, it remains unclear to what extent it will be possible to obtain meaningful patient consent to genomic testing. Equally unclear is how clinicians will disseminate the enormous volume of genetic information produced by whole-genome sequencing. Toward developing practical strategies for managing these ethical challenges, we propose a research agenda that approaches multiplexed forms of clinical genetic testing as natural laboratories in which to develop best practices for managing the ethical complexities of genomic medicine.
EPA ACTIVITIES TO PREPARE FOR REGULATORY AND RISK ASSESSMENT APPLICATIONS OF GENOMICS INFORMATION
Genomics will have significant implications for risk assessment and regulatory decision making. Since 2002, the U.S. EPA has undertaken a number of cross-Agency activities to further prepare itself to receive,interpret and apply genomics information for risk assessment and regul...
Enhancing genomic laboratory reports from the patients' view: A qualitative analysis.
Stuckey, Heather; Williams, Janet L; Fan, Audrey L; Rahm, Alanna Kulchak; Green, Jamie; Feldman, Lynn; Bonhag, Michele; Zallen, Doris T; Segal, Michael M; Williams, Marc S
2015-10-01
The purpose of this study was to develop a family genomic laboratory report designed to communicate genome sequencing results to parents of children who were participating in a whole genome sequencing clinical research study. Semi-structured interviews were conducted with parents of children who participated in a whole genome sequencing clinical research study to address the elements, language and format of a sample family-directed genome laboratory report. The qualitative interviews were followed by two focus groups aimed at evaluating example presentations of information about prognosis and next steps related to the whole genome sequencing result. Three themes emerged from the qualitative data: (i) Parents described a continual search for valid information and resources regarding their child's condition, a need that prior reports did not meet for parents; (ii) Parents believed that the Family Report would help facilitate communication with physicians and family members; and (iii) Parents identified specific items they appreciated in a genomics Family Report: simplicity of language, logical flow, visual appeal, information on what to expect in the future and recommended next steps. Parents affirmed their desire for a family genomic results report designed for their use and reference. They articulated the need for clear, easy to understand language that provided information with temporal detail and specific recommendations regarding relevant findings consistent with that available to clinicians. © 2015 Wiley Periodicals, Inc.
Enhancing genomic laboratory reports from the patients' view: A qualitative analysis
Stuckey, Heather; Fan, Audrey L.; Rahm, Alanna Kulchak; Green, Jamie; Feldman, Lynn; Bonhag, Michele; Zallen, Doris T.; Segal, Michael M.; Williams, Marc S.
2015-01-01
The purpose of this study was to develop a family genomic laboratory report designed to communicate genome sequencing results to parents of children who were participating in a whole genome sequencing clinical research study. Semi‐structured interviews were conducted with parents of children who participated in a whole genome sequencing clinical research study to address the elements, language and format of a sample family‐directed genome laboratory report. The qualitative interviews were followed by two focus groups aimed at evaluating example presentations of information about prognosis and next steps related to the whole genome sequencing result. Three themes emerged from the qualitative data: (i) Parents described a continual search for valid information and resources regarding their child's condition, a need that prior reports did not meet for parents; (ii) Parents believed that the Family Report would help facilitate communication with physicians and family members; and (iii) Parents identified specific items they appreciated in a genomics Family Report: simplicity of language, logical flow, visual appeal, information on what to expect in the future and recommended next steps. Parents affirmed their desire for a family genomic results report designed for their use and reference. They articulated the need for clear, easy to understand language that provided information with temporal detail and specific recommendations regarding relevant findings consistent with that available to clinicians. PMID:26086630
Identification of Streptococcus mitis321A vaccine antigens based on reverse vaccinology
Zhang, Qiao; Lin, Kexiong; Wang, Changzheng; Xu, Zhi; Yang, Li; Ma, Qianli
2018-01-01
Streptococcus mitis (S. mitis) may transform into highly pathogenic bacteria. The aim of the present study was to identify potential antigen targets for designing an effective vaccine against the pathogenic S. mitis321A. The genome of S. mitis321A was sequenced using an Illumina Hiseq2000 instrument. Subsequently, Glimmer 3.02 and Tandem Repeat Finder (TRF) 4.04 were used to predict genes and tandem repeats, respectively, with DNA sequence function analysis using the Basic Local Alignment Search Tool (BLAST) in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups of proteins (COG) databases. Putative gene antigen candidates were screened with BLAST ahead of phylogenetic tree analysis. The DNA sequence assembly size was 2,110,680 bp with 40.12% GC, 6 scaffolds and 9 contig. Consequently, 1,944 genes were predicted, and 119 TRF, 56 microsatellite DNA, 10 minisatellite DNA and 154 transposons were acquired. The predicted genes were associated with various pathways and functions concerning membrane transport and energy metabolism. Multiple putative genes encoding surface proteins, secreted proteins and virulence factors, as well as essential genes were determined. The majority of essential genes belonged to a phylogenetic lineage, while 321AGL000129 and 321AGL000299 were on the same branch. The current study provided useful information regarding the biological function of the S. mitis321A genome and recommends putative antigen candidates for developing a potent vaccine against S. mitis. PMID:29620181
Establishment and cryptic transmission of Zika virus in Brazil and the Americas
NASA Astrophysics Data System (ADS)
Faria, N. R.; Quick, J.; Claro, I. M.; Thézé, J.; de Jesus, J. G.; Giovanetti, M.; Kraemer, M. U. G.; Hill, S. C.; Black, A.; da Costa, A. C.; Franco, L. C.; Silva, S. P.; Wu, C.-H.; Raghwani, J.; Cauchemez, S.; Du Plessis, L.; Verotti, M. P.; de Oliveira, W. K.; Carmo, E. H.; Coelho, G. E.; Santelli, A. C. F. S.; Vinhal, L. C.; Henriques, C. M.; Simpson, J. T.; Loose, M.; Andersen, K. G.; Grubaugh, N. D.; Somasekar, S.; Chiu, C. Y.; Muñoz-Medina, J. E.; Gonzalez-Bonilla, C. R.; Arias, C. F.; Lewis-Ximenez, L. L.; Baylis, S. A.; Chieppe, A. O.; Aguiar, S. F.; Fernandes, C. A.; Lemos, P. S.; Nascimento, B. L. S.; Monteiro, H. A. O.; Siqueira, I. C.; de Queiroz, M. G.; de Souza, T. R.; Bezerra, J. F.; Lemos, M. R.; Pereira, G. F.; Loudal, D.; Moura, L. C.; Dhalia, R.; França, R. F.; Magalhães, T.; Marques, E. T.; Jaenisch, T.; Wallau, G. L.; de Lima, M. C.; Nascimento, V.; de Cerqueira, E. M.; de Lima, M. M.; Mascarenhas, D. L.; Neto, J. P. Moura; Levin, A. S.; Tozetto-Mendoza, T. R.; Fonseca, S. N.; Mendes-Correa, M. C.; Milagres, F. P.; Segurado, A.; Holmes, E. C.; Rambaut, A.; Bedford, T.; Nunes, M. R. T.; Sabino, E. C.; Alcantara, L. C. J.; Loman, N. J.; Pybus, O. G.
2017-06-01
Transmission of Zika virus (ZIKV) in the Americas was first confirmed in May 2015 in northeast Brazil. Brazil has had the highest number of reported ZIKV cases worldwide (more than 200,000 by 24 December 2016) and the most cases associated with microcephaly and other birth defects (2,366 confirmed by 31 December 2016). Since the initial detection of ZIKV in Brazil, more than 45 countries in the Americas have reported local ZIKV transmission, with 24 of these reporting severe ZIKV-associated disease. However, the origin and epidemic history of ZIKV in Brazil and the Americas remain poorly understood, despite the value of this information for interpreting observed trends in reported microcephaly. Here we address this issue by generating 54 complete or partial ZIKV genomes, mostly from Brazil, and reporting data generated by a mobile genomics laboratory that travelled across northeast Brazil in 2016. One sequence represents the earliest confirmed ZIKV infection in Brazil. Analyses of viral genomes with ecological and epidemiological data yield an estimate that ZIKV was present in northeast Brazil by February 2014 and is likely to have disseminated from there, nationally and internationally, before the first detection of ZIKV in the Americas. Estimated dates for the international spread of ZIKV from Brazil indicate the duration of pre-detection cryptic transmission in recipient regions. The role of northeast Brazil in the establishment of ZIKV in the Americas is further supported by geographic analysis of ZIKV transmission potential and by estimates of the basic reproduction number of the virus.
Establishment and cryptic transmission of Zika virus in Brazil and the Americas.
Faria, N R; Quick, J; Claro, I M; Thézé, J; de Jesus, J G; Giovanetti, M; Kraemer, M U G; Hill, S C; Black, A; da Costa, A C; Franco, L C; Silva, S P; Wu, C-H; Raghwani, J; Cauchemez, S; du Plessis, L; Verotti, M P; de Oliveira, W K; Carmo, E H; Coelho, G E; Santelli, A C F S; Vinhal, L C; Henriques, C M; Simpson, J T; Loose, M; Andersen, K G; Grubaugh, N D; Somasekar, S; Chiu, C Y; Muñoz-Medina, J E; Gonzalez-Bonilla, C R; Arias, C F; Lewis-Ximenez, L L; Baylis, S A; Chieppe, A O; Aguiar, S F; Fernandes, C A; Lemos, P S; Nascimento, B L S; Monteiro, H A O; Siqueira, I C; de Queiroz, M G; de Souza, T R; Bezerra, J F; Lemos, M R; Pereira, G F; Loudal, D; Moura, L C; Dhalia, R; França, R F; Magalhães, T; Marques, E T; Jaenisch, T; Wallau, G L; de Lima, M C; Nascimento, V; de Cerqueira, E M; de Lima, M M; Mascarenhas, D L; Neto, J P Moura; Levin, A S; Tozetto-Mendoza, T R; Fonseca, S N; Mendes-Correa, M C; Milagres, F P; Segurado, A; Holmes, E C; Rambaut, A; Bedford, T; Nunes, M R T; Sabino, E C; Alcantara, L C J; Loman, N J; Pybus, O G
2017-06-15
Transmission of Zika virus (ZIKV) in the Americas was first confirmed in May 2015 in northeast Brazil. Brazil has had the highest number of reported ZIKV cases worldwide (more than 200,000 by 24 December 2016) and the most cases associated with microcephaly and other birth defects (2,366 confirmed by 31 December 2016). Since the initial detection of ZIKV in Brazil, more than 45 countries in the Americas have reported local ZIKV transmission, with 24 of these reporting severe ZIKV-associated disease. However, the origin and epidemic history of ZIKV in Brazil and the Americas remain poorly understood, despite the value of this information for interpreting observed trends in reported microcephaly. Here we address this issue by generating 54 complete or partial ZIKV genomes, mostly from Brazil, and reporting data generated by a mobile genomics laboratory that travelled across northeast Brazil in 2016. One sequence represents the earliest confirmed ZIKV infection in Brazil. Analyses of viral genomes with ecological and epidemiological data yield an estimate that ZIKV was present in northeast Brazil by February 2014 and is likely to have disseminated from there, nationally and internationally, before the first detection of ZIKV in the Americas. Estimated dates for the international spread of ZIKV from Brazil indicate the duration of pre-detection cryptic transmission in recipient regions. The role of northeast Brazil in the establishment of ZIKV in the Americas is further supported by geographic analysis of ZIKV transmission potential and by estimates of the basic reproduction number of the virus.
Zinc Fingers, TALEs, and CRISPR Systems: A Comparison of Tools for Epigenome Editing.
Waryah, Charlene Babra; Moses, Colette; Arooj, Mahira; Blancafort, Pilar
2018-01-01
The completion of genome, epigenome, and transcriptome mapping in multiple cell types has created a demand for precision biomolecular tools that allow researchers to functionally manipulate DNA, reconfigure chromatin structure, and ultimately reshape gene expression patterns. Epigenetic editing tools provide the ability to interrogate the relationship between epigenetic modifications and gene expression. Importantly, this information can be exploited to reprogram cell fate for both basic research and therapeutic applications. Three different molecular platforms for epigenetic editing have been developed: zinc finger proteins (ZFs), transcription activator-like effectors (TALEs), and the system of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) proteins. These platforms serve as custom DNA-binding domains (DBDs), which are fused to epigenetic modifying domains to manipulate epigenetic marks at specific sites in the genome. The addition and/or removal of epigenetic modifications reconfigures local chromatin structure, with the potential to provoke long-lasting changes in gene transcription. Here we summarize the molecular structure and mechanism of action of ZF, TALE, and CRISPR platforms and describe their applications for the locus-specific manipulation of the epigenome. The advantages and disadvantages of each platform will be discussed with regard to genomic specificity, potency in regulating gene expression, and reprogramming cell phenotypes, as well as ease of design, construction, and delivery. Finally, we outline potential applications for these tools in molecular biology and biomedicine and identify possible barriers to their future clinical implementation.
Castrillo, Juan I; Lista, Simone; Hampel, Harald; Ritchie, Craig W
2018-01-01
Alzheimer's disease (AD) is a complex multifactorial disease, involving a combination of genomic, interactome, and environmental factors, with essential participation of (a) intrinsic genomic susceptibility and (b) a constant dynamic interplay between impaired pathways and central homeostatic networks of nerve cells. The proper investigation of the complexity of AD requires new holistic systems-level approaches, at both the experimental and computational level. Systems biology methods offer the potential to unveil new fundamental insights, basic mechanisms, and networks and their interplay. These may lead to the characterization of mechanism-based molecular signatures, and AD hallmarks at the earliest molecular and cellular levels (and beyond), for characterization of AD subtypes and stages, toward targeted interventions according to the evolving precision medicine paradigm. In this work, an update on advanced systems biology methods and strategies for holistic studies of multifactorial diseases-particularly AD-is presented. This includes next-generation genomics, neuroimaging and multi-omics methods, experimental and computational approaches, relevant disease models, and latest genome editing and single-cell technologies. Their progressive incorporation into basic research, cohort studies, and trials is beginning to provide novel insights into AD essential mechanisms, molecular signatures, and markers toward mechanism-based classification and staging, and tailored interventions. Selected methods which can be applied in cohort studies and trials, with the European Prevention of Alzheimer's Dementia (EPAD) project as a reference example, are presented and discussed.
Phylogenomic Insights into Mouse Evolution Using a Pseudoreference Approach
Sarver, Brice A.J.; Keeble, Sara; Cosart, Ted; Tucker, Priscilla K.; Dean, Matthew D.
2017-01-01
Comparative genomic studies are now possible across a broad range of evolutionary timescales, but the generation and analysis of genomic data across many different species still present a number of challenges. The most sophisticated genotyping and down-stream analytical frameworks are still predominantly based on comparisons to high-quality reference genomes. However, established genomic resources are often limited within a given group of species, necessitating comparisons to divergent reference genomes that could restrict or bias comparisons across a phylogenetic sample. Here, we develop a scalable pseudoreference approach to iteratively incorporate sample-specific variation into a genome reference and reduce the effects of systematic mapping bias in downstream analyses. To characterize this framework, we used targeted capture to sequence whole exomes (∼54 Mbp) in 12 lineages (ten species) of mice spanning the Mus radiation. We generated whole exome pseudoreferences for all species and show that this iterative reference-based approach improved basic genomic analyses that depend on mapping accuracy while preserving the associated annotations of the mouse reference genome. We then use these pseudoreferences to resolve evolutionary relationships among these lineages while accounting for phylogenetic discordance across the genome, contributing an important resource for comparative studies in the mouse system. We also describe patterns of genomic introgression among lineages and compare our results to previous studies. Our general approach can be applied to whole or partitioned genomic data and is easily portable to any system with sufficient genomic resources, providing a useful framework for phylogenomic studies in mice and other taxa. PMID:28338821
Genome Sequence of the Electrogenic Petroleum-Degrading Thalassospira sp. Strain HJ.
Kiseleva, Larisa; Garushyants, Sofya K; Briliute, Justina; Simpson, David J W; Cohen, Michael F; Goryanin, Igor
2015-05-14
We present the draft genome of the petroleum-degrading Thalassospira sp. strain HJ, isolated from tidal marine sediment. Knowledge of this genomic information will inform studies on electrogenesis and means to degrade environmental organic contaminants, including compounds found in petroleum. Copyright © 2015 Kiseleva et al.
SSGP: SNP-set based genomic prediction to incorporate biological information
USDA-ARS?s Scientific Manuscript database
Genomic prediction has emerged as an effective approach in plant and animal breeding and in precision medicine. Much research has been devoted to an improved accuracy in genomic prediction, and one of the potential ways is to incorporate biological information. Due to the statistical and computation...
Bharathi, Kosaraju; Sreenath, H L
2017-07-01
Coffea canephora is the commonly cultivated coffee species in the world along with Coffea arabica . Different pests and pathogens affect the production and quality of the coffee. Jasmonic acid (JA) is a plant hormone which plays an important role in plants growth, development, and defense mechanisms, particularly against insect pests. The key enzymes involved in the production of JA are lipoxygenase, allene oxide synthase, allene oxide cyclase, and 12-oxo-phytodienoic reductase. There is no report on the genes involved in JA pathway in coffee plants. We made an attempt to identify and analyze the genes coding for these enzymes in C. canephora . First, protein sequences of jasmonate pathway genes from model plant Arabidopsis thaliana were identified in the National Center for Biotechnology Information (NCBI) database. These protein sequences were used to search the web-based database Coffee Genome Hub to identify homologous protein sequences in C. canephora genome using Basic Local Alignment Search Tool (BLAST). Homologous protein sequences for key genes were identified in the C. canephora genome database. Protein sequences of the top matches were in turn used to search in NCBI database using BLAST tool to confirm the identity of the selected proteins and to identify closely related genes in species. The protein sequences from C. canephora database and the top matches in NCBI were aligned, and phylogenetic trees were constructed using MEGA6 software and identified the genetic distance of the respective genes. The study identified the four key genes of JA pathway in C. canephora , confirming the conserved nature of the pathway in coffee. The study expected to be useful to further explore the defense mechanisms of coffee plants. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC, and OPR are identified in C. canephora (robusta coffee) by bioinformatic approaches confirming the conserved nature of the pathway in coffee. The findings are useful to understand the defense mechanisms of C. canephora and coffee breeding in the long run. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC and OPR were identified and analyzed in C. canephora (robusta coffee) by in silico approach. The study has confirmed the conserved nature of JA pathway in coffee; the findings are useful to further explore the defense mechanisms of coffee plants. Abbreviations used: C. canephora : Coffea canephora ; C. arabica : Coffea arabica ; JA: Jasmonic acid; CGH: Coffee Genome Hub; NCBI: National Centre for Biotechnology Information; BLAST: Basic Local Alignment Search Tool; A. thaliana : Arabidopsis thaliana ; LOX: Lipoxygenase, AOS: Allene oxide synthase; AOC: Allene oxide cyclase; OPR: 12 oxo phytodienoic reductase.
Participants' recall and understanding of genomic research and large-scale data sharing.
Robinson, Jill Oliver; Slashinski, Melody J; Wang, Tao; Hilsenbeck, Susan G; McGuire, Amy L
2013-10-01
As genomic researchers are urged to openly share generated sequence data with other researchers, it is important to examine the utility of informed consent documents and processes, particularly as these relate to participants' engagement with and recall of the information presented to them, their objective or subjective understanding of the key elements of genomic research (e.g., data sharing), as well as how these factors influence or mediate the decisions they make. We conducted a randomized trial of three experimental informed consent documents (ICDs) with participants (n = 229) being recruited to genomic research studies; each document afforded varying control over breadth of release of genetic information. Recall and understanding, their impact on data sharing decisions, and comfort in decision making were assessed in a follow-up structured interview. Over 25% did not remember signing an ICD to participate in a genomic study, and the majority (54%) could not correctly identify with whom they had agreed to share their genomic data. However, participants felt that they understood enough to make an informed decision, and lack of recall did not impact final data sharing decisions or satisfaction with participation. These findings raise questions about the types of information participants need in order to provide valid informed consent, and whether subjective understanding and comfort with decision making are sufficient to satisfy the ethical principle of respect for persons.
Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions.
Hoban, Sean; Kelley, Joanna L; Lotterhos, Katie E; Antolin, Michael F; Bradburd, Gideon; Lowry, David B; Poss, Mary L; Reed, Laura K; Storfer, Andrew; Whitlock, Michael C
2016-10-01
Uncovering the genetic and evolutionary basis of local adaptation is a major focus of evolutionary biology. The recent development of cost-effective methods for obtaining high-quality genome-scale data makes it possible to identify some of the loci responsible for adaptive differences among populations. Two basic approaches for identifying putatively locally adaptive loci have been developed and are broadly used: one that identifies loci with unusually high genetic differentiation among populations (differentiation outlier methods) and one that searches for correlations between local population allele frequencies and local environments (genetic-environment association methods). Here, we review the promises and challenges of these genome scan methods, including correcting for the confounding influence of a species' demographic history, biases caused by missing aspects of the genome, matching scales of environmental data with population structure, and other statistical considerations. In each case, we make suggestions for best practices for maximizing the accuracy and efficiency of genome scans to detect the underlying genetic basis of local adaptation. With attention to their current limitations, genome scan methods can be an important tool in finding the genetic basis of adaptive evolutionary change.
Evolving approaches to the ethical management of genomic data.
McEwen, Jean E; Boyer, Joy T; Sun, Kathie Y
2013-06-01
The ethical landscape in the field of genomics is rapidly shifting. Plummeting sequencing costs, along with ongoing advances in bioinformatics, now make it possible to generate an enormous volume of genomic data about vast numbers of people. The informational richness, complexity, and frequently uncertain meaning of these data, coupled with evolving norms surrounding the sharing of data and samples and persistent privacy concerns, have generated a range of approaches to the ethical management of genomic information. As calls increase for the expanded use of broad or even open consent, and as controversy grows about how best to handle incidental genomic findings, these approaches, informed by normative analysis and empirical data, will continue to evolve alongside the science. Published by Elsevier Ltd.
Evolving Approaches to the Ethical Management of Genomic Data
Boyer, Joy T.; Sun, Kathie Y.
2013-01-01
The ethical landscape in the field of genomics is rapidly shifting. Plummeting sequencing costs, along with ongoing advances in bioinformatics, now make it possible to generate an enormous volume of genomic data about vast numbers of people. The informational richness, complexity, and frequently uncertain meaning of these data, coupled with evolving norms surrounding the sharing of data and samples and persistent privacy concerns, have generated a range of approaches to the ethical management of genomic information. As calls increase for the expanded use of broad or even open consent, and as controversy grows about how best to handle incidental genomic findings, these approaches, informed by normative analysis and empirical data, will continue to evolve alongside the science. PMID:23453621
Yu, Yang; Zhang, Xiaojun; Yuan, Jianbo; Li, Fuhua; Chen, Xiaohan; Zhao, Yongzhen; Huang, Long; Zheng, Hongkun; Xiang, Jianhai
2015-01-01
The Pacific white shrimp Litopenaeus vannamei is the dominant crustacean species in global seafood mariculture. Understanding the genome and genetic architecture is useful for deciphering complex traits and accelerating the breeding program in shrimp. In this study, a genome survey was conducted and a high-density linkage map was constructed using a next-generation sequencing approach. The genome survey was used to identify preliminary genome characteristics and to generate a rough reference for linkage map construction. De novo SNP discovery resulted in 25,140 polymorphic markers. A total of 6,359 high-quality markers were selected for linkage map construction based on marker coverage among individuals and read depths. For the linkage map, a total of 6,146 markers spanning 4,271.43 cM were mapped to 44 sex-averaged linkage groups, with an average marker distance of 0.7 cM. An integration analysis linked 5,885 genome scaffolds and 1,504 BAC clones to the linkage map. Based on the high-density linkage map, several QTLs for body weight and body length were detected. This high-density genetic linkage map reveals basic genomic architecture and will be useful for comparative genomics research, genome assembly and genetic improvement of L. vannamei and other penaeid shrimp species. PMID:26503227
Brown, Eric W.; Detter, Chris; Gerner-Smidt, Peter; Gilmour, Matthew W.; Harmsen, Dag; Hendriksen, Rene S.; Hewson, Roger; Heymann, David L.; Johansson, Karin; Ijaz, Kashef; Keim, Paul S.; Koopmans, Marion; Kroneman, Annelies; Wong, Danilo Lo Fo; Lund, Ole; Palm, Daniel; Sawanpanyalert, Pathom; Sobel, Jeremy; Schlundt, Jørgen
2012-01-01
The rapid advancement of genome technologies holds great promise for improving the quality and speed of clinical and public health laboratory investigations and for decreasing their cost. The latest generation of genome DNA sequencers can provide highly detailed and robust information on disease-causing microbes, and in the near future these technologies will be suitable for routine use in national, regional, and global public health laboratories. With additional improvements in instrumentation, these next- or third-generation sequencers are likely to replace conventional culture-based and molecular typing methods to provide point-of-care clinical diagnosis and other essential information for quicker and better treatment of patients. Provided there is free-sharing of information by all clinical and public health laboratories, these genomic tools could spawn a global system of linked databases of pathogen genomes that would ensure more efficient detection, prevention, and control of endemic, emerging, and other infectious disease outbreaks worldwide. PMID:23092707
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim; ...
2015-02-05
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
Genetic makeup of amantadine-resistant and oseltamivir-resistant human influenza A/H1N1 viruses.
Zaraket, Hassan; Saito, Reiko; Suzuki, Yasushi; Baranovich, Tatiana; Dapat, Clyde; Caperig-Dapat, Isolde; Suzuki, Hiroshi
2010-04-01
The emergence and widespread occurrence of antiviral drug-resistant seasonal human influenza A viruses, especially oseltamivir-resistant A/H1N1 virus, are major concerns. To understand the genetic background of antiviral drug-resistant A/H1N1 viruses, we performed full genome sequencing of prepandemic A/H1N1 strains. Seasonal influenza A/H1N1 viruses, including antiviral-susceptible viruses, amantadine-resistant viruses, and oseltamivir-resistant viruses, obtained from several areas in Japan during the 2007-2008 and 2008-2009 influenza seasons were analyzed. Sequencing of the full genomes of these viruses was performed, and the phylogenetic relationships among the sequences of each individual genome segment were inferred. Reference genome sequences from the Influenza Virus Resource database were included to determine the closest ancestor for each segment. Phylogenetic analysis revealed that the oseltamivir-resistant strain evolved from a reassortant oseltamivir-susceptible strain (clade 2B) which circulated in the 2007-2008 season by acquiring the H275Y resistance-conferring mutation in the NA gene. The oseltamivir-resistant lineage (corresponding to the Northern European resistant lineage) represented 100% of the H1N1 isolates from the 2008-2009 season and further acquired at least one mutation in each of the polymerase basic protein 2 (PB2), polymerase basic protein 1 (PB1), hemagglutinin (HA), and neuraminidase (NA) genes. Therefore, a reassortment event involving two distinct oseltamivir-susceptible lineages, followed by the H275Y substitution in the NA gene and other mutations elsewhere in the genome, contributed to the emergence of the oseltamivir-resistant lineage. In contrast, amantadine-resistant viruses from the 2007-2008 season distinctly clustered in clade 2C and were characterized by extensive amino acid substitutions across their genomes, suggesting that a fitness gap among its genetic components might have driven these mutations to maintain it in the population.
77 FR 51496 - Federal Acquisition Regulation; Basic Safeguarding of Contractor Information Systems
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-24
... Federal Acquisition Regulation; Basic Safeguarding of Contractor Information Systems AGENCY: Department of... Acquisition Regulation (FAR) to add a new subpart and contract clause for the basic safeguarding of contractor... information) that will be resident on or transiting through contractor information systems. DATES: Interested...
Soh, Jung; Gordon, Paul MK; Taschuk, Morgan L; Dong, Anguo; Ah-Seng, Andrew C; Turinsky, Andrei L; Sensen, Christoph W
2008-01-01
Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i) gene expression information; and (ii) comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i) numerous display customization features; (ii) the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii) the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes. PMID:18940007
Exploring Other Genomes: Bacteria.
ERIC Educational Resources Information Center
Flannery, Maura C.
2001-01-01
Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)
Cloud-based interactive analytics for terabytes of genomic variants data
Pan, Cuiping; McInnes, Gregory; Deflaux, Nicole; Snyder, Michael; Bingham, Jonathan; Datta, Somalee; Tsao, Philip S
2017-01-01
Abstract Motivation Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired. Results We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information. Availability and implementation Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs. Contact cuiping@stanford.edu or ptsao@stanford.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28961771
The Adenovirus Genome Contributes to the Structural Stability of the Virion
Saha, Bratati; Wong, Carmen M.; Parks, Robin J.
2014-01-01
Adenovirus (Ad) vectors are currently the most commonly used platform for therapeutic gene delivery in human gene therapy clinical trials. Although these vectors are effective, many researchers seek to further improve the safety and efficacy of Ad-based vectors through detailed characterization of basic Ad biology relevant to its function as a vector system. Most Ad vectors are deleted of key, or all, viral protein coding sequences, which functions to not only prevent virus replication but also increase the cloning capacity of the vector for foreign DNA. However, radical modifications to the genome size significantly decreases virion stability, suggesting that the virus genome plays a role in maintaining the physical stability of the Ad virion. Indeed, a similar relationship between genome size and virion stability has been noted for many viruses. This review discusses the impact of the genome size on Ad virion stability and emphasizes the need to consider this aspect of virus biology in Ad-based vector design. PMID:25254384
Kang, Hahk-Soo
2017-02-01
Genomics-based methods are now commonplace in natural products research. A phylogeny-guided mining approach provides a means to quickly screen a large number of microbial genomes or metagenomes in search of new biosynthetic gene clusters of interest. In this approach, biosynthetic genes serve as molecular markers, and phylogenetic trees built with known and unknown marker gene sequences are used to quickly prioritize biosynthetic gene clusters for their metabolites characterization. An increase in the use of this approach has been observed for the last couple of years along with the emergence of low cost sequencing technologies. The aim of this review is to discuss the basic concept of a phylogeny-guided mining approach, and also to provide examples in which this approach was successfully applied to discover new natural products from microbial genomes and metagenomes. I believe that the phylogeny-guided mining approach will continue to play an important role in genomics-based natural products research.
Test Pricing and Reimbursement in Genomic Medicine: Towards a General Strategy.
Vozikis, Athanassios; Cooper, David N; Mitropoulou, Christina; Kambouris, Manousos E; Brand, Angela; Dolzan, Vita; Fortina, Paolo; Innocenti, Federico; Lee, Ming Ta Michael; Leyens, Lada; Macek, Milan; Al-Mulla, Fahd; Prainsack, Barbara; Squassina, Alessio; Taruscio, Domenica; van Schaik, Ron H; Vayena, Effy; Williams, Marc S; Patrinos, George P
2016-01-01
This paper aims to provide an overview of the rationale and basic principles guiding the governance of genomic testing services, to clarify their objectives, and allocate and define responsibilities among stakeholders in a health-care system, with a special focus on the EU countries. Particular attention is paid to issues pertaining to pricing and reimbursement policies, the availability of essential genomic tests which differs between various countries owing to differences in disease prevalence and public health relevance, the prescribing and use of genomic testing services according to existing or new guidelines, budgetary and fiscal control, the balance between price and access to innovative testing, monitoring and evaluation for cost-effectiveness and safety, and the development of research capacity. We conclude that addressing the specific items put forward in this article will help to create a robust policy in relation to pricing and reimbursement in genomic medicine. This will contribute to an effective and sustainable health-care system and will prove beneficial to the economy at large. © 2016 S. Karger AG, Basel.
CRISPR applications in ophthalmologic genome surgery.
Cabral, Thiago; DiCarlo, James E; Justus, Sally; Sengillo, Jesse D; Xu, Yu; Tsang, Stephen H
2017-05-01
The present review seeks to summarize and discuss the application of clustered regularly interspaced short palindromic repeats (CRISPR)-associated systems (Cas) for genome editing, also called genome surgery, in the field of ophthalmology. Precision medicine is an emerging approach for disease treatment and prevention that takes into account the variability of an individual's genetic sequence. Various groups have used CRISPR-Cas genome editing to make significant progress in mammalian preclinical models of eye disease, the basic science of eye development in zebrafish, the in vivo modification of ocular tissue, and the correction of stem cells with therapeutic applications. In addition, investigators have creatively used the targeted mutagenic potential of CRISPR-Cas systems to target pathogenic alleles in vitro. Over the past year, CRISPR-Cas genome editing has been used to correct pathogenic mutations in vivo and in transplantable stem cells. Although off-target mutagenesis remains a concern, improvement in CRISPR-Cas technology and careful screening for undesired mutations will likely lead to clinical eye therapeutics employing CRISPR-Cas systems in the near future.
Hierarchical Scaffolding With Bambus
Pop, Mihai; Kosack, Daniel S.; Salzberg, Steven L.
2004-01-01
The output of a genome assembler generally comprises a collection of contiguous DNA sequences (contigs) whose relative placement along the genome is not defined. A procedure called scaffolding is commonly used to order and orient these contigs using paired read information. This ordering of contigs is an essential step when finishing and analyzing the data from a whole-genome shotgun project. Most recent assemblers include a scaffolding module; however, users have little control over the scaffolding algorithm or the information produced. We thus developed a general-purpose scaffolder, called Bambus, which affords users significant flexibility in controlling the scaffolding parameters. Bambus was used recently to scaffold the low-coverage draft dog genome data. Most significantly, Bambus enables the use of linking data other than that inferred from mate-pair information. For example, the sequence of a completed genome can be used to guide the scaffolding of a related organism. We present several applications of Bambus: support for finishing, comparative genomics, analysis of the haplotype structure of genomes, and scaffolding of a mammalian genome at low coverage. Bambus is available as an open-source package from our Web site. PMID:14707177
Hierarchical scaffolding with Bambus.
Pop, Mihai; Kosack, Daniel S; Salzberg, Steven L
2004-01-01
The output of a genome assembler generally comprises a collection of contiguous DNA sequences (contigs) whose relative placement along the genome is not defined. A procedure called scaffolding is commonly used to order and orient these contigs using paired read information. This ordering of contigs is an essential step when finishing and analyzing the data from a whole-genome shotgun project. Most recent assemblers include a scaffolding module; however, users have little control over the scaffolding algorithm or the information produced. We thus developed a general-purpose scaffolder, called Bambus, which affords users significant flexibility in controlling the scaffolding parameters. Bambus was used recently to scaffold the low-coverage draft dog genome data. Most significantly, Bambus enables the use of linking data other than that inferred from mate-pair information. For example, the sequence of a completed genome can be used to guide the scaffolding of a related organism. We present several applications of Bambus: support for finishing, comparative genomics, analysis of the haplotype structure of genomes, and scaffolding of a mammalian genome at low coverage. Bambus is available as an open-source package from our Web site.
technologies using materials-by-design methods. The basic direction involves research on non-equilibrium doping in semiconductors Materials by Design and Materials Genome Non-equilibrium and metastable . 5, 1117 (2014) "Theoretical Prediction and Experimental Realization of New Stable Inorganic
Methods in molecular biology: plant cytogenetics
USDA-ARS?s Scientific Manuscript database
Cytogenetic studies have contributed greatly to our understanding of genetics, biology, reproduction, and evolution. From early studies in basic chromosome behavior the field has expanded enabling whole genome analysis to the manipulation of chromosomes and their organization. This book covers a ran...
Centromere synteny among Brachypodium, wheat, and rice
USDA-ARS?s Scientific Manuscript database
Rice, wheat and Brachypodium are plant genetic models with variable genome complexity and basic chromosome numbers, representing two subfamilies of the Poaceae. Centromeres are prominent chromosome landmarks, but their fate during this convoluted chromosome evolution has been more difficult to deter...
Comparative genomic analysis by microbial COGs self-attraction rate.
Santoni, Daniele; Romano-Spica, Vincenzo
2009-06-21
Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes' aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.
Zhou, Bin; Lin, Xudong; Wang, Wei; Halpin, Rebecca A.; Bera, Jayati; Stockwell, Timothy B.; Barr, Ian G.
2014-01-01
Although human influenza B virus (IBV) is a significant human pathogen, its great genetic diversity has limited our ability to universally amplify the entire genome for subsequent sequencing or vaccine production. The generation of sequence data via next-generation approaches and the rapid cloning of viral genes are critical for basic research, diagnostics, antiviral drugs, and vaccines to combat IBV. To overcome the difficulty of amplifying the diverse and ever-changing IBV genome, we developed and optimized techniques that amplify the complete segmented negative-sense RNA genome from any IBV strain in a single tube/well (IBV genomic amplification [IBV-GA]). Amplicons for >1,000 diverse IBV genomes from different sample types (e.g., clinical specimens) were generated and sequenced using this robust technology. These approaches are sensitive, robust, and sequence independent (i.e., universally amplify past, present, and future IBVs), which facilitates next-generation sequencing and advanced genomic diagnostics. Importantly, special terminal sequences engineered into the optimized IBV-GA2 products also enable ligation-free cloning to rapidly generate reverse-genetics plasmids, which can be used for the rescue of recombinant viruses and/or the creation of vaccine seed stock. PMID:24501036
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.
Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi
2017-07-01
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
The BIG Data Center: from deposition to integration to translation
2017-01-01
Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658
Anderson, Olin D; Coleman-Derr, Devin; Gu, Yong Q; Heath, Sekou
2010-06-16
Among the dietary essential amino acids, the most severely limiting in the cereals is lysine. Since cereals make up half of the human diet, lysine limitation has quality/nutritional consequences. The breakdown of lysine is controlled mainly by the catabolic bifunctional enzyme lysine ketoglutarate reductase - saccharopine dehydrogenase (LKR/SDH). The LKR/SDH gene has been reported to produce transcripts for the bifunctional enzyme and separate monofunctional transcripts. In addition to lysine metabolism, this gene has been implicated in a number of metabolic and developmental pathways, which along with its production of multiple transcript types and complex exon/intron structure suggest an important node in plant metabolism. Understanding more about the LKR/SDH gene is thus interesting both from applied standpoint and for basic plant metabolism. The current report describes a wheat genomic fragment containing an LKR/SDH gene and adjacent genes. The wheat LKR/SDH genomic segment was found to originate from the A-genome of wheat, and EST analysis indicates all three LKR/SDH genes in hexaploid wheat are transcriptionally active. A comparison of a set of plant LKR/SDH genes suggests regions of greater sequence conservation likely related to critical enzymatic functions and metabolic controls. Although most plants contain only a single LKR/SDH gene per genome, poplar contains at least two functional bifunctional genes in addition to a monofunctional LKR gene. Analysis of ESTs finds evidence for monofunctional LKR transcripts in switchgrass, and monofunctional SDH transcripts in wheat, Brachypodium, and poplar. The analysis of a wheat LKR/SDH gene and comparative structural and functional analyses among available plant genes provides new information on this important gene. Both the structure of the LKR/SDH gene and the immediately adjacent genes show lineage-specific differences between monocots and dicots, and findings suggest variation in activity of LKR/SDH genes among plants. Although most plant genomes seem to contain a single conserved LKR/SDH gene per genome, poplar possesses multiple contiguous genes. A preponderance of SDH transcripts suggests the LKR region may be more rate-limiting. Only switchgrass has EST evidence for LKR monofunctional transcripts. Evidence for monofunctional SDH transcripts shows a novel intron in wheat, Brachypodium, and poplar.
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.
Tatusova, Tatiana
2016-01-01
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Closing the gap between knowledge and clinical application: challenges for genomic translation.
Burke, Wylie; Korngiebel, Diane M
2015-01-01
Despite early predictions and rapid progress in research, the introduction of personal genomics into clinical practice has been slow. Several factors contribute to this translational gap between knowledge and clinical application. The evidence available to support genetic test use is often limited, and implementation of new testing programs can be challenging. In addition, the heterogeneity of genomic risk information points to the need for strategies to select and deliver the information most appropriate for particular clinical needs. Accomplishing these tasks also requires recognition that some expectations for personal genomics are unrealistic, notably expectations concerning the clinical utility of genomic risk assessment for common complex diseases. Efforts are needed to improve the body of evidence addressing clinical outcomes for genomics, apply implementation science to personal genomics, and develop realistic goals for genomic risk assessment. In addition, translational research should emphasize the broader benefits of genomic knowledge, including applications of genomic research that provide clinical benefit outside the context of personal genomic risk.
The coffee genome hub: a resource for coffee genomes
Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan
2015-01-01
The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. PMID:25392413
MIPSPlantsDB—plant database resource for integrative and comparative plant genome research
Spannagl, Manuel; Noubibou, Octave; Haase, Dirk; Yang, Li; Gundlach, Heidrun; Hindemitt, Tobias; Klee, Kathrin; Haberer, Georg; Schoof, Heiko; Mayer, Klaus F. X.
2007-01-01
Genome-oriented plant research delivers rapidly increasing amount of plant genome data. Comprehensive and structured information resources are required to structure and communicate genome and associated analytical data for model organisms as well as for crops. The increase in available plant genomic data enables powerful comparative analysis and integrative approaches. PlantsDB aims to provide data and information resources for individual plant species and in addition to build a platform for integrative and comparative plant genome research. PlantsDB is constituted from genome databases for Arabidopsis, Medicago, Lotus, rice, maize and tomato. Complementary data resources for cis elements, repetive elements and extensive cross-species comparisons are implemented. The PlantsDB portal can be reached at . PMID:17202173
Known unknowns: building an ethics of uncertainty into genomic medicine.
Newson, Ainsley J; Leonard, Samantha J; Hall, Alison; Gaff, Clara L
2016-09-01
Genomic testing has reached the point where, technically at least, it can be cheaper to undertake panel-, exome- or whole genome testing than it is to sequence a single gene. An attribute of these approaches is that information gleaned will often have uncertain significance. In addition to the challenges this presents for pre-test counseling and informed consent, a further consideration emerges over how - ethically - we should conceive of and respond to this uncertainty. To date, the ethical aspects of uncertainty in genomics have remained under-explored. In this paper, we draft a conceptual and ethical response to the question of how to conceive of and respond to uncertainty in genomic medicine. After introducing the problem, we articulate a concept of 'genomic uncertainty'. Drawing on this, together with exemplar clinical cases and related empirical literature, we then critique the presumption that uncertainty is always problematic and something to be avoided, or eradicated. We conclude by outlining an 'ethics of genomic uncertainty'; describing how we might handle uncertainty in genomic medicine. This involves fostering resilience, welfare, autonomy and solidarity. Uncertainty will be an inherent aspect of clinical practice in genomics for some time to come. Genomic testing should not be offered with the explicit aim to reduce uncertainty. Rather, uncertainty should be appraised, adapted to and communicated about as part of the process of offering and providing genomic information.
Hay, Elizabeth A; Cowie, Philip; MacKenzie, Alasdair
2017-01-01
There can now be little doubt that the cis-regulatory genome represents the largest information source within the human genome essential for health. In addition to containing up to five times more information than the coding genome, the cis-regulatory genome also acts as a major reservoir of disease-associated polymorphic variation. The cis-regulatory genome, which is comprised of enhancers, silencers, promoters, and insulators, also acts as a major functional target for epigenetic modification including DNA methylation and chromatin modifications. These epigenetic modifications impact the ability of cis-regulatory sequences to maintain tissue-specific and inducible expression of genes that preserve health. There has been limited ability to identify and characterize the functional components of this huge and largely misunderstood part of the human genome that, for decades, was ignored as "Junk" DNA. In an attempt to address this deficit, the current chapter will first describe methods of identifying and characterizing functional elements of the cis-regulatory genome at a genome-wide level using databases such as ENCODE, the UCSC browser, and NCBI. We will then explore the databases on the UCSC genome browser, which provides access to DNA methylation and chromatin modification datasets. Finally, we will describe how we can superimpose the huge volume of study data contained in the NCBI archives onto that contained within the UCSC browser in order to glean relevant in vivo study data for any locus within the genome. An ability to access and utilize these information sources will become essential to informing the future design of experiments and subsequent determination of the role of epigenetics in health and disease and will form a critical step in our development of personalized medicine.
Gatekeepers or intermediaries? The role of clinicians in commercial genomic testing.
McGowan, Michelle L; Fishman, Jennifer R; Settersten, Richard A; Lambrix, Marcie A; Juengst, Eric T
2014-01-01
Many commentators on "direct-to-consumer" genetic risk information have raised concerns that giving results to individuals with insufficient knowledge and training in genomics may harm consumers, the health care system, and society. In response, several commercial laboratories offering genomic risk profiling have shifted to more traditional "direct-to-provider" (DTP) marketing strategies, repositioning clinicians as the intended recipients of advertising of laboratory services and as gatekeepers to personal genomic information. Increasing popularity of next generation sequencing puts a premium on ensuring that those who are charged with interpreting, translating, communicating and managing commercial genomic risk information are appropriately equipped for the job. To shed light on their gatekeeping role, we conducted a study to assess how and why early clinical users of genomic risk assessment incorporate these tools in their clinical practices and how they interpret genomic information for their patients. We conducted qualitative in-depth interviews with 18 clinicians providing genomic risk assessment services to their patients in partnership with DNA Direct and Navigenics. Our findings suggest that clinicians learned most of what they knew about genomics directly from the commercial laboratories. Clinicians rely on the expertise of the commercial laboratories without the ability to critically evaluate the knowledge or assess risks. DTP service delivery model cannot guarantee that providers will have adequate expertise or sound clinical judgment. Even if clinicians want greater genomic knowledge, the current market structure is unlikely to build the independent substantive expertise of clinicians, but rather promote its continued outsourcing. Because commercial laboratories have the most "skin in the game" financially, genetics professionals and policymakers should scrutinize the scientific validity and clinical soundness of the process by which these laboratories interpret their findings to assess whether self-interested commercial sources are the most appropriate entities for gate-keeping genomic interpretation.
Luo, Li; Zhu, Yun
2012-01-01
Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812
Luo, Li; Zhu, Yun; Xiong, Momiao
2012-06-01
The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.
Holistic Nursing in the Genetic/Genomic Era.
Sharoff, Leighsa
2016-06-01
Holistic nursing practice is an ever-evolving transformative process with core values that require continued growth, professional leadership, and advocacy. Holistic nurses are required to stay current with all new required competencies, such as the Core Competencies in Genetics for Health Professional, and, as such, be adept at translating scientific evidence relating to genetics/genomics in the clinical setting. Knowledge of genetics/genomics in relation to nursing practice, policy, utilization, and research influence nurses' responsibilities. In addition to holistic nursing competencies, the holistic nurse must have basic knowledge and skills to integrate genetics/genomics aspects. It is important for holistic nurses to enhance their overall knowledge foundation, skills, and attitudes about genetics to prepare for the transformation in health care that is already underway. Holistic nurses can provide an important perspective to the application of genetics and genomics, focusing on health promotion, caring, and understanding the relationship between caring and families, community, and society. Yet there may be a lack of genetic and genomic knowledge to fully participate in the current genomic era. This article will explore the required core competencies for all health care professionals, share linkage of holistic nurses in practice with genetic/genomic conditions, and provide resources to further one's knowledge base. © The Author(s) 2015.
Genome Sequence of “Thalassospira australica” NP3b2T Isolated from St. Kilda Beach, Tasman Sea
López-Pérez, Mario; Webb, Hayden K.; Crawford, Russell J.
2014-01-01
Here, we present the draft genome of “Thalassospira australica” NP3b2T, a potential poly(ethylene terephthalate) (PET) plastic biodegrader. This genomic information will enhance information on the genetic basis of metabolic pathways for the degradation of PET plastic. PMID:25395631
The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins.
Dehal, Paramvir; Satou, Yutaka; Campbell, Robert K; Chapman, Jarrod; Degnan, Bernard; De Tomaso, Anthony; Davidson, Brad; Di Gregorio, Anna; Gelpke, Maarten; Goodstein, David M; Harafuji, Naoe; Hastings, Kenneth E M; Ho, Isaac; Hotta, Kohji; Huang, Wayne; Kawashima, Takeshi; Lemaire, Patrick; Martinez, Diego; Meinertzhagen, Ian A; Necula, Simona; Nonaka, Masaru; Putnam, Nik; Rash, Sam; Saiga, Hidetoshi; Satake, Masanobu; Terry, Astrid; Yamada, Lixy; Wang, Hong-Gang; Awazu, Satoko; Azumi, Kaoru; Boore, Jeffrey; Branno, Margherita; Chin-Bow, Stephen; DeSantis, Rosaria; Doyle, Sharon; Francino, Pilar; Keys, David N; Haga, Shinobu; Hayashi, Hiroko; Hino, Kyosuke; Imai, Kaoru S; Inaba, Kazuo; Kano, Shungo; Kobayashi, Kenji; Kobayashi, Mari; Lee, Byung-In; Makabe, Kazuhiro W; Manohar, Chitra; Matassi, Giorgio; Medina, Monica; Mochizuki, Yasuaki; Mount, Steve; Morishita, Tomomi; Miura, Sachiko; Nakayama, Akie; Nishizaka, Satoko; Nomoto, Hisayo; Ohta, Fumiko; Oishi, Kazuko; Rigoutsos, Isidore; Sano, Masako; Sasaki, Akane; Sasakura, Yasunori; Shoguchi, Eiichi; Shin-i, Tadasu; Spagnuolo, Antoinetta; Stainier, Didier; Suzuki, Miho M; Tassy, Olivier; Takatori, Naohito; Tokuoka, Miki; Yagi, Kasumi; Yoshizaki, Fumiko; Wada, Shuichi; Zhang, Cindy; Hyatt, P Douglas; Larimer, Frank; Detter, Chris; Doggett, Norman; Glavina, Tijana; Hawkins, Trevor; Richardson, Paul; Lucas, Susan; Kohara, Yuji; Levine, Michael; Satoh, Nori; Rokhsar, Daniel S
2002-12-13
The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains approximately 16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.
The Emerging Field of Human Social Genomics
Slavich, George M.; Cole, Steven W.
2013-01-01
Although we generally experience our bodies as being biologically stable across time and situations, an emerging field of research is demonstrating that external social conditions, especially our subjective perceptions of those conditions, can influence our most basic internal biological processes—namely, the expression of our genes. This research on human social genomics has begun to identify the types of genes that are subject to social-environmental regulation, the neural and molecular mechanisms that mediate the effects of social processes on gene expression, and the genetic polymorphisms that moderate individual differences in genomic sensitivity to social context. The molecular models resulting from this research provide new opportunities for understanding how social and genetic factors interact to shape complex behavioral phenotypes and susceptibility to disease. This research also sheds new light on the evolution of the human genome and challenges the fundamental belief that our molecular makeup is relatively stable and impermeable to social-environmental influence. PMID:23853742
Structure of Ljungan virus provides insight into genome packaging of this picornavirus
NASA Astrophysics Data System (ADS)
Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J.; Kotecha, Abhay; Siebert, C. Alistair; Lindberg, A. Michael; Fry, Elizabeth E.; Rao, Zihe; Tuthill, Tobias J.; Stuart, David I.
2015-10-01
Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses.
Human, Mouse, and Rat Genome Large-Scale Rearrangements: Stability Versus Speciation
Zhao, Shaying; Shetty, Jyoti; Hou, Lihua; Delcher, Arthur; Zhu, Baoli; Osoegawa, Kazutoyo; de Jong, Pieter; Nierman, William C.; Strausberg, Robert L.; Fraser, Claire M.
2004-01-01
Using paired-end sequences from bacterial artificial chromosomes, we have constructed high-resolution synteny and rearrangement breakpoint maps among human, mouse, and rat genomes. Among the >300 syntenic blocks identified are segments of over 40 Mb without any detected interspecies rearrangements, as well as regions with frequently broken synteny and extensive rearrangements. As closely related species, mouse and rat share the majority of the breakpoints and often have the same types of rearrangements when compared with the human genome. However, the breakpoints not shared between them indicate that mouse rearrangements are more often interchromosomal, whereas intrachromosomal rearrangements are more prominent in rat. Centromeres may have played a significant role in reorganizing a number of chromosomes in all three species. The comparison of the three species indicates that genome rearrangements follow a path that accommodates a delicate balance between maintaining a basic structure underlying all mammalian species and permitting variations that are necessary for speciation. PMID:15364903
The Amphimedon queenslandica genome and the evolution of animal complexity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srivastava, Mansi; Simakov, Oleg; Chapman, Jarrod
2010-07-01
Sponges are an ancient group of animals that diverged from other metazoans over 600 million years ago. Here we present the draft genome sequence of Amphimedon queenslandica, a demosponge from the Great Barrier Reef, and show that it is remarkably similar to other animal genomes in content, structure and organization. Comparative analysis enabled by the sponge sequence reveals genomic events linked to the origin and early evolution of animals, including the appearance, expansion, and diversification of pan-metazoan transcription factor, signaling pathway, and structural genes. This diverse 'toolkit' of genes correlates with critical aspects of all metazoan body plans, and comprisesmore » cell cycle control and growth, development, somatic and germ cell specification, cell adhesion, innate immunity, and allorecognition. Notably, many of the genes associated with the emergence of animals are also implicated in cancer, which arises from defects in basic processes associated with metazoan multicellularity.« less
Structure of Ljungan virus provides insight into genome packaging of this picornavirus.
Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J; Kotecha, Abhay; Siebert, C Alistair; Lindberg, A Michael; Fry, Elizabeth E; Rao, Zihe; Tuthill, Tobias J; Stuart, David I
2015-10-08
Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses.
Final Technical Report for Award # ER64999
DOE Office of Scientific and Technical Information (OSTI.GOV)
Metcalf, William W.
2014-10-08
This report provides a summary of activities for Award # ER64999, a Genomes to Life Project funded by the Office of Science, Basic Energy Research. The project was entitled "Methanogenic archaea and the global carbon cycle: a systems biology approach to the study of Methanosarcina species". The long-term goal of this multi-investigator project was the creation of integrated, multiscale models that accurately and quantitatively predict the role of Methanosarcina species in the global carbon cycle under dynamic environmental conditions. To achieve these goals we pursed four specific aims: (1) genome sequencing of numerous members of the Order Methanosarcinales, (2) identificationmore » of genomic sources of phenotypic variation through in silico comparative genomics, (3) elucidation of the transcriptional networks of two Methanosarcina species, and (4) development of comprehensive metabolic network models for characterized strains to address the question of how metabolic models scale with genetic distance.« less
Razin, S V
2018-04-01
This issue of Biochemistry (Moscow) is devoted to the cell nucleus and mechanisms of transcription regulation. Over the years, biochemical processes in the cell nucleus have been studied in isolation, outside the context of their spatial organization. Now it is clear that segregation of functional processes within a compartmentalized cell nucleus is very important for the implementation of basic genetic processes. The functional compartmentalization of the cell nucleus is closely related to the spatial organization of the genome, which in turn plays a key role in the operation of epigenetic mechanisms. In this issue of Biochemistry (Moscow), we present a selection of review articles covering the functional architecture of the eukaryotic cell nucleus, the mechanisms of genome folding, the role of stochastic processes in establishing 3D architecture of the genome, and the impact of genome spatial organization on transcription regulation.
Aspergillus Niger Genomics: Past, Present and into the Future
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Scott E.
2006-09-01
Aspergillus niger is a filamentous ascomycete fungus that is ubiquitous in the environment and has been implicated in opportunistic infections of humans. In addition to its role as an opportunistic human pathogen, A. niger is economically important as a fermentation organism used for the production of citric acid. Industrial citric acid production by A. niger represents one of the most efficient, highest yield bioprocesses in use currently by industry. The genome size of A. niger is estimated to be between 35.5 and 38.5 megabases (Mb) divided among eight chromosomes/linkage groups that vary in size from 3.5 - 6.6 Mb. Currently,more » there are three independent A. niger genome projects, an indication of the economic importance of this organism. The rich amount of data resulting from these multiple A. niger genome sequences will be used for basic and applied research programs applicable to fermentation process development, morphology and pathogenicity.« less
Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.
Andersen, Ethan J; Nepal, Madhav P
2017-08-01
We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.
Public attitudes to genomic science: an experiment in information provision.
Sturgis, Patrick; Brunton-Smith, Ian; Fife-Schaw, Chris
2010-03-01
We use an experimental panel study design to investigate the effect of providing "value-neutral" information about genomic science in the form of a short film to a random sample of the British public. We find little evidence of attitude change as a function of information provision. However, our results show that information provision significantly increased dropout from the study amongst less educated respondents. Our findings have implications both for our understanding of the knowledge-attitude relationship in public opinion toward genomic science and for science communication more generally.
Patient-derived Xenograft (PDX) Models In Basic and Translational Breast Cancer Research
Dobrolecki, Lacey E.; Airhart, Susie D.; Alferez, Denis G.; Aparicio, Samuel; Behbod, Fariba; Bentires-Alj, Mohamed; Brisken, Cathrin; Bult, Carol J.; Cai, Shirong; Clarke, Robert B.; Dowst, Heidi; Ellis, Matthew J.; Gonzalez-Suarez, Eva; Iggo, Richard D.; Kabos, Peter; Li, Shunqiang; Lindeman, Geoffrey J.; Marangoni, Elisabetta; McCoy, Aaron; Meric-Bernstam, Funda; Piwnica-Worms, Helen; Poupon, Marie-France; Reis-Filho, Jorge; Sartorius, Carol A.; Scabia, Valentina; Sflomos, George; Tu, Yizheng; Vaillant, François; Visvader, Jane E.; Welm, Alana; Wicha, Max S.
2017-01-01
Patient-derived xenograft (PDX) models of a growing spectrum of cancers are rapidly supplanting long-established traditional cell lines as preferred models for conducting basic and translational pre-clinical research. In breast cancer, to complement the now curated collection of approximately 45 long-established human breast cancer cell lines, a newly formed consortium of academic laboratories, currently from Europe, Australia, and North America, herein summarizes data on over 500 stably transplantable PDX models representing all three clinical subtypes of breast cancer (ER+, HER2+, and “Triple-negative” (TNBC)). Many of these models are well-characterized with respect to genomic, transcriptomic, and proteomic features, metastatic behavior, and treatment response to a variety of standard-of-care and experimental therapeutics. These stably transplantable PDX lines are generally available for dissemination to laboratories conducting translational research, and contact information for each collection is provided. This review summarizes current experiences related to PDX generation across participating groups, efforts to develop data standards for annotation and dissemination of patient clinical information that does not compromise patient privacy, efforts to develop complementary data standards for annotation of PDX characteristics and biology, and progress toward “credentialing” of PDX models as surrogates to represent individual patients for use in pre-clinical and co-clinical translational research. In addition, this review highlights important unresolved questions, as well as current limitations, that have hampered more efficient generation of PDX lines and more rapid adoption of PDX use in translational breast cancer research. PMID:28025748
ERIC Educational Resources Information Center
Pollack, Miriam
The "Mapping the Human Genome" project demonstrated that librarians can help whomever they serve in accessing information resources in the areas of biological and health information, whether it is the scientists who are developing the information or a member of the public who is using the information. Public libraries can guide library…
De Rocquigny, H; Ficheux, D; Gabus, C; Allain, B; Fournie-Zaluski, M C; Darlix, J L; Roques, B P
1993-01-01
The 56 amino acid nucleocapsid protein (NCp10) of Moloney Murine Leukemia Virus, contains a CysX2CysX4HisX4Cys zinc finger flanked by basic residues. In vitro NCp10 promotes genomic RNA dimerization, a process most probably linked to genomic RNA packaging, and replication primer tRNA(Pro) annealing to the initiation site of reverse transcription. To characterize the amino-acid sequences involved in the various functions of NCp10, we have synthesized by solid phase method the native protein and a series of derived peptides shortened at the N- or C-terminus with or without the zinc finger domain. In the latter case, the two parts of the protein were linked by a Glycine - Glycine spacer. The in vitro studies of these peptides show that nucleic acid annealing activities of NCp10 do not require a zinc finger but are critically dependent on the presence of specific sequences located on each side of the CCHC domain and containing proline and basic residues. Thus, deletion of 11R or 49PRPQT, of the fully active 29 residue peptide 11RQGGERRRSQLDRDGGKKPRGPRGPRPQT53 leads to a complete loss of NCp10 activity. Therefore it is proposed that in NCp10, the zinc finger directs the spatial recognition of the target RNAs by the basic domains surrounding the zinc finger. Images PMID:8451185
Chen, Zhi-Teng; Wu, Hai-Yan; Du, Yu-Zhou
2016-07-01
We report the nearly complete mitochondrial genome of a stonefly species, Styloperla sp. (Plecoptera: Styloperlidae), which is a circular molecule of 15,416 bp in length and consists of 13 protein-coding genes, 2 ribosomal RNAs, 20 transfer RNAs and a partial control region (645 bp). Using the 13 protein-coding genes of 8 stoneflies and 3 other related species, we constructed a phylogenetic tree to verify the accuracy of the new determined mitogenome sequences. Our results provide basic data for further study of phylogeny in Plecoptera.
Future Translational Applications From the Contemporary Genomics Era
Fox, Caroline S.; Hall, Jennifer L.; Arnett, Donna K.; Ashley, Euan A.; Delles, Christian; Engler, Mary B.; Freeman, Mason W.; Johnson, Julie A.; Lanfear, David E.; Liggett, Stephen B.; Lusis, Aldons J.; Loscalzo, Joseph; MacRae, Calum A.; Musunuru, Kiran; Newby, L. Kristin; O’Donnell, Christopher J.; Rich, Stephen S.; Terzic, Andre
2016-01-01
The field of genetics and genomics has advanced considerably with the achievement of recent milestones encompassing the identification of many loci for cardiovascular disease and variable drug responses. Despite this achievement, a gap exists in the understanding and advancement to meaningful translation that directly affects disease prevention and clinical care. The purpose of this scientific statement is to address the gap between genetic discoveries and their practical application to cardiovascular clinical care. In brief, this scientific statement assesses the current timeline for effective translation of basic discoveries to clinical advances, highlighting past successes. Current discoveries in the area of genetics and genomics are covered next, followed by future expectations, tools, and competencies for achieving the goal of improving clinical care. PMID:25882488
Sha, Li-Na; Fan, Xing; Li, Jun; Liao, Jin-Qiu; Zeng, Jian; Wang, Yi; Kang, Hou-Yang; Zhang, Hai-Qin; Zheng, You-Liang; Zhou, Yong-Hong
2017-09-01
Leymus Hochst. (Triticeae: Poaceae), a group of allopolyploid species with the NsXm genomes, is a perennial genus with diversity in morphology, cytology, ecology, and distribution in the Triticeae. To investigate the genome origin and evolutionary history of Leymus, three unlinked low-copy nuclear genes (Acc1, Pgk1, and GBSSI) and three chloroplast regions (trnL-F, matK, and rbcL) of 32 Leymus species were analyzed with those of 36 diploid species representing 18 basic genomes in the Triticeae. The phylogenetic relationships were reconstructed using Bayesian inference, Maximum parsimony, and NeighborNet methods. A time-calibrated phylogeny was generated to estimate the evolutionary history of Leymus. The results suggest that reticulate evolution has occurred in Leymus species, with several distinct progenitors contributing to the Leymus. The molecular data in resolution of the Xm-genome lineage resulted in two apparently contradictory results, with one placing the Xm-genome lineage as closely related to the P/F genome and the other splitting the Xm-genome lineage as sister to the Ns-genome donor. Our results suggested that (1) the Ns genome of Leymus was donated by Psathyrostachys, and additional Ns-containing alleles may be introgressed into some Leymus polyploids by recurrent hybridization; (2) The phylogenetic incongruence regarding the resolution of the Xm-genome lineage suggested that the Xm genome of Leymus was closely related to the P genome of Agropyron; (3) Both Ns- and Xm-genome lineages served as the maternal donor during the speciation of Leymus species; (4) The Pseudoroegneria, Lophopyrum and Australopyrum genomes contributed to some Leymus species. Copyright © 2017 Elsevier Inc. All rights reserved.
ReprDB and panDB: minimalist databases with maximal microbial representation.
Zhou, Wei; Gay, Nicole; Oh, Julia
2018-01-18
Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
Attitudes of stakeholders in psychiatry towards the inclusion of children in genomic research.
Sundby, Anna; Boolsen, Merete Watt; Burgdorf, Kristoffer Sølvsten; Ullum, Henrik; Hansen, Thomas Folkmann; Mors, Ole
2018-03-05
Genomic sequencing of children in research raises complex ethical issues. This study aims to gain more knowledge on the attitudes towards the inclusion of children as research subjects in genomic research and towards the disclosure of pertinent and incidental findings to the parents and the child. Qualitative data were collected from interviews with a wide range of informants: experts engaged in genomic research, clinical geneticists, persons with mental disorders, relatives, and blood donors. Quantitative data were collected from a cross-sectional web-based survey among 1227 parents and 1406 non-parents who were potential stakeholders in psychiatric genomic research. Participants generally expressed positive views on children's participation in genomic research. The informants in the qualitative interviews highlighted the age of the child as a critical aspect when disclosing genetic information. Other important aspects were the child's right to an autonomous choice, the emotional burden of knowing imposed on both the child and the parents, and the possibility of receiving beneficial clinical information regarding the future health of the child. Nevertheless, there was no consensus whether the parent or the child should receive the findings. A majority of survey stakeholders agreed that children should be able to participate in genomic research. The majority agreed that both pertinent and incidental findings should be returned to the parents and to the child when of legal age. Having children does not affect the stakeholder's attitudes towards the inclusion of children as research subjects in genomic research. Our findings illustrate that both the child's right to autonomy and the parents' interest to be informed are important factors that are found valuable by the participants. In future guidelines governing children as subjects in genomic research, it would thus be essential to incorporate the child's right to an open future, including the right to receive information on adult-onset genetic disorders.
Eukaryotic acquisition of a bacterial operon
USDA-ARS?s Scientific Manuscript database
The yeast Saccharomyces cerevisiae is one of the champions of basic biomedical research due to its compact eukaryotic genome and ease of experimental manipulation. Despite these immense strengths, its impact on understanding the genetic basis of natural phenotypic variation has been limited by strai...
Genomic medicine in the military
De Castro, Mauricio; Biesecker, Leslie G; Turner, Clesson; Brenner, Ruth; Witkop, Catherine; Mehlman, Maxwell; Bradburne, Chris; Green, Robert C
2016-01-01
The announcement of the Precision Medicine Initiative was an important step towards establishing the use of genomic information as part of the wider practice of medicine. The US military has been exploring the role that genomic information will have in health care for service members (SMs) and its integration into the continuum of military medicine. An important part of the process is establishing robust protections to protect SMs from genetic discrimination in the era of exome/genome sequencing. PMID:29263806
Palmer, Jessica Elizabeth
2012-01-01
Should consumers be able to obtain information about their own bodies, even if it has no proven medical value? Direct-to-consumer ("DTC") genomic companies offer consumers two services: generation of the consumer's personal genetic sequence, and interpretation of that sequence in light of current research. Concerned that consumers will misunderstand genomic information and make ill-advised health decisions, regulators, legislators and scholars have advocated restricted access to DTC genomic services. The Food and Drug Administration, which has historically refrained from regulating most genetic tests, has announced its intent to treat DTC genomic services as medical devices because they make "medical claims." This Article argues that FDA regulation of genomic services as medical devices would be counterproductive. Clinical laboratories conducting genetic tests are already overseen by a federal regime administered by the Centers for Medicare and Medicaid Services. While consumers and clinicians would benefit from clearer communication of test results and their health implications, FDA's gatekeeping framework is ill-suited to weigh the safety and efficacy of genomic information that is not medically actionable in traditional ways. Playing gatekeeper would burden FDA's resources, conflict with the patient-empowering policies promoted by personalized medicine initiatives, impair individuals' access to information in which they have powerful autonomy interests, weaken novel participatory research infrastructures, and set a poor precedent for the future regulation of medical information. Rather than applying its risk-based regulatory framework to genetic information, FDA should ameliorate regulatory uncertainty by working with the Federal Trade Commission and Centers for Medicare and Medicaid Services to ensure that DTC genomic services deliver analytically valid data, market and implement their services in a truthful manner, and fully disclose the limitations of their services. Federal agencies with relevant expertise should collaborate on standards and best practices for interpreting genetic information in light of scientific uncertainty, and an adverse event reporting system should be established to collect empirical data verifying or disproving the speculative harms resulting from individual access to genetic information. Most of all, FDA should take advantage of this opportunity to adapt its regulatory process to an increasingly informational health ecosystem.
Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G
2008-04-01
The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome-genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I-III in one clade, while plastome IV appears to be closest to the common ancestor.
Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V.; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G.
2008-01-01
The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome–genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I–III in one clade, while plastome IV appears to be closest to the common ancestor. PMID:18299283
Genome resource banking for wildlife research, management, and conservation.
Wildt, D E
2000-01-01
Cryobiology offers an important opportunity to assist in the management and study of wildlife, including endangered species. The benefits of developing genome resource banks for wildlife are profound, perhaps more so than for traditional uses in terms of livestock and human fertility. In addition to preserving heterozygosity and assisting in the genetic management of rare populations held in captivity, frozen repositories help insure wild populations against natural and human-induced catastrophes. Such banks also are an invaluable source of new knowledge (for basic and applied research) from thousands of species that have yet to be studied. However, it is crucial that genome resource banks for wildlife species be developed in a coordinated fashion that first benefits the conservation of biodiversity. Spurious collections will be of no advantage to genuine conservation. The Conservation Breeding Specialist Group (CBSG; of the International Union for the Conservation of Nature and Natural Resources' Species Survival Commission) has promoted international dialogue on this topic. CBSG working groups have recognized that such repositories be developed according to specific, scientific guidelines consistent with an international standard that ensures practicality, high-quality ethics, and cost-effectiveness. Areas requiring priority attention also are reviewed, including the need for more basic research, advocacy, and support for developing organized repositories of biomaterials representing the world's diverse biota.
Emmons-Bell, Maya; Durant, Fallon; Hammelman, Jennifer; Bessonov, Nicholas; Volpert, Vitaly; Morokuma, Junji; Pinet, Kaylinnette; Adams, Dany S.; Pietak, Alexis; Lobo, Daniel; Levin, Michael
2015-01-01
The shape of an animal body plan is constructed from protein components encoded by the genome. However, bioelectric networks composed of many cell types have their own intrinsic dynamics, and can drive distinct morphological outcomes during embryogenesis and regeneration. Planarian flatworms are a popular system for exploring body plan patterning due to their regenerative capacity, but despite considerable molecular information regarding stem cell differentiation and basic axial patterning, very little is known about how distinct head shapes are produced. Here, we show that after decapitation in G. dorotocephala, a transient perturbation of physiological connectivity among cells (using the gap junction blocker octanol) can result in regenerated heads with quite different shapes, stochastically matching other known species of planaria (S. mediterranea, D. japonica, and P. felina). We use morphometric analysis to quantify the ability of physiological network perturbations to induce different species-specific head shapes from the same genome. Moreover, we present a computational agent-based model of cell and physical dynamics during regeneration that quantitatively reproduces the observed shape changes. Morphological alterations induced in a genomically wild-type G. dorotocephala during regeneration include not only the shape of the head but also the morphology of the brain, the characteristic distribution of adult stem cells (neoblasts), and the bioelectric gradients of resting potential within the anterior tissues. Interestingly, the shape change is not permanent; after regeneration is complete, intact animals remodel back to G. dorotocephala-appropriate head shape within several weeks in a secondary phase of remodeling following initial complete regeneration. We present a conceptual model to guide future work to delineate the molecular mechanisms by which bioelectric networks stochastically select among a small set of discrete head morphologies. Taken together, these data and analyses shed light on important physiological modifiers of morphological information in dictating species-specific shape, and reveal them to be a novel instructive input into head patterning in regenerating planaria. PMID:26610482
NASA Astrophysics Data System (ADS)
Serra, Reviewed By Martin J.
2000-01-01
Genomics is one of the most rapidly expanding areas of science. This book is an outgrowth of a series of lectures given by one of the former heads (CRC) of the Human Genome Initiative. The book is designed to reach a wide audience, from biologists with little chemical or physical science background through engineers, computer scientists, and physicists with little current exposure to the chemical or biological principles of genetics. The text starts with a basic review of the chemical and biological properties of DNA. However, without either a biochemistry background or a supplemental biochemistry text, this chapter and much of the rest of the text would be difficult to digest. The second chapter is designed to put DNA into the context of the larger chromosomal unit. Specialized chromosomal structures and sequences (centromeres, telomeres) are introduced, leading to a section on chromosome organization and purification. The next 4 chapters cover the physical (hybridization, electrophoresis), chemical (polymerase chain reaction), and biological (genetic) techniques that provide the backbone of genomic analysis. These chapters cover in significant detail the fundamental principles underlying each technique and provide a firm background for the remainder of the text. Chapters 79 consider the need and methods for the development of physical maps. Chapter 7 primarily discusses chromosomal localization techniques, including in situ hybridization, FISH, and chromosome paintings. The next two chapters focus on the development of libraries and clones. In particular, Chapter 9 considers the limitations of current mapping and clone production. The current state and future of DNA sequencing is covered in the next three chapters. The first considers the current methods of DNA sequencing - especially gel-based methods of analysis, although other possible approaches (mass spectrometry) are introduced. Much of the chapter addresses the limitations of current methods, including analysis of error in sequencing and current bottlenecks in the sequencing effort. The next chapter describes the steps necessary to scale current technologies for the sequencing of entire genomes. Chapter 12 examines alternate methods for DNA sequencing. Initially, methods of single-molecule sequencing and sequencing by microscopy are introduced; the majority of the chapter is devoted to the development of DNA sequencing methods using chip microarrays and hybridization. The remaining chapters (13-15) consider the uses and analysis of DNA sequence information. The initial focus is on the identification of genes. Several examples are given of the use of DNA sequence information for diagnosis of inherited or infectious diseases. The sequence-specific manipulation of DNA is discussed in Chapter 14. The final chapter deals with the implications of large-scale sequencing, including methods for identifying genes and finding errors in DNA sequences, to the development of computer algorithms for the interpretation of DNA sequence information. The text figures are black and white line drawings that, although clearly done, seem a bit primitive for 1999. While I appreciated the simplicity of the drawings, many students accustomed to more colorful presentations will find them wanting. The four color figures in the center of the text seem an afterthought and add little to the text's clarity. Each chapter has a set of additional reading sources, mostly primary sources. Often, specialized topics are offset into boxes that provide clarification and amplification without cluttering the text. An appendix includes a list of the Web-based database resources. As an undergraduate instructor who has previously taught biochemistry, molecular biology, and a course on the human genome, I found many interesting tidbits and amplifications throughout the text. I would recommend this book as a text for an advanced undergraduate or beginning graduate course in genomics. Although the text works though several examples of genetic and genome analysis, additional problem/homework sets would need to be developed to ensure student comprehension. The text steers clear of the ethical implications of the Human Genome Initiative and remains true to its subtitle The Science and Technology .
Seo, Joann; Ivanovich, Jennifer; Goodman, Melody S; Biesecker, Barbara B; Kaphingst, Kimberly A
2017-06-01
We investigated what information women diagnosed with breast cancer at a young age would want to learn when genome sequencing results are returned. We conducted 60 semi-structured interviews with women diagnosed with breast cancer at age 40 or younger. We examined what specific information participants would want to learn across result types and for each type of result, as well as how much information they would want. Genome sequencing was not offered to participants as part of the study. Two coders independently coded interview transcripts; analysis was conducted using NVivo10. Across result types, participants wanted to learn about health implications, risk and prevalence in quantitative terms, causes of variants, and causes of diseases. Participants wanted to learn actionable information for variants affecting risk of preventable or treatable disease, medication response, and carrier status. The amount of desired information differed for variants affecting risk of unpreventable or untreatable disease, with uncertain significance, and not health-related. Women diagnosed with breast cancer at a young age recognize the value of genome sequencing results in identifying potential causes and effective treatments and expressed interest in using the information to help relatives and to further understand their other health risks. Our findings can inform the development of effective feedback strategies for genome sequencing that meet patients' information needs and preferences.
Lee, Chi-Ching; Chen, Yi-Ping Phoebe; Yao, Tzu-Jung; Ma, Cheng-Yu; Lo, Wei-Cheng; Lyu, Ping-Chiang; Tang, Chuan Yi
2013-04-10
Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. Copyright © 2012 Elsevier B.V. All rights reserved.
A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.
Vandepoele, Klaas
2017-01-01
PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .
Kaphingst, Kimberly A.; Blanchard, Melvin; Milam, Laurel; Pokharel, Manusheela; Elrick, Ashley; Goodman, Melody S.
2017-01-01
The increasing importance of genomic information in clinical care heightens our need to examine how individuals understand, value, and communicate about this information. Based on a conceptual framework of genomics-related health literacy, we examined whether health literacy was related to knowledge, self-efficacy, and perceived importance of genetics and FHH and communication about FHH in a medically underserved population. The analytic sample was comprised of 624 patients at a primary care clinic at a large urban hospital. About half of participants (47%) had limited health literacy; 55% had no education beyond high school and 58% were Black. In multivariable models, limited health literacy was associated with lower genetic knowledge (β=−0.55; SE=0.10, p<.0001), lower awareness of FHH (OR=0.50; 95% CI=0.28,0.90, p=.020), greater perceived importance of genetic information (OR=1.95; 95% CI=1.27,3.00, p=.0022) but lower perceived importance of FHH information (OR=0.47; 95% CI=0.26,0.86, p=.013), and more frequent communication with a doctor about FHH (OR=2.02; 95% CI=1.27,3.23, p=.0032). The findings highlight the importance of considering domains of genomics-related health literacy (e.g., knowledge, oral literacy) in developing educational strategies for genomic information. Health literacy research is essential to avoid increasing disparities in information and health outcomes as genomic information reaches more patients. PMID:27043759
Compositional patterns in the genomes of unicellular eukaryotes
2013-01-01
Background The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. Results In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. Conclusions The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes. PMID:24188247
Haraksingh, Rajini R.; Abyzov, Alexej; Gerstein, Mark; Urban, Alexander E.; Snyder, Michael
2011-01-01
Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications. PMID:22140474
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies.
Card, Daren C; Schield, Drew R; Reyes-Velasco, Jacobo; Fujita, Matthew K; Andrew, Audra L; Oyler-McCance, Sara J; Fike, Jennifer A; Tomback, Diana F; Ruggiero, Robert P; Castoe, Todd A
2014-01-01
As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5-5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies
Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthre K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.
2014-01-01
As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (~3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
A core viral protein binds host nucleosomes to sequester immune danger signals
Avgousti, Daphne C.; Herrmann, Christin; Kulej, Katarzyna; Pancholi, Neha J.; Sekulic, Nikolina; Petrescu, Joana; Molden, Rosalynn C.; Blumenthal, Daniel; Paris, Andrew J.; Reyes, Emigdio D.; Ostapchuk, Philomena; Hearing, Patrick; Seeholzer, Steven H.; Worthen, G. Scott; Black, Ben E.; Garcia, Benjamin A.; Weitzman, Matthew D.
2016-01-01
Viral proteins mimic host protein structure and function to redirect cellular processes and subvert innate defenses1. Small basic proteins compact and regulate both viral and cellular DNA genomes. Nucleosomes are the repeating units of cellular chromatin and play an important role in innate immune responses2. Viral encoded core basic proteins compact viral genomes but their impact on host chromatin structure and function remains unexplored. Adenoviruses encode a highly basic protein called protein VII that resembles cellular histones3. Although protein VII binds viral DNA and is incorporated with viral genomes into virus particles4,5, it is unknown whether protein VII impacts cellular chromatin. Our observation that protein VII alters cellular chromatin led us to hypothesize that this impacts antiviral responses during adenovirus infection. We found that protein VII forms complexes with nucleosomes and limits DNA accessibility. We identified post-translational modifications on protein VII that are responsible for chromatin localization. Furthermore, proteomic analysis demonstrated that protein VII is sufficient to alter protein composition of host chromatin. We found that protein VII is necessary and sufficient for retention in chromatin of members of the high-mobility group protein B family (HMGB1, HMGB2, and HMGB3). HMGB1 is actively released in response to inflammatory stimuli and functions as a danger signal to activate immune responses6,7. We showed that protein VII can directly bind HMGB1 in vitro and further demonstrated that protein VII expression in mouse lungs is sufficient to decrease inflammation-induced HMGB1 content and neutrophil recruitment in the bronchoalveolar lavage fluid. Together our in vitro and in vivo results show that protein VII sequesters HMGB1 and can prevent its release. This study uncovers a viral strategy in which nucleosome binding is exploited to control extracellular immune signaling. PMID:27362237
Switzer, William M; Salemi, Marco; Qari, Shoukat H; Jia, Hongwei; Gray, Rebecca R; Katzourakis, Aris; Marriott, Susan J; Pryor, Kendle N; Wolfe, Nathan D; Burke, Donald S; Folks, Thomas M; Heneine, Walid
2009-01-01
Background Human T-lymphotropic virus type 4 (HTLV-4) is a new deltaretrovirus recently identified in a primate hunter in Cameroon. Limited sequence analysis previously showed that HTLV-4 may be distinct from HTLV-1, HTLV-2, and HTLV-3, and their simian counterparts, STLV-1, STLV-2, and STLV-3, respectively. Analysis of full-length genomes can provide basic information on the evolutionary history and replication and pathogenic potential of new viruses. Results We report here the first complete HTLV-4 sequence obtained by PCR-based genome walking using uncultured peripheral blood lymphocyte DNA from an HTLV-4-infected person. The HTLV-4(1863LE) genome is 8791-bp long and is equidistant from HTLV-1, HTLV-2, and HTLV-3 sharing only 62–71% nucleotide identity. HTLV-4 has a prototypic genomic structure with all enzymatic, regulatory, and structural proteins preserved. Like STLV-2, STLV-3, and HTLV-3, HTLV-4 is missing a third 21-bp transcription element found in the long terminal repeats of HTLV-1 and HTLV-2 but instead contains unique c-Myb and pre B-cell leukemic transcription factor binding sites. Like HTLV-2, the PDZ motif important for cellular signal transduction and transformation in HTLV-1 and HTLV-3 is missing in the C-terminus of the HTLV-4 Tax protein. A basic leucine zipper (b-ZIP) region located in the antisense strand of HTLV-1 and believed to play a role in viral replication and oncogenesis, was also found in the complementary strand of HTLV-4. Detailed phylogenetic analysis shows that HTLV-4 is clearly a monophyletic viral group. Dating using a relaxed molecular clock inferred that the most recent common ancestor of HTLV-4 and HTLV-2/STLV-2 occurred 49,800 to 378,000 years ago making this the oldest known PTLV lineage. Interestingly, this period coincides with the emergence of Homo sapiens sapiens during the Middle Pleistocene suggesting that early humans may have been susceptible hosts for the ancestral HTLV-4. Conclusion The inferred ancient origin of HTLV-4 coinciding with the appearance of Homo sapiens, the propensity of STLVs to cross-species into humans, the fact that HTLV-1 and -2 spread globally following migrations of ancient populations, all suggest that HTLV-4 may be prevalent. Expanded surveillance and clinical studies are needed to better define the epidemiology and public health importance of HTLV-4 infection. PMID:19187529
Beekman, Janine B; Ferrer, Rebecca A; Klein, William M P; Persky, Susan
2016-01-01
Weight-based discrimination negatively influences health, potentially via increased willingness to engage in unhealthful behaviours. This study examines whether the provision of genomic obesity information in a clinical context can lead to less willingness to engage in unhealthy eating and alcohol consumption through a mediated process including reduced perceptions of blame and discrimination. A total of 201 overweight or obese women aged 20-50 interacted with a virtual physician in a simulated clinical primary care environment, which included physician-delivered information that emphasised either genomic or behavioural underpinnings of weight and weight loss. Perceived blame and weight discrimination from the doctor, and willingness to eat unhealthy foods and consume alcohol. Controlling for BMI and race, participants who received genomic information perceived less blame from the doctor than participants who received behavioural information. In a serial multiple mediation model, reduced perceived blame was associated with less perceived discrimination, and in turn, lower willingness to eat unhealthy foods and drink alcohol. Providing patients with genomic information about weight and weight loss may positively influence interpersonal dynamics between patients and providers by reducing perceived blame and perceived discrimination. These improved dynamics, in turn, positively influence health cognitions.
Genomic Signal Processing: Predicting Basic Molecular Biological Principles
NASA Astrophysics Data System (ADS)
Alter, Orly
2005-03-01
Advances in high-throughput technologies enable acquisition of different types of molecular biological data, monitoring the flow of biological information as DNA is transcribed to RNA, and RNA is translated to proteins, on a genomic scale. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment and drug development. Recently we described data-driven models for genome-scale molecular biological data, which use singular value decomposition (SVD) and the comparative generalized SVD (GSVD). Now we describe an integrative data-driven model, which uses pseudoinverse projection (1). We also demonstrate the predictive power of these matrix algebra models (2). The integrative pseudoinverse projection model formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the ``basis'' set. The mathematical variables of this integrative model, the pseudoinverse correlation patterns that are uncovered in the data, represent independent processes and corresponding cellular states (such as observed genome-wide effects of known regulators or transcription factors, the biological components of the cellular machinery that generate the genomic signals, and measured samples in which these regulators or transcription factors are over- or underactive). Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis, and gives a global picture of the correlations and possibly also causal coordination of these two sets of states. Mapping genome-scale protein binding data using pseudoinverse projection onto patterns of RNA expression data that had been extracted by SVD and GSVD, a novel correlation between DNA replication initiation and RNA transcription during the cell cycle in yeast, that might be due to a previously unknown mechanism of regulation, is predicted. (1) Alter & Golub, Proc. Natl. Acad. Sci. USA 101, 16577 (2004). (2) Alter, Golub, Brown & Botstein, Miami Nat. Biotechnol. Winter Symp. 2004 (www.med.miami.edu/mnbws/alter-.pdf)
Close Encounters - Probing Proximal Proteins in Live or Fixed Cells.
Lönn, Peter; Landegren, Ulf
2017-07-01
The well-oiled machinery of the cellular proteome operates via variable expression, modifications, and interactions of proteins, relaying genomic and transcriptomic information to coordinate cellular functions. In recent years, a number of techniques have emerged that serve to identify sets of proteins acting in close proximity in the course of orchestrating cellular activities. These proximity-dependent assays, including BiFC, BioID, APEX, FRET, and isPLA, have opened up new avenues to examine protein interactions in live or fixed cells. We review herein the current status of proximity-dependent in situ techniques. We compare the advantages and limitations of the methods, underlining recent progress and the growing importance of these techniques in basic research, and we discuss their potential as tools for drug development and diagnostics. Copyright © 2017 Elsevier Ltd. All rights reserved.
The fuzzy polynucleotide space: basic properties.
Torres, Angela; Nieto, Juan J
2003-03-22
Any triplet codon may be regarded as a 12-dimensional fuzzy code. Sufficient information about a particular sequence may not be available in certain situations. The investigator will be confronted with imprecise sequences, yet want to make comparisons of sequences. Fuzzy polynucleotides can be compared by using geometrical interpretation of fuzzy sets as points in a hypercube. We introduce the space of fuzzy polynucleotides and a means of measuring dissimilitudes between them. We establish mathematical principles to measure dissimilarities between fuzzy polynucleotides and present several examples in this metric space. We calculate the frequencies of the nucleotides at the three base sites of a codon in the coding sequences of Escherichia coli K-12 and Mycobacterium tuberculosis H37Rv, and consider them as points in that fuzzy space. We compute the distance between the genomes of E.coli and M.tuberculosis.
Models of chromatin spatial organisation in the cell nucleus
NASA Astrophysics Data System (ADS)
Nicodemi, Mario
2014-03-01
In the cell nucleus chromosomes have a complex architecture serving vital functional purposes. Recent experiments have started unveiling the interaction map of DNA sites genome-wide, revealing different levels of organisation at different scales. The principles, though, which orchestrate such a complex 3D structure remain still mysterious. I will overview the scenario emerging from some classical polymer physics models of the general aspect of chromatin spatial organisation. The available experimental data, which can be rationalised in a single framework, support a picture where chromatin is a complex mixture of differently folded regions, self-organised across spatial scales according to basic physical mechanisms. I will also discuss applications to specific DNA loci, e.g. the HoxB locus, where models informed with biological details, and tested against targeted experiments, can help identifying the determinants of folding.