Science.gov

Sample records for additional genomic regions

  1. New complete mitochondrial genome of the Perccottus glenii (Perciformes, Odontobutidae): additional non-coding region.

    PubMed

    Chen, Xiaohui; Shi, Yangbai; Zhong, Liqiang; Wang, Minghua; Sun, Lihui; Yang, Guoliang

    2016-05-01

    Perccottus glenii is a species of freshwater sleeper native to the Russian Far East, north-eastern China, and the northern part of the Korean Penninsula with introduced populations in other regions of Eurasia. In this study, a new complete mitochondrial genome of Perccottus glenii was reported. The circular genome is 16,510 bp in length and consists of 13 protein-coding genes, 22 tRNA genes, 2 ribosomal RNA genes, and 1 control region. Except the origin of the light strand replication (OL), an additional non-coding region was present between ND6 and tRNA-Glu in the Light strand. The overall nucleotide composition was 30.5% A, 29.2% T, 24.4% C and 15.9% G, with an A + T bias of 59.7%. The gene composition and the structural arrangement of the P. glenii complete mtDNA were identical to most of the other vertebrates. The molecular data here we presented could play a useful role to study the evolutionary relationships and population genetics of Odontobutidae fish. PMID:25329281

  2. Atypical regions in large genomic DNA sequences

    SciTech Connect

    Scherer, S. |; McPeek, M.S.; Speed, T.P.

    1994-07-19

    Large genomic DNA sequences contain regions with distinctive patterns of sequence organization. The authors describe a method using logarithms of probabilities based on seventh-order Markov chains to rapidly identify genomic sequences that do not resemble models of genome organization built from compilations of octanucleotide usage. Data bases have been constructed from Escherichia coli and Saccharomyces cerevisiae DNA sequences of >1000 nt and human sequences of >10,000 nt. Atypical genes and clusters of genes have been located in bacteriophage, yeast, and primate DNA sequences. The authors consider criteria for statistical significance of the results, offer possible explanations for the observed variation in genome organization, and give additional applications of these methods in DNA sequence analysis.

  3. The opossum MHC genomic region revisited.

    PubMed

    Krasnec, Katina V; Sharp, Alana R; Williams, Tracey L; Miller, Robert D

    2015-04-01

    The gray short-tailed opossum Monodelphis domestica is one of the few marsupial species for which a high quality whole genome sequence is available and the major histocompatibility complex (MHC) region has been annotated. Previous analyses revealed only a single locus within the opossum MHC region, designated Modo-UA1, with the features expected for encoding a functionally classical class I α-chain. Nine other class I genes found within the MHC are highly divergent and have features usually associated with non-classical roles. The original annotation, however, was based on an early version of the opossum genome assembly. More recent analyses of allelic variation in individual opossums revealed too many Modo-UA1 sequences per individual to be accounted for by a single MHC class I locus found in the genome assembly. A reanalysis of a later generation assembly, MonDom5, revealed the presence of two additional loci, now designated Modo-UA3 and UA4, in a region that was expanded and more complete than in the earlier assembly. Modo-UA1, UA3, and UA4 are all transcribed, although Modo-UA4 transcripts are rarer. Modo-UA4 is also relatively non-polymorphic. Evidence presented support the accuracy of the later assembly and the existence of three related class I genes in the opossum, making opossums more typical of mammals and most tetrapods by having multiple apparent classical MHC class I loci. PMID:25737310

  4. REEF: searching REgionally Enriched Features in genomes

    PubMed Central

    Coppe, Alessandro; Danieli, Gian Antonio; Bortoluzzi, Stefania

    2006-01-01

    Background In Eukaryotic genomes, different features including genes are not uniformly distributed. The integration of annotation information and genomic position of functional DNA elements in the Eukaryotic genomes opened the way to test novel hypotheses of higher order genome organization and regulation of expression. Results REEF is a new tool, aimed at identifying genomic regions enriched in specific features, such as a class or group of genes homogeneous for expression and/or functional characteristics. The method for the calculation of local feature enrichment uses test statistic based on the Hypergeometric Distribution applied genome-wide by using a sliding window approach and adopting the False Discovery Rate for controlling multiplicity. REEF software, source code and documentation are freely available at . Conclusion REEF can aid to shed light on the role of organization of specific genomic regions in the determination of their functional role. PMID:17042935

  5. regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests

    PubMed Central

    Gel, Bernat; Díez-Villanueva, Anna; Serra, Eduard; Buschbeck, Marcus; Peinado, Miguel A.; Malinverni, Roberto

    2016-01-01

    Motivation: Statistically assessing the relation between a set of genomic regions and other genomic features is a common challenging task in genomic and epigenomic analyses. Randomization based approaches implicitly take into account the complexity of the genome without the need of assuming an underlying statistical model. Summary: regioneR is an R package that implements a permutation test framework specifically designed to work with genomic regions. In addition to the predefined randomization and evaluation strategies, regioneR is fully customizable allowing the use of custom strategies to adapt it to specific questions. Finally, it also implements a novel function to evaluate the local specificity of the detected association. Availability and implementation: regioneR is an R package released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/regioneR). Contact: rmalinverni@carrerasresearch.org PMID:26424858

  6. The Probabilistic Admissible Region with Additional Constraints

    NASA Astrophysics Data System (ADS)

    Roscoe, C.; Hussein, I.; Wilkins, M.; Schumacher, P.

    The admissible region, in the space surveillance field, is defined as the set of physically acceptable orbits (e.g., orbits with negative energies) consistent with one or more observations of a space object. Given additional constraints on orbital semimajor axis, eccentricity, etc., the admissible region can be constrained, resulting in the constrained admissible region (CAR). Based on known statistics of the measurement process, one can replace hard constraints with a probabilistic representation of the admissible region. This results in the probabilistic admissible region (PAR), which can be used for orbit initiation in Bayesian tracking and prioritization of tracks in a multiple hypothesis tracking framework. The PAR concept was introduced by the authors at the 2014 AMOS conference. In that paper, a Monte Carlo approach was used to show how to construct the PAR in the range/range-rate space based on known statistics of the measurement, semimajor axis, and eccentricity. An expectation-maximization algorithm was proposed to convert the particle cloud into a Gaussian Mixture Model (GMM) representation of the PAR. This GMM can be used to initialize a Bayesian filter. The PAR was found to be significantly non-uniform, invalidating an assumption frequently made in CAR-based filtering approaches. Using the GMM or particle cloud representations of the PAR, orbits can be prioritized for propagation in a multiple hypothesis tracking (MHT) framework. In this paper, the authors focus on expanding the PAR methodology to allow additional constraints, such as a constraint on perigee altitude, to be modeled in the PAR. This requires re-expressing the joint probability density function for the attributable vector as well as the (constrained) orbital parameters and range and range-rate. The final PAR is derived by accounting for any interdependencies between the parameters. Noting that the concepts presented are general and can be applied to any measurement scenario, the idea

  7. Unraveling Additive from Nonadditive Effects Using Genomic Relationship Matrices

    PubMed Central

    Muñoz, Patricio R.; Resende, Marcio F. R.; Gezan, Salvador A.; Resende, Marcos Deon Vilela; de los Campos, Gustavo; Kirst, Matias; Huber, Dudley; Peter, Gary F.

    2014-01-01

    The application of quantitative genetics in plant and animal breeding has largely focused on additive models, which may also capture dominance and epistatic effects. Partitioning genetic variance into its additive and nonadditive components using pedigree-based models (P-genomic best linear unbiased predictor) (P-BLUP) is difficult with most commonly available family structures. However, the availability of dense panels of molecular markers makes possible the use of additive- and dominance-realized genomic relationships for the estimation of variance components and the prediction of genetic values (G-BLUP). We evaluated height data from a multifamily population of the tree species Pinus taeda with a systematic series of models accounting for additive, dominance, and first-order epistatic interactions (additive by additive, dominance by dominance, and additive by dominance), using either pedigree- or marker-based information. We show that, compared with the pedigree, use of realized genomic relationships in marker-based models yields a substantially more precise separation of additive and nonadditive components of genetic variance. We conclude that the marker-based relationship matrices in a model including additive and nonadditive effects performed better, improving breeding value prediction. Moreover, our results suggest that, for tree height in this population, the additive and nonadditive components of genetic variance are similar in magnitude. This novel result improves our current understanding of the genetic control and architecture of a quantitative trait and should be considered when developing breeding strategies. PMID:25324160

  8. Genomic regions associated with kyphosis in swine

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: A back curvature defect similar to kyphosis in humans has been observed in swine herds. The defect ranges from mild to severe curvature of the thoracic vertebrate in split carcasses and has an estimated heritability of 0.3. The objective of this study was to identify genomic regions that...

  9. Harnessing genomics to improve health in the Eastern Mediterranean Region - an executive course in genomics policy.

    PubMed

    Acharya, Tara; Rab, Mohammed Abdur; Singer, Peter A; Daar, Abdallah S

    2005-01-21

    BACKGROUND: While innovations in medicine, science and technology have resulted in improved health and quality of life for many people, the benefits of modern medicine continue to elude millions of people in many parts of the world. To assess the potential of genomics to address health needs in EMR, the World Health Organization's Eastern Mediterranean Regional Office and the University of Toronto Joint Centre for Bioethics jointly organized a Genomics and Public Health Policy Executive Course, held September 20th-23rd, 2003, in Muscat, Oman. The 4-day course was sponsored by WHO-EMRO with additional support from the Canadian Program in Genomics and Global Health. The overall objective of the course was to collectively explore how to best harness genomics to improve health in the region. This article presents the course findings and recommendations for genomics policy in EMR. METHODS: The course brought together senior representatives from academia, biotechnology companies, regulatory bodies, media, voluntary, and legal organizations to engage in discussion. Topics covered included scientific advances in genomics, followed by innovations in business models, public sector perspectives, ethics, legal issues and national innovation systems. RESULTS: A set of recommendations, summarized below, was formulated for the Regional Office, the Member States and for individuals.* Advocacy for genomics and biotechnology for political leadership;* Networking between member states to share information, expertise, training, and regional cooperation in biotechnology; coordination of national surveys for assessment of health biotechnology innovation systems, science capacity, government policies, legislation and regulations, intellectual property policies, private sector activity;* Creation in each member country of an effective National Body on genomics, biotechnology and health to:- formulate national biotechnology strategies- raise biotechnology awareness- encourage teaching and

  10. Genomic distance entrained clustering and regression modelling highlights interacting genomic regions contributing to proliferation in breast cancer

    PubMed Central

    2010-01-01

    Background Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression. Results We employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC) clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22. Conclusions Data-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint the key genes and interactions

  11. Genomic in situ hybridization analysis of Thinopyrum chromatin in a wheat-Th. intermedium partial amphiploid and six derived chromosome addition lines

    PubMed

    Chen; Conner; Laroche; Ji; Armstrong; Fedak

    1999-12-01

    The genomic origin of alien chromosomes present in a wheat-Thinopyrum intermedium partial amphiploid TAF46 (2n = 8x = 56) and six derived chromosome addition lines were analyzed by genomic in situ hybridization (GISH) using S genomic DNA from Pseudoroegneria strigosa (2n = 2x = 14, SS) as a probe. The GISH analysis clearly showed that the chromosome complement of the partial amphiploid TAF46 consists of an entire wheat genome plus one synthetic genome consisting of a mixture of six S genome chromosomes and eight J (=E) genome chromosomes derived from Th. intermedium (2n = 6x = 42, JJJ(s)J(s)SS). There were no Js genome chromosomes present in TAF46. The J genome chromosomes present in TAF46 displayed a unique GISH hybridization pattern with the S genomic DNA probe, in which S genome DNA strongly hybridized at the terminal regions and weakly hybridized over the remaining parts of the chromosomes. This provides a diagnostic marker for distinguishing J genome chromosomes from Js or S genome or wheat ABD genome chromosomes. The genomic origin of the alien chromosomes present in the six derived chromosome addition lines were identified by their characteristic GISH hybridization patterns with S genomic DNA probe. GISH analysis showed that addition lines L1, L2, L3, and L5 carried one pair of J genome chromosomes, while addition lines L4 and L7 each carried one pair of S genome chromosomes. GISH patterns detected by the S genome probe on addition line of L1 were identical to those of the J genome chromosomes present in the partial amphiploid TAF46, suggesting that these chromosomes were not structurally altered when they were transferred from TAF46 to addition lines. PMID:10659790

  12. Admixture mapping identifies introgressed genomic regions in North American canids.

    PubMed

    vonHoldt, Bridgett M; Kays, Roland; Pollinger, John P; Wayne, Robert K

    2016-06-01

    Hybrid zones typically contain novel gene combinations that can be tested by natural selection in a unique genetic context. Parental haplotypes that increase fitness can introgress beyond the hybrid zone, into the range of parental species. We used the Affymetrix canine SNP genotyping array to identify genomic regions tagged by multiple ancestry informative markers that are more frequent in an admixed population than expected. We surveyed a hybrid zone formed in the last 100 years as coyotes expanded their range into eastern North America. Concomitant with expansion, coyotes hybridized with wolves and some populations became more wolflike, such that coyotes in the northeast have the largest body size of any coyote population. Using a set of 3102 ancestry informative markers, we identified 60 differentially introgressed regions in 44 canines across this admixture zone. These regions are characterized by an excess of exogenous ancestry and, in northeastern coyotes, are enriched for genes affecting body size and skeletal proportions. Further, introgressed wolf-derived alleles have penetrated into Southern US coyote populations. Because no wolves currently exist in this area, these alleles are unlikely to have originated from recent hybridization. Instead, they probably originated from intraspecific gene flow or ancient admixture. We show that grey wolf and coyote admixture has far-reaching effects and, in addition to phenotypically transforming admixed populations, allows for the differential movement of alleles from different parental species to be tested in new genomic backgrounds. PMID:27106273

  13. Powerful methods for detecting introgressed regions from population genomic data.

    PubMed

    Rosenzweig, Benjamin K; Pease, James B; Besansky, Nora J; Hahn, Matthew W

    2016-06-01

    Understanding the types and functions of genes that are able to cross species boundaries-and those that are not-is an important step in understanding the forces maintaining species as largely independent lineages across the remainder of the genome. With large next-generation sequencing data sets we are now able to ask whether introgression has occurred across the genome, and multiple methods have been proposed to detect the signature of such events. Here, we introduce a new summary statistic that can be used to test for introgression, RNDmin , that makes use of the minimum pairwise sequence distance between two population samples relative to divergence to an outgroup. We find that our method offers a modest increase in power over other, related tests, but that all such tests have high power to detect introgressed loci when migration is recent and strong. RNDmin is robust to variation in the mutation rate, and remains reliable even when estimates of the divergence time between sister species are inaccurate. We apply RNDmin to population genomic data from the African mosquitoes Anopheles quadriannulatus and A. arabiensis, identifying three novel candidate regions for introgression. Interestingly, one of the introgressed loci is on the X chromosome, but outside of an inversion separating these two species. Our results suggest that significant, but rare, sharing of alleles is occurring between species that diverged more than 1 million years ago, and that application of these methods to additional systems are likely to reveal similar results. PMID:26945783

  14. TAG Sequence Identification of Genomic Regions Using TAGdb.

    PubMed

    Ruperao, Pradeep

    2016-01-01

    Second-generation sequencing (SGS) technology has enabled the sequencing of genomes and identification of genes. However, large complex plant genomes remain particularly difficult for de novo assembly. Access to the vast quantity of raw sequence data may facilitate discoveries; however the volume of this data makes access difficult. This chapter discusses the Web-based tool TAGdb that enables researchers to identify paired read second-generation DNA sequence data that share identity with a submitted query sequence. The identified reads can be used for PCR amplification of genomic regions to identify genes and promoters without the need for genome assembly. PMID:26519409

  15. Enhancer scanning to locate regulatory regions in genomic loci.

    PubMed

    Buckley, Melissa; Gjyshi, Anxhela; Mendoza-Fandiño, Gustavo; Baskin, Rebekah; Carvalho, Renato S; Carvalho, Marcelo A; Woods, Nicholas T; Monteiro, Alvaro N A

    2016-01-01

    This protocol provides a rapid, streamlined and scalable strategy to systematically scan genomic regions for the presence of transcriptional regulatory regions that are active in a specific cell type. It creates genomic tiles spanning a region of interest that are subsequently cloned by recombination into a luciferase reporter vector containing the simian virus 40 promoter. Tiling clones are transfected into specific cell types to test for the presence of transcriptional regulatory regions. The protocol includes testing of different single-nucleotide polymorphism (SNP) alleles to determine their effect on regulatory activity. This procedure provides a systematic framework for identifying candidate functional SNPs within a locus during functional analysis of genome-wide association studies. This protocol adapts and combines previous well-established molecular biology methods to provide a streamlined strategy, based on automated primer design and recombinational cloning, allowing one to rapidly go from a genomic locus to a set of candidate functional SNPs in 8 weeks. PMID:26658467

  16. Enhancer scanning to locate regulatory regions in genomic loci

    PubMed Central

    Buckley, Melissa; Gjyshi, Anxhela; Mendoza-Fandiño, Gustavo; Baskin, Rebekah; Carvalho, Renato S.; Carvalho, Marcelo A.; Woods, Nicholas T.; Monteiro, Alvaro N.A.

    2016-01-01

    The present protocol provides a rapid, streamlined and scalable strategy to systematically scan genomic regions for the presence of transcriptional regulatory regions active in a specific cell type. It creates genomic tiles spanning a region of interest that are subsequently cloned by recombination into a luciferase reporter vector containing the Simian Virus 40 promoter. Tiling clones are transfected into specific cell types to test for the presence of transcriptional regulatory regions. The protocol includes testing of different SNP (single nucleotide polymorphism) alleles to determine their effect on regulatory activity. This procedure provides a systematic framework to identify candidate functional SNPs within a locus during functional analysis of genome-wide association studies. This protocol adapts and combines previous well-established molecular biology methods to provide a streamlined strategy, based on automated primer design and recombinational cloning to rapidly go from a genomic locus to a set of candidate functional SNPs in eight weeks. PMID:26658467

  17. Harnessing genomics to improve health in the Eastern Mediterranean Region – an executive course in genomics policy

    PubMed Central

    Acharya, Tara; Rab, Mohammed Abdur; Singer, Peter A; Daar, Abdallah S

    2005-01-01

    Background While innovations in medicine, science and technology have resulted in improved health and quality of life for many people, the benefits of modern medicine continue to elude millions of people in many parts of the world. To assess the potential of genomics to address health needs in EMR, the World Health Organization's Eastern Mediterranean Regional Office and the University of Toronto Joint Centre for Bioethics jointly organized a Genomics and Public Health Policy Executive Course, held September 20th–23rd, 2003, in Muscat, Oman. The 4-day course was sponsored by WHO-EMRO with additional support from the Canadian Program in Genomics and Global Health. The overall objective of the course was to collectively explore how to best harness genomics to improve health in the region. This article presents the course findings and recommendations for genomics policy in EMR. Methods The course brought together senior representatives from academia, biotechnology companies, regulatory bodies, media, voluntary, and legal organizations to engage in discussion. Topics covered included scientific advances in genomics, followed by innovations in business models, public sector perspectives, ethics, legal issues and national innovation systems. Results A set of recommendations, summarized below, was formulated for the Regional Office, the Member States and for individuals. • Advocacy for genomics and biotechnology for political leadership; • Networking between member states to share information, expertise, training, and regional cooperation in biotechnology; coordination of national surveys for assessment of health biotechnology innovation systems, science capacity, government policies, legislation and regulations, intellectual property policies, private sector activity; • Creation in each member country of an effective National Body on genomics, biotechnology and health to: - formulate national biotechnology strategies - raise biotechnology awareness - encourage

  18. Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region

    PubMed Central

    Jiang, Zhi J; Castoe, Todd A; Austin, Christopher C; Burbrink, Frank T; Herron, Matthew D; McGuire, Jimmy A; Parkinson, Christopher L; Pollock, David D

    2007-01-01

    Background The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. Results We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. Conclusion Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and

  19. Telomere maintenance through recruitment of internal genomic regions.

    PubMed

    Seo, Beomseok; Kim, Chuna; Hills, Mark; Sung, Sanghyun; Kim, Hyesook; Kim, Eunkyeong; Lim, Daisy S; Oh, Hyun-Seok; Choi, Rachael Mi Jung; Chun, Jongsik; Shim, Jaegal; Lee, Junho

    2015-01-01

    Cells surviving crisis are often tumorigenic and their telomeres are commonly maintained through the reactivation of telomerase. However, surviving cells occasionally activate a recombination-based mechanism called alternative lengthening of telomeres (ALT). Here we establish stably maintained survivors in telomerase-deleted Caenorhabditis elegans that escape from sterility by activating ALT. ALT survivors trans-duplicate an internal genomic region, which is already cis-duplicated to chromosome ends, across the telomeres of all chromosomes. These 'Template for ALT' (TALT) regions consist of a block of genomic DNA flanked by telomere-like sequences, and are different between two genetic background. We establish a model that an ancestral duplication of a donor TALT region to a proximal telomere region forms a genomic reservoir ready to be incorporated into telomeres on ALT activation. PMID:26382656

  20. MapRepeat: an approach for effective assembly of repetitive regions in prokaryotic genomes

    PubMed Central

    Mariano, Diego CB; Pereira, Felipe L; Ghosh, Preetam; Barh, Debmalya; Figueiredo, Henrique CP; Silva, Artur; Ramos, Rommel TJ; Azevedo, Vasco AC

    2015-01-01

    The newest technologies for DNA sequencing have led to the determination of the primary structure of the genomes of organisms, mainly prokaryotes, with high efficiency and at lower costs. However, the presence of regions with repetitive sequences, in addition to the short reads produced by the Next-Generation Sequencing (NGS) platforms, created a lot of difficulty in reconstructing the original genome in silico. Thus, even today, genome assembly continues to be one of the major challenges in bioinformatics specifically when repetitive sequences are considered. In this paper, we present an approach to assemble repetitive regions in prokaryotic genomes. Our methodology enables (i) the identification of these regions through visual tools, (ii) the characterization of sequences on the extremities of gaps and (iii) the extraction of consensus sequences based on mapping of raw data to a reference genome. We also present a case study on the assembly of regions that encode ribosomal RNAs (rRNA) in the genome of Corynebacterium ulcerans FRC11, in order to show the efficiency of the strategies presented here. The proposed methods and tools will help in finishing genome assemblies, besides reducing the running time and associated costs. Availability All scripts are available at http://github.com/dcbmariano/maprepeat PMID:26229287

  1. Genome Sequences of Five Additional Brevibacillus laterosporus Bacteriophages

    PubMed Central

    Merrill, Bryan D.; Berg, Jordan A.; Graves, Kiel A.; Ward, Andy T.; Hilton, Jared A.; Wake, Braden N.; Grose, Julianne H.; Breakwell, Donald P.

    2015-01-01

    Brevibacillus laterosporus has been isolated from many different environments, including beehives, and produces compounds that are toxic to many organisms. Five B. laterosporus phages have been isolated previously. Here, we announce five additional phages that infect this bacterium, including the first B. laterosporus siphoviruses to be discovered. PMID:26494658

  2. The shared genomic architecture of human nucleolar organizer regions

    PubMed Central

    Floutsakou, Ioanna; Agrawal, Saumya; Nguyen, Thong T.; Seoighe, Cathal; Ganley, Austen R.D.; McStay, Brian

    2013-01-01

    The short arms of the five acrocentric human chromosomes harbor sequences that direct the assembly and function of the nucleolus, one of the key functional domains of the nucleus, yet they are absent from the current human genome assembly. Here we describe the genomic architecture of these human nucleolar organizers. Sequences distal and proximal to ribosomal gene arrays are conserved among the acrocentric chromosomes, suggesting they are sites of frequent recombination. Although previously believed to be heterochromatic, characterization of these two flanking regions reveals that they share a complex genomic architecture similar to other euchromatic regions of the genome, but they have distinct genomic characteristics. Proximal sequences are almost entirely segmentally duplicated, similar to the regions bordering centromeres. In contrast, the distal sequence is predominantly unique to the acrocentric short arms and is dominated by a very large inverted repeat. We show that the distal element is localized to the periphery of the nucleolus, where it appears to anchor the ribosomal gene repeats. This, combined with its complex chromatin structure and transcriptional activity, suggests that this region is involved in nucleolar organization. Our results provide a platform for investigating the role of NORs in nucleolar formation and function, and open the door for determining the role of these regions in the well-known empirical association of nucleoli with pathology. PMID:23990606

  3. Analysis of human accelerated DNA regions using archaic hominin genomes.

    PubMed

    Burbano, Hernán A; Green, Richard E; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  4. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes

    PubMed Central

    Burbano, Hernán A.; Green, Richard E.; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S.; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  5. Attenuation of Monkeypox Virus by Deletion of Genomic Regions

    PubMed Central

    Lopera, Juan G.; Falendysz, Elizabeth A.; Rocke, Tonie E.; Osorio, Jorge E.

    2015-01-01

    Monkeypox virus (MPXV) is an emerging pathogen from Africa that causes disease similar to smallpox. Two clades with different geographic distributions and virulence have been described. Here, we utilized bioinformatic tools to identify genomic regions in MPXV containing multiple virulence genes and explored their roles in pathogenicity; two selected regions were then deleted singularly or in combination. In vitro and in vivo studies indicated that these regions play a significant role in MPXV replication, tissue spread, and mortality in mice. Interestingly, while deletion of either region led to decreased virulence in mice, one region had no effect on in vitro replication. Deletion of both regions simultaneously also reduced cell culture replication and significantly increased the attenuation in vivo over either single deletion. Attenuated MPXV with genomic deletions present a safe and efficacious tool in the study of MPX pathogenesis and in the identification of genetic factors associated with virulence. PMID:25462353

  6. Attenuation of monkeypox virus by deletion of genomic regions

    USGS Publications Warehouse

    Lopera, Juan G.; Falendysz, Elizabeth A.; Rocke, Tonie E.; Osorio, Jorge E.

    2015-01-01

    Monkeypox virus (MPXV) is an emerging pathogen from Africa that causes disease similar to smallpox. Two clades with different geographic distributions and virulence have been described. Here, we utilized bioinformatic tools to identify genomic regions in MPXV containing multiple virulence genes and explored their roles in pathogenicity; two selected regions were then deleted singularly or in combination. In vitro and in vivostudies indicated that these regions play a significant role in MPXV replication, tissue spread, and mortality in mice. Interestingly, while deletion of either region led to decreased virulence in mice, one region had no effect on in vitro replication. Deletion of both regions simultaneously also reduced cell culture replication and significantly increased the attenuation in vivo over either single deletion. Attenuated MPXV with genomic deletions present a safe and efficacious tool in the study of MPX pathogenesis and in the identification of genetic factors associated with virulence.

  7. On the Additive and Dominant Variance and Covariance of Individuals Within the Genomic Selection Scope

    PubMed Central

    Vitezica, Zulma G.; Varona, Luis; Legarra, Andres

    2013-01-01

    Genomic evaluation models can fit additive and dominant SNP effects. Under quantitative genetics theory, additive or “breeding” values of individuals are generated by substitution effects, which involve both “biological” additive and dominant effects of the markers. Dominance deviations include only a portion of the biological dominant effects of the markers. Additive variance includes variation due to the additive and dominant effects of the markers. We describe a matrix of dominant genomic relationships across individuals, D, which is similar to the G matrix used in genomic best linear unbiased prediction. This matrix can be used in a mixed-model context for genomic evaluations or to estimate dominant and additive variances in the population. From the “genotypic” value of individuals, an alternative parameterization defines additive and dominance as the parts attributable to the additive and dominant effect of the markers. This approach underestimates the additive genetic variance and overestimates the dominance variance. Transforming the variances from one model into the other is trivial if the distribution of allelic frequencies is known. We illustrate these results with mouse data (four traits, 1884 mice, and 10,946 markers) and simulated data (2100 individuals and 10,000 markers). Variance components were estimated correctly in the model, considering breeding values and dominance deviations. For the model considering genotypic values, the inclusion of dominant effects biased the estimate of additive variance. Genomic models were more accurate for the estimation of variance components than their pedigree-based counterparts. PMID:24121775

  8. A Bivariate Whole Genome Linkage Study Identified Genomic Regions Influencing Both BMD and Bone Structure

    PubMed Central

    Liu, Xiao-Gang; Liu, Yong-Jun; Liu, Jianfeng; Pei, Yufang; Xiong, Dong-Hai; Shen, Hui; Deng, Hong-Yi; Papasian, Christopher J; Drees, Betty M; Hamilton, James J; Recker, Robert R; Deng, Hong-Wen

    2008-01-01

    Areal BMD (aBMD) and areal bone size (ABS) are biologically correlated traits and are each important determinants of bone strength and risk of fractures. Studies showed that aBMD and ABS are genetically correlated, indicating that they may share some common genetic factors, which, however, are largely unknown. To study the genetic factors influencing both aBMD and ABS, bivariate whole genome linkage analyses were conducted for aBMD-ABS at the femoral neck (FN), lumbar spine (LS), and ultradistal (UD)-forearm in a large sample of 451 white pedigrees made up of 4498 individuals. We detected significant linkage on chromosome Xq27 (LOD = 4.89) for LS aBMD-ABS. In addition, we detected suggestive linkages at 20q11 (LOD = 3.65) and Xp11 (LOD = 2.96) for FN aBMD-ABS; at 12p11 (LOD = 3.39) and 17q21 (LOD = 2.94) for LS aBMD-ABS; and at 5q23 (LOD = 3.54), 7p15 (LOD = 3.45), Xq27 (LOD = 2.93), and 12p11 (LOD = 2.92) for UD-forearm aBMD-ABS. Subsequent discrimination analyses indicated that quantitative trait loci (QTLs) at 12p11 and 17q21 may have pleiotropic effects on aBMD and ABS. This study identified several genomic regions that may contain QTLs important for both aBMD and ABS. Further endeavors are necessary to follow these regions to eventually pinpoint the genetic variants affecting bone strength and risk of fractures. PMID:18597637

  9. Genome-wide identification of hypoxia-induced enhancer regions

    PubMed Central

    Preston, Jessica L.; Randel, Melissa A.; Johnson, Eric A.

    2015-01-01

    Here we present a genome-wide method for de novo identification of enhancer regions. This approach enables massively parallel empirical investigation of DNA sequences that mediate transcriptional activation and provides a platform for discovery of regulatory modules capable of driving context-specific gene expression. The method links fragmented genomic DNA to the transcription of randomer molecule identifiers and measures the functional enhancer activity of the library by massively parallel sequencing. We transfected a Drosophila melanogaster library into S2 cells in normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and previously described enhancers. Novel enhancer regions were identified and integrated with RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide basis as a complex regulatory network involving multiple stress-response pathways. This work provides a novel method for high-throughput assay of enhancer activity and the genome-scale identification of 31 hypoxia-activated enhancers in Drosophila. PMID:26713262

  10. regionReport: Interactive reports for region-level and feature-level genomic analyses

    PubMed Central

    Collado-Torres, Leonardo; Jaffe, Andrew E.; Leek, Jeffrey T.

    2016-01-01

    regionReport is an R package for generating detailed interactive reports from region-level genomic analyses as well as feature-level RNA-seq. The report includes quality-control checks, an overview of the results, an interactive table of the genomic regions or features of interest and reproducibility information. regionReport provides specialised reports for exploring DESeq2, edgeR, or derfinder differential expression analyses results. regionReport is also flexible and can easily be expanded with report templates for other analysis pipelines. PMID:27429738

  11. Annotation of the Protein Coding Regions of the Equine Genome

    PubMed Central

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.; Zeng, Zheng; Liu, Jinze; Orlando, Ludovic; MacLeod, James N.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons. PMID:26107351

  12. Centromeric motion facilitates the mobility of interphase genomic regions in fission yeast

    PubMed Central

    Kim, Kyoung-Dong; Tanizawa, Hideki; Iwasaki, Osamu; Corcoran, Christopher J.; Capizzi, Joseph R.; Hayden, James E.; Noma, Ken-ichi

    2013-01-01

    Summary Dispersed genetic elements, such as retrotransposons and Pol-III-transcribed genes, including tRNA and 5S rRNA, cluster and associate with centromeres in fission yeast through the function of condensin. However, the dynamics of these condensin-mediated genomic associations remains unknown. We have examined the 3D motions of genomic loci including the centromere, telomere, rDNA repeat locus, and the loci carrying Pol-III-transcribed genes or long-terminal repeat (LTR) retrotransposons in live cells at as short as 1.5-second intervals. Treatment with carbendazim (CBZ), a microtubule-destabilizing agent, not only prevents centromeric motion, but also reduces the mobility of the other genomic loci during interphase. Further analyses demonstrate that condensin-mediated associations between centromeres and the genomic loci are clonal, infrequent and transient. However, when associated, centromeres and the genomic loci migrate together in a coordinated fashion. In addition, a condensin mutation that disrupts associations between centromeres and the genomic loci results in a concomitant decrease in the mobility of the loci. Our study suggests that highly mobile centromeres pulled by microtubules in cytoplasm serve as ‘genome mobility elements’ by facilitating physical relocations of associating genomic regions. PMID:23986481

  13. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    PubMed Central

    Zhang, Hao; van Diepeningen, Anne D.; van der Lee, Theo A. J.; Waalwijk, Cees; de Hoog, G. Sybren

    2016-01-01

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/). PMID

  14. Nucleolar organizer regions: genomic 'dark matter' requiring illumination.

    PubMed

    McStay, Brian

    2016-07-15

    Nucleoli form around tandem arrays of a ribosomal gene repeat, termed nucleolar organizer regions (NORs). During metaphase, active NORs adopt a characteristic undercondensed morphology. Recent evidence indicates that the HMG-box-containing DNA-binding protein UBF (upstream binding factor) is directly responsible for this morphology and provides a mitotic bookmark to ensure rapid nucleolar formation beginning in telophase in human cells. This is likely to be a widely employed strategy, as UBF is present throughout metazoans. In higher eukaryotes, NORs are typically located within regions of chromosomes that form perinucleolar heterochromatin during interphase. Typically, the genomic architecture of NORs and the chromosomal regions within which they lie is very poorly described, yet recent evidence points to a role for context in their function. In Arabidopsis, NOR silencing appears to be controlled by sequences outside the rDNA (ribosomal DNA) array. Translocations reveal a role for context in the expression of the NOR on the X chromosome in Drosophila Recent work has begun on characterizing the genomic architecture of human NORs. A role for distal sequences located in perinucleolar heterochromatin has been inferred, as they exhibit a complex transcriptionally active chromatin structure. Links between rDNA genomic stability and aging in Saccharomyces cerevisiae are now well established, and indications are emerging that this is important in aging and replicative senescence in higher eukaryotes. This, combined with the fact that rDNA arrays are recombinational hot spots in cancer cells, has focused attention on DNA damage responses in NORs. The introduction of DNA double-strand breaks into rDNA arrays leads to a dramatic reorganization of nucleolar structure. Damaged rDNA repeats move from the nucleolar interior to form caps at the nucleolar periphery, presumably to facilitate repair, suggesting that the chromosomal context of human NORs contributes to their genomic

  15. Evolutionary history of the ABCB2 genomic region in teleosts

    USGS Publications Warehouse

    Palti, Y.; Rodriguez, M.F.; Gahr, S.A.; Hansen, J.D.

    2007-01-01

    Gene duplication, silencing and translocation have all been implicated in shaping the unique genomic architecture of the teleost MH regions. Previously, we demonstrated that trout possess five unlinked regions encoding MH genes. One of these regions harbors ABCB2 which in all other vertebrate classes is found in the MHC class II region. In this study, we sequenced a BAC contig for the trout ABCB2 region. Analysis of this region revealed the presence of genes homologous to those located in the human class II (ABCB2, BRD2, ??DAA), extended class II (RGL2, PHF1, SYGP1) and class III (PBX2, Notch-L) regions. The organization and syntenic relationships of this region were then compared to similar regions in humans, Tetraodon and zebrafish to learn more about the evolutionary history of this region. Our analysis indicates that this region was generated during the teleost-specific duplication event while also providing insight about potential MH paralogous regions in teleosts. ?? 2006 Elsevier Ltd. All rights reserved.

  16. Targeted isolation of cloned genomic regions by recombineering for haplotype phasing and isogenic targeting.

    PubMed

    Nedelkova, Marta; Maresca, Marcello; Fu, Jun; Rostovskaya, Maria; Chenna, Ramu; Thiede, Christian; Anastassiadis, Konstantinos; Sarov, Mihail; Stewart, A Francis

    2011-11-01

    Studying genetic variations in the human genome is important for understanding phenotypes and complex traits, including rare personal variations and their associations with disease. The interpretation of polymorphisms requires reliable methods to isolate natural genetic variations, including combinations of variations, in a format suitable for downstream analysis. Here, we describe a strategy for targeted isolation of large regions (∼35 kb) from human genomes that is also applicable to any genome of interest. The method relies on recombineering to fish out target fosmid clones from pools and thereby circumvents the laborious need to plate and screen thousands of individual clones. To optimize the method, a new highly recombineering-efficient bacterial host, including inducible TrfA for fosmid copy number amplification, was developed. Various regions were isolated from human embryonic stem cell lines and a personal genome, including highly repetitive and duplicated ones. The maternal and paternal alleles at the MECP2/IRAK 1 loci were distinguished based on identification of novel allele-specific single-nucleotide polymorphisms in regulatory regions. Additionally, we applied further recombineering to construct isogenic targeting vectors for patient-specific applications. These methods will facilitate work to understand the linkage between personal variations and disease propensity, as well as possibilities for personal genome surgery. PMID:21852329

  17. Targeted isolation of cloned genomic regions by recombineering for haplotype phasing and isogenic targeting

    PubMed Central

    Nedelkova, Marta; Maresca, Marcello; Fu, Jun; Rostovskaya, Maria; Chenna, Ramu; Thiede, Christian; Anastassiadis, Konstantinos; Sarov, Mihail; Stewart, A. Francis

    2011-01-01

    Studying genetic variations in the human genome is important for understanding phenotypes and complex traits, including rare personal variations and their associations with disease. The interpretation of polymorphisms requires reliable methods to isolate natural genetic variations, including combinations of variations, in a format suitable for downstream analysis. Here, we describe a strategy for targeted isolation of large regions (∼35 kb) from human genomes that is also applicable to any genome of interest. The method relies on recombineering to fish out target fosmid clones from pools and thereby circumvents the laborious need to plate and screen thousands of individual clones. To optimize the method, a new highly recombineering-efficient bacterial host, including inducible TrfA for fosmid copy number amplification, was developed. Various regions were isolated from human embryonic stem cell lines and a personal genome, including highly repetitive and duplicated ones. The maternal and paternal alleles at the MECP2/IRAK 1 loci were distinguished based on identification of novel allele-specific single-nucleotide polymorphisms in regulatory regions. Additionally, we applied further recombineering to construct isogenic targeting vectors for patient-specific applications. These methods will facilitate work to understand the linkage between personal variations and disease propensity, as well as possibilities for personal genome surgery. PMID:21852329

  18. Genome Size Variation in the Genus Carthamus (Asteraceae, Cardueae): Systematic Implications and Additive Changes During Allopolyploidization

    PubMed Central

    GARNATJE, TERESA; GARCIA, SÒNIA; VILATERSANA, ROSER; VALLÈS, JOAN

    2006-01-01

    • Background and Aims Plant genome size is an important biological characteristic, with relationships to systematics, ecology and distribution. Currently, there is no information regarding nuclear DNA content for any Carthamus species. In addition to improving the knowledge base, this research focuses on interspecific variation and its implications for the infrageneric classification of this genus. Genome size variation in the process of allopolyploid formation is also addressed. • Methods Nuclear DNA samples from 34 populations of 16 species of the genus Carthamus were assessed by flow cytometry using propidium iodide. • Key Results The 2C values ranged from 2·26 pg for C. leucocaulos to 7·46 pg for C. turkestanicus, and monoploid genome size (1Cx-value) ranged from 1·13 pg in C. leucocaulos to 1·53 pg in C. alexandrinus. Mean genome sizes differed significantly, based on sectional classification. Both allopolyploid species (C. creticus and C. turkestanicus) exhibited nuclear DNA contents in accordance with the sum of the putative parental C-values (in one case with a slight reduction, frequent in polyploids), supporting their hybrid origin. • Conclusions Genome size represents a useful tool in elucidating systematic relationships between closely related species. A considerable reduction in monoploid genome size, possibly due to the hybrid formation, is also reported within these taxa. PMID:16390843

  19. Chromosome region-specific libraries for human genome analysis

    SciTech Connect

    Kao, Fa-Ten.

    1991-01-01

    We have made important progress since the beginning of the current grant year. We have further developed the microdissection and PCR- assisted microcloning techniques using the linker-adaptor method. We have critically evaluated the microdissection libraries constructed by this microtechnology and proved that they are of high quality. We further demonstrated that these microdissection clones are useful in identifying corresponding YAC clones for a thousand-fold expansion of the genomic coverage and for contig construction. We are also improving the technique of cloning the dissected fragments in test tube by the TDT method. We are applying both of these PCR cloning technique to human chromosomes 2 and 5 to construct region-specific libraries for physical mapping purposes of LLNL and LANL. Finally, we are exploring efficient procedures to use unique sequence microclones to isolate cDNA clones from defined chromosomal regions as valuable resources for identifying expressed gene sequences in the human genome. We believe that we are making important progress under the auspices of this DOE human genome program grant and we will continue to make significant contributions in the coming year. 4 refs., 4 figs.

  20. The Control Region of Maternally and Paternally Inherited Mitochondrial Genomes of Three Species of the Sea Mussel Genus Mytilus

    PubMed Central

    Cao, Liqin; Ort, Brian S.; Mizi, Athanasia; Pogson, Grant; Kenchington, Elen; Zouros, Eleftherios; Rodakis, George C.

    2009-01-01

    Species of the mussel genus Mytilus possess maternally and paternally transmitted mitochondrial genomes. In the interbreeding taxa Mytilus edulis and M. galloprovincialis, several genomes of both types have been fully sequenced. The genome consists of the coding part (which, in addition to protein and RNA genes, contains several small noncoding sequences) and the main control region (CR), which in turn consists of three distinct parts: the first variable (VD1), the conserved (CD), and the second variable (VD2) domain. The maternal and paternal genomes are very similar in gene content and organization, even though they differ by >20% in primary sequence. They differ even more at VD1 and VD2, yet they are remarkably similar at CD. The complete sequence of a genome from the closely related species M. trossulus was previously reported and found to consist of a maternal-like coding part and a paternal-like and a maternal-like CR. From this and from the fact that it was extracted from a male individual, it was inferred that this is a genome that switched from maternal to paternal transmission. Here we provide clear evidence that this genome is the maternal genome of M. trossulus. We have found that in this genome the tRNAGln in the coding region is apparently defective and that an intact copy of this tRNA occurs in the CR, that one of the two conserved domains is missing essential motifs, and that one of the two first variable domains has a high rate of divergence. These features may explain the large size and mosaic structure of the CR of the maternal genome of M. trossulus. We have also obtained CR sequences of the maternal and paternal genomes of M. californianus, a more distantly related species. We compare the control regions from all three species, focusing on the divergence among genomes of different species origin and among genomes of different transmission routes. PMID:19139146

  1. Loss of heterozygosity on chromosomes 17 and 18 in breast carcinoma: two additional regions identified.

    PubMed Central

    Cropp, C S; Lidereau, R; Campbell, G; Champene, M H; Callahan, R

    1990-01-01

    The loss of heterozygosity (LOH) at specific regions of the human genome in tumor DNA is recognized as evidence for a tumor-suppressor gene located within the corresponding region of the homologous chromosome. Restriction fragment length polymorphism analysis of a panel of primary human breast tumor DNAs has led to the identification of two additional regions on chromosomes 17q and 18q that frequently are affected by LOH. Deletions of each of these regions have a significant correlation with clinical parameters that are associated with aggressive breast carcinomas. Previous restriction fragment length polymorphism analysis of this panel of tumors has uncovered several other frequently occurring mutations. LOH on chromosome 18q frequently occurs in tumors with concomitant LOH of loci on chromosomes 17p and 11p. Similarly, tumors having LOH on 17q also have LOH on chromosomes 1p and 3p. This suggests that certain combinations of mutations may collaborate in the development and malignant progression of breast carcinomas. Images PMID:1977164

  2. Exploring the diploid wheat ancestral A genome through sequence comparison at the high-molecular-weight glutenin locus region.

    PubMed

    Dong, Lingli; Huo, Naxin; Wang, Yi; Deal, Karin; Luo, Ming-Cheng; Wang, Daowen; Anderson, Olin D; Gu, Yong Qiang

    2012-12-01

    The polyploid nature of hexaploid wheat (T. aestivum, AABBDD) often represents a great challenge in various aspects of research including genetic mapping, map-based cloning of important genes, and sequencing and accurately assembly of its genome. To explore the utility of ancestral diploid species of polyploid wheat, sequence variation of T. urartu (A(u)A(u)) was analyzed by comparing its 277-kb large genomic region carrying the important Glu-1 locus with the homologous regions from the A genomes of the diploid T. monococcum (A(m)A(m)), tetraploid T. turgidum (AABB), and hexaploid T. aestivum (AABBDD). Our results revealed that in addition to a high degree of the gene collinearity, nested retroelement structures were also considerably conserved among the A(u) genome and the A genomes in polyploid wheats, suggesting that the majority of the repetitive sequences in the A genomes of polyploid wheats originated from the diploid A(u) genome. The difference in the compared region between A(u) and A is mainly caused by four differential TE insertion and two deletion events between these genomes. The estimated divergence time of A genomes calculated on nucleotide substitution rate in both shared TEs and collinear genes further supports the closer evolutionary relationship of A to A(u) than to A(m). The structure conservation in the repetitive regions promoted us to develop repeat junction markers based on the A(u) sequence for mapping the A genome in hexaploid wheat. Eighty percent of these repeat junction markers were successfully mapped to the corresponding region in hexaploid wheat, suggesting that T. urartu could serve as a useful resource for developing molecular markers for genetic and breeding studies in hexaploid wheat. PMID:23052831

  3. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    PubMed Central

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  4. A genomic island provides Acidithiobacillus ferrooxidans ATCC 53993 additional copper resistance: a possible competitive advantage.

    PubMed

    Orellana, Luis H; Jerez, Carlos A

    2011-11-01

    There is great interest in understanding how extremophilic biomining bacteria adapt to exceptionally high copper concentrations in their environment. Acidithiobacillus ferrooxidans ATCC 53993 genome possesses the same copper resistance determinants as strain ATCC 23270. However, the former strain contains in its genome a 160-kb genomic island (GI), which is absent in ATCC 23270. This GI contains, amongst other genes, several genes coding for an additional putative copper ATPase and a Cus system. A. ferrooxidans ATCC 53993 showed a much higher resistance to CuSO(4) (>100 mM) than that of strain ATCC 23270 (<25 mM). When a similar number of bacteria from each strain were mixed and allowed to grow in the absence of copper, their respective final numbers remained approximately equal. However, in the presence of copper, there was a clear overgrowth of strain ATCC 53993 compared to ATCC 23270. This behavior is most likely explained by the presence of the additional copper-resistance genes in the GI of strain ATCC 53993. As determined by qRT-PCR, it was demonstrated that these genes are upregulated when A. ferrooxidans ATCC 53993 is grown in the presence of copper and were shown to be functional when expressed in copper-sensitive Escherichia coli mutants. Thus, the reason for resistance to copper of two strains of the same acidophilic microorganism could be determined by slight differences in their genomes, which may not only lead to changes in their capacities to adapt to their environment, but may also help to select the more fit microorganisms for industrial biomining operations. PMID:21789491

  5. Rapid evolution and complex structural organization in genomic regions harboring multiple prolamin genes in the polyploid wheat genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genes encoding wheat prolamins belong to complicated multi-gene families in the wheat genome. To understand the structural complexity of storage protein loci, we sequenced and analyzed orthologous regions containing both gliadin and LMW-glutenin genes from the A and B genomes of a tetraploid wheat ...

  6. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions

    PubMed Central

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions. PMID:26186451

  7. Multiple Comparison Analysis of Two New Genomic Sequences of ILTV Strains from China with Other Strains from Different Geographic Regions.

    PubMed

    Zhao, Yan; Kong, Congcong; Wang, Yunfeng

    2015-01-01

    To date, twenty complete genome sequences of ILTV strains have been published in GenBank, including one strain from China, and nineteen strains from Australian and the United States. To investigate the genomic information on ILTVs from different geographic regions, two additional individual complete genome sequences of WG and K317 strains from China were determined. The genomes of WG and K317 strains were 153,505 and 153,639 bp in length, respectively. Alignments performed on the amino acid sequences of the twelve glycoproteins showed that 13 out of 116 mutational sites were present only among the Chinese strain WG and the Australian strains SA2 and A20. The phylogenetic tree analysis suggested that the WG strain established close relationships with the Australian strain SA2. The recombination events were detected and confirmed in different subregions of the WG strain with the sequences of SA2 and K317 strains as parental. In this study, two new complete genome sequences of Chinese ILTV strains were used in comparative analysis with other complete genome sequences of ILTV strains from China, the United States, and Australia. The analysis of genome comparison, phylogenetic trees, and recombination events showed close relationships among the Chinese strain WG and the Australian strains SA2. The information of the two new complete genome sequences from China will help to facilitate the analysis of phylogenetic relationships and the molecular differences among ILTV strains from different geographic regions. PMID:26186451

  8. Mapping of lamin A- and progerin-interacting genome regions.

    PubMed

    Kubben, Nard; Adriaens, Michiel; Meuleman, Wouter; Voncken, Jan Willem; van Steensel, Bas; Misteli, Tom

    2012-10-01

    Mutations in the A-type lamins A and C, two major components of the nuclear lamina, cause a large group of phenotypically diverse diseases collectively referred to as laminopathies. These conditions often involve defects in chromatin organization. However, it is unclear whether A-type lamins interact with chromatin in vivo and whether aberrant chromatin-lamin interactions contribute to disease. Here, we have used an unbiased approach to comparatively map genome-wide interactions of gene promoters with lamin A and progerin, the mutated lamin A isoform responsible for the premature aging disorder Hutchinson-Gilford progeria syndrome (HGPS) in mouse cardiac myoytes and embryonic fibroblasts. We find that lamin A-associated genes are predominantly transcriptionally silent and that loss of lamin association leads to the relocation of peripherally localized genes, but not necessarily to their activation. We demonstrate that progerin induces global changes in chromatin organization by enhancing interactions with a specific subset of genes in addition to the identified lamin A-associated genes. These observations demonstrate disease-related changes in higher order genome organization in HGPS and provide novel insights into the role of lamin-chromatin interactions in chromatin organization. PMID:22610065

  9. Annotation of additional evolutionary conserved microRNAs in CHO cells from updated genomic data

    PubMed Central

    Hackl, Matthias; Klanert, Gerald; Jadhav, Vaibhav; Reithofer, Manuel; Stiefel, Fabian; Hesse, Friedemann; Grillari, Johannes; Borth, Nicole

    2015-01-01

    ABSTRACT MicroRNAs are small non‐coding RNAs that play a critical role in post‐transcriptional control of gene expression. Recent publications of genomic sequencing data from the Chinese Hamster (CGR) and Chinese hamster ovary (CHO) cells provide new tools for the discovery of novel miRNAs in this important production system. Version 20 of the miRNA registry miRBase contains 307 mature miRNAs and 200 precursor sequences for CGR/CHO. We searched for evolutionary conserved miRNAs from miRBase v20 in recently published genomic data, derived from Chinese hamster and CHO cells, to further extend the list of known miRNAs. With our approach we could identify several hundred miRNA sequences in the genome. For several of these, the expression in CHO cells could be verified from multiple next‐generation sequencing experiments. In addition, several hundred unexpressed miRNAs are awaiting further confirmation by testing for their transcription in different Chinese hamster tissues. Biotechnol. Bioeng. 2015;112: 1488–1493. © 2015 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc. PMID:25689160

  10. DNA Replication Control Is Linked to Genomic Positioning of Control Regions in Escherichia coli.

    PubMed

    Frimodt-Møller, Jakob; Charbon, Godefroid; Krogfelt, Karen A; Løbner-Olesen, Anders

    2016-09-01

    Chromosome replication in Escherichia coli is in part controlled by three non-coding genomic sequences, DARS1, DARS2, and datA that modulate the activity of the initiator protein DnaA. The relative distance from oriC to the non-coding regions are conserved among E. coli species, despite large variations in genome size. Here we use a combination of i) site directed translocation of each region to new positions on the bacterial chromosome and ii) random transposon mediated translocation followed by culture evolution, to show genetic evidence for the importance of position. Here we provide evidence that the genomic locations of these regulatory sequences are important for cell cycle control and bacterial fitness. In addition, our work shows that the functionally redundant DARS1 and DARS2 regions play different roles in replication control. DARS1 is mainly involved in maintaining the origin concentration, whether DARS2 is also involved in maintaining single cell synchrony. PMID:27589233

  11. Genomic prediction of growth in pigs based on a model including additive and dominance effects.

    PubMed

    Lopes, M S; Bastiaansen, J W M; Janss, L; Knol, E F; Bovenhuis, H

    2016-06-01

    Independent of whether prediction is based on pedigree or genomic information, the focus of animal breeders has been on additive genetic effects or 'breeding values'. However, when predicting phenotypes rather than breeding values of an animal, models that account for both additive and dominance effects might be more accurate. Our aim with this study was to compare the accuracy of predicting phenotypes using a model that accounts for only additive effects (MA) and a model that accounts for both additive and dominance effects simultaneously (MAD). Lifetime daily gain (DG) was evaluated in three pig populations (1424 Pietrain, 2023 Landrace, and 2157 Large White). Animals were genotyped using the Illumina SNP60K Beadchip and assigned to either a training data set to estimate the genetic parameters and SNP effects, or to a validation data set to assess the prediction accuracy. Models MA and MAD applied random regression on SNP genotypes and were implemented in the program Bayz. The additive heritability of DG across the three populations and the two models was very similar at approximately 0.26. The proportion of phenotypic variance explained by dominance effects ranged from 0.04 (Large White) to 0.11 (Pietrain), indicating that importance of dominance might be breed-specific. Prediction accuracies were higher when predicting phenotypes using total genetic values (sum of breeding values and dominance deviations) from the MAD model compared to using breeding values from both MA and MAD models. The highest increase in accuracy (from 0.195 to 0.222) was observed in the Pietrain, and the lowest in Large White (from 0.354 to 0.359). Predicting phenotypes using total genetic values instead of breeding values in purebred data improved prediction accuracy and reduced the bias of genomic predictions. Additional benefit of the method is expected when applied to predict crossbred phenotypes, where dominance levels are expected to be higher. PMID:26676611

  12. Single Molecule Analysis of Replicated DNA Reveals the Usage of Multiple KSHV Genome Regions for Latent Replication

    PubMed Central

    Verma, Subhash C.; Lu, Jie; Cai, Qiliang; Kosiyatrakul, Settapong; McDowell, Maria E.; Schildkraut, Carl L.; Robertson, Erle S.

    2011-01-01

    Kaposi's sarcoma associated herpesvirus (KSHV), an etiologic agent of Kaposi's sarcoma, Body Cavity Based Lymphoma and Multicentric Castleman's Disease, establishes lifelong latency in infected cells. The KSHV genome tethers to the host chromosome with the help of a latency associated nuclear antigen (LANA). Additionally, LANA supports replication of the latent origins within the terminal repeats by recruiting cellular factors. Our previous studies identified and characterized another latent origin, which supported the replication of plasmids ex-vivo without LANA expression in trans. Therefore identification of an additional origin site prompted us to analyze the entire KSHV genome for replication initiation sites using single molecule analysis of replicated DNA (SMARD). Our results showed that replication of DNA can initiate throughout the KSHV genome and the usage of these regions is not conserved in two different KSHV strains investigated. SMARD also showed that the utilization of multiple replication initiation sites occurs across large regions of the genome rather than a specified sequence. The replication origin of the terminal repeats showed only a slight preference for their usage indicating that LANA dependent origin at the terminal repeats (TR) plays only a limited role in genome duplication. Furthermore, we performed chromatin immunoprecipitation for ORC2 and MCM3, which are part of the pre-replication initiation complex to determine the genomic sites where these proteins accumulate, to provide further characterization of potential replication initiation sites on the KSHV genome. The ChIP data confirmed accumulation of these pre-RC proteins at multiple genomic sites in a cell cycle dependent manner. Our data also show that both the frequency and the sites of replication initiation vary within the two KSHV genomes studied here, suggesting that initiation of replication is likely to be affected by the genomic context rather than the DNA sequences. PMID

  13. Novel swine virulence determinant in the left variable region of the African swine fever virus genome.

    PubMed

    Neilan, J G; Zsak, L; Lu, Z; Kutish, G F; Afonso, C L; Rock, D L

    2002-04-01

    Previously we have shown that the African swine fever virus (ASFV) NL gene deletion mutant E70DeltaNL is attenuated in pigs. Our recent observations that NL gene deletion mutants of two additional pathogenic ASFV isolates, Malawi Lil-20/1 and Pr4, remained highly virulent in swine (100% mortality) suggested that these isolates encoded an additional virulence determinant(s) that was absent from E70. To map this putative virulence determinant, in vivo marker rescue experiments were performed by inoculating swine with infection-transfection lysates containing E70 NL deletion mutant virus (E70DeltaNL) and cosmid DNA clones from the Malawi NL gene deletion mutant (MalDeltaNL). A cosmid clone representing the left-hand 38-kb region (map units 0.05 to 0.26) of the MalDeltaNL genome was capable of restoring full virulence to E70DeltaNL. Southern blot analysis of recovered virulent viruses confirmed that they were recombinant E70DeltaNL genomes containing a 23- to 28-kb DNA fragment of the Malawi genome. These recombinants exhibited an unaltered MalDeltaNL disease and virulence phenotype when inoculated into swine. Additional in vivo marker rescue experiments identified a 20-kb fragment, encoding members of multigene families (MGF) 360 and 530, as being capable of fully restoring virulence to E70DeltaNL. Comparative nucleotide sequence analysis of the left variable region of the E70DeltaNL and Malawi Lil-20/1 genomes identified an 8-kb deletion in the E70DeltaNL isolate which resulted in the deletion and/or truncation of three MGF 360 genes and four MGF 530 genes. A recombinant MalDeltaNL deletion mutant lacking three members of each MGF gene family was constructed and evaluated for virulence in swine. The mutant virus replicated normally in macrophage cell culture but was avirulent in swine. Together, these results indicate that a region within the left variable region of the ASFV genome containing the MGF 360 and 530 genes represents a previously unrecognized virulence

  14. Novel Swine Virulence Determinant in the Left Variable Region of the African Swine Fever Virus Genome

    PubMed Central

    Neilan, J. G.; Zsak, L.; Lu, Z.; Kutish, G. F.; Afonso, C. L.; Rock, D. L.

    2002-01-01

    Previously we have shown that the African swine fever virus (ASFV) NL gene deletion mutant E70ΔNL is attenuated in pigs. Our recent observations that NL gene deletion mutants of two additional pathogenic ASFV isolates, Malawi Lil-20/1 and Pr4, remained highly virulent in swine (100% mortality) suggested that these isolates encoded an additional virulence determinant(s) that was absent from E70. To map this putative virulence determinant, in vivo marker rescue experiments were performed by inoculating swine with infection-transfection lysates containing E70 NL deletion mutant virus (E70ΔNL) and cosmid DNA clones from the Malawi NL gene deletion mutant (MalΔNL). A cosmid clone representing the left-hand 38-kb region (map units 0.05 to 0.26) of the MalΔNL genome was capable of restoring full virulence to E70ΔNL. Southern blot analysis of recovered virulent viruses confirmed that they were recombinant E70ΔNL genomes containing a 23- to 28-kb DNA fragment of the Malawi genome. These recombinants exhibited an unaltered MalΔNL disease and virulence phenotype when inoculated into swine. Additional in vivo marker rescue experiments identified a 20-kb fragment, encoding members of multigene families (MGF) 360 and 530, as being capable of fully restoring virulence to E70ΔNL. Comparative nucleotide sequence analysis of the left variable region of the E70ΔNL and Malawi Lil-20/1 genomes identified an 8-kb deletion in the E70ΔNL isolate which resulted in the deletion and/or truncation of three MGF 360 genes and four MGF 530 genes. A recombinant MalΔNL deletion mutant lacking three members of each MGF gene family was constructed and evaluated for virulence in swine. The mutant virus replicated normally in macrophage cell culture but was avirulent in swine. Together, these results indicate that a region within the left variable region of the ASFV genome containing the MGF 360 and 530 genes represents a previously unrecognized virulence determinant for domestic swine

  15. Dynamic evolution of Rht-1 homologous regions in grass genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Bread wheat contains A, B, and D subgenomes with its well characterized ancestral genomes that exist at the diploid and tetraploid levels. Therefore, the wheat genome system acts as a model specie for studying genome evolutionary dynamics. Here, we performed intra- and inter-species comparative ana...

  16. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    SciTech Connect

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  17. Fractionation of Synteny in a Genomic Region Containing Tandemly Duplicated Genes Across Glycine max, Medicago truncatula and Arabidopsis thaliana

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Extended comparison of gene sequences found on homeologous soybean BACs to Medicago truncatula and Arabidopsis thaliana genomic sequences demonstrated a network of synteny within conserved regions interrupted by gene addition and/or deletions. Consolidation of gene order among all three species prov...

  18. Additives

    NASA Technical Reports Server (NTRS)

    Smalheer, C. V.

    1973-01-01

    The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.

  19. Targeted gene addition into a specified location in the human genome using designed zinc finger nucleases

    PubMed Central

    Moehle, Erica A.; Rock, Jeremy M.; Lee, Ya-Li; Jouvenot, Yann; DeKelver, Russell C.; Gregory, Philip D.; Urnov, Fyodor D.; Holmes, Michael C.

    2007-01-01

    Efficient incorporation of novel DNA sequences into a specific site in the genome of living human cells remains a challenge despite its potential utility to genetic medicine, biotechnology, and basic research. We find that a precisely placed double-strand break induced by engineered zinc finger nucleases (ZFNs) can stimulate integration of long DNA stretches into a predetermined genomic location, resulting in high-efficiency site-specific gene addition. Using an extrachromosomal DNA donor carrying a 12-bp tag, a 900-bp ORF, or a 1.5-kb promoter-transcription unit flanked by locus-specific homology arms, we find targeted integration frequencies of 15%, 6%, and 5%, respectively, within 72 h of treatment, and with no selection for the desired event. Importantly, we find that the integration event occurs in a homology-directed manner and leads to the accurate reconstruction of the donor-specified genotype at the endogenous chromosomal locus, and hence presumably results from synthesis-dependent strand annealing repair of the break using the donor DNA as a template. This site-specific gene addition occurs with no measurable increase in the rate of random integration. Remarkably, we also find that ZFNs can drive the addition of an 8-kb sequence carrying three distinct promoter-transcription units into an endogenous locus at a frequency of 6%, also in the absence of any selection. These data reveal the surprising versatility of the specialized polymerase machinery involved in double-strand break repair, illuminate a powerful approach to mammalian cell engineering, and open the possibility of ZFN-driven gene addition therapy for human genetic disease. PMID:17360608

  20. Targeted gene addition into a specified location in the human genome using designed zinc finger nucleases.

    PubMed

    Moehle, Erica A; Moehle, E A; Rock, Jeremy M; Rock, J M; Lee, Ya-Li; Lee, Y L; Jouvenot, Yann; Jouvenot, Y; DeKelver, Russell C; Dekelver, R C; Gregory, Philip D; Gregory, P D; Urnov, Fyodor D; Urnov, F D; Holmes, Michael C; Holmes, M C

    2007-02-27

    Efficient incorporation of novel DNA sequences into a specific site in the genome of living human cells remains a challenge despite its potential utility to genetic medicine, biotechnology, and basic research. We find that a precisely placed double-strand break induced by engineered zinc finger nucleases (ZFNs) can stimulate integration of long DNA stretches into a predetermined genomic location, resulting in high-efficiency site-specific gene addition. Using an extrachromosomal DNA donor carrying a 12-bp tag, a 900-bp ORF, or a 1.5-kb promoter-transcription unit flanked by locus-specific homology arms, we find targeted integration frequencies of 15%, 6%, and 5%, respectively, within 72 h of treatment, and with no selection for the desired event. Importantly, we find that the integration event occurs in a homology-directed manner and leads to the accurate reconstruction of the donor-specified genotype at the endogenous chromosomal locus, and hence presumably results from synthesis-dependent strand annealing repair of the break using the donor DNA as a template. This site-specific gene addition occurs with no measurable increase in the rate of random integration. Remarkably, we also find that ZFNs can drive the addition of an 8-kb sequence carrying three distinct promoter-transcription units into an endogenous locus at a frequency of 6%, also in the absence of any selection. These data reveal the surprising versatility of the specialized polymerase machinery involved in double-strand break repair, illuminate a powerful approach to mammalian cell engineering, and open the possibility of ZFN-driven gene addition therapy for human genetic disease. PMID:17360608

  1. Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences

    PubMed Central

    2014-01-01

    Background Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research. PMID:24980144

  2. Remarkably Divergent Regions Punctuate the Genome Assembly of the Caenorhabditis elegans Hawaiian Strain CB4856

    PubMed Central

    Thompson, Owen A.; Snoek, L. Basten; Nijveen, Harm; Sterken, Mark G.; Volkers, Rita J. M.; Brenchley, Rachel; van’t Hof, Arjen; Bevers, Roel P. J.; Cossins, Andrew R.; Yanai, Itai; Hajnal, Alex; Schmid, Tobias; Perkins, Jaryn D.; Spencer, David; Kruglyak, Leonid; Andersen, Erik C.; Moerman, Donald G.; Hillier, LaDeana W.; Kammenga, Jan E.; Waterston, Robert H.

    2015-01-01

    The Hawaiian strain (CB4856) of Caenorhabditis elegans is one of the most divergent from the canonical laboratory strain N2 and has been widely used in developmental, population, and evolutionary studies. To enhance the utility of the strain, we have generated a draft sequence of the CB4856 genome, exploiting a variety of resources and strategies. When compared against the N2 reference, the CB4856 genome has 327,050 single nucleotide variants (SNVs) and 79,529 insertion–deletion events that result in a total of 3.3 Mb of N2 sequence missing from CB4856 and 1.4 Mb of sequence present in CB4856 but not present in N2. As previously reported, the density of SNVs varies along the chromosomes, with the arms of chromosomes showing greater average variation than the centers. In addition, we find 61 regions totaling 2.8 Mb, distributed across all six chromosomes, which have a greatly elevated SNV density, ranging from 2 to 16% SNVs. A survey of other wild isolates show that the two alternative haplotypes for each region are widely distributed, suggesting they have been maintained by balancing selection over long evolutionary times. These divergent regions contain an abundance of genes from large rapidly evolving families encoding F-box, MATH, BATH, seven-transmembrane G-coupled receptors, and nuclear hormone receptors, suggesting that they provide selective advantages in natural environments. The draft sequence makes available a comprehensive catalog of sequence differences between the CB4856 and N2 strains that will facilitate the molecular dissection of their phenotypic differences. Our work also emphasizes the importance of going beyond simple alignment of reads to a reference genome when assessing differences between genomes. PMID:25995208

  3. Remarkably Divergent Regions Punctuate the Genome Assembly of the Caenorhabditis elegans Hawaiian Strain CB4856.

    PubMed

    Thompson, Owen A; Snoek, L Basten; Nijveen, Harm; Sterken, Mark G; Volkers, Rita J M; Brenchley, Rachel; Van't Hof, Arjen; Bevers, Roel P J; Cossins, Andrew R; Yanai, Itai; Hajnal, Alex; Schmid, Tobias; Perkins, Jaryn D; Spencer, David; Kruglyak, Leonid; Andersen, Erik C; Moerman, Donald G; Hillier, LaDeana W; Kammenga, Jan E; Waterston, Robert H

    2015-07-01

    The Hawaiian strain (CB4856) of Caenorhabditis elegans is one of the most divergent from the canonical laboratory strain N2 and has been widely used in developmental, population, and evolutionary studies. To enhance the utility of the strain, we have generated a draft sequence of the CB4856 genome, exploiting a variety of resources and strategies. When compared against the N2 reference, the CB4856 genome has 327,050 single nucleotide variants (SNVs) and 79,529 insertion-deletion events that result in a total of 3.3 Mb of N2 sequence missing from CB4856 and 1.4 Mb of sequence present in CB4856 but not present in N2. As previously reported, the density of SNVs varies along the chromosomes, with the arms of chromosomes showing greater average variation than the centers. In addition, we find 61 regions totaling 2.8 Mb, distributed across all six chromosomes, which have a greatly elevated SNV density, ranging from 2 to 16% SNVs. A survey of other wild isolates show that the two alternative haplotypes for each region are widely distributed, suggesting they have been maintained by balancing selection over long evolutionary times. These divergent regions contain an abundance of genes from large rapidly evolving families encoding F-box, MATH, BATH, seven-transmembrane G-coupled receptors, and nuclear hormone receptors, suggesting that they provide selective advantages in natural environments. The draft sequence makes available a comprehensive catalog of sequence differences between the CB4856 and N2 strains that will facilitate the molecular dissection of their phenotypic differences. Our work also emphasizes the importance of going beyond simple alignment of reads to a reference genome when assessing differences between genomes. PMID:25995208

  4. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions.

    PubMed

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  5. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  6. Linkage disequilibrium and diversity for three genomic regions in Azoreans and mainland Portuguese

    PubMed Central

    2009-01-01

    Studies on linkage disequilibrium (LD) across the genome and populations have been used in recent years with the main objective of improving gene mapping of complex traits. Here, we characterize the patterns of genetic diversity of HLA loci and evaluate LD (D') extent in three genomic regions: Xq13.3, NRY and HLA. In addition, we examine the distribution of DXS1225-DXS8082 haplotype diversity in Azoreans and mainland Portuguese. Allele distribution has demonstrated that the São Miguel population is genetically very diverse; haplotype analysis revealed 100% discriminatory power for X- and Y-markers and 94.3% for HLA markers. Standardized multiallelic D' in these three genomic regions shows values lower than 0.33, thereby suggesting there is no extensive LD in the São Miguel population. Data regarding the distribution of DXS1225-DXS8082 haplotypes indicate that there are no significant differences among all the populations studied, (Azorean geographical groups, the Azores archipelago and mainland Portugal). Moreover, in these as well as in other European populations, the most frequent DXS1225-DXS8082 haplotype is 210-219. Even though São Miguel islanders and Azoreans do not constitute isolated populations and show LD for only very short physical distances, certain characteristics, such as the absence of genetic structure, the same environment and the possibility of constructing extensive pedigrees through church and civil records, offer an opportunity for dissecting the genetic background of complex diseases in these populations. PMID:21637671

  7. Augmenting Chinese hamster genome assembly by identifying regions of high confidence.

    PubMed

    Vishwanathan, Nandita; Bandyopadhyay, Arpan A; Fu, Hsu-Yuan; Sharma, Mohit; Johnson, Kathryn C; Mudge, Joann; Ramaraj, Thiruvarangan; Onsongo, Getiria; Silverstein, Kevin A T; Jacob, Nitya M; Le, Huong; Karypis, George; Hu, Wei-Shou

    2016-09-01

    Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. We assembled two independent drafts of Chinese hamster genome by de novo assembly from shotgun sequencing reads and by re-scaffolding and gap-filling the draft genome from NCBI for improved scaffold lengths and gap fractions. We then used the two independent assemblies to identify high confidence regions using two different approaches. First, the two independent assemblies were compared at the sequence level to identify their consensus regions as "high confidence regions" which accounts for at least 78 % of the assembled genome. Further, a genome wide comparison of the Chinese hamster scaffolds with mouse chromosomes revealed scaffolds with large blocks of collinearity, which were also compiled as high-quality scaffolds. Genome scale collinearity was complemented with EST based synteny which also revealed conserved gene order compared to mouse. As cell line sequencing becomes more commonly practiced, the approaches reported here are useful for assessing the quality of assembly and potentially facilitate the engineering of cell lines. PMID:27374913

  8. Addition of a breeding database in the Genome Database for Rosaceae.

    PubMed

    Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

    2013-01-01

    Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will

  9. Addition of a breeding database in the Genome Database for Rosaceae

    PubMed Central

    Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

    2013-01-01

    Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will

  10. Genome assemblies for 11 Yersinia pestis strains isolated in the Caucasus region

    SciTech Connect

    Zhgenti, Ekaterine; Johnson, Shannon L.; Davenport, Karen W.; Chanturia, Gvantsa; Daligault, Hajnalka E.; Chain, Patrick S.; Nikolich, Mikeljon P.

    2015-09-17

    Yersinia pestis, the causative agent of plague, is endemic to the Caucasus region but few reference strain genome sequences from that region are available. We present the improved draft or finished assembled genomes from 11 strains isolated in the nation of Georgia and surrounding countries.

  11. Genome assemblies for 11 Yersinia pestis strains isolated in the Caucasus region

    DOE PAGESBeta

    Zhgenti, Ekaterine; Johnson, Shannon L.; Davenport, Karen W.; Chanturia, Gvantsa; Daligault, Hajnalka E.; Chain, Patrick S.; Nikolich, Mikeljon P.

    2015-09-17

    Yersinia pestis, the causative agent of plague, is endemic to the Caucasus region but few reference strain genome sequences from that region are available. We present the improved draft or finished assembled genomes from 11 strains isolated in the nation of Georgia and surrounding countries.

  12. Genome Assemblies for 11 Yersinia pestis Strains Isolated in the Caucasus Region.

    PubMed

    Zhgenti, Ekaterine; Johnson, Shannon L; Davenport, Karen W; Chanturia, Gvantsa; Daligault, Hajnalka E; Chain, Patrick S; Nikolich, Mikeljon P

    2015-01-01

    Yersinia pestis, the causative agent of plague, is endemic to the Caucasus region but few reference strain genome sequences from that region are available. Here, we present the improved draft or finished assembled genomes from 11 strains isolated in the nation of Georgia and surrounding countries. PMID:26383663

  13. Ecological effects of cell-level processes: genome size, functional traits and regional abundance of herbaceous plant species

    PubMed Central

    Herben, Tomáš; Suda, Jan; Klimešová, Jitka; Mihulka, Stanislav; Říha, Pavel; Šímová, Irena

    2012-01-01

    Background and Aims Genome size is known to be correlated with a number of phenotypic traits associated with cell sizes and cell-division rates. Genome size was therefore used as a proxy for them in order to assess how common plant traits such as height, specific leaf area and seed size/number predict species regional abundance. In this study it is hypothesized that if there is residual correlation between genome size and abundance after these traits are partialled out, there must be additional ecological effects of cell size and/or cell-division rate. Methods Variation in genome size, plant traits and regional abundance were examined in 436 herbaceous species of central European flora, and relationships were sought for among these variables by correlation and path analysis. Key Results Species regional abundance was weakly but significantly correlated with genome size; the relationship was stronger for annuals (R2 = 0·145) than for perennials (R2 = 0·027). In annuals, genome size was linked to abundance via its effect on seed size, which constrains seed number and hence population growth rate. In perennials, it weakly affected (via height and specific leaf area) competitive ability. These relationships did not change qualitatively after phylogenetic correction. In both annuals and perennials there was an unresolved effect of genome size on abundance. Conclusions The findings indicate that additional predictors of regional abundance should be sought among variables that are linked to cell size and cell-division rate. Signals of these cell-level processes remain identifiable even at the landscape scale, and show deep differences between perennials and annuals. Plant population biology could thus possibly benefit from more systematic use of indicators of cell-level processes. PMID:22628380

  14. Comprehensive Repertoire of Foldable Regions within Whole Genomes

    PubMed Central

    Faure, Guilhem; Callebaut, Isabelle

    2013-01-01

    In order to get a comprehensive repertoire of foldable domains within whole proteomes, including orphan domains, we developed a novel procedure, called SEG-HCA. From only the information of a single amino acid sequence, SEG-HCA automatically delineates segments possessing high densities in hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA). These hydrophobic clusters mainly correspond to regular secondary structures, which together form structured or foldable regions. Genome-wide analyses revealed that SEG-HCA is opposite of disorder predictors, both addressing distinct structural states. Interestingly, there is however an overlap between the two predictions, including small segments of disordered sequences, which undergo coupled folding and binding. SEG-HCA thus gives access to these specific domains, which are generally poorly represented in domain databases. Comparison of the whole set of SEG-HCA predictions with the Conserved Domain Database (CDD) also highlighted a wide proportion of predicted large (length >50 amino acids) segments, which are CDD orphan. These orphan sequences may either correspond to highly divergent members of already known families or belong to new families of domains. Their comprehensive description thus opens new avenues to investigate new functional and/or structural features, which remained so far uncovered. Altogether, the data described here provide new insights into the protein architecture and organization throughout the three kingdoms of life. PMID:24204229

  15. Discovery and verification of functional single nucleotide polymorphisms in regulatory genomic regions: Current and developing technologies

    PubMed Central

    Chorley, Brian N.; Wang, Xuting; Campbell, Michelle R.; Pittman, Gary S.; Noureddine, Maher A.; Bell, Douglas A.

    2008-01-01

    The most common form of genetic variation, single nucleotide polymorphisms or SNPs, can affect the way an individual responds to the environment and modify disease risk. Although most of the millions of SNPs have little or no effect on gene regulation and protein activity, there are many circumstances where base changes can have deleterious effects. Non-synonymous SNPs that result in amino acid changes in proteins have been studied because of their obvious impact on protein activity. It is well known that SNPs within regulatory regions of the genome can result in disregulation of gene transcription. However, the impact of SNPs located in putative regulatory regions, or rSNPs, is harder to predict for two primary reasons. First, the mechanistic roles of non-coding genomic sequence remain poorly defined. Second, experimental validation of the functional consequences of rSNPs is often slow and laborious. In this review, we summarize traditional and novel methodologies for candidate rSNPs selection, in particular in silico techniques that aid in candidate rSNP selection. Additionally we will discuss molecular biological techniques that assess the impact of rSNPs on binding of regulatory machinery, as well as functional consequences on transcription. Standard techniques such as EMSA and luciferase reporter constructs are still widely used to assess effects of rSNPs on binding and gene transcription; however, these protocols are often bottlenecks in the discovery process. Therefore, we highlight novel and developing high-throughput protocols that promise to aid in shortening the process of rSNP validation. Given the large amount of genomic information generated from a multitude of re-sequencing and genome-wide SNP array efforts, future focus should be to develop validation techniques that will allow greater understanding of the impact these polymorphisms have on human health and disease. PMID:18565787

  16. LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor

    PubMed Central

    Sheffield, Nathan C.; Bock, Christoph

    2016-01-01

    Summary: Genomic datasets are often interpreted in the context of large-scale reference databases. One approach is to identify significantly overlapping gene sets, which works well for gene-centric data. However, many types of high-throughput data are based on genomic regions. Locus Overlap Analysis (LOLA) provides easy and automatable enrichment analysis for genomic region sets, thus facilitating the interpretation of functional genomics and epigenomics data. Availability and Implementation: R package available in Bioconductor and on the following website: http://lola.computational-epigenetics.org. Contact: nsheffield@cemm.oeaw.ac.at or cbock@cemm.oeaw.ac.at PMID:26508757

  17. Additional evidence implicating Triticum searsii as the B-genome donor to wheat.

    PubMed

    Nath, J; Hanzel, J J; Thompson, J P; McNay, J W

    1984-02-01

    In vitro DNA:DNA hybridizations and hydroxyapatite thermal-elution chromatography were employed to identify the diploid wheat species ancestral to the B genome of Triticum turgidum. 3H-T. turgidum DNA was hybridized to the unlabeled DNAs of T. urartu, T. speltoides, T. sharonensis, T. bicorne, T. longissimum, and T. searsii. 3H-Labeled DNAs of T. monococcum and a synthetic tetraploid AADD were hybridized with unlabeled DNAs of T. urartu and T. searsii to determine the relationship of the A genome of polyploid wheat and T. urartu. The heteroduplex thermal stabilities indicated that T. searsii was most closely related to the B genome of T. turgidum (AB) and that the genome of T. urartu and the A genome have a great deal of base-sequence homology. Thus, it appears that T. searsii is the B-genome donor to polyploid wheat or a major chromosome donor if the B genome is polyphyletic in origin. PMID:6712588

  18. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep.

    PubMed

    Mousel, Michelle R; Reynolds, James O; White, Stephen N

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10(-5)) were identified including markers in or near PIK3CB (P = 2.22x10(-6); additive model), KCNB1 (P = 2.93x10(-6); dominance model), ZC3H12C (P = 3.25x10(-6); genotypic model), JPH1 (P = 4.68x20(-6); genotypic model), and MYO3B (P = 5.74x10(-6); recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection. PMID:26098909

  19. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep

    PubMed Central

    Mousel, Michelle R.; Reynolds, James O.; White, Stephen N.

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10-5) were identified including markers in or near PIK3CB (P = 2.22x10-6; additive model), KCNB1 (P = 2.93x10-6; dominance model), ZC3H12C (P = 3.25x10-6; genotypic model), JPH1 (P = 4.68x20-6; genotypic model), and MYO3B (P = 5.74x10-6; recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection. PMID:26098909

  20. Identification of Low-Confidence Regions in the Pig Reference Genome (Sscrofa10.2).

    PubMed

    Warr, Amanda; Robert, Christelle; Hume, David; Archibald, Alan L; Deeb, Nader; Watson, Mick

    2015-01-01

    Many applications of high throughput sequencing rely on the availability of an accurate reference genome. Variant calling often produces large data sets that cannot be realistically validated and which may contain large numbers of false-positives. Errors in the reference assembly increase the number of false-positives. While resources are available to aid in the filtering of variants from human data, for other species these do not yet exist and strict filtering techniques must be employed which are more likely to exclude true-positives. This work assesses the accuracy of the pig reference genome (Sscrofa10.2) using whole genome sequencing reads from the Duroc sow whose genome the assembly was based on. Indicators of structural variation including high regional coverage, unexpected insert sizes, improper pairing and homozygous variants were used to identify low quality (LQ) regions of the assembly. Low coverage (LC) regions were also identified and analyzed separately. The LQ regions covered 13.85% of the genome, the LC regions covered 26.6% of the genome and combined (LQLC) they covered 33.07% of the genome. Over half of dbSNP variants were located in the LQLC regions. Of copy number variable regions identified in a previous study, 86.3% were located in the LQLC regions. The regions were also enriched for gene predictions from RNA-seq data with 42.98% falling in the LQLC regions. Excluding variants in the LQ, LC, or LQLC from future analyses will help reduce the number of false-positive variant calls. Researchers using WGS data should be aware that the current pig reference genome does not give an accurate representation of the copy number of alleles in the original Duroc sow's genome. PMID:26640477

  1. Estimating additive and dominance variances for complex traits in pigs combining genomic and pedigree information.

    PubMed

    Costa, E V; Diniz, D B; Veroneze, R; Resende, M D V; Azevedo, C F; Guimaraes, S E F; Silva, F F; Lopes, P S

    2015-01-01

    Knowledge of dominance effects should improve ge-netic evaluations, provide the accurate selection of purebred animals, and enable better breeding strategies, including the exploitation of het-erosis in crossbreeds. In this study, we combined genomic and pedi-gree data to study the relative importance of additive and dominance genetic variation in growth and carcass traits in an F2 pig population. Two GBLUP models were used, a model without a polygenic effect (ADM) and a model with a polygenic effect (ADMP). Additive effects played a greater role in the control of growth and carcass traits than did dominance effects. However, dominance effects were important for all traits, particularly in backfat thickness. The narrow-sense and broad-sense heritability estimates for growth (0.06 to 0.42, and 0.10 to 0.51, respectively) and carcass traits (0.07 to 0.37, and 0.10 to 0.76, respec-tively) exhibited a wide variation. The inclusion of a polygenic effect in the ADMP model changed the broad-sense heritability estimates only for birth weight and weight at 21 days of age. PMID:26125833

  2. Breaking Good: Accounting for Fragility of Genomic Regions in Rearrangement Distance Estimation

    PubMed Central

    Biller, Priscila; Guéguen, Laurent; Knibbe, Carole; Tannier, Eric

    2016-01-01

    Models of evolution by genome rearrangements are prone to two types of flaws: One is to ignore the diversity of susceptibility to breakage across genomic regions, and the other is to suppose that susceptibility values are given. Without necessarily supposing their precise localization, we call “solid” the regions that are improbably broken by rearrangements and “fragile” the regions outside solid ones. We propose a model of evolution by inversions where breakage probabilities vary across fragile regions and over time. It contains as a particular case the uniform breakage model on the nucleotidic sequence, where breakage probabilities are proportional to fragile region lengths. This is very different from the frequently used pseudouniform model where all fragile regions have the same probability to break. Estimations of rearrangement distances based on the pseudouniform model completely fail on simulations with the truly uniform model. On pairs of amniote genomes, we show that identifying coding genes with solid regions yields incoherent distance estimations, especially with the pseudouniform model, and to a lesser extent with the truly uniform model. This incoherence is solved when we coestimate the number of fragile regions with the rearrangement distance. The estimated number of fragile regions is surprisingly small, suggesting that a minority of regions are recurrently used by rearrangements. Estimations for several pairs of genomes at different divergence times are in agreement with a slowly evolvable colocalization of active genomic regions in the cell. PMID:27190002

  3. Genomic regions associated with necrotic enteritis resistance in Fayoumi and White Leghorn chickens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, we used two breeds of chicken to identify genomic regions corresponding to necrotic enteritis (NE) resistance. We scanned the genomes of a resistant and susceptible line of Fayoumi and White Leghorn chicken using a chicken 60K Illumina SNP panel. A total of 235 loci with divergently ...

  4. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    PubMed

    Wang, Yu; Li, Wei; Xia, Yingying; Wang, Chongzhi; Tang, Y Tom; Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2014-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information. PMID:25919136

  5. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions

    PubMed Central

    Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2015-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information. PMID:25919136

  6. Combined analysis of genome-wide expression and copy number profiles to identify key altered genomic regions in cancer

    PubMed Central

    2012-01-01

    Background Analysis of DNA copy number alterations and gene expression changes in human samples have been used to find potential target genes in complex diseases. Recent studies have combined these two types of data using different strategies, but focusing on finding gene-based relationships. However, it has been proposed that these data can be used to identify key genomic regions, which may enclose causal genes under the assumption that disease-associated gene expression changes are caused by genomic alterations. Results Following this proposal, we undertake a new integrative analysis of genome-wide expression and copy number datasets. The analysis is based on the combined location of both types of signals along the genome. Our approach takes into account the genomic location in the copy number (CN) analysis and also in the gene expression (GE) analysis. To achieve this we apply a segmentation algorithm to both types of data using paired samples. Then, we perform a correlation analysis and a frequency analysis of the gene loci in the segmented CN regions and the segmented GE regions; selecting in both cases the statistically significant loci. In this way, we find CN alterations that show strong correspondence with GE changes. We applied our method to a human dataset of 64 Glioblastoma Multiforme samples finding key loci and hotspots that correspond to major alterations previously described for this type of tumors. Conclusions Identification of key altered genomic loci constitutes a first step to find the genes that drive the alteration in a malignant state. These driver genes can be found in regions that show high correlation in copy number alterations and expression changes. PMID:23095915

  7. Lost region in amyloid precursor protein (APP) through TALEN-mediated genome editing alters mitochondrial morphology.

    PubMed

    Wang, Yajie; Wu, Fengyi; Pan, Haining; Zheng, Wenzhong; Feng, Chi; Wang, Yunfu; Deng, Zixin; Wang, Lianrong; Luo, Jie; Chen, Shi

    2016-01-01

    Alzheimer's disease (AD) is characterized by amyloid-β (Aβ) deposition in the brain. Aβ plaques are produced through sequential β/γ cleavage of amyloid precursor protein (APP), of which there are three main APP isoforms: APP695, APP751 and APP770. KPI-APPs (APP751 and APP770) are known to be elevated in AD, but the reason remains unclear. Transcription activator-like (TAL) effector nucleases (TALENs) induce mutations with high efficiency at specific genomic loci, and it is thus possible to knock out specific regions using TALENs. In this study, we designed and expressed TALENs specific for the C-terminus of APP in HeLa cells, in which KPI-APPs are predominantly expressed. The KPI-APP mutants lack a 12-aa region that encompasses a 5-aa trans-membrane (TM) region and 7-aa juxta-membrane (JM) region. The mutated KPI-APPs exhibited decreased mitochondrial localization. In addition, mitochondrial morphology was altered, resulting in an increase in spherical mitochondria in the mutant cells through the disruption of the balance between fission and fusion. Mitochondrial dysfunction, including decreased ATP levels, disrupted mitochondrial membrane potential, increased ROS generation and impaired mitochondrial dehydrogenase activity, was also found. These results suggest that specific regions of KPI-APPs are important for mitochondrial localization and function. PMID:26924205

  8. Lost region in amyloid precursor protein (APP) through TALEN-mediated genome editing alters mitochondrial morphology

    PubMed Central

    Wang, Yajie; Wu, Fengyi; Pan, Haining; Zheng, Wenzhong; Feng, Chi; Wang, Yunfu; Deng, Zixin; Wang, Lianrong; Luo, Jie; Chen, Shi

    2016-01-01

    Alzheimer’s disease (AD) is characterized by amyloid-β (Aβ) deposition in the brain. Aβ plaques are produced through sequential β/γ cleavage of amyloid precursor protein (APP), of which there are three main APP isoforms: APP695, APP751 and APP770. KPI-APPs (APP751 and APP770) are known to be elevated in AD, but the reason remains unclear. Transcription activator-like (TAL) effector nucleases (TALENs) induce mutations with high efficiency at specific genomic loci, and it is thus possible to knock out specific regions using TALENs. In this study, we designed and expressed TALENs specific for the C-terminus of APP in HeLa cells, in which KPI-APPs are predominantly expressed. The KPI-APP mutants lack a 12-aa region that encompasses a 5-aa trans-membrane (TM) region and 7-aa juxta-membrane (JM) region. The mutated KPI-APPs exhibited decreased mitochondrial localization. In addition, mitochondrial morphology was altered, resulting in an increase in spherical mitochondria in the mutant cells through the disruption of the balance between fission and fusion. Mitochondrial dysfunction, including decreased ATP levels, disrupted mitochondrial membrane potential, increased ROS generation and impaired mitochondrial dehydrogenase activity, was also found. These results suggest that specific regions of KPI-APPs are important for mitochondrial localization and function. PMID:26924205

  9. Genome-wide meta-analysis of maize heterosis reveals the potential role of additive gene expression at pericentromeric loci

    PubMed Central

    2014-01-01

    Background The identification of QTL involved in heterosis formation is one approach to unravel the not yet fully understood genetic basis of heterosis - the improved agronomic performance of hybrid F1 plants compared to their inbred parents. The identification of candidate genes underlying a QTL is important both for developing markers and determining the molecular genetic basis of a trait, but remains difficult owing to the large number of genes often contained within individual QTL. To address this problem in heterosis analysis, we applied a meta-analysis strategy for grain yield (GY) of Zea mays L. as example, incorporating QTL-, hybrid field-, and parental gene expression data. Results For the identification of genes underlying known heterotic QTL, we made use of tight associations between gene expression pattern and the trait of interest, identified by correlation analyses. Using this approach genes strongly associated with heterosis for GY were discovered to be clustered in pericentromeric regions of the complex maize genome. This suggests that expression differences of sequences in recombination-suppressed regions are important in the establishment of heterosis for GY in F1 hybrids and also in the conservation of heterosis for GY across genotypes. Importantly functional analysis of heterosis-associated genes from these genomic regions revealed over-representation of a number of functional classes, identifying key processes contributing to heterosis for GY. Based on the finding that the majority of the analyzed heterosis-associated genes were addtitively expressed, we propose a model referring to the influence of cis-regulatory variation on heterosis for GY by the compensation of fixed detrimental expression levels in parents. Conclusions The study highlights the utility of a meta-analysis approach that integrates phenotypic and multi-level molecular data to unravel complex traits in plants. It provides prospects for the identification of genes relevant for

  10. ECRbase: Database of Evolutionary Conserved Regions, Promoters, and Transcription Factor Binding Sites in Vertebrate Genomes

    DOE Data Explorer

    Loots, Gabriela G. [LLNL; Ovcharenko, I. [LLNL

    Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. This database of evolutionary conserved regions (ECRs) in vertebrate genomes features a database of syntenic blocks that recapitulate the evolution of rearrangements in vertebrates and a comprehensive collection of promoters in all vertebrate genomes generated using multiple sources of gene annotation. The database also contains a collection of annotated transcription factor binding sites (TFBSs) in evolutionary conserved and promoter elements. ECRbase currently includes human, rhesus macaque, dog, opossum, rat, mouse, chicken, frog, zebrafish, and fugu genomes. (taken from paper in Journal: Bioinformatics, November 7, 2006, pp. 122-124

  11. Whole-genome sequencing reveals small genomic regions of introgression in an introduced crater lake population of threespine stickleback.

    PubMed

    Yoshida, Kohta; Miyagi, Ryutaro; Mori, Seiichi; Takahashi, Aya; Makino, Takashi; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2016-04-01

    Invasive species pose a major threat to biological diversity. Although introduced populations often experience population bottlenecks, some invasive species are thought to be originated from hybridization between multiple populations or species, which can contribute to the maintenance of high genetic diversity. Recent advances in genome sequencing enable us to trace the evolutionary history of invasive species even at whole-genome level and may help to identify the history of past hybridization that may be overlooked by traditional marker-based analysis. Here, we conducted whole-genome sequencing of eight threespine stickleback (Gasterosteus aculeatus) individuals, four from a recently introduced crater lake population and four of the putative source population. We found that both populations have several small genomic regions with high genetic diversity, which resulted from introgression from a closely related species (Gasterosteus nipponicus). The sizes of the regions were too small to be detected with traditional marker-based analysis or even some reduced-representation sequencing methods. Further amplicon sequencing revealed linkage disequilibrium around an introgression site, which suggests the possibility of selective sweep at the introgression site. Thus, interspecies introgression might predate introduction and increase genetic variation in the source population. Whole-genome sequencing of even a small number of individuals can therefore provide higher resolution inference of history of introduced populations. PMID:27069575

  12. Characterization of the flamenco region of the Drosophila melanogaster genome.

    PubMed Central

    Robert, V; Prud'homme, N; Kim, A; Bucheton, A; Pélisson, A

    2001-01-01

    The flamenco gene, located at 20A1-3 in the beta-heterochromatin of the Drosophila X chromosome, is a major regulator of the gypsy/mdg4 endogenous retrovirus. As a first step to characterize this gene, approximately 100 kb of genomic DNA flanking a P-element-induced mutation of flamenco was isolated. This DNA is located in a sequencing gap of the Celera Genomics project, i.e., one of those parts of the genome in which the "shotgun" sequence could not be assembled, probably because it contains long stretches of repetitive DNA, especially on the proximal side of the P insertion point. Deficiency mapping indicated that sequences required for the normal flamenco function are located >130 kb proximal to the insertion site. The distal part of the cloned DNA does, nevertheless, contain several unique sequences, including at least four different transcription units. Dip1, the closest one to the P-element insertion point, might be a good candidate for a gypsy regulator, since it putatively encodes a nuclear protein containing two double-stranded RNA-binding domains. However, transgenes containing dip1 genomic DNA were not able to rescue flamenco mutant flies. The possible nature of the missing flamenco sequences is discussed. PMID:11404334

  13. Characterization of the flamenco region of the Drosophila melanogaster genome.

    PubMed

    Robert, V; Prud'homme, N; Kim, A; Bucheton, A; Pélisson, A

    2001-06-01

    The flamenco gene, located at 20A1-3 in the beta-heterochromatin of the Drosophila X chromosome, is a major regulator of the gypsy/mdg4 endogenous retrovirus. As a first step to characterize this gene, approximately 100 kb of genomic DNA flanking a P-element-induced mutation of flamenco was isolated. This DNA is located in a sequencing gap of the Celera Genomics project, i.e., one of those parts of the genome in which the "shotgun" sequence could not be assembled, probably because it contains long stretches of repetitive DNA, especially on the proximal side of the P insertion point. Deficiency mapping indicated that sequences required for the normal flamenco function are located >130 kb proximal to the insertion site. The distal part of the cloned DNA does, nevertheless, contain several unique sequences, including at least four different transcription units. Dip1, the closest one to the P-element insertion point, might be a good candidate for a gypsy regulator, since it putatively encodes a nuclear protein containing two double-stranded RNA-binding domains. However, transgenes containing dip1 genomic DNA were not able to rescue flamenco mutant flies. The possible nature of the missing flamenco sequences is discussed. PMID:11404334

  14. Genomics of the hop psuedo-autosomal regions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Hop is one of the few crop species with female and male plants with sex being determined by either XX or XY chromosomes. Hop cones are only produced in female hops with or without fertilization. This has lead to most genomic research being directed toward female plants. Very little work has been don...

  15. Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity

    PubMed Central

    2013-01-01

    Background Hanwoo (Korean cattle), which originated from natural crossbreeding between taurine and zebu cattle, migrated to the Korean peninsula through North China. Hanwoo were raised as draft animals until the 1970s without the introduction of foreign germplasm. Since 1979, Hanwoo has been bred as beef cattle. Genetic variation was analyzed by whole-genome deep resequencing of a Hanwoo bull. The Hanwoo genome was compared to that of two other breeds, Black Angus and Holstein, and genes within regions of homozygosity were investigated to elucidate the genetic and genomic characteristics of Hanwoo. Results The Hanwoo bull genome was sequenced to 45.6-fold coverage using the ABI SOLiD system. In total, 4.7 million single-nucleotide polymorphisms and 0.4 million small indels were identified by comparison with the Btau4.0 reference assembly. Of the total number of SNPs and indels, 58% and 87%, respectively, were novel. The overall genotype concordance between the SNPs and BovineSNP50 BeadChip data was 96.4%. Of 1.6 million genetic differences in Hanwoo, approximately 25,000 non-synonymous SNPs, splice-site variants, and coding indels (NS/SS/Is) were detected in 8,360 genes. Among 1,045 genes containing reliable specific NS/SS/Is in Hanwoo, 109 genes contained more than one novel damaging NS/SS/I. Of the genes containing NS/SS/Is, 610 genes were assigned as trait-associated genes. Moreover, 16, 78, and 51 regions of homozygosity (ROHs) were detected in Hanwoo, Black Angus, and Holstein, respectively. ‘Regulation of actin filament length’ was revealed as a significant gene ontology term and 25 trait-associated genes for meat quality and disease resistance were found in 753 genes that resided in the ROHs of Hanwoo. In Hanwoo, 43 genes were located in common ROHs between whole-genome resequencing and SNP chips in BTA2, 10, and 13 coincided with quantitative trait loci for meat fat traits. In addition, the common ROHs in BTA2 and 16 were in agreement between Hanwoo and

  16. Genome-Wide Profiling of PARP1 Reveals an Interplay with Gene Regulatory Regions and DNA Methylation

    PubMed Central

    Nalabothula, Narasimharao; Al-jumaily, Taha; Eteleeb, Abdallah M.; Flight, Robert M.; Xiaorong, Shao; Moseley, Hunter; Rouchka, Eric C.; Fondufe-Mittendorf, Yvonne N.

    2015-01-01

    Poly (ADP-ribose) polymerase-1 (PARP1) is a nuclear enzyme involved in DNA repair, chromatin remodeling and gene expression. PARP1 interactions with chromatin architectural multi-protein complexes (i.e. nucleosomes) alter chromatin structure resulting in changes in gene expression. Chromatin structure impacts gene regulatory processes including transcription, splicing, DNA repair, replication and recombination. It is important to delineate whether PARP1 randomly associates with nucleosomes or is present at specific nucleosome regions throughout the cell genome. We performed genome-wide association studies in breast cancer cell lines to address these questions. Our studies show that PARP1 associates with epigenetic regulatory elements genome-wide, such as active histone marks, CTCF and DNase hypersensitive sites. Additionally, the binding of PARP1 to chromatin genome-wide is mutually exclusive with DNA methylation pattern suggesting a functional interplay between PARP1 and DNA methylation. Indeed, inhibition of PARylation results in genome-wide changes in DNA methylation patterns. Our results suggest that PARP1 controls the fidelity of gene transcription and marks actively transcribed gene regions by selectively binding to transcriptionally active chromatin. These studies provide a platform for developing our understanding of PARP1’s role in gene regulation. PMID:26305327

  17. Self-identification of protein-coding regions in microbial genomes.

    PubMed

    Audic, S; Claverie, J M

    1998-08-18

    A new method for predicting protein-coding regions in microbial genomic DNA sequences is presented. It uses an ab initio iterative Markov modeling procedure to automatically perform the partition of genomic sequences into three subsets shown to correspond to coding, coding on the opposite strand, and noncoding segments. In contrast to current methods, such as GENEMARK [Borodovsky, M. & McIninch, J. D. (1993) Comput. Chem. 17, 123-133], no training set or prior knowledge of the statistical properties of the studied genome are required. This new method tolerates error rates of 1-2% and can process unassembled sequences. It is thus ideal for the analysis of genome survey and/or fragmented sequence data from uncharacterized microorganisms. The method was validated on 10 complete bacterial genomes (from four major phylogenetic lineages). The results show that protein-coding regions can be identified with an accuracy of up to 90% with a totally automated and objective procedure. PMID:9707594

  18. Self-Confirmation and Ascertainment of the Candidate Genomic Regions of Complex Trait Loci – A None-Experimental Solution

    PubMed Central

    Wang, Lishi; Jiao, Yan; Wang, Yongjun; Zhang, Mengchen; Gu, Weikuan

    2016-01-01

    Over the past half century, thousands of quantitative trait loci (QTL) have been identified by using animal models and plant populations. However, the none-reliability and imprecision of the genomic regions of these loci have remained the major hurdle for the identification of the causal genes for the correspondent traits. We used a none-experimental strategy of strain number reduction for testing accuracy and ascertainment of the candidate region for QTL. We tested the strategy in over 400 analyses with data from 47 studies. These studies include: 1) studies with recombinant inbred (RI) strains of mice. We first tested two previously mapped QTL with well-defined genomic regions; We then tested additional four studies with known QTL regions; and finally we examined the reliability of QTL in 38 sets of data which are produced from relatively large numbers of RI strains, derived from C57BL/6J (B6) X DBA/2J (D2), known as BXD RI mouse strains; 2) studies with RI strains of rats and plants; and 3) studies using F2 populations in mice, rats and plants. In these cases, our method identified the reliability of mapped QTL and localized the candidate genes into the defined genomic regions. Our data also suggests that LRS score produced by permutation tests does not necessarily confirm the reliability of the QTL. Number of strains are not the reliable indicators for the accuracy of QTL either. Our strategy determines the reliability and accuracy of the genomic region of a QTL without any additional experimental study such as congenic breeding. PMID:27203862

  19. Self-Confirmation and Ascertainment of the Candidate Genomic Regions of Complex Trait Loci - A None-Experimental Solution.

    PubMed

    Wang, Lishi; Jiao, Yan; Wang, Yongjun; Zhang, Mengchen; Gu, Weikuan

    2016-01-01

    Over the past half century, thousands of quantitative trait loci (QTL) have been identified by using animal models and plant populations. However, the none-reliability and imprecision of the genomic regions of these loci have remained the major hurdle for the identification of the causal genes for the correspondent traits. We used a none-experimental strategy of strain number reduction for testing accuracy and ascertainment of the candidate region for QTL. We tested the strategy in over 400 analyses with data from 47 studies. These studies include: 1) studies with recombinant inbred (RI) strains of mice. We first tested two previously mapped QTL with well-defined genomic regions; We then tested additional four studies with known QTL regions; and finally we examined the reliability of QTL in 38 sets of data which are produced from relatively large numbers of RI strains, derived from C57BL/6J (B6) X DBA/2J (D2), known as BXD RI mouse strains; 2) studies with RI strains of rats and plants; and 3) studies using F2 populations in mice, rats and plants. In these cases, our method identified the reliability of mapped QTL and localized the candidate genes into the defined genomic regions. Our data also suggests that LRS score produced by permutation tests does not necessarily confirm the reliability of the QTL. Number of strains are not the reliable indicators for the accuracy of QTL either. Our strategy determines the reliability and accuracy of the genomic region of a QTL without any additional experimental study such as congenic breeding. PMID:27203862

  20. Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome.

    PubMed

    Muñoz-Amatriaín, María; Lonardi, Stefano; Luo, MingCheng; Madishetty, Kavitha; Svensson, Jan T; Moscou, Matthew J; Wanamaker, Steve; Jiang, Tao; Kleinhofs, Andris; Muehlbauer, Gary J; Wise, Roger P; Stein, Nils; Ma, Yaqin; Rodriguez, Edmundo; Kudrna, Dave; Bhat, Prasanna R; Chao, Shiaoman; Condamine, Pascal; Heinen, Shane; Resnik, Josh; Wing, Rod; Witt, Heather N; Alpert, Matthew; Beccuti, Marco; Bozdag, Serdar; Cordero, Francesca; Mirebrahim, Hamid; Ounit, Rachid; Wu, Yonghui; You, Frank; Zheng, Jie; Simková, Hana; Dolezel, Jaroslav; Grimwood, Jane; Schmutz, Jeremy; Duma, Denisa; Altschmied, Lothar; Blake, Tom; Bregitzer, Phil; Cooper, Laurel; Dilbirligi, Muharrem; Falk, Anders; Feiz, Leila; Graner, Andreas; Gustafson, Perry; Hayes, Patrick M; Lemaux, Peggy; Mammadov, Jafar; Close, Timothy J

    2015-10-01

    Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant. PMID:26252423

  1. Analysis of real time PCR amplification efficiencies from three genomic region of dengue virus.

    PubMed

    Odreman-Macchioli, María; Vielma, Silvana; Atchley, Daniel; Comach, Guillermo; Ramirez, Alvaro; Pérez, Saberio; Téllez, Luis; Quintero, Beatriz; Hernández, Erick; Muñoz, Maritza; Mendoza, José

    2013-03-01

    Early diagnosis of dengue virus (DENV) infection represents a key factor in preventing clinical complications attributed to the disease. The aim of this study was to evaluate the amplification efficiencies of an in-house quantitative real time-PCR (qPCR) assay of DENV, using the non-structural conserved genomic region protein-5 (NS5) versus two genomic regions usually employed for virus detection, the capsid/pre-membrane region (C-prM) and the 3'-noncoding region (3'NC). One-hundred sixty seven acute phase serum samples from febrile patients were used for validation purposes. Results showed that the three genomic regions had similar amplification profiles and correlation coefficients (0.987-0.999). When isolated viruses were used, the NS5 region had the highest qPCR efficiencies for the four serotypes (98-100%). Amplification from acute serum samples showed that 41.1% (67/167) were positive for the universal assay by at least two of the selected genomic regions. The agreement rates between NS5/C-prM and NS5/3'NC regions were 56.7% and 97%, respectively. Amplification concordance values between C-prM/NS5 and NS5/3'NC regions showed a weak (kappa = 0.109; CI 95%) and a moderate (kappa = 0.489; CI 95%) efficiencies in amplification, respectively. Serotyping assay using a singleplex NS5-TaqMan format was much more sensitive than the C-prM/SYBR Green I protocol (76%). External evaluation showed a high sensitivity (100%), specificity (78%) and high agreement between the assays. According to the results, the NS5 genomic region provides the best genomic region for optimal detection and typification of DENV in clinical samples. PMID:23781709

  2. Regions identity between the genome of vertebrates and non-retroviral families of insect viruses

    PubMed Central

    2011-01-01

    Background The scope of our understanding of the evolutionary history between viruses and animals is limited. The fact that the recent availability of many complete insect virus genomes and vertebrate genomes as well as the ability to screen these sequences makes it possible to gain a new perspective insight into the evolutionary interaction between insect viruses and vertebrates. This study is to determine the possibility of existence of sequence identity between the genomes of insect viruses and vertebrates, attempt to explain this phenomenon in term of genetic mobile element, and try to investigate the evolutionary relationship between these short regions of identity among these species. Results Some of studied insect viruses contain variable numbers of short regions of sequence identity to the genomes of vertebrate with nucleotide sequence length from 28 bp to 124 bp. They are found to locate in multiple sites of the vertebrate genomes. The ontology of animal genes with identical regions involves in several processes including chromatin remodeling, regulation of apoptosis, signaling pathway, nerve system development and some enzyme-like catalysis. Phylogenetic analysis reveals that at least some short regions of sequence identity in the genomes of vertebrate are derived the ancestral of insect viruses. Conclusion Short regions of sequence identity were found in the vertebrates and insect viruses. These sequences played an important role not only in the long-term evolution of vertebrates, but also in promotion of insect virus. This typical win-win strategy may come from natural selection. PMID:22073942

  3. A novel sandwich hybridization method for selecting cDNAs from large genomic regions: Identification of cDNAs from the cloned genomic DNA spanning the XLRP locus

    SciTech Connect

    Yan, D.; McHenry, C.; Fujita, R.

    1994-09-01

    We have developed an efficient hybridization-based cDNA-selection method. A sandwich of three species - single-stranded cDNA, tagged RNA derived from genomic DNA, and biotinylated RNA complementary to the tag - allows specific retention of hybrids on an avidin-matrix. Previously, using model experiments, we demonstrated highly specific and efficient selection of a retinal gene, NRL, from complex mixtures of cDNA clones, using a sub-library from a 5 kb NRL genomic clone. We have now applied this selection strategy to isolate cDNAs from human adult retina and fetal eye libraries, with the {open_quotes}genomic RNA{close_quotes} derived from two YAC clones (OTC-C and 55B) spanning the region of X-linked retinitis pigmentosa (XLRP) locus RP3 at Xp21.1. Effectiveness of the selection-method was monitored by enrichment of TCTEX-1L gene that maps within the 55B YAC. Of the 15 selected cDNA clones that hybridized to the 55B YAC DNA, five appear to the map to specific cosmid clones derived from the 55B YAC. Inserts in these selected cDNA clones range from 0.5 to 2.3 kb in size. Additional clones are now being isolated and characterized. This procedure should be independent of the size or complexity of genomic DNA being used for selection, allow for the isolation of full-length cDNAs, and may have wider application.

  4. Genome-Wide Analyses in Bacteria Show Small-RNA Enrichment for Long and Conserved Intergenic Regions

    PubMed Central

    Tsai, Chen-Hsun; Liao, Rick; Chou, Brendan; Palumbo, Michael

    2014-01-01

    Interest in finding small RNAs (sRNAs) in bacteria has significantly increased in recent years due to their regulatory functions. Development of high-throughput methods and more sophisticated computational algorithms has allowed rapid identification of sRNA candidates in different species. However, given their various sizes (50 to 500 nucleotides [nt]) and their potential genomic locations in the 5′ and 3′ untranslated regions as well as in intergenic regions, identification and validation of true sRNAs have been challenging. In addition, the evolution of bacterial sRNAs across different species continues to be puzzling, given that they can exert similar functions with various sequences and structures. In this study, we analyzed the enrichment patterns of sRNAs in 13 well-annotated bacterial species using existing transcriptome and experimental data. All intergenic regions were analyzed by WU-BLAST to examine conservation levels relative to species within or outside their genus. In total, more than 900 validated bacterial sRNAs and 23,000 intergenic regions were analyzed. The results indicate that sRNAs are enriched in intergenic regions, which are longer and more conserved than the average intergenic regions in the corresponding bacterial genome. We also found that sRNA-coding regions have different conservation levels relative to their flanking regions. This work provides a way to analyze how noncoding RNAs are distributed in bacterial genomes and also shows conserved features of intergenic regions that encode sRNAs. These results also provide insight into the functions of regions surrounding sRNAs and into optimization of RNA search algorithms. PMID:25313390

  5. Population Genomic Analysis of 962 Whole Genome Sequences of Humans Reveals Natural Selection in Non-Coding Regions

    PubMed Central

    Gazave, Elodie; Chang, Diana; Raj, Srilakshmi; Hunter-Zinck, Haley; Blekhman, Ran; Arbiza, Leonardo; Van Hout, Cris; Morrison, Alanna; Johnson, Andrew D.; Bis, Joshua; Cupples, L. Adrienne; Psaty, Bruce M.; Muzny, Donna; Yu, Jin; Gibbs, Richard A.; Keinan, Alon; Clark, Andrew G.; Boerwinkle, Eric

    2015-01-01

    Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories. There was a significant enrichment of purifying selection signals detected in enhancers, transcription factor binding sites, microRNAs and target sites, but not on lincRNA or piRNAs, suggesting different evolutionary constraints for these domains. Loci in “repressed or low activity regions” and loci near or overlapping the transcription start site were the most significantly over-represented annotations among the top 1% of signals for positive selection. PMID:25807536

  6. Identification of Whole Mitochondrial Genomes from Venezuela and Implications on Regional Phylogenies in South America.

    PubMed

    Lee, Esther J; Merriwether, D Andrew

    2015-01-01

    Recent studies have expanded and refined the founding haplogroups of the Americas using whole mitochondrial (mtDNA) genome analysis. In addition to pan-American lineages, specific variants have been identified in a number of studies that show higher frequencies in restricted geographical areas. To further characterize Native American maternal lineages and specifically examine local patterns within South America, we analyzed 12 maternally unrelated Yekuana whole mtDNA genomes from one village (Sharamaña) that include the four major Native American haplogroups A2, B2, C1, and D1. Based on our results, we propose a reconfiguration of one subhaplogroup A2 (A2aa) that is specific to South America and identify other singleton branches across the four haplogroups. Furthermore, we show nucleotide diversity values that increase from north to south for haplogroups C1 and D1. The results from our work add to the growing mitogenomic data that highlight local phylogenies and support the rapid genetic differentiation of South American populations, which has been correlated with the linguistic diversity in the region by previous studies. PMID:26416320

  7. HYBRIDCHECK: software for the rapid detection, visualization and dating of recombinant regions in genome sequence data.

    PubMed

    Ward, Ben J; van Oosterhout, Cock

    2016-03-01

    HYBRIDCHECK is a software package to visualize the recombination signal in large DNA sequence data set, and it can be used to analyse recombination, genetic introgression, hybridization and horizontal gene transfer. It can scan large (multiple kb) contigs and whole-genome sequences of three or more individuals. HYBRIDCHECK is written in the r software for OS X, Linux and Windows operating systems, and it has a simple graphical user interface. In addition, the r code can be readily incorporated in scripts and analysis pipelines. HYBRIDCHECK implements several ABBA-BABA tests and visualizes the effects of hybridization and the resulting mosaic-like genome structure in high-density graphics. The package also reports the following: (i) the breakpoint positions, (ii) the number of mutations in each introgressed block, (iii) the probability that the identified region is not caused by recombination and (iv) the estimated age of each recombination event. The divergence times between the donor and recombinant sequence are calculated using a JC, K80, F81, HKY or GTR correction, and the dating algorithm is exceedingly fast. By estimating the coalescence time of introgressed blocks, it is possible to distinguish between hybridization and incomplete lineage sorting. HYBRIDCHECK is libré software and it and its manual are free to download from http://ward9250.github.io/HybridCheck/. PMID:26394708

  8. Evaluating genome architecture of a complex region via generalized bipartite matching.

    PubMed

    Lo, Christine; Kim, Sangwoo; Zakov, Shay; Bafna, Vineet

    2013-01-01

    With the remarkable development in inexpensive sequencing technologies and supporting computational tools, we have the promise of medicine being personalized by knowledge of the individual genome. Current technologies provide high throughput, but short reads. Reconstruction of the donor genome is based either on de novo assembly of the (short) reads, or on mapping donor reads to a standard reference. While such techniques demonstrate high success rates for inferring 'simple' genomic segments, they are confounded by segments with complex duplication patterns, including regions of direct medical relevance, like the HLA and the KIR regions.In this work, we address this problem with a method for assessing the quality of a predicted genome sequence for complex regions of the genome. This method combines two natural types of evidence: sequence similarity of the mapped reads to the predicted donor genome, and distribution of reads across the predicted genome. We define a new scoring function for read-to-genome matchings, which penalizes for sequence dissimilarities and deviations from expected read location distribution, and present an efficient algorithm for finding matchings that minimize the penalty. The algorithm is based on a formal problem, first defined in this paper, called Coverage Sensitive many-to-many min-cost bipartite Matching (CSM). This new problem variant generalizes the standard (one-to-one) weighted bipartite matching problem, and can be solved using network flows. The resulting Java-based tool, called SAGE (Scoring function for Assembled GEnomes), is freely available upon request. We demonstrate over simulated data that SAGE can be used to infer correct haplotypes of the highly repetitive KIR region on the Human chromosome 19. PMID:23734567

  9. Genomic Regions Associated with Sheep Resistance to Gastrointestinal Nematodes.

    PubMed

    Benavides, Magda Vieira; Sonstegard, Tad S; Van Tassell, Curtis

    2016-06-01

    Genetic markers for sheep resistance to gastrointestinal parasites have long been sought by the livestock industry as a way to select more resistant individuals and to help farmers reduce parasite transmission by identifying and removing high egg shedders from the flock. Polymorphisms related to the major histocompatibility complex and interferon (IFN)-γ genes have been the most frequently reported markers associated with infection. Recently, a new picture is emerging from genome-wide studies, showing that not only immune mechanisms are important determinants of host resistance but that gastrointestinal mucus production and hemostasis pathways may also play a role. PMID:27183838

  10. Isolation of Specific Genomic Regions and Identification of Associated Molecules by enChIP

    PubMed Central

    Fujita, Toshitsugu; Fujii, Hodaka

    2016-01-01

    The identification of molecules associated with specific genomic regions of interest is required to understand the mechanisms of regulation of the functions of these regions. To enable the non-biased identification of molecules interacting with a specific genomic region of interest, we recently developed the engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) technique. Here, we describe how to use enChIP to isolate specific genomic regions and identify the associated proteins and RNAs. First, a genomic region of interest is tagged with a transcription activator-like (TAL) protein or a clustered regularly interspaced short palindromic repeats (CRISPR) complex consisting of a catalytically inactive form of Cas9 and a guide RNA. Subsequently, the chromatin is crosslinked and fragmented by sonication. The tagged locus is then immunoprecipitated and the crosslinking is reversed. Finally, the proteins or RNAs that are associated with the isolated chromatin are subjected to mass spectrometric or RNA sequencing analyses, respectively. This approach allows the successful identification of proteins and RNAs associated with a genomic region of interest. PMID:26862718

  11. Differentially Methylated Genomic Regions in Birth-Weight Discordant Twin Pairs.

    PubMed

    Chen, Mubo; Baumbach, Jan; Vandin, Fabio; Röttger, Richard; Barbosa, Eudes; Dong, Mingchui; Frost, Morten; Christiansen, Lene; Tan, Qihua

    2016-03-01

    Poor nutrition during critical growth phases may alter the structural and physiologic development of vital organs thus "programming" the susceptibility to adult-onset diseases and disease-related health conditions. Epigenome-wide association studies have been performed in birth-weight discordant twin pairs to find evidence for such "programming" effects, but no significant results emerged. We further investigated this issue using a new computational approach: Instead of probing single genomic sites for significant alterations in epigenetic marks, we scan for differentially methylated genomic regions. Whole genome DNA methylation levels were measured in whole blood from 150 pairs of adult identical twins discordant for birth-weight. Intrapair differential DNA methylation was associated with qualitative (large or small) and quantitative (percentage) birth-weight discordance at each genomic site using regression models adjusting for age and sex. Based on the regression results, genomic regions with consistent alteration patterns of DNA methylation were located and tested for significant robustness using computational permutation tests. This yielded an interesting genomic region on chromosome 1, which is significantly differentially methylated for quantitative birth-weight discordance. The region covers two genes (TYW3 and CRYZ) both reportedly associated with metabolism. We conclude that prenatal conditions for birth-weight discordance may result in persistent epigenetic modifications potentially affecting even adult health. PMID:26831219

  12. ECRbase: Database of Evolutionary Conserved Regions, Promoters, and Transcription Factor Binding Sites in Vertebrate Genomes

    SciTech Connect

    Loots, G; Ovcharenko, I

    2006-08-08

    Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. We have created a database of evolutionary conserved regions (ECRs) in vertebrate genomes entitled ECRbase that is constructed from a collection of pairwise vertebrate genome alignments produced by the ECR Browser database. ECRbase features a database of syntenic blocks that recapitulate the evolution of rearrangements in vertebrates and a collection of promoters in all vertebrate genomes presented in the database. The database also contains a collection of annotated transcription factor binding sites (TFBS) in all ECRs and promoter elements. ECRbase currently includes human, rhesus macaque, dog, opossum, rat, mouse, chicken, frog, zebrafish, and two pufferfish genomes. It is freely accessible at http://ECRbase.dcode.org.

  13. Chromosome region-specific libraries for human genome analysis

    SciTech Connect

    Kao, Fa-Ten.

    1992-08-01

    During the grant period progress has been made in the successful demonstration of regional mapping of microclones derived from microdissection libraries; successful demonstration of the feasibility of converting microclones with short inserts into yeast artificial chromosome clones with very large inserts for high resolution physical mapping of the dissected region; Successful demonstration of the usefulness of region-specific microclones to isolate region-specific cDNA clones as candidate genes to facilitate search for the crucial genes underlying genetic diseases assigned to the dissected region; and the successful construction of four region-specific microdissection libraries for human chromosome 2, including 2q35-q37, 2q33-q35, 2p23-p25 and 2p2l-p23. The 2q35-q37 library has been characterized in detail. The characterization of the other three libraries is in progress. These region-specific microdissection libraries and the unique sequence microclones derived from the libraries will be valuable resources for investigators engaged in high resolution physical mapping and isolation of disease-related genes residing in these chromosomal regions.

  14. Mitochondrial genome phylogeny among Asiatic black bear Ursus thibetanus subspecies and comprehensive analysis of their control regions.

    PubMed

    Choi, Eun Hwa; Kim, Sang Ki; Ryu, Shi Hyun; Jang, Kuem Hee; Hwang, Ui Wook

    2010-06-01

    The complete mitochondrial genome (16,824 bp) of an Asiatic black bear Ursus thibetanus ussuricus (Mammalia, Carnivora, Ursidae) was newly sequenced and characterized in detail. It is the second mitochondrial genome from this subspecies which has been completely sequenced. The two U. t. ussuricus individuals were compared with each other and then with individuals from the other four U. thibetanus subspecies and the other nine ursid species, focusing especially on the control regions in the 14 mitochondrial genomes. Within these control regions, tandem repeats of basically 10 bp (5'-ACGCACGTGT-3' or its derivatives) were found in Domain II. Plausible secondary structures of the repeat region were compared between the North and South Korean individuals of U. t. ussuricus. According to the maximum likelihood and Bayesian inference trees inferred from the nucleotide sequences of 13 protein-coding and two rRNA genes, the ursine members within the monophyletic ursid clade can be divided into at least three groups: A, B, and C. According to this analysis, U. thibetanus subspecies were found with Ursus americanus and Ursus malayanus within Group A, showing the following relationships with nodal bootstrap values above 91% and Bayesian posterior probabilities of 1.00: ([(U. t. thibetanus, U. t. formosanus), U. t. spp.], U. t. ussuricus), U. t. mupinensis. In addition, we present a hypothetical scenario of the evolution of the major repeat motifs in the control region. PMID:20795781

  15. Augmented Annotation of the Schizosaccharomyces pombe Genome Reveals Additional Genes Required for Growth and Viability

    PubMed Central

    Bitton, Danny A.; Wood, Valerie; Scutt, Paul J.; Grallert, Agnes; Yates, Tim; Smith, Duncan L.; Hagan, Iain M.; Miller, Crispin J.

    2011-01-01

    Genome annotation is a synthesis of computational prediction and experimental evidence. Small genes are notoriously difficult to detect because the patterns used to identify them are often indistinguishable from chance occurrences, leading to an arbitrary cutoff threshold for the length of a protein-coding gene identified solely by in silico analysis. We report a systematic reappraisal of the Schizosaccharomyces pombe genome that ignores thresholds. A complete six-frame translation was compared to a proteome data set, the Pfam domain database, and the genomes of six other fungi. Thirty-nine novel loci were identified. RT-PCR and RNA-Seq confirmed transcription at 38 loci; 33 novel gene structures were delineated by 5′ and 3′ RACE. Expression levels of 14 transcripts fluctuated during meiosis. Translational evidence for 10 genes, evolutionary conservation data supporting 35 predictions, and distinct phenotypes upon ORF deletion (one essential, four slow-growth, two delayed-division phenotypes) suggest that all 39 predictions encode functional proteins. The popularity of S. pombe as a model organism suggests that this augmented annotation will be of interest in diverse areas of molecular and cellular biology, while the generality of the approach suggests widespread applicability to other genomes. PMID:21270388

  16. Genome-Wide Association Study of Intelligence: Additive Effects of Novel Brain Expressed Genes

    ERIC Educational Resources Information Center

    Loo, Sandra K.; Shtir, Corina; Doyle, Alysa E.; Mick, Eric; McGough, James J.; McCracken, James; Biederman, Joseph; Smalley, Susan L.; Cantor, Rita M.; Faraone, Stephen V.; Nelson, Stanley F.

    2012-01-01

    Objective: The purpose of the present study was to identify common genetic variants that are associated with human intelligence or general cognitive ability. Method: We performed a genome-wide association analysis with a dense set of 1 million single-nucleotide polymorphisms (SNPs) and quantitative intelligence scores within an ancestrally…

  17. CNV-based genome wide association study reveals additional variants contributing to meat quality in swine

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pork quality is important both to the meat processing industry and consumers’ purchasing attitudes. Copy number variation (CNV) is a burgeoning kind of variant that may influence meat quality. Herein, a genome-wide association study (GWAS) was performed between CNVs and meat quality traits in swine....

  18. Identification of accessory genome regions in poultry Clostridium perfringens isolates carrying the netB plasmid.

    PubMed

    Lepp, D; Gong, J; Songer, J G; Boerlin, P; Parreira, V R; Prescott, J F

    2013-03-01

    Necrotic enteritis (NE) is an economically important disease of poultry caused by certain Clostridium perfringens type A strains. NE pathogenesis involves the NetB toxin, which is encoded on a large conjugative plasmid within a 42-kb pathogenicity locus. Recent multilocus sequence type (MLST) studies have identified two predominant NE-associated clonal groups, suggesting that host genes are also involved in NE pathogenesis. We used microarray comparative genomic hybridization (CGH) to assess the gene content of 54 poultry isolates from birds that were healthy or that suffered from NE. A total of 400 genes were variably present among the poultry isolates and nine nonpoultry strains, many of which had putative functions related to nutrient uptake and metabolism and cell wall and capsule biosynthesis. The variable genes were organized into 142 genomic regions, 49 of which contained genes significantly associated with netB-positive isolates. These regions included three previously identified NE-associated loci as well as several apparent fitness-related loci, such as a carbohydrate ABC transporter, a ferric-iron siderophore uptake system, and an adhesion locus. Additional loci were related to plasmid maintenance. Cluster analysis of the CGH data grouped all of the netB-positive poultry isolates into two major groups, separated according to two prevalent clonal groups based on MLST analysis. This study identifies chromosomal loci associated with netB-positive poultry strains, suggesting that the chromosomal background can confer a selective advantage to NE-causing strains, possibly through mechanisms involving iron acquisition, carbohydrate metabolism, and plasmid maintenance. PMID:23292780

  19. De Novo Identification of Regulatory Regions in Intergenic Spaces of Prokaryotic Genomes

    SciTech Connect

    Chain, P; Garcia, E; Mcloughlin, K; Ovcharenko, I

    2007-02-20

    This project was begun to implement, test, and experimentally validate the results of a novel algorithm for genome-wide identification of candidate transcription-factor binding sites in prokaryotes. Most techniques used to identify regulatory regions rely on conservation between different genomes or have a predetermined sequence motif(s) to perform a genome-wide search. Therefore, such techniques cannot be used with new genome sequences, where information regarding such motifs has not yet been discovered. This project aimed to apply a de novo search algorithm to identify candidate binding-site motifs in intergenic regions of prokaryotic organisms, initially testing the available genomes of the Yersinia genus. We retrofitted existing nucleotide pattern-matching algorithms, analyzed the candidate sites identified by these algorithms as well as their target genes to screen for meaningful patterns. Using properly annotated prokaryotic genomes, this project aimed to develop a set of procedures to identify candidate intergenic sites important for gene regulation. We planned to demonstrate this in Yersinia pestis, a model biodefense, Category A Select Agent pathogen, and then follow up with experimental evidence that these regions are indeed involved in regulation. The ability to quickly characterize transcription-factor binding sites will help lead to a better understanding of how known virulence pathways are modulated in biodefense-related organisms, and will help our understanding and exploration of regulons--gene regulatory networks--and novel pathways for metabolic processes in environmental microbes.

  20. An improved method for oriT-directed cloning and functionalization of large bacterial genomic regions.

    PubMed

    Kvitko, Brian H; McMillan, Ian A; Schweizer, Herbert P

    2013-08-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  1. An Improved Method for oriT-Directed Cloning and Functionalization of Large Bacterial Genomic Regions

    PubMed Central

    Kvitko, Brian H.; McMillan, Ian A.

    2013-01-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  2. Transcription restores DNA repair to heterochromatin, determining regional mutation rates in cancer genomes.

    PubMed

    Zheng, Christina L; Wang, Nicholas J; Chung, Jongsuk; Moslehi, Homayoun; Sanborn, J Zachary; Hur, Joseph S; Collisson, Eric A; Vemula, Swapna S; Naujokas, Agne; Chiotti, Kami E; Cheng, Jeffrey B; Fassihi, Hiva; Blumberg, Andrew J; Bailey, Celeste V; Fudem, Gary M; Mihm, Frederick G; Cunningham, Bari B; Neuhaus, Isaac M; Liao, Wilson; Oh, Dennis H; Cleaver, James E; LeBoit, Philip E; Costello, Joseph F; Lehmann, Alan R; Gray, Joe W; Spellman, Paul T; Arron, Sarah T; Huh, Nam; Purdom, Elizabeth; Cho, Raymond J

    2014-11-20

    Somatic mutations in cancer are more frequent in heterochromatic and late-replicating regions of the genome. We report that regional disparities in mutation density are virtually abolished within transcriptionally silent genomic regions of cutaneous squamous cell carcinomas (cSCCs) arising in an XPC(-/-) background. XPC(-/-) cells lack global genome nucleotide excision repair (GG-NER), thus establishing differential access of DNA repair machinery within chromatin-rich regions of the genome as the primary cause for the regional disparity. Strikingly, we find that increasing levels of transcription reduce mutation prevalence on both strands of gene bodies embedded within H3K9me3-dense regions, and only to those levels observed in H3K9me3-sparse regions, also in an XPC-dependent manner. Therefore, transcription appears to reduce mutation prevalence specifically by relieving the constraints imposed by chromatin structure on DNA repair. We model this relationship among transcription, chromatin state, and DNA repair, revealing a new, personalized determinant of cancer risk. PMID:25456125

  3. U3 Region in the HIV-1 Genome Adopts a G-Quadruplex Structure in Its RNA and DNA Sequence

    PubMed Central

    2015-01-01

    Genomic regions rich in G residues are prone to adopt G-quadruplex structure. Multiple Sp1-binding motifs arranged in tandem have been suggested to form this structure in promoters of cancer-related genes. Here, we demonstrate that the G-rich proviral DNA sequence of the HIV-1 U3 region, which serves as a promoter of viral transcription, adopts a G-quadruplex structure. The sequence contains three binding elements for transcription factor Sp1, which is involved in the regulation of HIV-1 latency, reactivation, and high-level virus expression. We show that the three Sp1 binding motifs can adopt different forms of G-quadruplex structure and that the Sp1 protein can recognize and bind to its site folded into a G-quadruplex. In addition, a c-kit2 specific antibody, designated hf2, binds to two different G-quadruplexes formed in Sp1 sites. Since U3 is encoded at both viral genomic ends, the G-rich sequence is also present in the RNA genome. We demonstrate that the RNA sequence of U3 forms dimers with characteristics known for intermolecular G-quadruplexes. Together with previous reports showing G-quadruplex dimers in the gag and cPPT regions, these results suggest that integrity of the two viral genomes is maintained through numerous intermolecular G-quadruplexes formed in different RNA genome locations. Reconstituted reverse transcription shows that the potassium-dependent structure formed in U3 RNA facilitates RT template switching, suggesting that the G-quadruplex contributes to recombination in U3. PMID:24735378

  4. Endogenous Hot Spots of De Novo Telomere Addition in the Yeast Genome Contain Proximal Enhancers That Bind Cdc13.

    PubMed

    Obodo, Udochukwu C; Epum, Esther A; Platts, Margaret H; Seloff, Jacob; Dahlson, Nicole A; Velkovsky, Stoycho M; Paul, Shira R; Friedman, Katherine L

    2016-06-15

    DNA double-strand breaks (DSBs) pose a threat to genome stability and are repaired through multiple mechanisms. Rarely, telomerase, the enzyme that maintains telomeres, acts upon a DSB in a mutagenic process termed telomere healing. The probability of telomere addition is increased at specific genomic sequences termed sites of repair-associated telomere addition (SiRTAs). By monitoring repair of an induced DSB, we show that SiRTAs on chromosomes V and IX share a bipartite structure in which a core sequence (Core) is directly targeted by telomerase, while a proximal sequence (Stim) enhances the probability of de novo telomere formation. The Stim and Core sequences are sufficient to confer a high frequency of telomere addition to an ectopic site. Cdc13, a single-stranded DNA binding protein that recruits telomerase to endogenous telomeres, is known to stimulate de novo telomere addition when artificially recruited to an induced DSB. Here we show that the ability of the Stim sequence to enhance de novo telomere addition correlates with its ability to bind Cdc13, indicating that natural sites at which telomere addition occurs at high frequency require binding by Cdc13 to a sequence 20 to 100 bp internal from the site at which telomerase acts to initiate de novo telomere addition. PMID:27044869

  5. STaRRRT: a table of short tandem repeats in regulatory regions of the human genome

    PubMed Central

    2013-01-01

    Background Tandem repeats (TRs) are unstable regions commonly found within genomes that have consequences for evolution and disease. In humans, polymorphic TRs are known to cause neurodegenerative and neuromuscular disorders as well as being associated with complex diseases such as diabetes and cancer. If present in upstream regulatory regions, TRs can modify chromatin structure and affect transcription; resulting in altered gene expression and protein abundance. The most common TRs are short tandem repeats (STRs), or microsatellites. Promoter located STRs are considerably more polymorphic than coding region STRs. As such, they may be a common driver of phenotypic variation. To study STRs located in regulatory regions, we have performed genome-wide analysis to identify all STRs present in a region that is 2 kilobases upstream and 1 kilobase downstream of the transcription start sites of genes. Results The Short Tandem Repeats in Regulatory Regions Table, STaRRRT, contains the results of the genome-wide analysis, outlining the characteristics of 5,264 STRs present in the upstream regulatory region of 4,441 human genes. Gene set enrichment analysis has revealed significant enrichment for STRs in cellular, transcriptional and neurological system gene promoters and genes important in ion and calcium homeostasis. The set of enriched terms has broad similarity to that seen in coding regions, suggesting that regulatory region STRs are subject to similar evolutionary pressures as STRs in coding regions and may, like coding region STRs, have an important role in controlling gene expression. Conclusions STaRRRT is a readily-searchable resource for investigating potentially polymorphic STRs that could influence the expression of any gene of interest. The processes and genes enriched for regulatory region STRs provide potential novel targets for diagnosing and treating disease, and support a role for these STRs in the evolution of the human genome. PMID:24228761

  6. A genome-wide association study identifies a genomic region for the polycerate phenotype in sheep (Ovis aries)

    PubMed Central

    Ren, Xue; Yang, Guang-Li; Peng, Wei-Feng; Zhao, Yong-Xin; Zhang, Min; Chen, Ze-Hui; Wu, Fu-An; Kantanen, Juha; Shen, Min; Li, Meng-Hua

    2016-01-01

    Horns are a cranial appendage found exclusively in Bovidae, and play important roles in accessing resources and mates. In sheep (Ovies aries), horns vary from polled to six-horned, and human have been selecting polled animals in farming and breeding. Here, we conducted a genome-wide association study on 24 two-horned versus 22 four-horned phenotypes in a native Chinese breed of Sishui Fur sheep. Together with linkage disequilibrium (LD) analyses and haplotype-based association tests, we identified a genomic region comprising 132.0–133.1 Mb on chromosome 2 that contained the top 10 SNPs (including 4 significant SNPs) and 5 most significant haplotypes associated with the polycerate phenotype. In humans and mice, this genomic region contains the HOXD gene cluster and adjacent functional genes EVX2 and KIAA1715, which have a close association with the formation of limbs and genital buds. Our results provide new insights into the genetic basis underlying variable numbers of horns and represent a new resource for use in sheep genetics and breeding. PMID:26883901

  7. The catecholamine biosynthetic enzyme dopamine β-hydroxylase (DBH): first genome-wide search positions trait-determining variants acting additively in the proximal promoter

    PubMed Central

    Mustapic, Maja; Maihofer, Adam X.; Mahata, Manjula; Chen, Yuqing; Baker, Dewleen G.; O'Connor, Daniel T.; Nievergelt, Caroline M.

    2014-01-01

    Dopamine beta-hydroxylase (DBH) is the biosynthetic enzyme catalyzing formation of norepinephrine. Changes in DBH expression or activity have been implicated in the pathogenesis of cardiovascular and neuropsychiatric disorders. Genetic determination of DBH enzymatic activity and its secretion are only incompletely understood. We began with a genome-wide association search for loci contributing to DBH activity in human plasma. Initially, in a population sample of European ancestry, we identified the proximal DBH promoter as a region harboring three common trait-determining variants (top hit rs1611115, P = 7.2 × 10−51). We confirmed their effects on transcription and showed that the three variants each acted additively on gene expression. Results were replicated in a population sample of Native American descent (top hit rs1611115, P = 4.1 × 10−15). Jointly, DBH variants accounted for 57% of DBH trait variation. We further identified a genome-wide significant SNP at the LOC338797 locus on chromosome 12 as trans-quantitative trait locus (QTL) (rs4255618, P = 4.62 × 10−8). Conditional analyses on DBH identified a third genomic region contributing to DBH variation: a likely cis-QTL adjacent to DBH in SARDH (rs7040170, P = 1.31 × 10−14) on chromosome 9q. We conclude that three common SNPs in the DBH promoter act additively to control phenotypic variation in DBH levels, and that two additional novel loci (SARDH and LOC338797) may also contribute to the expression of this catecholamine biosynthetic trait. Identification of DBH variants with strong effects makes it possible to take advantage of Mendelian randomization approaches to test causal effects of this intermediate trait on disease. PMID:24986918

  8. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs.

    PubMed

    Freedman, Adam H; Schweizer, Rena M; Ortega-Del Vecchyo, Diego; Han, Eunjung; Davis, Brian W; Gronau, Ilan; Silva, Pedro M; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R; Parker, Heidi G; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D; Harkins, Timothy T; Nelson, Stanley F; Marques-Bonet, Tomas; Ostrander, Elaine A; Wayne, Robert K; Novembre, John

    2016-03-01

    Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers. PMID:26943675

  9. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs

    PubMed Central

    Freedman, Adam H.; Schweizer, Rena M.; Ortega-Del Vecchyo, Diego; Han, Eunjung; Davis, Brian W.; Gronau, Ilan; Silva, Pedro M.; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R.; Parker, Heidi G.; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D.; Harkins, Timothy T.; Nelson, Stanley F.; Marques-Bonet, Tomas; Ostrander, Elaine A.; Wayne, Robert K.; Novembre, John

    2016-01-01

    Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers. PMID:26943675

  10. SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand.

    PubMed

    Tang, Haibao; Bomhoff, Matthew D; Briones, Evan; Zhang, Liangsheng; Schnable, James C; Lyons, Eric

    2015-12-01

    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, even when no such gene is present. This capability means that synteny-based methods are far more effective than sequence similarity-based methods in identifying true-negatives, a necessity for studying gene loss and gene transposition. However, the identification of syntenic regions requires complex analyses which must be repeated for pairwise comparisons between any two species. Therefore, as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of target genomes. SynFind is capable of reporting per-gene information, useful for researchers studying specific gene families, as well as genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc. PMID:26560340

  11. SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand

    PubMed Central

    Tang, Haibao; Bomhoff, Matthew D.; Briones, Evan; Zhang, Liangsheng; Schnable, James C.; Lyons, Eric

    2015-01-01

    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, even when no such gene is present. This capability means that synteny-based methods are far more effective than sequence similarity-based methods in identifying true-negatives, a necessity for studying gene loss and gene transposition. However, the identification of syntenic regions requires complex analyses which must be repeated for pairwise comparisons between any two species. Therefore, as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of target genomes. SynFind is capable of reporting per-gene information, useful for researchers studying specific gene families, as well as genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc. PMID:26560340

  12. Genome-Enabled Estimates of Additive and Nonadditive Genetic Variances and Prediction of Apple Phenotypes Across Environments

    PubMed Central

    Kumar, Satish; Molloy, Claire; Muñoz, Patricio; Daetwyler, Hans; Chagné, David; Volz, Richard

    2015-01-01

    The nonadditive genetic effects may have an important contribution to total genetic variation of phenotypes, so estimates of both the additive and nonadditive effects are desirable for breeding and selection purposes. Our main objectives were to: estimate additive, dominance and epistatic variances of apple (Malus × domestica Borkh.) phenotypes using relationship matrices constructed from genome-wide dense single nucleotide polymorphism (SNP) markers; and compare the accuracy of genomic predictions using genomic best linear unbiased prediction models with or without including nonadditive genetic effects. A set of 247 clonally replicated individuals was assessed for six fruit quality traits at two sites, and also genotyped using an Illumina 8K SNP array. Across several fruit quality traits, the additive, dominance, and epistatic effects contributed about 30%, 16%, and 19%, respectively, to the total phenotypic variance. Models ignoring nonadditive components yielded upwardly biased estimates of additive variance (heritability) for all traits in this study. The accuracy of genomic predicted genetic values (GEGV) varied from about 0.15 to 0.35 for various traits, and these were almost identical for models with or without including nonadditive effects. However, models including nonadditive genetic effects further reduced the bias of GEGV. Between-site genotypic correlations were high (>0.85) for all traits, and genotype-site interaction accounted for <10% of the phenotypic variability. The accuracy of prediction, when the validation set was present only at one site, was generally similar for both sites, and varied from about 0.50 to 0.85. The prediction accuracies were strongly influenced by trait heritability, and genetic relatedness between the training and validation families. PMID:26497141

  13. Chromosome region-specific libraries for human genome analysis. Final progress report, 1 March 1991--28 February 1994

    SciTech Connect

    Kao, F.T.

    1994-04-01

    The objectives of this grant proposal include (1) development of a chromosome microdissection and PCR-mediated microcloning technology, (2) application of this microtechnology to the construction of region-specific libraries for human genome analysis. During this grant period, the authors have successfully developed this microtechnology and have applied it to the construction of microdissection libraries for the following chromosome regions: a whole chromosome 21 (21E), 2 region-specific libraries for the long arm of chromosome 2, 2q35-q37 (2Q1) and 2q33-q35 (2Q2), and 4 region-specific libraries for the entire short arm of chromosome 2, 2p23-p25 (2P1), 2p21-p23 (2P2), 2p14-p16 (wP3) and 2p11-p13 (2P4). In addition, 20--40 unique sequence microclones have been isolated and characterized for genomic studies. These region-specific libraries and the single-copy microclones from the library have been used as valuable resources for (1) isolating microsatellite probes in linkage analysis to further refine the disease locus; (2) isolating corresponding clones with large inserts, e.g. YAC, BAC, P1, cosmid and phage, to facilitate construction of contigs for high resolution physical mapping; and (3) isolating region-specific cDNA clones for use as candidate genes. These libraries are being deposited in the American Type Culture Collection (ATCC) for general distribution.

  14. Internal genomic regions mobilized for telomere maintenance in C. elegans

    PubMed Central

    Kim, Chuna; Sung, Sanghyun; Lee, Junho

    2016-01-01

    ABSTRACT Because DNA polymerase cannot replicate telomeric DNA at linear chromosomal ends, eukaryotes have developed specific telomere maintenance mechanisms (TMMs). A major TMM involves specialized reverse transcriptase, telomerase. However, there also exist various telomerase-independent TMMs (TI-TMMs), which can arise both in pathological conditions (such as cancers) and during evolution. The TI-TMM in cancer cells is called alternative lengthening of telomeres (ALT), whose mechanism is not fully understood. We generated stably maintained telomerase-independent survivors from C. elegans telomerase mutants and found that, unlike previously described survivors in worms, these survivors “mobilize” specific internal sequence blocks for telomere lengthening, which we named TALTs (templates for ALT). The cis-duplication of internal genomic TALTs produces “reservoirs” of TALTs, whose trans-duplication occurs at all chromosome ends in the ALT survivors. Our discovery that different TALTs are utilized in different wild isolates provides insight into the molecular events leading to telomere evolution. PMID:27073737

  15. Internal genomic regions mobilized for telomere maintenance in C. elegans.

    PubMed

    Kim, Chuna; Sung, Sanghyun; Lee, Junho

    2016-01-01

    Because DNA polymerase cannot replicate telomeric DNA at linear chromosomal ends, eukaryotes have developed specific telomere maintenance mechanisms (TMMs). A major TMM involves specialized reverse transcriptase, telomerase. However, there also exist various telomerase-independent TMMs (TI-TMMs), which can arise both in pathological conditions (such as cancers) and during evolution. The TI-TMM in cancer cells is called alternative lengthening of telomeres (ALT), whose mechanism is not fully understood. We generated stably maintained telomerase-independent survivors from C. elegans telomerase mutants and found that, unlike previously described survivors in worms, these survivors "mobilize" specific internal sequence blocks for telomere lengthening, which we named TALTs (templates for ALT). The cis-duplication of internal genomic TALTs produces "reservoirs" of TALTs, whose trans-duplication occurs at all chromosome ends in the ALT survivors. Our discovery that different TALTs are utilized in different wild isolates provides insight into the molecular events leading to telomere evolution. PMID:27073737

  16. OcculterCut: A Comprehensive Survey of AT-Rich Regions in Fungal Genomes

    PubMed Central

    Testa, Alison C.; Oliver, Richard P.; Hane, James K.

    2016-01-01

    We present a novel method to measure the local GC-content bias in genomes and a survey of published fungal species. The method, enacted as “OcculterCut” (https://sourceforge.net/projects/occultercut, last accessed April 30, 2016), identified species containing distinct AT-rich regions. In most fungal taxa, AT-rich regions are a signature of repeat-induced point mutation (RIP), which targets repetitive DNA and decreases GC-content though the conversion of cytosine to thymine bases. RIP has in turn been identified as a driver of fungal genome evolution, as RIP mutations can also occur in single-copy genes neighboring repeat-rich regions. Over time RIP perpetuates “two speeds” of gene evolution in the GC-equilibrated and AT-rich regions of fungal genomes. In this study, genomes showing evidence of this process are found to be common, particularly among the Pezizomycotina. Further analysis highlighted differences in amino acid composition and putative functions of genes from these regions, supporting the hypothesis that these regions play an important role in fungal evolution. OcculterCut can also be used to identify genes undergoing RIP-assisted diversifying selection, such as small, secreted effector proteins that mediate host-microbe disease interactions. PMID:27289099

  17. OcculterCut: A Comprehensive Survey of AT-Rich Regions in Fungal Genomes.

    PubMed

    Testa, Alison C; Oliver, Richard P; Hane, James K

    2016-01-01

    We present a novel method to measure the local GC-content bias in genomes and a survey of published fungal species. The method, enacted as "OcculterCut" (https://sourceforge.net/projects/occultercut, last accessed April 30, 2016), identified species containing distinct AT-rich regions. In most fungal taxa, AT-rich regions are a signature of repeat-induced point mutation (RIP), which targets repetitive DNA and decreases GC-content though the conversion of cytosine to thymine bases. RIP has in turn been identified as a driver of fungal genome evolution, as RIP mutations can also occur in single-copy genes neighboring repeat-rich regions. Over time RIP perpetuates "two speeds" of gene evolution in the GC-equilibrated and AT-rich regions of fungal genomes. In this study, genomes showing evidence of this process are found to be common, particularly among the Pezizomycotina. Further analysis highlighted differences in amino acid composition and putative functions of genes from these regions, supporting the hypothesis that these regions play an important role in fungal evolution. OcculterCut can also be used to identify genes undergoing RIP-assisted diversifying selection, such as small, secreted effector proteins that mediate host-microbe disease interactions. PMID:27289099

  18. Comparative genomics provides insight into maize adaptation in temperate regions.

    PubMed

    Hufford, Matthew B

    2016-01-01

    A new study provides insights into the evolution of maize during its global spread into temperate regions from its origin in coastal Mexico.Please see related Research article: http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1009-x. PMID:27411931

  19. Expression of transcribed ultraconserved regions of genome in rat cerebral cortex

    PubMed Central

    Mehta, Suresh L.; Dharap, Ashutosh; Vemuganti, Raghu

    2014-01-01

    Emerging evidence indicates that 481 regions of the genome (>200 bp) that actively transcribe noncoding RNAs shows 100% homology between humans, rats and mice. These transcribed ultraconserved regions (T-UCRs) are thought to control the essential regulatory functions basic for life in rodents and mammals. Using microarray analysis, we presently show that 107 T-UCRs are actively expressed in adult rat cerebral cortex. They are grouped into intragenic (61) and intergenic (46) based on their genic location. Interestingly, 10 T-UCRs are expressed at unusually high levels in cerebral cortex. Additionally, many T-UCRs also showed cogenic expression. We further analyzed the correlation of intragenic T-UCRs with their host protein coding genes. Surprisingly, most of the expressed intragenic T-UCRs (54 out of 61) displayed a negative correlation with their host gene expression. T-UCRs are thought to control the splicing and transcription of the protein-coding genes that host them and flank them. Bioinformatics analysis indicated that the protein products of majority of these genes are nuclear in localization, share protein domains and are involved in the regulation of diverse biological and molecular functions including metabolism, development, cell cycle, binding and transcription factor regulation. In conclusion, this is the first study to shows that many T-UCRs are expressed in rodent brain and they might play a role in physiological brain functions. PMID:24953281

  20. Systematic sequencing of the Escherichia coli genome: analysis of the 0-2.4 min region.

    PubMed Central

    Yura, T; Mori, H; Nagai, H; Nagata, T; Ishihama, A; Fujita, N; Isono, K; Mizobuchi, K; Nakata, A

    1992-01-01

    A contiguous 111,402-nucleotide sequence corresponding to the 0 to 2.4 min region of the E. coli chromosome was determined as a first step to complete structural analysis of the genome. The resulting sequence was used to predict open reading frames and to search for sequence similarity against the PIR protein database. A number of novel genes were found whose predicted protein sequences showed significant homology with known proteins from various organisms, including several clusters of genes similar to those involved in fatty acid metabolism in bacteria (e.g., betT, baiF) and higher organisms, iron transport (sfuA, B, C) in Serratia marcescens, and symbiotic nitrogen fixation or electron transport (fixA, B, C, X) in Azorhizobium caulinodans. In addition, several genes and IS elements that had been mapped but not sequenced (e.g., leuA, B, C, D) were identified. We estimate that about 90 genes are represented in this region of the chromosome with little spacer. Images PMID:1630901

  1. Genomic variation in the porcine immunoglobulin lambda variable region.

    PubMed

    Guo, Xi; Schwartz, John C; Murtaugh, Michael P

    2016-04-01

    Production of a vast antibody repertoire is essential for the protection against pathogens. Variable region germline complexity contributes to repertoire diversity and is a standard feature of mammalian immunoglobulin loci, but functional V region genes are limited in swine. For example, the porcine lambda light chain locus is composed of 23 variable (V) genes and 4 joining (J) genes, but only 10 or 11 V and 2 J genes are functional. Allelic variation in V and J may increase overall diversity within a population, yet lead to repertoire holes in individuals lacking key alleles. Previous studies focused on heavy chain genetic variation, thus light chain allelic diversity is not known. We characterized allelic variation of the porcine immunoglobulin lambda variable (IGLV) region genes. All intact IGLV genes in 81 pigs were amplified, sequenced, and analyzed to determine their allelic variation and functionality. We observed mutational variation across the entire length of the IGLV genes, in both framework and complementarity determining regions (CDRs). Three recombination hotspot motifs were also identified suggesting that non-allelic homologous recombination is an evolutionarily alternative mechanism for generating germline antibody diversity. Functional alleles were greatest in the most highly expressed families, IGLV3 and IGLV8. At the population level, allelic variation appears to help maintain the potential for broad antibody repertoire diversity in spite of reduced gene segment choices and limited germline sequence modification. The trade-off may be a reduction in repertoire diversity within individuals that could result in an increased variation in immunity to infectious disease and response to vaccination. PMID:26791019

  2. The Complete Chloroplast Genome Sequences of Three Veroniceae Species (Plantaginaceae): Comparative Analysis and Highly Divergent Regions

    PubMed Central

    Choi, Kyoung Su; Chung, Myong Gi; Park, SeonJoo

    2016-01-01

    Previous studies of Veronica and related genera were weakly supported by molecular and paraphyletic taxa. Here, we report the complete chloroplast genome sequence of Veronica nakaiana and the related species Veronica persica and Veronicastrum sibiricum. The chloroplast genome length of V. nakaiana, V. persica, and V. sibiricum ranged from 150,198 bp to 152,930 bp. A total of 112 genes comprising 79 protein coding genes, 29 tRNA genes, and 4 rRNA genes were observed in three chloroplast genomes. The total number of SSRs was 48, 51, and 53 in V. nakaiana, V. persica, and V. sibiricum, respectively. Two SSRs (10 bp of AT and 12 bp of AATA) were observed in the same regions (rpoC2 and ndhD) in three chloroplast genomes. A comparison of coding genes and non-coding regions between V. nakaiana and V. persica revealed divergent sites, with the greatest variation occurring petD-rpoA region. The complete chloroplast genome sequence information regarding the three Veroniceae will be helpful for elucidating Veroniceae phylogenetic relationships. PMID:27047524

  3. Extensive Pyrosequencing Reveals Frequent Intra-Genomic Variations of Internal Transcribed Spacer Regions of Nuclear Ribosomal DNA

    PubMed Central

    Li, Dezhu; Sun, Yongzhen; Niu, Yunyun; Chen, Zhiduan; Luo, Hongmei; Pang, Xiaohui; Sun, Zhiying; Liu, Chang; Lv, Aiping; Deng, Youping; Larson-Rabin, Zachary; Wilkinson, Mike; Chen, Shilin

    2012-01-01

    Background Internal transcribed spacer of nuclear ribosomal DNA (nrDNA) is already one of the most popular phylogenetic and DNA barcoding markers. However, the existence of its multiple copies has complicated such usage and a detailed characterization of intra-genomic variations is critical to address such concerns. Methodology/Principal Findings In this study, we used sequence-tagged pyrosequencing and genome-wide analyses to characterize intra-genomic variations of internal transcribed spacer 2 (ITS2) regions from 178 plant species. We discovered that mutation of ITS2 is frequent, with a mean of 35 variants per species. And on average, three of the most abundant variants make up 91% of all ITS2 copies. Moreover, we found different congeneric species share identical variants in 13 genera. Interestingly, different species across different genera also share identical variants. In particular, one minor variant of ITS2 in Eleutherococcus giraldii was found identical to the ITS2 major variant of Panax ginseng, both from Araliaceae family. In addition, DNA barcoding gap analysis showed that the intra-genomic distances were markedly smaller than those of the intra-specific or inter-specific variants. When each of 5543 variants were examined for its species discrimination efficiency, a 97% success rate was obtained at the species level. Conclusions Identification of identical ITS2 variants across intra-generic or inter-generic species revealed complex species evolutionary history, possibly, horizontal gene transfer and ancestral hybridization. Although intra-genomic multiple variants are frequently found within each genome, the usage of the major variants alone is sufficient for phylogeny construction and species determination in most cases. Furthermore, the inclusion of minor variants further improves the resolution of species identification. PMID:22952830

  4. Genetics/Genomics Research in the Central Region

    USGS Publications Warehouse

    U.S. Geological Survey

    2006-01-01

    Genetics-based research within the Biological Resources Discipline (BRD) Science Centers in the Central Region incorporates many aspects of the field of genetics. Research activities range from documenting patterns of genetic variation in order to investigate relationships among species, populations and individuals to investigating the structure, function and expression of genes and their response to environmental stressors. Research in the broad areas of genetics requires multidisciplinary expertise and specialized equipment and instrumentation. Brief summaries of the capabilities of the five BRD Centers are given below.

  5. Characterization of the genomic region containing the Shadow of Prion Protein (SPRN) gene in sheep

    PubMed Central

    Lampo, Evelyne; Van Poucke, Mario; Hugot, Karine; Hayes, Hélène; Van Zeveren, Alex; Peelman, Luc J

    2007-01-01

    Background TSEs are a group of fatal neurodegenerative diseases occurring in man and animals. They are caused by prions, alternatively folded forms of the endogenous prion protein, encoded by PRNP. Since differences in the sequence of PRNP can not explain all variation in TSE susceptibility, there is growing interest in other genes that might have an influence on this susceptibility. One of these genes is SPRN, a gene coding for a protein showing remarkable similarities with the prion protein. Until now, SPRN has not been described in sheep, a highly relevant species in prion matters. Results In order to characterize the genomic region containing SPRN in sheep, a BAC mini-contig was built, covering approximately 200,000 bp and containing the genes ECHS1, PAOX, MTG1, SPRN, LOC619207, CYP2E1 and at least partially SYCE1. FISH mapping of the two most exterior BAC clones of the contig positioned this contig on Oari22q24. A fragment of 4,544 bp was also sequenced, covering the entire SPRN gene and 1206 bp of the promoter region. In addition, the transcription profile of SPRN in 21 tissues was determined by RT-PCR, showing high levels in cerebrum and cerebellum, and low levels in testis, lymph node, jejunum, ileum, colon and rectum. Conclusion Annotation of a mini-contig including SPRN suggests conserved linkage between Oari22q24 and Hsap10q26. The ovine SPRN sequence, described for the first time, shows a high level of homology with the bovine, and to a lesser extent with the human SPRN sequence. In addition, transcription profiling in sheep reveals main expression of SPRN in brain tissue, as in rat, cow, man and mouse. PMID:17537256

  6. The complete mitochondrial genome sequence of the tubeworm Lamellibrachia satsuma and structural conservation in the mitochondrial genome control regions of Order Sabellida.

    PubMed

    Patra, Ajit Kumar; Kwon, Yong Min; Kang, Sung Gyun; Fujiwara, Yoshihiro; Kim, Sang-Jin

    2016-04-01

    The control region of the mitochondrial genomes shows high variation in conserved sequence organizations, which follow distinct evolutionary patterns in different species or taxa. In this study, we sequenced the complete mitochondrial genome of Lamellibrachia satsuma from the cold-seep region of Kagoshima Bay, as a part of whole genome study and extensively studied the structural features and patterns of the control region sequences. We obtained 15,037 bp of mitochondrial genome using Illumina sequencing and identified the non-coding AT-rich region or control region (354 bp, AT=83.9%) located between trnH and trnR. We found 7 conserved sequence blocks (CSB), scattered throughout the control region of L. satsuma and other taxa of Annelida. The poly-TA stretches, which commonly form the stem of multiple stem-loop structures, are most conserved in the CSB-I and CSB-II regions. The mitochondrial genome of L. satsuma encodes a unique repetitive sequence in the control region, which forms a unique secondary structure in comparison to Lamellibrachia luymesi. Phylogenetic analyses of all protein-coding genes indicate that L. satsuma forms a monophyletic clade with L. luymesi along with other tubeworms found in cold-seep regions (genera: Lamellibrachia, Escarpia, and Seepiophila). In general, the control region sequences of Annelida could be aligned with certainty within each genus, and to some extent within the family, but with a higher rate of variation in conserved regions. PMID:26776396

  7. Genomic Characterization and Comparison of Multi-Regional and Pooled Tumor Biopsy Specimens

    PubMed Central

    Kim, Sang Cheol; Jung, HyunChul; Park, Woong-Yang; Song, Sang-Yong

    2016-01-01

    A single tumor biopsy specimen is typically used in cancer genome studies. However, it may represent incompletely the underlying mutational and transcriptional profiles of tumor biology. Multi-regional biopsies have the advantage of increased sensitivity for genomic profiling, but they are not cost-effective. The concept of an alternative method such as the pooling of multiple biopsies is a challenge. In order to determine if the pooling of distinct regions is representative at the genomic and transcriptome level, we performed sequencing of four regional samples and pooled samples for four cancer types including colon, stomach, kidney and liver cancer. Subsequently, a comparative analysis was conducted to explore differences in mutations and gene expression profiles between multiple regional biopsies and pooled biopsy for each tumor. Our analysis revealed a marginal level of regional difference in detected variants, but in those with low allele frequency, considerable discrepancies were observed. In conclusion, sequencing pooled samples has the benefit of detecting many variants with moderate allele frequency that occur in partial regions, but it is not applicable for detecting low-frequency mutations that require deep sequencing. PMID:27010638

  8. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

    PubMed Central

    Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

    1999-01-01

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707

  9. Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR).

    PubMed

    Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J; Laclette, Juan P; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

    2015-01-01

    Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest. PMID:25989346

  10. Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR)

    PubMed Central

    Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J.; Laclette, Juan P.; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

    2015-01-01

    Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest. PMID:25989346

  11. β-globin matrix attachment region improves stable genomic expression of the Sleeping Beauty transposon.

    PubMed

    Sjeklocha, Lucas; Chen, Yixin; Daly, Meghan C; Steer, Clifford J; Kren, Betsy T

    2011-09-01

    The liver is an attractive target for gene therapy due to its extensive capability for protein production and the numerous diseases resulting from a loss of gene function it normally provides. The Sleeping Beauty Transposon (SB-Tn)(1) system is a non-viral vector capable of delivering and mediating therapeutic transgene(s) insertion into the host genome for long-term expression. A current challenge for this system is the low efficiency of integration of the transgene. In this study we use a human hepatoma cell line (HuH-7) and primary human blood outgrowth endothelial cells (BOECs) to test vectors containing DNA elements to enhance transposition without integrating themselves. We employed the human β-globin matrix attachment region (MAR) and the Simian virus 40 (SV40) nuclear translocation signal to increase the percent of HuH-7 cells persistently expressing a GFP::Zeo reporter construct by ∼50% for each element; while combining both did not show an additive effect. Interestingly, both elements together displayed an additive effect on the number of insertion sites, and in BOECs the SV40 alone appeared to have an inhibitory effect on transposition. In long-term cultures the loss of plasmid DNA, transposase expression and mapping of insertion sites demonstrated bona fide transposition without episomal expression. These results show that addition of the β-globin MAR and potentially other elements to the backbone of SB-Tn system can enhance transposition and expression of therapeutic transgenes. These findings may have a significant influence on the use of SB transgene delivery to liver for the treatment of a wide variety of disorders. PMID:21520245

  12. Identification of Genomic Regions Associated with Phenotypic Variation between Dog Breeds using Selection Mapping

    PubMed Central

    Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Åke; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.

    2011-01-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease. PMID:22022279

  13. Fine-mapping in the MHC region accounts for 18% additional genetic risk for celiac disease

    PubMed Central

    Gutierrez-Achury, Javier; Zhernakova, Alexandra; Pulit, Sara L.; Trynka, Gosia; Hunt, Karen A.; Romanos, Jihane; Raychaudhuri, Soumya; van Heel, David A.; Wijmenga, Cisca; de Bakker, Paul I.W.

    2015-01-01

    Although dietary gluten is the trigger, celiac disease risk is strongly influenced by genetic variation in the major histocompatibility complex (MHC) region. We fine-mapped the MHC association signal to identify additional risk factors independent of the HLA-DQ alleles and observed five novel associations that account for 18% of the genetic risk. Together with the 57 known non-MHC loci, genetic variation can now explain up to 48% of celiac disease heritability. PMID:25894500

  14. Evolutionary Genomics Suggests That CheV Is an Additional Adaptor for Accommodating Specific Chemoreceptors within the Chemotaxis Signaling Complex

    PubMed Central

    Ortega, Davi R.; Zhulin, Igor B.

    2016-01-01

    Escherichia coli and Salmonella enterica are models for many experiments in molecular biology including chemotaxis, and most of the results obtained with one organism have been generalized to another. While most components of the chemotaxis pathway are strongly conserved between the two species, Salmonella genomes contain some chemoreceptors and an additional protein, CheV, that are not found in E. coli. The role of CheV was examined in distantly related species Bacillus subtilis and Helicobacter pylori, but its role in bacterial chemotaxis is still not well understood. We tested a hypothesis that in enterobacteria CheV functions as an additional adaptor linking the CheA kinase to certain types of chemoreceptors that cannot be effectively accommodated by the universal adaptor CheW. Phylogenetic profiling, genomic context and comparative protein sequence analyses suggested that CheV interacts with specific domains of CheA and chemoreceptors from an orthologous group exemplified by the Salmonella McpC protein. Structural consideration of the conservation patterns suggests that CheV and CheW share the same binding spot on the chemoreceptor structure, but have some affinity bias towards chemoreceptors from different orthologous groups. Finally, published experimental results and data newly obtained via comparative genomics support the idea that CheV functions as a “phosphate sink” possibly to off-set the over-stimulation of the kinase by certain types of chemoreceptors. Overall, our results strongly suggest that CheV is an additional adaptor for accommodating specific chemoreceptors within the chemotaxis signaling complex. PMID:26844549

  15. Evolutionary Genomics Suggests That CheV Is an Additional Adaptor for Accommodating Specific Chemoreceptors within the Chemotaxis Signaling Complex.

    PubMed

    Ortega, Davi R; Zhulin, Igor B

    2016-02-01

    Escherichia coli and Salmonella enterica are models for many experiments in molecular biology including chemotaxis, and most of the results obtained with one organism have been generalized to another. While most components of the chemotaxis pathway are strongly conserved between the two species, Salmonella genomes contain some chemoreceptors and an additional protein, CheV, that are not found in E. coli. The role of CheV was examined in distantly related species Bacillus subtilis and Helicobacter pylori, but its role in bacterial chemotaxis is still not well understood. We tested a hypothesis that in enterobacteria CheV functions as an additional adaptor linking the CheA kinase to certain types of chemoreceptors that cannot be effectively accommodated by the universal adaptor CheW. Phylogenetic profiling, genomic context and comparative protein sequence analyses suggested that CheV interacts with specific domains of CheA and chemoreceptors from an orthologous group exemplified by the Salmonella McpC protein. Structural consideration of the conservation patterns suggests that CheV and CheW share the same binding spot on the chemoreceptor structure, but have some affinity bias towards chemoreceptors from different orthologous groups. Finally, published experimental results and data newly obtained via comparative genomics support the idea that CheV functions as a "phosphate sink" possibly to off-set the over-stimulation of the kinase by certain types of chemoreceptors. Overall, our results strongly suggest that CheV is an additional adaptor for accommodating specific chemoreceptors within the chemotaxis signaling complex. PMID:26844549

  16. Comparative annotation of functional regions in the human genome using epigenomic data

    PubMed Central

    Won, Kyoung-Jae; Zhang, Xian; Wang, Tao; Ding, Bo; Raha, Debasish; Snyder, Michael; Ren, Bing; Wang, Wei

    2013-01-01

    Epigenetic regulation is dynamic and cell-type dependent. The recently available epigenomic data in multiple cell types provide an unprecedented opportunity for a comparative study of epigenetic landscape. We developed a machine-learning method called ChroModule to annotate the epigenetic states in eight ENCyclopedia Of DNA Elements cell types. The trained model successfully captured the characteristic histone-modification patterns associated with regulatory elements, such as promoters and enhancers, and showed superior performance on identifying enhancers compared with the state-of-art methods. In addition, given the fixed number of epigenetic states in the model, ChroModule allows straightforward illustration of epigenetic variability in multiple cell types. Using this feature, we found that invariable and variable epigenetic states across cell types correspond to housekeeping functions and stimulus response, respectively. Especially, we observed that enhancers, but not the other regulatory elements, dictate cell specificity, as similar cell types share common enhancers, and cell-type–specific enhancers are often bound by transcription factors playing critical roles in that cell type. More interestingly, we found some genomic regions are dormant in cell type but primed to become active in other cell types. These observations highlight the usefulness of ChroModule in comparative analysis and interpretation of multiple epigenomes. PMID:23482391

  17. Comparative annotation of functional regions in the human genome using epigenomic data.

    PubMed

    Won, Kyoung-Jae; Zhang, Xian; Wang, Tao; Ding, Bo; Raha, Debasish; Snyder, Michael; Ren, Bing; Wang, Wei

    2013-04-01

    Epigenetic regulation is dynamic and cell-type dependent. The recently available epigenomic data in multiple cell types provide an unprecedented opportunity for a comparative study of epigenetic landscape. We developed a machine-learning method called ChroModule to annotate the epigenetic states in eight ENCyclopedia Of DNA Elements cell types. The trained model successfully captured the characteristic histone-modification patterns associated with regulatory elements, such as promoters and enhancers, and showed superior performance on identifying enhancers compared with the state-of-art methods. In addition, given the fixed number of epigenetic states in the model, ChroModule allows straightforward illustration of epigenetic variability in multiple cell types. Using this feature, we found that invariable and variable epigenetic states across cell types correspond to housekeeping functions and stimulus response, respectively. Especially, we observed that enhancers, but not the other regulatory elements, dictate cell specificity, as similar cell types share common enhancers, and cell-type-specific enhancers are often bound by transcription factors playing critical roles in that cell type. More interestingly, we found some genomic regions are dormant in cell type but primed to become active in other cell types. These observations highlight the usefulness of ChroModule in comparative analysis and interpretation of multiple epigenomes. PMID:23482391

  18. Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology

    PubMed Central

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K. P.; Woo, Patrick C. Y.

    2015-01-01

    Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10–49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n = 2), Pichia (Candida) norvegensis (n = 2), Candida tropicalis (n = 1) and Saccharomyces cerevisiae (n = 1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study. PMID:26506340

  19. Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation.

    PubMed

    Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

    2015-01-01

    Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes. PMID:25836973

  20. Evaluation of Apis mellifera syriaca Levant region honeybee conservation using comparative genome hybridization.

    PubMed

    Haddad, Nizar Jamal; Batainh, Ahmed; Saini, Deepti; Migdadi, Osama; Aiyaz, Mohamed; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; Al-Shagour, Banan; Brake, Mohammad; Bourgeois, Lelania; De Guzman, Lilia; Rinderer, Thomas; Hamouri, Zayed Mahoud

    2016-06-01

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control reference sample (CRS) in this study was originally collected and stored since 2001 from "Wadi Ben Hammad", a remote valley in the southern region of Jordan. Morphometric and mitochondrial DNA markers of these honeybees had shown highest similarity to reference A. m. syriaca samples collected in 1952 by Brother Adam of samples collected from the Middle East. Samples 1-5 were collected from the National Center for Agricultural Research and Extension breeding apiary which was established for the conservation of A. m. syriaca. Our objective was to determine the success of an A. m. syriaca honey bee conservation program using genomic information from an array-based comparative genomic hybridization platform to evaluate genetic similarities to a historic reference collection (CRS). Our results had shown insignificant genomic differences between the current population in the conservation program and the CRS indicated that program is successfully conserving A. m. syriaca. Functional genomic variations were identified which are useful for conservation monitoring and may be useful for breeding programs designed to improve locally adapted strains of A. m. syriaca. PMID:27010806

  1. Analysis of Genomic Regions of Trichoderma harzianum IOC-3844 Related to Biomass Degradation

    PubMed Central

    Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

    2015-01-01

    Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes. PMID:25836973

  2. Genome wide signatures of positive selection: The comparison of independent samples and the identification of regions associated to traits

    PubMed Central

    Barendse, William; Harrison, Blair E; Bunch, Rowan J; Thomas, Merle B; Turner, Lex B

    2009-01-01

    Background The goal of genome wide analyses of polymorphisms is to achieve a better understanding of the link between genotype and phenotype. Part of that goal is to understand the selective forces that have operated on a population. Results In this study we compared the signals of selection, identified through population divergence in the Bovine HapMap project, to those found in an independent sample of cattle from Australia. Evidence for population differentiation across the genome, as measured by FST, was highly correlated in the two data sets. Nevertheless, 40% of the variance in FST between the two studies was attributed to the differences in breed composition. Seventy six percent of the variance in FST was attributed to differences in SNP composition and density when the same breeds were compared. The difference between FST of adjacent loci increased rapidly with the increase in distance between SNP, reaching an asymptote after 20 kb. Using 129 SNP that have highly divergent FST values in both data sets, we identified 12 regions that had additive effects on the traits residual feed intake, beef yield or intramuscular fatness measured in the Australian sample. Four of these regions had effects on more than one trait. One of these regions includes the R3HDM1 gene, which is under selection in European humans. Conclusion Firstly, many different populations will be necessary for a full description of selective signatures across the genome, not just a small set of highly divergent populations. Secondly, it is necessary to use the same SNP when comparing the signatures of selection from one study to another. Thirdly, useful signatures of selection can be obtained where many of the groups have only minor genetic differences and may not be clearly separated in a principal component analysis. Fourthly, combining analyses of genome wide selection signatures and genome wide associations to traits helps to define the trait under selection or the population group in which

  3. Estimation of Additive, Dominance, and Imprinting Genetic Variance Using Genomic Data

    PubMed Central

    Lopes, Marcos S.; Bastiaansen, John W. M.; Janss, Luc; Knol, Egbert F.; Bovenhuis, Henk

    2015-01-01

    Traditionally, exploration of genetic variance in humans, plants, and livestock species has been limited mostly to the use of additive effects estimated using pedigree data. However, with the development of dense panels of single-nucleotide polymorphisms (SNPs), the exploration of genetic variation of complex traits is moving from quantifying the resemblance between family members to the dissection of genetic variation at individual loci. With SNPs, we were able to quantify the contribution of additive, dominance, and imprinting variance to the total genetic variance by using a SNP regression method. The method was validated in simulated data and applied to three traits (number of teats, backfat, and lifetime daily gain) in three purebred pig populations. In simulated data, the estimates of additive, dominance, and imprinting variance were very close to the simulated values. In real data, dominance effects account for a substantial proportion of the total genetic variance (up to 44%) for these traits in these populations. The contribution of imprinting to the total phenotypic variance of the evaluated traits was relatively small (1–3%). Our results indicate a strong relationship between additive variance explained per chromosome and chromosome length, which has been described previously for other traits in other species. We also show that a similar linear relationship exists for dominance and imprinting variance. These novel results improve our understanding of the genetic architecture of the evaluated traits and shows promise to apply the SNP regression method to other traits and species, including human diseases. PMID:26438289

  4. "Replicated" genome wide association for dependence on illegal substances: genomic regions identified by overlapping clusters of nominally positive SNPs.

    PubMed

    Drgon, Tomas; Johnson, Catherine A; Nino, Michelle; Drgonova, Jana; Walther, Donna M; Uhl, George R

    2011-03-01

    Declaring "replication" from results of genome wide association (GWA) studies is straightforward when major gene effects provide genome-wide significance for association of the same allele of the same SNP in each of multiple independent samples. However, such unambiguous replication may be unlikely when phenotypes display polygenic genetic architecture, allelic heterogeneity, locus heterogeneity, and when different samples display linkage disequilibria with different fine structures. We seek chromosomal regions that are tagged by clustered SNPs that display nominally significant association in each of several independent samples. This approach provides one "nontemplate" approach to identifying overall replication of groups of GWA results in the face of difficult genetic architectures. We apply this strategy to 1 million (1M) SNP Affymetrix and Illumina GWA results for dependence on illegal substances. This approach provides high confidence in rejecting the null hypothesis that chance alone accounts for the extent to which clustered, nominally significant SNPs from samples of the same racial/ethnic background identify the same chromosomal regions. There is more modest confidence in: (a) identification of individual chromosomal regions and genes and (b) overlap between results from samples of different racial/ethnic backgrounds. The strong overlap identified among the samples with similar racial/ethnic backgrounds, together with prior work that identified overlapping results in samples of different racial/ethnic backgrounds, support contributions to individual differences in vulnerability to addictions that come from both relatively older allelic variants that are common in many current human populations and newer allelic variants that are common in fewer current human populations. PMID:21302341

  5. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-01-01

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu. PMID:26015273

  6. Comparative genomics of Lupinus angustifolius gene-rich regions: BAC library exploration, genetic mapping and cytogenetics

    PubMed Central

    2013-01-01

    Background The narrow-leafed lupin, Lupinus angustifolius L., is a grain legume species with a relatively compact genome. The species has 2n = 40 chromosomes and its genome size is 960 Mbp/1C. During the last decade, L. angustifolius genomic studies have achieved several milestones, such as molecular-marker development, linkage maps, and bacterial artificial chromosome (BAC) libraries. Here, these resources were integratively used to identify and sequence two gene-rich regions (GRRs) of the genome. Results The genome was screened with a probe representing the sequence of a microsatellite fragment length polymorphism (MFLP) marker linked to Phomopsis stem blight resistance. BAC clones selected by hybridization were subjected to restriction fingerprinting and contig assembly, and 232 BAC-ends were sequenced and annotated. BAC fluorescence in situ hybridization (BAC-FISH) identified eight single-locus clones. Based on physical mapping, cytogenetic localization, and BAC-end annotation, five clones were chosen for sequencing. Within the sequences of clones that hybridized in FISH to a single-locus, two large GRRs were identified. The GRRs showed strong and conserved synteny to Glycine max duplicated genome regions, illustrated by both identical gene order and parallel orientation. In contrast, in the clones with dispersed FISH signals, more than one-third of sequences were transposable elements. Sequenced, single-locus clones were used to develop 12 genetic markers, increasing the number of L. angustifolius chromosomes linked to appropriate linkage groups by five pairs. Conclusions In general, probes originating from MFLP sequences can assist genome screening and gene discovery. However, such probes are not useful for positional cloning, because they tend to hybridize to numerous loci. GRRs identified in L. angustifolius contained a low number of interspersed repeats and had a high level of synteny to the genome of the model legume G. max. Our results showed that

  7. Information compression exploits patterns of genome composition to discriminate populations and highlight regions of evolutionary interest

    PubMed Central

    2014-01-01

    Background Genomic information allows population relatedness to be inferred and selected genes to be identified. Single nucleotide polymorphism microarray (SNP-chip) data, a proxy for genome composition, contains patterns in allele order and proportion. These patterns can be quantified by compression efficiency (CE). In principle, the composition of an entire genome can be represented by a CE number quantifying allele representation and order. Results We applied a compression algorithm (DEFLATE) to genome-wide high-density SNP data from 4,155 human, 1,800 cattle, 1,222 sheep, 81 dogs and 49 mice samples. All human ethnic groups can be clustered by CE and the clusters recover phylogeography based on traditional fixation index (FST) analyses. CE analysis of other mammals results in segregation by breed or species, and is sensitive to admixture and past effective population size. This clustering is a consequence of individual patterns such as runs of homozygosity. Intriguingly, a related approach can also be used to identify genomic loci that show population-specific CE segregation. A high resolution CE ‘sliding window’ scan across the human genome, organised at the population level, revealed genes known to be under evolutionary pressure. These include SLC24A5 (European and Gujarati Indian skin pigmentation), HERC2 (European eye color), LCT (European and Maasai milk digestion) and EDAR (Asian hair thickness). We also identified a set of previously unidentified loci with high population-specific CE scores including the chromatin remodeler SCMH1 in Africans and EDA2R in Asians. Closer inspection reveals that these prioritised genomic regions do not correspond to simple runs of homozygosity but rather compositionally complex regions that are shared by many individuals of a given population. Unlike FST, CE analyses do not require ab initio population comparisons and are amenable to the hemizygous X chromosome. Conclusions We conclude with a discussion of the

  8. Read clouds uncover variation in complex regions of the human genome

    PubMed Central

    Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E.; West, Robert; Sidow, Arend; Batzoglou, Serafim

    2015-01-01

    Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. PMID:26286554

  9. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  10. A 5'-proximal region of the Citrus tristeza virus genome encoding two leader proteases is involved in virus superinfection exclusion.

    PubMed

    Atallah, Osama O; Kang, Sung-Hwan; El-Mohtar, Choaa A; Shilts, Turksen; Bergua, María; Folimonova, Svetlana Y

    2016-02-01

    Superinfection exclusion (SIE), a phenomenon in which a primary virus infection prevents a secondary infection with the same or closely related virus, has been observed with various viruses. Earlier we demonstrated that SIE by Citrus tristeza virus (CTV) requires viral p33 protein. In this work we show that p33 alone is not sufficient for virus exclusion. To define the additional viral components that are involved in this phenomenon, we engineered a hybrid virus in which a 5'-proximal region in the genome of the T36 isolate containing coding sequences for the two leader proteases L1 and L2 has been substituted with a corresponding region from the genome of a heterologous T68-1 isolate. Sequential inoculation of plants pre-infected with the CTV L1L2T68 hybrid with T36 CTV resulted in superinfection with the challenge virus, which indicated that the substitution of the L1-L2 coding region affected SIE ability of the virus. PMID:26748332

  11. Transferability of regional permafrost disturbance susceptibility modelling using generalized linear and generalized additive models

    NASA Astrophysics Data System (ADS)

    Rudy, Ashley C. A.; Lamoureux, Scott F.; Treitz, Paul; van Ewijk, Karin Y.

    2016-07-01

    To effectively assess and mitigate risk of permafrost disturbance, disturbance-prone areas can be predicted through the application of susceptibility models. In this study we developed regional susceptibility models for permafrost disturbances using a field disturbance inventory to test the transferability of the model to a broader region in the Canadian High Arctic. Resulting maps of susceptibility were then used to explore the effect of terrain variables on the occurrence of disturbances within this region. To account for a large range of landscape characteristics, the model was calibrated using two locations: Sabine Peninsula, Melville Island, NU, and Fosheim Peninsula, Ellesmere Island, NU. Spatial patterns of disturbance were predicted with a generalized linear model (GLM) and generalized additive model (GAM), each calibrated using disturbed and randomized undisturbed locations from both locations and GIS-derived terrain predictor variables including slope, potential incoming solar radiation, wetness index, topographic position index, elevation, and distance to water. Each model was validated for the Sabine and Fosheim Peninsulas using independent data sets while the transferability of the model to an independent site was assessed at Cape Bounty, Melville Island, NU. The regional GLM and GAM validated well for both calibration sites (Sabine and Fosheim) with the area under the receiver operating curves (AUROC) > 0.79. Both models were applied directly to Cape Bounty without calibration and validated equally with AUROC's of 0.76; however, each model predicted disturbed and undisturbed samples differently. Additionally, the sensitivity of the transferred model was assessed using data sets with different sample sizes. Results indicated that models based on larger sample sizes transferred more consistently and captured the variability within the terrain attributes in the respective study areas. Terrain attributes associated with the initiation of disturbances were

  12. Two genomic regions contribute disproportionately to geographic differentiation in wild barley.

    PubMed

    Fang, Zhou; Gonzales, Ana M; Clegg, Michael T; Smith, Kevin P; Muehlbauer, Gary J; Steffenson, Brian J; Morrell, Peter L

    2014-07-01

    Genetic differentiation in natural populations is driven by geographic distance and by ecological or physical features within and between natural habitats that reduce migration. The primary population structure in wild barley differentiates populations east and west of the Zagros Mountains. Genetic differentiation between eastern and western populations is uneven across the genome and is greatest on linkage groups 2H and 5H. Genetic markers in these two regions demonstrate the largest difference in frequency between the primary populations and have the highest informativeness for assignment to each population. Previous cytological and genetic studies suggest there are chromosomal structural rearrangements (inversions or translocations) in these genomic regions. Environmental association analyses identified an association with both temperature and precipitation variables on 2H and with precipitation variables on 5H. PMID:24760390

  13. Two Genomic Regions Contribute Disproportionately to Geographic Differentiation in Wild Barley

    PubMed Central

    Fang, Zhou; Gonzales, Ana M.; Clegg, Michael T.; Smith, Kevin P.; Muehlbauer, Gary J.; Steffenson, Brian J.; Morrell, Peter L.

    2014-01-01

    Genetic differentiation in natural populations is driven by geographic distance and by ecological or physical features within and between natural habitats that reduce migration. The primary population structure in wild barley differentiates populations east and west of the Zagros Mountains. Genetic differentiation between eastern and western populations is uneven across the genome and is greatest on linkage groups 2H and 5H. Genetic markers in these two regions demonstrate the largest difference in frequency between the primary populations and have the highest informativeness for assignment to each population. Previous cytological and genetic studies suggest there are chromosomal structural rearrangements (inversions or translocations) in these genomic regions. Environmental association analyses identified an association with both temperature and precipitation variables on 2H and with precipitation variables on 5H. PMID:24760390

  14. Generation of Recombinant Polioviruses Harboring RNA Affinity Tags in the 5′ and 3′ Noncoding Regions of Genomic RNAs

    PubMed Central

    Flather, Dylan; Cathcart, Andrea L.; Cruz, Casey; Baggs, Eric; Ngo, Tuan; Gershon, Paul D.; Semler, Bert L.

    2016-01-01

    Despite being intensely studied for more than 50 years, a complete understanding of the enterovirus replication cycle remains elusive. Specifically, only a handful of cellular proteins have been shown to be involved in the RNA replication cycle of these viruses. In an effort to isolate and identify additional cellular proteins that function in enteroviral RNA replication, we have generated multiple recombinant polioviruses containing RNA affinity tags within the 3′ or 5′ noncoding region of the genome. These recombinant viruses retained RNA affinity sequences within the genome while remaining viable and infectious over multiple passages in cell culture. Further characterization of these viruses demonstrated that viral protein production and growth kinetics were unchanged or only slightly altered relative to wild type poliovirus. However, attempts to isolate these genetically-tagged viral genomes from infected cells have been hindered by high levels of co-purification of nonspecific proteins and the limited matrix-binding efficiency of RNA affinity sequences. Regardless, these recombinant viruses represent a step toward more thorough characterization of enterovirus ribonucleoprotein complexes involved in RNA replication. PMID:26861382

  15. Predicting the effects of nanoscale cerium additives in diesel fuel on regional-scale air quality.

    PubMed

    Erdakos, Garnet B; Bhave, Prakash V; Pouliot, George A; Simon, Heather; Mathur, Rohit

    2014-11-01

    Diesel vehicles are a major source of air pollutant emissions. Fuel additives containing nanoparticulate cerium (nCe) are currently being used in some diesel vehicles to improve fuel efficiency. These fuel additives also reduce fine particulate matter (PM2.5) emissions and alter the emissions of carbon monoxide (CO), nitrogen oxides (NOx), and hydrocarbon (HC) species, including several hazardous air pollutants (HAPs). To predict their net effect on regional air quality, we review the emissions literature and develop a multipollutant inventory for a hypothetical scenario in which nCe additives are used in all on-road and nonroad diesel vehicles. We apply the Community Multiscale Air Quality (CMAQ) model to a domain covering the eastern U.S. for a summer and a winter period. Model calculations suggest modest decreases of average PM2.5 concentrations and relatively larger decreases in particulate elemental carbon. The nCe additives also have an effect on 8 h maximum ozone in summer. Variable effects on HAPs are predicted. The total U.S. emissions of fine-particulate cerium are estimated to increase 25-fold and result in elevated levels of airborne cerium (up to 22 ng/m3), which might adversely impact human health and the environment. PMID:25271762

  16. Genotyping of infectious laryngotracheitis virus using allelic variations from multiple genomic regions.

    PubMed

    Choi, Eun-Jung; La, Tae-Min; Choi, In-Soo; Song, Chang-Seon; Park, Seung-Yong; Lee, Joong-Bok; Lee, Sang-Won

    2016-08-01

    Live attenuated vaccines are extensively used worldwide to control the outbreak of infectious laryngotracheitis. Virulent field strains showing close genetic relationship with the infectious laryngotracheitis virus (ILTV) vaccines of chicken embryo origin have been detected in the poultry industry. Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis, a reliable molecular epidemiological method, of multiple genomic regions was performed. The PCR-RFLP is a time-consuming method that requires considerable amount of intact viral genomic DNA to amplify genomic regions greater than 4 kb. In this study, six variable genomic regions were selected and amplified for sequencing. The multi-allelic PCR-sequence genotyping showed better discrimination power than that of previous PCR-sequencing schemes using single or two target regions. The allelic variation patterns yielded 16 strains of ILTV classified into 14 different genotypes. Three Korean field strains, 550/05/Ko, 0010/05/Ko and 40032/08/Ko, were found to have the same genotype as the commercial vaccine strain, Laryngo Vac (Zoetis, Florham Park, NJ, USA). Three other Korean field strains, 40798/10/Ko, 12/07/Ko, and 30678/14/Ko, showed recombined allelic patterns. The multi-allelic PCR-sequencing method was proved to be an efficient and practical procedure to classify the different strains of ILTV. The method could serve as an alternate diagnostic and differentiating tool for the classification of ILTV, and contribute to understanding of the epidemiology of the disease at a global level. PMID:26956802

  17. Large homogeneous genome regions (isochores) in soybean [glycine max (L.) merr].

    PubMed

    Woody, J L; Beavis, W; Shoemaker, R C

    2012-01-01

    The landscape of plant genomes, while slowly being characterized and defined, is still composed primarily of regions of undefined function. Many eukaryotic genomes contain isochore regions, mosaics of homogeneous GC content that can abruptly change from one neighboring isochore to the next. Isochores are broken into families that are characterized by their GC levels. We identified 4,339 compositionally distinct domains and 331 of these were identified as long homogeneous genome regions (LHGRs). We assigned these to four families based on finite mixture models of GC content. We then characterized each family with respect to exon length, gene content, and transposable elements. The LHGR pattern of soybeans is unique in that while the majority of the genes within LHGRs are found within a single LHGR family with a narrow GC range (Family B), that family is not the highest in GC content as seen in vertebrates and invertebrates. Instead Family B has a mean GC content of 35%. The range of GC content for all LHGRs is 16-59% GC which is a larger range than what is typical of vertebrates. This is the first study in which LHGRs have been identified in soybeans and the functions of the genes within the LHGRs have been analyzed. PMID:22934101

  18. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions

    PubMed Central

    Lelieveld, Stefan H.; Spielmann, Malte; Mundlos, Stefan; Veltman, Joris A.

    2015-01-01

    ABSTRACT For next‐generation sequencing technologies, sufficient base‐pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole‐genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole‐exome sequencing (WES) platforms, and compared single‐base coverage for a large set of exome and genome samples. We find that WES platforms have improved considerably in the last years, but at comparable sequencing depth, WGS outperforms WES in terms of covered coding regions. At higher sequencing depth (95x–160x), WES successfully captures 95% of the coding regions with a minimal coverage of 20x, compared with 98% for WGS at 87‐fold coverage. Three different assessments of sequence coverage bias showed consistent biases for WES but not for WGS. We found no clear differences for the technologies concerning their ability to achieve complete coverage of 2,759 clinically relevant genes. We show that WES performs comparable to WGS in terms of covered bases if sequenced at two to three times higher coverage. This does, however, go at the cost of substantially more sequencing biases in WES approaches. Our findings will guide laboratories to make an informed decision on which sequencing platform and coverage to choose. PMID:25973577

  19. Genome-Wide Chromatin Remodeling Identified at GC-Rich Long Nucleosome-Free Regions

    PubMed Central

    Hochreiter, Sepp

    2012-01-01

    To gain deeper insights into principles of cell biology, it is essential to understand how cells reorganize their genomes by chromatin remodeling. We analyzed chromatin remodeling on next generation sequencing data from resting and activated T cells to determine a whole-genome chromatin remodeling landscape. We consider chromatin remodeling in terms of nucleosome repositioning which can be observed most robustly in long nucleosome-free regions (LNFRs) that are occupied by nucleosomes in another cell state. We found that LNFR sequences are either AT-rich or GC-rich, where nucleosome repositioning was observed much more prominently in GC-rich LNFRs — a considerable proportion of them outside promoter regions. Using support vector machines with string kernels, we identified a GC-rich DNA sequence pattern indicating loci of nucleosome repositioning in resting T cells. This pattern appears to be also typical for CpG islands. We found out that nucleosome repositioning in GC-rich LNFRs is indeed associated with CpG islands and with binding sites of the CpG-island-binding ZF-CXXC proteins KDM2A and CFP1. That this association occurs prominently inside and also prominently outside of promoter regions hints at a mechanism governing nucleosome repositioning that acts on a whole-genome scale. PMID:23144837

  20. Genomic regions associated with dermal hyperpigmentation, polydactyly and other morphological traits in the Silkie chicken.

    PubMed

    Dorshorst, Ben; Okimoto, Ron; Ashwell, Chris

    2010-01-01

    The Silkie chicken has been a model of melanoctye precursor and neural crest cell migration and proliferation in the developing embryo due to its extensive hyperpigmentation of dermal and connective tissues. Although previous studies have focused on the distribution and structure of the Silkie's pigment or the general mechanisms by which this phenotype presents itself, the causal genetic variants have not been identified. Classical breeding experiments have determined this trait to be controlled by 2 interacting genes, the sex-linked inhibitor of dermal melanin (Id) and autosomal fibromelanosis (Fm) genes. Genome-wide single nucleotide polymorphism (SNP)-trait association analysis was used to detect genomic regions showing significant association with these pigmentation genes in 2 chicken mapping populations designed to segregate independently for Id and Fm. The SNP showing the highest association with Id was located at 72.3 Mb on chromosome Z and 10.3-13.1 Mb on chromosome 20 showed the highest association with Fm. Prior to this study, the linkage group to which Fm belonged was unknown. Although the primary focus of this study was to identify loci contributing to dermal pigmentation in the Silkie chicken, loci associated with various other morphological traits segregating in these populations were also detected. A single SNP in a highly conserved cis-regulatory region of Sonic Hedgehog was significantly associated with polydactyly (Po). Genomic regions in association with silkie feathering or hookless (h), feathered legs (Pti), vulture hock (V), rose comb (R), and duplex comb (D) were also identified. PMID:20064842

  1. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes

    PubMed Central

    Zeggini, Eleftheria; Scott, Laura J.; Saxena, Richa; Voight, Benjamin F.; Marchini, Jonathan L; Hu, Tainle; de Bakker, Paul IW; Abecasis, Gonçalo R; Almgren, Peter; Andersen, Gitte; Ardlie, Kristin; Boström, Kristina Bengtsson; Bergman, Richard N; Bonnycastle, Lori L; Borch-Johnsen, Knut; Burtt, Noël P; Chen, Hong; Chines, Peter S; Daly, Mark J; Deodhar, Parimal; Ding, Charles; Doney, Alex S F; Duren, William L; Elliott, Katherine S; Erdos, Michael R; Frayling, Timothy M; Freathy, Rachel M; Gianniny, Lauren; Grallert, Harald; Grarup, Niels; Groves, Christopher J; Guiducci, Candace; Hansen, Torben; Herder, Christian; Hitman, Graham A; Hughes, Thomas E; Isomaa, Bo; Jackson, Anne U; Jørgensen, Torben; Kong, Augustine; Kubalanza, Kari; Kuruvilla, Finny G; Kuusisto, Johanna; Langenberg, Claudia; Lango, Hana; Lauritzen, Torsten; Li, Yun; Lindgren, Cecilia M; Lyssenko, Valeriya; Marvelle, Amanda F; Meisinger, Christa; Midthjell, Kristian; Mohlke, Karen L; Morken, Mario A; Morris, Andrew D; Narisu, Narisu; Nilsson, Peter; Owen, Katharine R; Palmer, Colin NA; Payne, Felicity; Perry, John RB; Pettersen, Elin; Platou, Carl; Prokopenko, Inga; Qi, Lu; Qin, Li; Rayner, Nigel W; Rees, Matthew; Roix, Jeffrey J; Sandbæk, Anelli; Shields, Beverley; Sjögren, Marketa; Steinthorsdottir, Valgerdur; Stringham, Heather M; Swift, Amy J; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Timpson, Nicholas J; Tuomi, Tiinamaija; Tuomilehto, Jaakko; Walker, Mark; Watanabe, Richard M; Weedon, Michael N; Willer, Cristen J; Illig, Thomas; Hveem, Kristian; Hu, Frank B; Laakso, Markku; Stefansson, Kari; Pedersen, Oluf; Wareham, Nicholas J; Barroso, Inês; Hattersley, Andrew T; Collins, Francis S; Groop, Leif; McCarthy, Mark I; Boehnke, Michael; Altshuler, David

    2009-01-01

    Genome-wide association (GWA) studies have identified multiple new genomic loci at which common variants modestly but reproducibly influence risk of type 2 diabetes (T2D)1-11. Established associations to common and rare variants explain only a small proportion of the heritability of T2D. As previously published analyses had limited power to discover loci at which common alleles have modest effects, we performed meta-analysis of three T2D GWA scans encompassing 10,128 individuals of European-descent and ~2.2 million SNPs (directly genotyped and imputed). Replication testing was performed in an independent sample with an effective sample size of up to 53,975. At least six new loci with robust evidence for association were detected, including the JAZF1 (p=5.0×10−14), CDC123/CAMK1D (p=1.2×10−10), TSPAN8/LGR5 (p=1.1×10−9), THADA (p=1.1×10−9), ADAMTS9 (p=1.2×10−8), and NOTCH2 (p=4.1×10−8) gene regions. The large number of loci with relatively small effects indicates the value of large discovery and follow-up samples in identifying additional clues about the inherited basis of T2D. PMID:18372903

  2. Polymorphic simple sequence repeat regions in chloroplast genomes: applications to the population genetics of pines.

    PubMed Central

    Powell, W; Morgante, M; McDevitt, R; Vendramin, G G; Rafalski, J A

    1995-01-01

    Simple sequence repeats (SSRs), consisting of tandemly repeated multiple copies of mono-, di-, tri-, or tetranucleotide motifs, are ubiquitous in eukaryotic genomes and are frequently used as genetic markers, taking advantage of their length polymorphism. We have examined the polymorphism of such sequences in the chloroplast genomes of plants, by using a PCR-based assay. GenBank searches identified the presence of several (dA)n.(dT)n mononucleotide stretches in chloroplast genomes. A chloroplast (cp) SSR was identified in three pine species (Pinus contorta, Pinus sylvestris, and Pinus thunbergii) 312 bp upstream of the psbA gene. DNA amplification of this repeated region from 11 pine species identified nine length variants. The polymorphic amplified fragments were isolated and the DNA sequence was determined, confirming that the length polymorphism was caused by variation in the length of the repeated region. In the pines, the chloroplast genome is transmitted through pollen and this PCR assay may be used to monitor gene flow in this genus. Analysis of 305 individuals from seven populations of Pinus leucodermis Ant. revealed the presence of four variants with intrapopulational diversities ranging from 0.000 to 0.629 and an average of 0.320. Restriction fragment length polymorphism analysis of cpDNA on the same populations previously failed to detect any variation. Population subdivision based on cpSSR was higher (Gst = 0.22, where Gst is coefficient of gene differentiation) than that revealed in a previous isozyme study (Gst = 0.05). We anticipate that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations. Images Fig. 2 PMID:7644491

  3. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  4. Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites

    PubMed Central

    Reinprecht, Yarmilla; Arif, Muhammad; Simon, Leonardo C.; Pauls, K. Peter

    2015-01-01

    Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP) matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs) grown in different environments were incorporated into PP at 20% (wt/wt) by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL) for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue. PMID:26167917

  5. Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites.

    PubMed

    Reinprecht, Yarmilla; Arif, Muhammad; Simon, Leonardo C; Pauls, K Peter

    2015-01-01

    Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP) matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs) grown in different environments were incorporated into PP at 20% (wt/wt) by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL) for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue. PMID:26167917

  6. Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping

    PubMed Central

    Zeng, Xin; Li, Bo; Welch, Rene; Rojo, Constanza; Zheng, Ye; Dewey, Colin N.; Keleş, Sündüz

    2015-01-01

    Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells’ regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50–100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions. PMID:26484757

  7. Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-Enhanced Read Mapping.

    PubMed

    Zeng, Xin; Li, Bo; Welch, Rene; Rojo, Constanza; Zheng, Ye; Dewey, Colin N; Keleş, Sündüz

    2015-10-01

    Segmental duplications and other highly repetitive regions of genomes contribute significantly to cells' regulatory programs. Advancements in next generation sequencing enabled genome-wide profiling of protein-DNA interactions by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). However, interactions in highly repetitive regions of genomes have proven difficult to map since short reads of 50-100 base pairs (bps) from these regions map to multiple locations in reference genomes. Standard analytical methods discard such multi-mapping reads and the few that can accommodate them are prone to large false positive and negative rates. We developed Perm-seq, a prior-enhanced read allocation method for ChIP-seq experiments, that can allocate multi-mapping reads in highly repetitive regions of the genomes with high accuracy. We comprehensively evaluated Perm-seq, and found that our prior-enhanced approach significantly improves multi-read allocation accuracy over approaches that do not utilize additional data types. The statistical formalism underlying our approach facilitates supervising of multi-read allocation with a variety of data sources including histone ChIP-seq. We applied Perm-seq to 64 ENCODE ChIP-seq datasets from GM12878 and K562 cells and identified many novel protein-DNA interactions in segmental duplication regions. Our analysis reveals that although the protein-DNA interactions sites are evolutionarily less conserved in repetitive regions, they share the overall sequence characteristics of the protein-DNA interactions in non-repetitive regions. PMID:26484757

  8. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations

  9. Genome-Centric Analysis of Microbial Populations Enriched by Hydraulic Fracture Fluid Additives in a Coal Bed Methane Production Well.

    PubMed

    Robbins, Steven J; Evans, Paul N; Parks, Donovan H; Golding, Suzanne D; Tyson, Gene W

    2016-01-01

    Coal bed methane (CBM) is generated primarily through the microbial degradation of coal. Despite a limited understanding of the microorganisms responsible for this process, there is significant interest in developing methods to stimulate additional methane production from CBM wells. Physical techniques including hydraulic fracture stimulation are commonly applied to CBM wells, however the effects of specific additives contained in hydraulic fracture fluids on native CBM microbial communities are poorly understood. Here, metagenomic sequencing was applied to the formation waters of a hydraulically fractured and several non-fractured CBM production wells to determine the effect of this stimulation technique on the in-situ microbial community. The hydraulically fractured well was dominated by two microbial populations belonging to the class Phycisphaerae (within phylum Planctomycetes) and candidate phylum Aminicenantes. Populations from these phyla were absent or present at extremely low abundance in non-fractured CBM wells. Detailed metabolic reconstruction of near-complete genomes from these populations showed that their high relative abundance in the hydraulically fractured CBM well could be explained by the introduction of additional carbon sources, electron acceptors, and biocides contained in the hydraulic fracture fluid. PMID:27375557

  10. Genome-Centric Analysis of Microbial Populations Enriched by Hydraulic Fracture Fluid Additives in a Coal Bed Methane Production Well

    PubMed Central

    Robbins, Steven J.; Evans, Paul N.; Parks, Donovan H.; Golding, Suzanne D.; Tyson, Gene W.

    2016-01-01

    Coal bed methane (CBM) is generated primarily through the microbial degradation of coal. Despite a limited understanding of the microorganisms responsible for this process, there is significant interest in developing methods to stimulate additional methane production from CBM wells. Physical techniques including hydraulic fracture stimulation are commonly applied to CBM wells, however the effects of specific additives contained in hydraulic fracture fluids on native CBM microbial communities are poorly understood. Here, metagenomic sequencing was applied to the formation waters of a hydraulically fractured and several non-fractured CBM production wells to determine the effect of this stimulation technique on the in-situ microbial community. The hydraulically fractured well was dominated by two microbial populations belonging to the class Phycisphaerae (within phylum Planctomycetes) and candidate phylum Aminicenantes. Populations from these phyla were absent or present at extremely low abundance in non-fractured CBM wells. Detailed metabolic reconstruction of near-complete genomes from these populations showed that their high relative abundance in the hydraulically fractured CBM well could be explained by the introduction of additional carbon sources, electron acceptors, and biocides contained in the hydraulic fracture fluid. PMID:27375557

  11. Different linkages in the long and short regions of the genomes of duck enteritis virus Clone-03 and VAC Strains

    PubMed Central

    2011-01-01

    Background Duck enteritis virus (DEV) is an unassigned member in the family Herpesviridae. To demonstrate further the evolutionary position of DEV in the family Herpesviridae, we have described a 42,897-bp fragment. We demonstrated novel genomic organization at one end of the long (L) region and in the entire short (S) region in the Clone-03 strain of DEV. Results A 42,897-bp fragment located downstream of the LOFR11 gene was amplified from the Clone-03 strain of DEV by using 'targeted gene walking PCR'. Twenty-two open reading frames (ORFs) were predicted and determined in the following order: 5'-LORF11-RLORF1-ORF1-ICP4-S1-S2-US1-US10-SORF3-US2-MDV091.5-like-US3-US4-US5-US6-US7-US8-ORFx-US1-S2-S1-ICP4 -3'. This was different from that of the published VAC strain, both in the linkage of the L region and S region, and in the length of the US10 and US7 proteins. The MDV091.5-like gene, ORFx gene, S1 gene and S2 gene were first observed in the DEV genome. The lengths of DEV US10 and US7 were determined to be 311 and 371 amino acids, respectively, in the Clone-03 strain of DEV, and these were different from those of other strains. The comparison of genomic organization in the fragment studied herein with those of other herpesviruses showed that DEV possesses some unique characteristics, such as the duplicated US1 at each end of the US region, and the US5, which showed no homology with those of other herpesviruses. In addition, the results of phylogenetic analysis of ORFs in the represented fragment indicated that DEV is closest to its counterparts VZV (Varicellovirus) and other avian herpesviruses. Conclusion The molecular characteristics of the 42,897-bp fragment of Clone-03 have been found to be different from those of the VAC strain. The phylogenetic analysis of genes in this region showed that DEV should be a separate member of the subfamily Alphaherpesvirinae. PMID:21535884

  12. Genomic region operation kit for flexible processing of deep sequencing data.

    PubMed

    Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

    2013-01-01

    Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/. PMID:23702556

  13. COUGER—co-factors associated with uniquely-bound genomic regions

    PubMed Central

    Munteanu, Alina; Ohler, Uwe; Gordân, Raluca

    2014-01-01

    Most transcription factors (TFs) belong to protein families that share a common DNA binding domain and have very similar DNA binding preferences. However, many paralogous TFs (i.e. members of the same TF family) perform different regulatory functions and interact with different genomic regions in the cell. A potential mechanism for achieving this differential in vivo specificity is through interactions with protein co-factors. Computational tools for studying the genomic binding profiles of paralogous TFs and identifying their putative co-factors are currently lacking. Here, we present an interactive web implementation of COUGER, a classification-based framework for identifying protein co-factors that might provide specificity to paralogous TFs. COUGER takes as input two sets of genomic regions bound by paralogous TFs, and it identifies a small set of putative co-factors that best distinguish the two sets of sequences. To achieve this task, COUGER uses a classification approach, with features that reflect the DNA-binding specificities of the putative co-factors. The identified co-factors are presented in a user-friendly output page, together with information that allows the user to understand and to explore the contributions of individual co-factor features. COUGER can be run as a stand-alone tool or through a web interface: http://couger.oit.duke.edu. PMID:24861628

  14. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker

    PubMed Central

    Fopa Fomeju, Berline; Falentin, Cyril; Lassalle, Gilles; Manzanares-Dauleux, Maria J.; Delourme, Régine

    2015-01-01

    All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U, and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling, or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the resistance phenotype

  15. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker.

    PubMed

    Fopa Fomeju, Berline; Falentin, Cyril; Lassalle, Gilles; Manzanares-Dauleux, Maria J; Delourme, Régine

    2015-01-01

    All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U, and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling, or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the resistance phenotype

  16. Complete genome sequence of Deltapapillomavirus 4 (bovine papillomavirus 2) from a bovine papillomavirus lesion in Amazon Region, Brazil

    PubMed Central

    Daudt, Cíntia; da Silva, Flavio RC; Cibulski, Samuel P; Weber, Matheus N; Mayer, Fabiana Q; Varela, Ana Paula M; Roehe, Paulo M; Canal, Cláudio W

    2016-01-01

    The complete genome sequence of bovine papillomavirus 2 (BPV2) from Brazilian Amazon Region was determined using multiple-primed rolling circle amplification followed by Illumina sequencing. The genome is 7,947 bp long, with 45.9% GC content. It encodes seven early (E1, E2,E4, E5, E6,E7, and E8) and two late (L1 and L2) genes. The complete genome of a BPV2 can help in future studies since this BPV type is highly reported worldwide although the lack of complete genome sequences available. PMID:27074259

  17. Complete genome sequence of Deltapapillomavirus 4 (bovine papillomavirus 2) from a bovine papillomavirus lesion in Amazon Region, Brazil.

    PubMed

    Daudt, Cíntia; Silva, Flavio Rc da; Cibulski, Samuel P; Weber, Matheus N; Mayer, Fabiana Q; Varela, Ana Paula M; Roehe, Paulo M; Canal, Cláudio W

    2016-04-01

    The complete genome sequence of bovine papillomavirus 2 (BPV2) from Brazilian Amazon Region was determined using multiple-primed rolling circle amplification followed by Illumina sequencing. The genome is 7,947 bp long, with 45.9% GC content. It encodes seven early (E1, E2,E4, E5, E6,E7, and E8) and two late (L1 and L2) genes. The complete genome of a BPV2 can help in future studies since this BPV type is highly reported worldwide although the lack of complete genome sequences available. PMID:27074259

  18. Targeted enrichment of genomic DNA regions for next-generation sequencing

    PubMed Central

    ElSharawy, Abdou; Sauer, Sascha; van Helvoort, Joop M.L.M.; van der Zaag, P.J.; Franke, Andre; Nilsson, Mats; Lehrach, Hans; Brookes, Anthony J.

    2011-01-01

    In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings. PMID:22121152

  19. High-density linkage mapping and distribution of segregation distortion regions in the oak genome

    PubMed Central

    Bodénès, Catherine; Chancerel, Emilie; Ehrenmann, François; Kremer, Antoine; Plomion, Christophe

    2016-01-01

    We developed the densest single-nucleotide polymorphism (SNP)-based linkage genetic map to date for the genus Quercus. An 8k gene-based SNP array was used to genotype more than 1,000 full-sibs from two intraspecific and two interspecific full-sib families of Quercus petraea and Quercus robur. A high degree of collinearity was observed between the eight parental maps of the two species. A composite map was then established with 4,261 SNP markers spanning 742 cM over the 12 linkage groups (LGs) of the oak genome. Nine genomic regions from six LGs displayed highly significant distortions of segregation. Two main hypotheses concerning the mechanisms underlying segregation distortion are discussed: genetic load vs. reproductive barriers. Our findings suggest a predominance of pre-zygotic to post-zygotic barriers. PMID:27013549

  20. High-density linkage mapping and distribution of segregation distortion regions in the oak genome.

    PubMed

    Bodénès, Catherine; Chancerel, Emilie; Ehrenmann, François; Kremer, Antoine; Plomion, Christophe

    2016-04-01

    We developed the densest single-nucleotide polymorphism (SNP)-based linkage genetic map to date for the genus Quercus An 8k gene-based SNP array was used to genotype more than 1,000 full-sibs from two intraspecific and two interspecific full-sib families of Quercus petraea and Quercus robur A high degree of collinearity was observed between the eight parental maps of the two species. A composite map was then established with 4,261 SNP markers spanning 742 cM over the 12 linkage groups (LGs) of the oak genome. Nine genomic regions from six LGs displayed highly significant distortions of segregation. Two main hypotheses concerning the mechanisms underlying segregation distortion are discussed: genetic load vs. reproductive barriers. Our findings suggest a predominance of pre-zygotic to post-zygotic barriers. PMID:27013549

  1. Genomic Anatomy of a Premier Major Histocompatibility Complex Paralogous Region on Chromosome 1q21–q22

    PubMed Central

    Shiina, Takashi; Ando, Asako; Suto, Yumiko; Kasai, Fumio; Shigenari, Atsuko; Takishima, Nobusada; Kikkawa, Eri; Iwata, Kyoko; Kuwano, Yuko; Kitamura, Yuka; Matsuzawa, Yumiko; Sano, Kazumi; Nogami, Masahiro; Kawata, Hisako; Li, Suyun; Fukuzumi, Yasuhito; Yamazaki, Masaaki; Tashiro, Hiroyuki; Tamiya, Gen; Kohda, Atsushi; Okumura, Katsuzumi; Ikemura, Toshimichi; Soeda, Eiichi; Mizuki, Nobuhisa; Kimura, Minoru; Bahram, Seiamak; Inoko, Hidetoshi

    2001-01-01

    Human chromosomes 1q21–q25, 6p21.3–22.2, 9q33–q34, and 19p13.1–p13.4 carry clusters of paralogous loci, to date best defined by the flagship 6p MHC region. They have presumably been created by two rounds of large-scale genomic duplications around the time of vertebrate emergence. Phylogenetically, the 1q21–25 region seems most closely related to the 6p21.3 MHC region, as it is only the MHC paralogous region that includes bona fide MHC class I genes, the CD1 and MR1 loci. Here, to clarify the genomic structure of this model MHC paralogous region as well as to gain insight into the evolutionary dynamics of the entire quadriplication process, a detailed analysis of a critical 1.7 megabase (Mb) region was performed. To this end, a composite, deep, YAC, BAC, and PAC contig encompassing all five CD1 genes and linking the centromeric +P5 locus to the telomeric KRTC7 locus was constructed. Within this contig a 1.1-Mb BAC and PAC core segment joining CD1D to FCER1A was fully sequenced and thoroughly analyzed. This led to the mapping of a total of 41 genes (12 expressed genes, 12 possibly expressed genes, and 17 pseudogenes), among which 31 were novel. The latter include 20 olfactory receptor (OR) genes, 9 of which are potentially expressed. Importantly, CD1, SPTA1, OR, and FCERIA belong to multigene families, which have paralogues in the other three regions. Furthermore, it is noteworthy that 12 of the 13 expressed genes in the 1q21–q22 region around the CD1 loci are immunologically relevant. In addition to CD1A-E, these include SPTA1, MNDA, IFI-16, AIM2, BL1A, FY and FCERIA. This functional convergence of structurally unrelated genes is reminiscent of the 6p MHC region, and perhaps represents the emergence of yet another antigen presentation gene cluster, in this case dedicated to lipid/glycolipid antigens rather than antigen-derived peptides. [The nucleotide sequence data reported in this paper have been submitted to the DDBJ, EMBL, and GenBank databases under

  2. Cooperative and specific binding of Vif to the 5' region of HIV-1 genomic RNA.

    PubMed

    Henriet, Simon; Richer, Delphine; Bernacchi, Serena; Decroly, Etienne; Vigne, Robert; Ehresmann, Bernard; Ehresmann, Chantal; Paillart, Jean-Christophe; Marquet, Roland

    2005-11-18

    The viral infectivity factor (Vif) protein of human immunodeficiency virus type 1 (HIV-1) is essential for viral replication in vivo. Packaging of Vif into viral particles is mediated by an interaction with viral genomic RNA and association with viral nucleoprotein complexes. Despite recent findings on the RNA-binding properties of Vif suggesting that Vif could be involved in retroviral assembly, no RNA sequence or structure specificity has been determined so far. To gain further insight into the mechanisms by which Vif might regulate viral replication, we studied the interactions of Vif with HIV-1 genomic RNA in vitro. Using extensive biochemical analysis, we have measured the affinity of recombinant Vif proteins for synthetic RNAs corresponding to various regions of the HIV-1 genome. We found that recombinant Vif proteins bind specifically to HIV-1 viral RNA fragments corresponding to the 5'-untranslated region (5'-UTR), gag and the 5' part of pol (K(d) between 45 nM and 65 nM). RNA encompassing nucleotides 1-497 or 499-996 of the HIV-1 genomic RNA bind 9+/-2 and 21+/-3 Vif molecules, respectively, and at least some of these proteins bind in a cooperative manner (Hill constant alpha(H) = 2.3). In contrast, RNAs corresponding to other parts of the HIV-1 genome or heterologous RNAs showed poor binding capacity and weak cooperativity (K(d) > 200 nM). Moreover, RNase T1 footprinting revealed a hierarchical binding of Vif, pointing to TAR and the poly(A) stem-loop structures as primary strong affinity targets, and downstream structures as secondary sites with moderate affinity. Taken together, our findings suggest that Vif may assist other proteins to maintain a correct folding of the genomic RNA in order to facilitate its packaging and further steps such as reverse transcription. Interestingly, our results suggest also that Vif could bind the viral RNA in order to protect it from the action of the antiviral factor APOBEC-3G/3F. PMID:16236319

  3. Origins of the Xylella fastidiosa Prophage-Like Regions and Their Impact in Genome Differentiation

    PubMed Central

    de Mello Varani, Alessandro; Souza, Rangel Celso; Nakaya, Helder I.; de Lima, Wanessa Cristina; Paula de Almeida, Luiz Gonzaga; Kitajima, Elliot Watanabe; Chen, Jianchi; Civerolo, Edwin; Vasconcelos, Ana Tereza Ribeiro; Van Sluys, Marie-Anne

    2008-01-01

    Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes. PMID:19116666

  4. Gene map of large yellow croaker (Larimichthys crocea) provides insights into teleost genome evolution and conserved regions associated with growth

    PubMed Central

    Xiao, Shijun; Wang, Panpan; Zhang, Yan; Fang, Lujing; Liu, Yang; Li, Jiong-Tang; Wang, Zhi-Yong

    2015-01-01

    The genetic map of a species is essential for its whole genome assembly and can be applied to the mapping of important traits. In this study, we performed RNA-seq for a family of large yellow croakers (Larimichthys crocea) and constructed a high-density genetic map. In this map, 24 linkage groups comprised 3,448 polymorphic SNP markers. Approximately 72.4% (2,495) of the markers were located in protein-coding regions. Comparison of the croaker genome with those of five model fish species revealed that the croaker genome structure was closer to that of the medaka than to the remaining four genomes. Because the medaka genome preserves the teleost ancestral karyotype, this result indicated that the croaker genome might also maintain the teleost ancestral genome structure. The analysis also revealed different genome rearrangements across teleosts. QTL mapping and association analysis consistently identified growth-related QTL regions and associated genes. Orthologs of the associated genes in other species were demonstrated to regulate development, indicating that these genes might regulate development and growth in croaker. This gene map will enable us to construct the croaker genome for comparative studies and to provide an important resource for selective breeding of croaker. PMID:26689832

  5. Non-additive genome-wide association scan reveals a new gene associated with habitual coffee consumption.

    PubMed

    Pirastu, Nicola; Kooyman, Maarten; Robino, Antonietta; van der Spek, Ashley; Navarini, Luciano; Amin, Najaf; Karssen, Lennart C; Van Duijn, Cornelia M; Gasparini, Paolo

    2016-01-01

    Coffee is one of the most consumed beverages world-wide and one of the primary sources of caffeine intake. Given its important health and economic impact, the underlying genetics of its consumption has been widely studied. Despite these efforts, much has still to be uncovered. In particular, the use of non-additive genetic models may uncover new information about the genetic variants driving coffee consumption. We have conducted a genome-wide association study in two Italian populations using additive, recessive and dominant models for analysis. This has uncovered a significant association in the PDSS2 gene under the recessive model that has been replicated in an independent cohort from the Netherlands (ERF). The identified gene has been shown to negatively regulate the expression of the caffeine metabolism genes and can thus be linked to coffee consumption. Further bioinformatics analysis of eQTL and histone marks from Roadmap data has evidenced a possible role of the identified SNPs in regulating PDSS2 gene expression through enhancers present in its intron. Our results highlight a novel gene which regulates coffee consumption by regulating the expression of the genes linked to caffeine metabolism. Further studies will be needed to clarify the biological mechanism which links PDSS2 and coffee consumption. PMID:27561104

  6. Non-additive genome-wide association scan reveals a new gene associated with habitual coffee consumption

    PubMed Central

    Pirastu, Nicola; Kooyman, Maarten; Robino, Antonietta; van der Spek, Ashley; Navarini, Luciano; Amin, Najaf; Karssen, Lennart C.; Van Duijn, Cornelia M; Gasparini, Paolo

    2016-01-01

    Coffee is one of the most consumed beverages world-wide and one of the primary sources of caffeine intake. Given its important health and economic impact, the underlying genetics of its consumption has been widely studied. Despite these efforts, much has still to be uncovered. In particular, the use of non-additive genetic models may uncover new information about the genetic variants driving coffee consumption. We have conducted a genome-wide association study in two Italian populations using additive, recessive and dominant models for analysis. This has uncovered a significant association in the PDSS2 gene under the recessive model that has been replicated in an independent cohort from the Netherlands (ERF). The identified gene has been shown to negatively regulate the expression of the caffeine metabolism genes and can thus be linked to coffee consumption. Further bioinformatics analysis of eQTL and histone marks from Roadmap data has evidenced a possible role of the identified SNPs in regulating PDSS2 gene expression through enhancers present in its intron. Our results highlight a novel gene which regulates coffee consumption by regulating the expression of the genes linked to caffeine metabolism. Further studies will be needed to clarify the biological mechanism which links PDSS2 and coffee consumption. PMID:27561104

  7. Establishment of regions of genomic activity during the Drosophila maternal to zygotic transition.

    PubMed

    Li, Xiao-Yong; Harrison, Melissa M; Villalta, Jacqueline E; Kaplan, Tommy; Eisen, Michael B

    2014-01-01

    We describe the genome-wide distributions and temporal dynamics of nucleosomes and post-translational histone modifications throughout the maternal-to-zygotic transition in embryos of Drosophila melanogaster. At mitotic cycle 8, when few zygotic genes are being transcribed, embryonic chromatin is in a relatively simple state: there are few nucleosome free regions, undetectable levels of the histone methylation marks characteristic of mature chromatin, and low levels of histone acetylation at a relatively small number of loci. Histone acetylation increases by cycle 12, but it is not until cycle 14 that nucleosome free regions and domains of histone methylation become widespread. Early histone acetylation is strongly associated with regions that we have previously shown to be bound in early embryos by the maternally deposited transcription factor Zelda, suggesting that Zelda triggers a cascade of events, including the accumulation of specific histone modifications, that plays a role in the subsequent activation of these sequences. PMID:25313869

  8. PCR primers for 30 novel gene regions in the nuclear genomes of Lepidoptera

    PubMed Central

    Wahlberg, Niklas; Peña, Carlos; Ahola, Milla; Wheat, Christopher W.; Rota, Jadranka

    2016-01-01

    Abstract We report primer pairs for 30 new gene regions in the nuclear genomes of Lepidoptera that can be amplified using a standard PCR protocol. The new primers were tested across diverse Lepidoptera, including nonditrysians and a wide selection of ditrysians. These new gene regions give a total of 11,043 bp of DNA sequence data and they show similar variability to traditionally used nuclear gene regions in studies of Lepidoptera. We feel that a PCR-based approach still has its place in molecular systematic studies of Lepidoptera, particularly at the intrafamilial level, and our new set of primers now provides a route to generating phylogenomic datasets using traditional methods. PMID:27408580

  9. Pedigree-based analysis of derivation of genome segments of an elite rice reveals key regions during its breeding.

    PubMed

    Zhou, Degui; Chen, Wei; Lin, Zechuan; Chen, Haodong; Wang, Chongrong; Li, Hong; Yu, Renbo; Zhang, Fengyun; Zhen, Gang; Yi, Junliang; Li, Kanghuo; Liu, Yaoguang; Terzaghi, William; Tang, Xiaoyan; He, Hang; Zhou, Shaochuan; Deng, Xing Wang

    2016-02-01

    Analyses of genome variations with high-throughput assays have improved our understanding of genetic basis of crop domestication and identified the selected genome regions, but little is known about that of modern breeding, which has limited the usefulness of massive elite cultivars in further breeding. Here we deploy pedigree-based analysis of an elite rice, Huanghuazhan, to exploit key genome regions during its breeding. The cultivars in the pedigree were resequenced with 7.6× depth on average, and 2.1 million high-quality single nucleotide polymorphisms (SNPs) were obtained. Tracing the derivation of genome blocks with pedigree and information on SNPs revealed the chromosomal recombination during breeding, which showed that 26.22% of Huanghuazhan genome are strictly conserved key regions. These major effect regions were further supported by a QTL mapping of 260 recombinant inbred lines derived from the cross of Huanghuazhan and a very dissimilar cultivar, Shuanggui 36, and by the genome profile of eight cultivars and 36 elite lines derived from Huanghuazhan. Hitting these regions with the cloned genes revealed they include numbers of key genes, which were then applied to demonstrate how Huanghuazhan were bred after 30 years of effort and to dissect the deficiency of artificial selection. We concluded the regions are helpful to the further breeding based on this pedigree and performing breeding by design. Our study provides genetic dissection of modern rice breeding and sheds new light on how to perform genomewide breeding by design. PMID:26096084

  10. DNA Rearrangement in Orthologous Orp Regions of the Maize, Rice and Sorghum Genomes

    PubMed Central

    Ma, Jianxin; SanMiguel, Phillip; Lai, Jinsheng; Messing, Joachim; Bennetzen, Jeffrey L.

    2005-01-01

    The homeologous Orp1 and Orp2 regions of maize and the orthologous regions in sorghum and rice were compared by generating sequence data for >486 kb of genomic DNA. At least three genic rearrangements differentiate the maize Orp1 and Orp2 segments, including an insertion of a single gene and two deletions that removed one gene each, while no genic rearrangements were detected in the maize Orp2 region relative to sorghum. Extended comparison of the orthologous Orp regions of sorghum and japonica rice uncovered numerous genic rearrangements and the presence of a transposon-rich region in rice. Only 11 of 27 genes (40%) are arranged in the same order and orientation between sorghum and rice. Of the 8 genes that are uniquely present in the sorghum region, 4 were found to have single-copy homologs in both rice and Arabidopsis, but none of these genes are located near each other, indicating frequent gene movement. Further comparison of the Orp segments from two rice subspecies, japonica and indica, revealed that the transposon-rich region is both an ancient and current hotspot for retrotransposon accumulation and genic rearrangement. We also identify unequal gene conversion as a mechanism for maize retrotransposon rearrangement.

  11. Small activating RNA binds to the genomic target site in a seed-region-dependent manner

    PubMed Central

    Meng, Xing; Jiang, Qian; Chang, Nannan; Wang, Xiaoxia; Liu, Chujun; Xiong, Jingwei; Cao, Huiqing; Liang, Zicai

    2016-01-01

    RNA activation (RNAa) is the upregulation of gene expression by small activating RNAs (saRNAs). In order to investigate the mechanism by which saRNAs act in RNAa, we used the progesterone receptor (PR) gene as a model, established a panel of effective saRNAs and assessed the involvement of the sense and antisense strands of saRNA in RNAa. All active saRNAs had their antisense strand effectively incorporated into Ago2, whereas such consistency did not occur for the sense strand. Using a distal hotspot for saRNA targeting at 1.6-kb upstream from the PR transcription start site, we further established that gene activation mediated by saRNA depended on the complementarity of the 5′ region of the antisense strand, and that such activity was largely abolished by mutations in this region of the saRNA. We found markedly reduced RNAa effects when we created mutations in the genomic target site of saRNA PR-1611, thus providing evidence that RNAa depends on the integrity of the DNA target. We further demonstrated that this saRNA bound the target site on promoter DNA. These results demonstrated that saRNAs work via an on-site mechanism by binding to target genomic DNA in a seed-region-dependent manner, reminiscent of miRNA-like target recognition. PMID:26873922

  12. A genome walking strategy for the identification of nucleotide sequences adjacent to known regions.

    PubMed

    Wang, Hailong; Yao, Ting; Cai, Mei; Xiao, Xiuqing; Ding, Xuezhi; Xia, Liqiu

    2013-02-01

    To identify the transposon insertion sites in a soil actinomycete, Saccharopolyspora spinosa, a genome walking approach, termed SPTA-PCR, was developed. In SPTA-PCR, a simple procedure consisting of TA cloning and a high stringency PCR, following the single primer-mediated, randomly-primed PCR, can eliminate non-target DNA fragments and obtain target fragments specifically. Using SPTA-PCR, the DNA sequence adjacent to the highly conserved region of lectin coding gene in onion plant, Allium chinense, was also cloned. PMID:23108875

  13. [Topological Conflicts in Phylogenetic Analysis of Different Regions of the Sable (Martes zibellina L.) Mitochondrial Genome].

    PubMed

    Malyarchuk, B A; Derenko, M V; Denisova, G A; Litvinov, A N

    2015-08-01

    Phylogenetic analysis of different regions of the mitochondrial genome of the sable showed the presence of several topologies of phylogenetic trees, but the most statistically significant topology is A-BC, which was obtained as a result of the analysis of the mitochondrial genome as a whole, as well as of the individual CO1, ND4, and ND5 genes. Analysis of the intergroup divergence of the mtDNA haplotypes (Dxy) indicated that the maximum Dxy values between A and BC groups were accompanied by minimum differences between B and C groups only for six genes showing the A-BC topology (12S rRNA; CO1, CO2, ND4, ND5, and CYTB). It is assumed that the topological conflicts observed in the analysis of individual sable mtDNA genes are associated with the uneven distribution of mutations along the mitochondrial genome and the mitochondrial tree. This may be due to random causes, as well as the nonuniform effect of selection. PMID:26601491

  14. Identification and mapping of DNA binding proteins target sequences in long genomic regions by two-dimensional EMSA.

    PubMed

    Chernov, Igor P; Akopov, Sergey B; Nikolaev, Lev G; Sverdlov, Eugene D

    2006-07-01

    Specific binding of nuclear proteins, in particular transcription factors, to target DNA sequences is a major mechanism of genome functioning and gene expression regulation in eukaryotes. Therefore, identification and mapping specific protein target sites (PTS) is necessary for understanding genomic regulation. Here we used a novel two-dimensional electrophoretic mobility shift assay (2D-EMSA) procedure for identification and mapping of 52 PTS within a 563-kb human genome region located between the FXYD5 and TZFP genes. The PTS occurred with approximately equal frequency within unique and repetitive genomic regions. PTS belonging to unique sequences tended to group together within gene introns and close to their 5' and 3' ends, whereas PTS located within repeats were evenly distributed between transcribed and intragenic regions. PMID:16869519

  15. Full-genome sequences of hepatitis B virus subgenotype D3 isolates from the Brazilian Amazon Region.

    PubMed

    Spitz, Natália; Mello, Francisco C A; Araujo, Natalia Motta

    2015-02-01

    The Brazilian Amazon Region is a highly endemic area for hepatitis B virus (HBV). However, little is known regarding the genetic variability of the strains circulating in this geographical region. Here, we describe the first full-length genomes of HBV isolated in the Brazilian Amazon Region; these genomes are also the first complete HBV subgenotype D3 genomes reported for Brazil. The genomes of the five Brazilian isolates were all 3,182 base pairs in length and the isolates were classified as belonging to subgenotype D3, subtypes ayw2 (n = 3) and ayw3 (n = 2). Phylogenetic analysis suggested that the Brazilian sequences are not likely to be closely related to European D3 sequences. Such results will contribute to further epidemiological and evolutionary studies of HBV. PMID:25742278

  16. Composite selection signals can localize the trait specific genomic regions in multi-breed populations of cattle and sheep

    PubMed Central

    2014-01-01

    Background Discerning the traits evolving under neutral conditions from those traits evolving rapidly because of various selection pressures is a great challenge. We propose a new method, composite selection signals (CSS), which unifies the multiple pieces of selection evidence from the rank distribution of its diverse constituent tests. The extreme CSS scores capture highly differentiated loci and underlying common variants hauling excess haplotype homozygosity in the samples of a target population. Results The data on high-density genotypes were analyzed for evidence of an association with either polledness or double muscling in various cohorts of cattle and sheep. In cattle, extreme CSS scores were found in the candidate regions on autosome BTA-1 and BTA-2, flanking the POLL locus and MSTN gene, for polledness and double muscling, respectively. In sheep, the regions with extreme scores were localized on autosome OAR-2 harbouring the MSTN gene for double muscling and on OAR-10 harbouring the RXFP2 gene for polledness. In comparison to the constituent tests, there was a partial agreement between the signals at the four candidate loci; however, they consistently identified additional genomic regions harbouring no known genes. Persuasively, our list of all the additional significant CSS regions contains genes that have been successfully implicated to secondary phenotypic diversity among several subpopulations in our data. For example, the method identified a strong selection signature for stature in cattle capturing selective sweeps harbouring UQCC-GDF5 and PLAG1-CHCHD7 gene regions on BTA-13 and BTA-14, respectively. Both gene pairs have been previously associated with height in humans, while PLAG1-CHCHD7 has also been reported for stature in cattle. In the additional analysis, CSS identified significant regions harbouring multiple genes for various traits under selection in European cattle including polledness, adaptation, metabolism, growth rate, stature

  17. Microcollinearity in an ethylene receptor coding gene region of the Coffea canephora genome is extensively conserved with Vitis vinifera and other distant dicotyledonous sequenced genomes

    PubMed Central

    Guyot, Romain; de la Mare, Marion; Viader, Véronique; Hamon, Perla; Coriton, Olivier; Bustamante-Porras, José; Poncet, Valérie; Campa, Claudine; Hamon, Serge; de Kochko, Alexandre

    2009-01-01

    Background Coffea canephora, also called Robusta, belongs to the Rubiaceae, the fourth largest angiosperm family. This diploid species (2x = 2n = 22) has a fairly small genome size of ≈ 690 Mb and despite its extreme economic importance, particularly for developing countries, knowledge on the genome composition, structure and evolution remain very limited. Here, we report the 160 kb of the first C. canephora Bacterial Artificial Chromosome (BAC) clone ever sequenced and its fine analysis. Results This clone contains the CcEIN4 gene, encoding an ethylene receptor, and twenty other predicted genes showing a high gene density of one gene per 7.8 kb. Most of them display perfect matches with C. canephora expressed sequence tags or show transcriptional activities through PCR amplifications on cDNA libraries. Twenty-three transposable elements, mainly Class II transposon derivatives, were identified at this locus. Most of these Class II elements are Miniature Inverted-repeat Transposable Elements (MITE) known to be closely associated with plant genes. This BAC composition gives a pattern similar to those found in gene rich regions of Solanum lycopersicum and Medicago truncatula genomes indicating that the CcEIN4 regions may belong to a gene rich region in the C. canephora genome. Comparative sequence analysis indicated an extensive conservation between C. canephora and most of the reference dicotyledonous genomes studied in this work, such as tomato (S. lycopersicum), grapevine (V. vinifera), barrel medic M. truncatula, black cottonwood (Populus trichocarpa) and Arabidopsis thaliana. The higher degree of microcollinearity was found between C. canephora and V. vinifera, which belong respectively to the Asterids and Rosids, two clades that diverged more than 114 million years ago. Conclusion This study provides a first glimpse of C. canephora genome composition and evolution. Our data revealed a remarkable conservation of the microcollinearity between C. canephora and V

  18. ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions

    PubMed Central

    2011-01-01

    ZINBA (Zero-Inflated Negative Binomial Algorithm) identifies genomic regions enriched in a variety of ChIP-seq and related next-generation sequencing experiments (DNA-seq), calling both broad and narrow modes of enrichment across a range of signal-to-noise ratios. ZINBA models and accounts for factors that co-vary with background or experimental signal, such as G/C content, and identifies enrichment in genomes with complex local copy number variations. ZINBA provides a single unified framework for analyzing DNA-seq experiments in challenging genomic contexts. Software website: http://code.google.com/p/zinba/ PMID:21787385

  19. A genomic region involved in the formation of adhesin fibers in Bacillus cereus biofilms

    PubMed Central

    Caro-Astorga, Joaquín; Pérez-García, Alejandro; de Vicente, Antonio; Romero, Diego

    2015-01-01

    Bacillus cereus is a bacterial pathogen that is responsible for many recurrent disease outbreaks due to food contamination. Spores and biofilms are considered the most important reservoirs of B. cereus in contaminated fresh vegetables and fruits. Biofilms are bacterial communities that are difficult to eradicate from biotic and abiotic surfaces because of their stable and extremely strong extracellular matrix. These extracellular matrixes contain exopolysaccharides, proteins, extracellular DNA, and other minor components. Although B. cereus can form biofilms, the bacterial features governing assembly of the protective extracellular matrix are not known. Using the well-studied bacterium B. subtilis as a model, we identified two genomic loci in B. cereus, which encodes two orthologs of the amyloid-like protein TasA of B. subtilis and a SipW signal peptidase. Deletion of this genomic region in B. cereus inhibited biofilm assembly; notably, mutation of the putative signal peptidase SipW caused the same phenotype. However, mutations in tasA or calY did not completely prevent biofilm formation; strains that were mutated for either of these genes formed phenotypically different surface attached biofilms. Electron microscopy studies revealed that TasA polymerizes to form long and abundant fibers on cell surfaces, whereas CalY does not aggregate similarly. Heterologous expression of this amyloid-like cassette in a B. subtilis strain lacking the factors required for the assembly of TasA amyloid-like fibers revealed (i) the involvement of this B. cereus genomic region in formation of the air-liquid interphase pellicles and (ii) the intrinsic ability of TasA to form fibers similar to the amyloid-like fibers produced by its B. subtilis ortholog. PMID:25628606

  20. Long regions of homologous DNA are incorporated into the tobacco plastid genome by transformation.

    PubMed Central

    Staub, J M; Maliga, P

    1992-01-01

    We investigated the size of flanking DNA incorporated into the tobacco plastid genome alongside a selectable antibiotic resistance mutation. The results showed that integration of a long uninterrupted region of homologous DNA, rather than of small fragments as previously thought, is the more likely event in plastid transformation of land plants. Transforming plasmid pJS75 contains a 6.2-kb DNA fragment from the inverted repeat region of the tobacco plastid genome. A spectinomycin resistance mutation is encoded in the gene of the 16S rRNA and, 3.2 kb away, a streptomycin resistance mutation is encoded in exon II of the ribosomal protein gene rps12. Transplastomic lines were obtained after introduction of pJS75 DNA into leaf cells by the biolistic process and selection for the spectinomycin resistance marker. Homologous replacement of resident wild-type sequences resulted in integration of all, or almost all, of the 6.2-kb plastid DNA sequence from pJS75. Plasmid pJS75, which contains engineered cloning sites between two selectable markers, can be used as a plastid insertion vector. PMID:1356049

  1. Identification of Sex-Linked SNPs and Sex-Determining Regions in the Yellowtail Genome.

    PubMed

    Koyama, Takashi; Ozaki, Akiyuki; Yoshida, Kazunori; Suzuki, Junpei; Fuji, Kanako; Aoki, Jun-ya; Kai, Wataru; Kawabata, Yumi; Tsuzaki, Tatsuo; Araki, Kazuo; Sakamoto, Takashi

    2015-08-01

    Unlike the conservation of sex-determining (SD) modes seen in most mammals and birds, teleost fishes exhibit a wide variety of SD systems and genes. Hence, the study of SD genes and sex chromosome turnover in fish is one of the most interesting topics in evolutionary biology. To increase resolution of the SD gene evolutionary trajectory in fish, identification of the SD gene in more fish species is necessary. In this study, we focused on the yellowtail, a species widely cultivated in Japan. It is a member of family Carangidae in which no heteromorphic sex chromosome has been observed, and no SD gene has been identified to date. By performing linkage analysis and BAC walking, we identified a genomic region and SNPs with complete linkage to yellowtail sex. Comparative genome analysis revealed the yellowtail SD region ancestral chromosome structure as medaka-fugu. Two inversions occurred in the yellowtail linage after it diverged from the yellowtail-medaka ancestor. An association study using wild yellowtails and the SNPs developed from BAC ends identified two SNPs that can reasonably distinguish the sexes. Therefore, these will be useful genetic markers for yellowtail breeding. Based on a comparative study, it was suggested that a PDZ domain containing the GIPC protein might be involved in yellowtail sex determination. The homomorphic sex chromosomes widely observed in the Carangidae suggest that this family could be a suitable marine fish model to investigate the early stages of sex chromosome evolution, for which our results provide a good starting point. PMID:25975833

  2. Bacillus subtilis genome editing using ssDNA with short homology regions.

    PubMed

    Wang, Yang; Weng, Jun; Waseem, Raza; Yin, Xihou; Zhang, Ruifu; Shen, Qirong

    2012-07-01

    In this study, we developed a simple and efficient Bacillus subtilis genome editing method in which targeted gene(s) could be inactivated by single-stranded PCR product(s) flanked by short homology regions and in-frame deletion could be achieved by incubating the transformants at 42°C. In this process, homologous recombination (HR) was promoted by the lambda beta protein synthesized under the control of promoter P(RM) in the lambda cI857 P(RM)-P(R) promoter system on a temperature sensitive plasmid pWY121. Promoter P(R) drove the expression of the recombinase gene cre at 42°C for excising the floxed (lox sites flanked) disruption cassette that contained a bleomycin resistance marker and a heat inducible counter-selectable marker (hewl, encoding hen egg white lysozyme). Then, we amplified the single-stranded disruption cassette using the primers that carried 70 nt homology extensions corresponding to the regions flanking the target gene. By transforming the respective PCR products into the B. subtilis that harbored pWY121 and incubating the resultant mutants at 42°C, we knocked out multiple genes in the same genetic background with no marker left. This process is simple and efficient and can be widely applied to large-scale genome analysis of recalcitrant Bacillus species. PMID:22422839

  3. The polycystic kidney disease 1 gene lies in a duplicated genomic region

    SciTech Connect

    Ward, C.J.; Hughes, J.; Peral, B. |

    1994-09-01

    The polycystic kidney disease 1 (PKD1) gene is situated in chromosomal band 16p13.3 and encodes a 14 kb transcript. The 5{prime} region of the PKD1 gene is located within a 40-50 kb stretch of genomic DNA which is duplicated several times in the more proximal region, 16p13.1. This proximal area gives rise to at least three transcripts designated homologous gene A (HG-A; 21 kb), HG-B (17 kb) and HG-C (8.5 kb). These three transcripts share substantial homology with each other and the PKD1 transcript. However, the 3{prime} 3.8 kb section of the PKD1 transcript is unique because it is encoded by a region of the gene that lies outside the duplicated area. The presence of the duplicate transcripts in all tissues analyzed has hampered attempts to clone and sequence the bone fide PKD1 gene. Comparison of cDNAs known to arise from the PKD1 transcript to those from the HG transcripts reveals that divergence of 2-3% has occurred between these sequences. To overcome the problem of the duplication, a large 15 kb section of genomic DNA has been sequenced together with several large HG cDNAs. Utilizing a radiation hybrid which contains only the 16p13.3 region and expresses low levels of the PKD1 transcript, we are now attempting to clone the duplicated part of the PKD1 gene by exon linking.

  4. In silico screening of the chicken genome for overlaps between genomic regions: microRNA genes, coding and non-coding transcriptional units, QTL, and genetic variations.

    PubMed

    Zorc, Minja; Kunej, Tanja

    2016-05-01

    MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a

  5. PacBio SMRT assembly of a complex multi-replicon genome reveals chlorocatechol degradative operon in a region of genome plasticity.

    PubMed

    Ricker, N; Shen, S Y; Goordial, J; Jin, S; Fulthorpe, R R

    2016-07-25

    We have sequenced a Burkholderia genome that contains multiple replicons and large repetitive elements that would make it inherently difficult to assemble by short read sequencing technologies. We illustrate how the integrated long read correction algorithms implemented through the PacBio Single Molecule Real-Time (SMRT) sequencing technology successfully provided a de novo assembly that is a reasonable estimate of both the gene content and genome organization without making any further modifications. This assembly is comparable to related organisms assembled by more labour intensive methods. Our assembled genome revealed regions of genome plasticity for further investigation, one of which harbours a chlorocatechol degradative operon highly homologous to those previously identified on globally ubiquitous plasmids. In an ideal world, this assembly would still require experimental validation to confirm gene order and copy number of repeated elements. However, we submit that particularly in instances where a polished genome is not the primary goal of the sequencing project, PacBio SMRT sequencing provides a financially viable option for generating a biologically relevant genome estimate that can be utilized by other researchers for comparative studies. PMID:27063562

  6. A novel method for discovering local spatial clusters of genomic regions with functional relationships from DNA contact maps

    PubMed Central

    Hu, Xihao; Shi, Christina Huan; Yip, Kevin Y.

    2016-01-01

    Motivation: The three-dimensional structure of genomes makes it possible for genomic regions not adjacent in the primary sequence to be spatially proximal. These DNA contacts have been found to be related to various molecular activities. Previous methods for analyzing DNA contact maps obtained from Hi-C experiments have largely focused on studying individual interactions, forming spatial clusters composed of contiguous blocks of genomic locations, or classifying these clusters into general categories based on some global properties of the contact maps. Results: Here, we describe a novel computational method that can flexibly identify small clusters of spatially proximal genomic regions based on their local contact patterns. Using simulated data that highly resemble Hi-C data obtained from real genome structures, we demonstrate that our method identifies spatial clusters that are more compact than methods previously used for clustering genomic regions based on DNA contact maps. The clusters identified by our method enable us to confirm functionally related genomic regions previously reported to be spatially proximal in different species. We further show that each genomic region can be assigned a numeric affinity value that indicates its degree of participation in each local cluster, and these affinity values correlate quantitatively with DNase I hypersensitivity, gene expression, super enhancer activities and replication timing in a cell type specific manner. We also show that these cluster affinity values can precisely define boundaries of reported topologically associating domains, and further define local sub-domains within each domain. Availability and implementation: The source code of BNMF and tutorials on how to use the software to extract local clusters from contact maps are available at http://yiplab.cse.cuhk.edu.hk/bnmf/. Contact: kevinyip@cse.cuhk.edu.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307607

  7. Three Distinct Regions of the Murine Gammaherpesvirus 68 Genome Are Transcriptionally Active in Latently Infected Mice

    PubMed Central

    Virgin, Herbert W.; Presti, Rachel M.; Li, Xi-Yang; Liu, Carl; Speck, Samuel H.

    1999-01-01

    The program(s) of gene expression operating during murine gammaherpesvirus 68 (γHV68) latency is undefined, as is the relationship between γHV68 latency and latency of primate gammaherpesviruses. We used a nested reverse transcriptase PCR strategy (sensitive to approximately one copy of γHV68 genome for each genomic region tested) to screen for the presence of viral transcripts in latently infected mice. Based on the positions of known latency-associated genes in other gammaherpesviruses, we screened for the presence of transcripts corresponding to 11 open reading frames (ORFs) in the γHV68 genome in RNA from spleens and peritoneal cells of latently infected B-cell-deficient (MuMT) mice which have been shown contain high levels of reactivable latent γHV68 (K. E. Weck, M. L. Barkon, L. I. Yoo, S. H. Speck, and H. W. Virgin, J. Virol. 70:6775–6780, 1996). To control for the possible presence of viral lytic activity, we determined that RNA from latently infected peritoneal and spleen cells contained few or no detectable transcripts corresponding to seven ORFs known to encode viral gene products associated with lytic replication. However, we did detect low-level expression of transcripts arising from the region of gene 50 (encoding the putative homolog of the Epstein-Barr virus BRLF1 transactivator) in peritoneal but not spleen cells. Latently infected peritoneal cells consistently scored for expression of RNA derived from 4 of the 11 candidate latency-associated ORFs examined, including the regions of ORF M2, ORF M11 (encoding v-bcl-2), gene 73 (a homolog of the Kaposi’s sarcoma-associated herpesvirus [human herpesvirus 8] gene encoding latency-associated nuclear antigen), and gene 74 (encoding a G-protein coupled receptor homolog, v-GCR). Latently infected spleen cells consistently scored positive for RNA derived from 3 of the 11 candidate latency-associated ORFs examined, including ORF M2, ORF M3, and ORF M9. To further characterize transcription of these

  8. Sequence analysis of the groESL-cotA region of the Bacillus subtilis genome, containing the restriction/modification system genes.

    PubMed

    Kasahara, Y; Nakai, S; Ogasawara, N; Yata, K; Sadaie, Y

    1997-10-31

    We have determined a 35-kb sequence of the groESL-gutR-cotA (45 degrees-52 degrees) region of the Bacillus subtilis genome. In addition to the groESL, gutRB and cotA genes reported previously, we have newly identified 24 ORFs including gutA and fruC genes, encoding glucitol permease and fructokinase, respectively. The inherent restriction/modification system genes, hsdMR and hsdMM, were mapped between groESL and gutRB, and we have identified two open reading frames (ORFs) encoding 5-methylcytosine forming DNA methyl transferase and an operon probably encoding a restriction enzyme complex. The unusual genome structure of few ORFs and lower GC content around the restriction/modification genes strongly suggests that the region originated from a bacteriophage integrated during evolution. PMID:9455482

  9. Epigenetic Mechanisms of Genomic Imprinting: Common Themes in the Regulation of Imprinted Regions in Mammals, Plants, and Insects

    PubMed Central

    MacDonald, William A.

    2012-01-01

    Genomic imprinting is a form of epigenetic inheritance whereby the regulation of a gene or chromosomal region is dependent on the sex of the transmitting parent. During gametogenesis, imprinted regions of DNA are differentially marked in accordance to the sex of the parent, resulting in parent-specific expression. While mice are the primary research model used to study genomic imprinting, imprinted regions have been described in a broad variety of organisms, including other mammals, plants, and insects. Each of these organisms employs multiple, interrelated, epigenetic mechanisms to maintain parent-specific expression. While imprinted genes and imprint control regions are often species and locus-specific, the same suites of epigenetic mechanisms are often used to achieve imprinted expression. This review examines some examples of the epigenetic mechanisms responsible for genomic imprinting in mammals, plants, and insects. PMID:22567394

  10. A Spontaneous Deletion of the US1.67/US2 Genomic Region on the Bovine Herpesvirus 1 Strain Cooper

    PubMed Central

    Campos, F. S.; Paim, W. P.; Silva, A. G.; Santos, R. N.; Firpo, R. M.; Scheffer, C. M.; Finoketti, F.; Franco, A. C.

    2016-01-01

    Bovine herpesvirus 1 (BoHV-1) is an alphaherpesvirus with a genome of 135 kb. Some BoHV-1 genes are nonessential and may be deleted from the viral genome. Here, a spontaneous gene deletion was identified in the BoHV-1 strain Cooper. Genes of the US1.67/US2 region were absent, as determined by next-generation sequencing. PMID:26847888

  11. Core and region-enriched networks of behaviorally regulated genes and the singing genome

    PubMed Central

    Whitney, Osceola; Pfenning, Andreas R.; Howard, Jason T.; Blatti, Charles A; Liu, Fang; Ward, James M.; Wang, Rui; Audet, Jean-Nicolas; Kellis, Manolis; Mukherjee, Sayan; Sinha, Saurabh; Hartemink, Alexander J.; West, Anne E.; Jarvis, Erich D.

    2015-01-01

    Songbirds represent an important model organism for elucidating molecular mechanisms that link genes with complex behaviors, in part because they have discrete vocal learning circuits that have parallels with those that mediate human speech. We found that ~10% of the genes in the avian genome were regulated by singing, and we found a striking regional diversity of both basal and singing-induced programs in the four key song nuclei of the zebra finch, a vocal learning songbird. The region-enriched patterns were a result of distinct combinations of region-enriched transcription factors (TFs), their binding motifs, and presinging acetylation of histone 3 at lysine 27 (H3K27ac) enhancer activity in the regulatory regions of the associated genes. RNA interference manipulations validated the role of the calcium-response transcription factor (CaRF) in regulating genes preferentially expressed in specific song nuclei in response to singing. Thus, differential combinatorial binding of a small group of activity-regulated TFs and predefined epigenetic enhancer activity influences the anatomical diversity of behaviorally regulated gene networks. PMID:25504732

  12. Genomic instability and mobile genetic elements in regions surrounding two discoidin I genes of Dictyostelium discoideum.

    PubMed Central

    Poole, S J; Firtel, R A

    1984-01-01

    We have found that the genomic regions surrounding the linked discoidin I genes of various Dictyostelium discoideum strains have undergone rapid changes. Wild-type strain NC-4 has three complete discoidin I genes; its axenic derivative strain Ax-3L has duplicated a region starting approximately 1 kilobase upstream from the two linked genes and extending for at least 8 kilobases past the genes. A separately maintained stock, strain Ax-3K, does not have this duplication but has undergone a different rearrangement approximately 3 kilobases farther upstream. We show that there are repeat elements in these rapidly changing regions. At least two of these elements, Tdd-2 and Tdd-3, have characteristics associated with mobile genetic elements. The Tdd-3 element is found in different locations in related strains and causes a 9- to 10-base-pair duplication of the target site DNA. The Tdd-2 and Tdd-3 elements do not cross-hybridize, but they share a 22-base-pair homology near one end. At two separate sites, the Tdd-3 element has transposed into the Tdd-2 element, directly adjacent to the 22-base-pair homology. The Tdd-3 element may use this 22-base-pair region as a preferential site of insertion. Images PMID:6325889

  13. Value addition of wild apricot fruits grown in North-West Himalayan regions-a review.

    PubMed

    Sharma, Rakesh; Gupta, Anil; Abrol, G S; Joshi, V K

    2014-11-01

    Wild apricot (Prunus armeniaca L.) commonly known as chulli is a potential fruit widely distributed in North-West Himalayan regions of the world. The fruits are good source of carbohydrates, vitamins, minerals besides having attractive colour and typical flavour. Unlike table purpose varieties of apricots like New Castle, the fruits of wild apricot are unsuitable for fresh consumption because of its high acid and low sugar content. However, the fruits are traditionally utilized for open sun drying, pulping to prepare different products such as jams, chutney and naturally fermented and distilled liquor. But, scientific literature on processing and value addition of wild apricot is scanty. Preparation of jam with 25 % wild apricot +75 % apple showed maximum score for organoleptic characteristics due to better taste and colour. Osmotic dehydration has been found as a suitable method for drying of wild type acidic apricots. A good quality sauce using wild apricot pulp and tomato pulp in the ratio of 1:1 has been prepared, while chutney of good acceptability prepared from wild apricot pulp (100 %) has also been documented. Preparation of apricot-soy protein enriched products like apricot-soya leather, toffee and fruit bars has been reported, which are reported to meet the protein requirements of adult and children as per the recommendations of ICMR. Besides these processed products, preparation of alcoholic beverages like wine, vermouth and brandy from wild apricot fruits has also been reported by various researchers. Further, after utilization of pulp for preparation of value added products, the stones left over have been successfully utilized for oil extraction which has medicinal and cosmetic value. The traditional method of oil extraction has been reported to be unhygienic and result in low oil yield with poor quality, whereas improved mechanical method of oil extraction has been found to produce good quality oil. The apricot kernel oil and press cake have

  14. Canine parvovirus host range is determined by the specific conformation of an additional region of the capsid.

    PubMed Central

    Parker, J S; Parrish, C R

    1997-01-01

    We analyzed a region of the capsid of canine parvovirus (CPV) which determines the ability of the virus to infect canine cells. This region is distinct from those previously shown to determine the canine host range differences between CPV and feline panleukopenia virus. It lies on a ridge of the threefold spike of the capsid and is comprised of five interacting loops from three capsid protein monomers. We analyzed 12 mutants of CPV which contained amino acid changes in two adjacent loops exposed on the surface of this region. Nine mutants infected and grew in feline cells but were restricted in replication in one or the other of two canine cell lines tested. Three other mutants whose genomes contain mutations which affect one probable interchain bond were nonviable and could not be propagated in either canine or feline cells, although the VP1 and VP2 proteins from those mutants produced empty capsids when expressed from a plasmid vector. Although wild-type and mutant capsids bound to canine and feline cells in similar amounts, infection or viral DNA replication was greatly reduced after inoculation of canine cells with most of the mutants. The viral genomes of two host range-restricted mutants and two nonviable mutants replicated to wild-type levels in both feline and canine cells upon transfection with plasmid clones. The capsids of wild-type CPV and two mutants were similar in susceptibility to heat inactivation, but one of those mutants and one other were more stable against urea denaturation. Most mutations in this structural region altered the ability of monoclonal antibodies to recognize epitopes within a major neutralizing antigenic site, and that site could be subdivided into a number of distinct epitopes. These results argue that a specific structure of this region is required for CPV to retain its canine host range. PMID:9371580

  15. Dynamic chromatin environment of key lytic cycle regulatory regions of the Epstein-Barr virus genome.

    PubMed

    Ramasubramanyan, Sharada; Osborn, Kay; Flower, Kirsty; Sinclair, Alison J

    2012-02-01

    The ability of Epstein-Barr virus (EBV) to establish latency allows it to evade the immune system and to persist for the lifetime of its host; one distinguishing characteristic is the lack of transcription of the majority of viral genes. Entry into the lytic cycle is coordinated by the viral transcription factor, Zta (BZLF1, ZEBRA, and EB1), and downstream effectors, while viral genome replication requires the concerted action of Zta and six other viral proteins at the origins of lytic replication. We explored the chromatin context at key EBV lytic cycle promoters (BZLF1, BRLF1, BMRF1, and BALF5) and the origins of lytic replication during latency and lytic replication. We show that a repressive heterochromatin-like environment (trimethylation of histone H3 at lysine 9 [H3K9me3] and lysine 27 [H3K27me3]), which blocks the interaction of some transcription factors with DNA, encompasses the key early lytic regulatory regions. Epigenetic silencing of the EBV genome is also imposed by DNA methylation during latency. The chromatin environment changes during the lytic cycle with activation of histones H3, H4, and H2AX occurring at both the origins of replication and at the key lytic regulatory elements. We propose that Zta is able to reverse the effects of latency-associated repressive chromatin at EBV early lytic promoters by interacting with Zta response elements within the H3K9me3-associated chromatin and demonstrate that these interactions occur in vivo. Since the interaction of Zta with DNA is not inhibited by DNA methylation, it is clear that Zta uses two routes to overcome epigenetic silencing of its genome. PMID:22090141

  16. Multiple region whole-exome sequencing reveals dramatically evolving intratumor genomic heterogeneity in esophageal squamous cell carcinoma

    PubMed Central

    Cao, W; Wu, W; Yan, M; Tian, F; Ma, C; Zhang, Q; Li, X; Han, P; Liu, Z; Gu, J; Biddle, F G

    2015-01-01

    Cancer is a disease of genome instability and genomic alterations; now, genomic heterogeneity is rapidly emerging as a defining feature of cancer, both within and between tumors. Motivation for our pilot study of tumor heterogeneity in esophageal squamous cell carcinoma (ESCC) is that it is not well studied, but the highest incidences of esophageal cancers are found in China and ESCC is the most common type. We profiled the mutations and changes in copy number that were identified by whole-exome sequencing and array-based comparative genomic hybridization in multiple regions within an ESCC from two patients. The average mutational heterogeneity rate was 90% in all regions of the individual tumors in each patient; most somatic point mutations were nonsynonymous substitutions, small Indels occurred in untranslated regions of genes, and copy number alterations varied among multiple regions of a tumor. Independent Sanger sequencing technology confirmed selected gene mutations with more than 88% concordance. Phylogenetic analysis of the somatic mutation frequency demonstrated that multiple, genomically heterogeneous divergent clones evolve and co-exist within a primary ESCC and metastatic subclones result from the dispersal and adaptation of an initially non-metastatic parental clone. Therefore, a single-region sampling will not reflect the evolving architecture of a genomically heterogeneous landscape of mutations in ESCC tumors and the divergent complexity of this genomic heterogeneity among patients will complicate any promise of a simple genetic or epigenetic diagnostic signature in ESCC. We conclude that any potential for informative biomarker discovery in ESCC and targeted personalized therapies will require a deeper understanding of the functional biology of the ontogeny and phylogeny of the tumor heterogeneity. PMID:26619400

  17. A comparison of the molecular organization of genomic regions associated with resistance to common bacterial blight in two Phaseolus vulgaris genotypes

    PubMed Central

    Perry, Gregory; DiNatale, Claudia; Xie, Weilong; Navabi, Alireza; Reinprecht, Yarmilla; Crosby, William; Yu, Kangfu; Shi, Chun; Pauls, K. Peter

    2013-01-01

    Resistance to common bacterial blight, caused by Xanthomonas axonopodis pv. phaseoli, in Phaseolus vulgaris is conditioned by several loci on different chromosomes. Previous studies with OAC-Rex, a CBB-resistant, white bean variety of Mesoamerican origin, identified two resistance loci associated with the molecular markers Pv-CTT001 and SU91, on chromosome 4 and 8, respectively. Resistance to CBB is assumed to be derived from an interspecific cross with Phaseolus acutifolius in the pedigree of OAC-Rex. Our current whole genome sequencing effort with OAC-Rex provided the opportunity to compare its genome in the regions associated with CBB resistance with the v1.0 release of the P. vulgaris line G19833, which is a large seeded bean of Andean origin, and (assumed to be) CBB susceptible. In addition, the genomic regions containing SAP6, a marker associated with P. vulgaris-derived CBB-resistance on chromosome 10, were compared. These analyses indicated that gene content was highly conserved between G19833 and OAC-Rex across the regions examined (>80%). However, fifty-nine genes unique to OAC Rex were identified, with resistance gene homologues making up the largest category (10 genes identified). Two unique genes in OAC-Rex located within the SU91 resistance QTL have homology to P. acutifolius ESTs and may be potential sources of CBB resistance. As the genomic sequence assembly of OAC-Rex is completed, we expect that further comparisons between it and the G19833 genome will lead to a greater understanding of CBB resistance in bean. PMID:24009615

  18. A comparison of the molecular organization of genomic regions associated with resistance to common bacterial blight in two Phaseolus vulgaris genotypes.

    PubMed

    Perry, Gregory; Dinatale, Claudia; Xie, Weilong; Navabi, Alireza; Reinprecht, Yarmilla; Crosby, William; Yu, Kangfu; Shi, Chun; Pauls, K Peter

    2013-01-01

    Resistance to common bacterial blight, caused by Xanthomonas axonopodis pv. phaseoli, in Phaseolus vulgaris is conditioned by several loci on different chromosomes. Previous studies with OAC-Rex, a CBB-resistant, white bean variety of Mesoamerican origin, identified two resistance loci associated with the molecular markers Pv-CTT001 and SU91, on chromosome 4 and 8, respectively. Resistance to CBB is assumed to be derived from an interspecific cross with Phaseolus acutifolius in the pedigree of OAC-Rex. Our current whole genome sequencing effort with OAC-Rex provided the opportunity to compare its genome in the regions associated with CBB resistance with the v1.0 release of the P. vulgaris line G19833, which is a large seeded bean of Andean origin, and (assumed to be) CBB susceptible. In addition, the genomic regions containing SAP6, a marker associated with P. vulgaris-derived CBB-resistance on chromosome 10, were compared. These analyses indicated that gene content was highly conserved between G19833 and OAC-Rex across the regions examined (>80%). However, fifty-nine genes unique to OAC Rex were identified, with resistance gene homologues making up the largest category (10 genes identified). Two unique genes in OAC-Rex located within the SU91 resistance QTL have homology to P. acutifolius ESTs and may be potential sources of CBB resistance. As the genomic sequence assembly of OAC-Rex is completed, we expect that further comparisons between it and the G19833 genome will lead to a greater understanding of CBB resistance in bean. PMID:24009615

  19. Genome analysis: Assigning protein coding regions to three-dimensional structures.

    PubMed Central

    Salamov, A. A.; Suwa, M.; Orengo, C. A.; Swindells, M. B.

    1999-01-01

    We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products. PMID:10211823

  20. QTL Mapping of Genome Regions Controlling Temephos Resistance in Larvae of the Mosquito Aedes aegypti

    PubMed Central

    Reyes-Solis, Guadalupe del Carmen; Saavedra-Rodriguez, Karla; Suarez, Adriana Flores; Black, William C.

    2014-01-01

    Introduction The mosquito Aedes aegypti is the principal vector of dengue and yellow fever flaviviruses. Temephos is an organophosphate insecticide used globally to suppress Ae. aegypti larval populations but resistance has evolved in many locations. Methodology/Principal Findings Quantitative Trait Loci (QTL) controlling temephos survival in Ae. aegypti larvae were mapped in a pair of F3 advanced intercross lines arising from temephos resistant parents from Solidaridad, México and temephos susceptible parents from Iquitos, Peru. Two sets of 200 F3 larvae were exposed to a discriminating dose of temephos and then dead larvae were collected and preserved for DNA isolation every two hours up to 16 hours. Larvae surviving longer than 16 hours were considered resistant. For QTL mapping, single nucleotide polymorphisms (SNPs) were identified at 23 single copy genes and 26 microsatellite loci of known physical positions in the Ae. aegypti genome. In both reciprocal crosses, Multiple Interval Mapping identified eleven QTL associated with time until death. In the Solidaridad×Iquitos (SLD×Iq) cross twelve were associated with survival but in the reciprocal IqxSLD cross, only six QTL were survival associated. Polymorphisms at acetylcholine esterase (AchE) loci 1 and 2 were not associated with either resistance phenotype suggesting that target site insensitivity is not an organophosphate resistance mechanism in this region of México. Conclusions/Significance Temephos resistance is under the control of many metabolic genes of small effect and dispersed throughout the Ae. aegypti genome. PMID:25330200

  1. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea

    PubMed Central

    2013-01-01

    Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706

  2. Interpreting Mammalian Evolution using Fugu Genome Comparisons

    SciTech Connect

    Stubbs, L; Ovcharenko, I; Loots, G G

    2004-04-02

    Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.

  3. Genomic analysis of the Xp21 region around the RP3 locus

    SciTech Connect

    Navia, B.A.; Eisenman, R.E.; Bruns, G.A.

    1994-09-01

    One form of X-linked retinitis pigmentosa has been localized by deletion and linkage analysis to proximal Xp21 near the OTC locus and the proximal breakpoint of the BB deletion. A deletion junction clone, previously isolated from this region, was used to initiate a series of bidirectional walks in a human genomic library in EMBL3A. A phage contig of nearly 70 kb has been cloned and systematically searched for conserved sequences and CA repeats. A number of unique sequences around the breakpoint have been sequenced and analyzed with exon identification programs. An HTF island was identified approximately 35 kb distal to the centromeric breakpoint of the BB deletion and several CA repeat-containing areas were found in the contig. Two YACs that contain the breakpoint and surrounding region were isolated. A phage sublibrary was constructed from one of the YACs and is being used to extend the contig map further centromeric. To isolate transcripts from the region, two rounds of cDNA selection from a combined short insert human retinal and fetal brain library were performed against the pooled phage clones from the contig and against the pooled phage from the YAC derived sublibrary. Among the selected cDNAs, several unique sequences have been identified and are currently being mapped and sequenced.

  4. Complete coding region of the mitochondrial genome of Monochamus alternatus hope (Coleoptera: Cerambycidae).

    PubMed

    Wang, Cheng-Ye; Feng, Ying; Chen, Xiao-Ming

    2013-07-01

    The Japanese pine sawyer, Monochamus alternatus Hope, 1842, an important forest pest, mainly occurs in Far East. It is the main vector of pine wood nematode Bursaphelenchus xylophilus, which causes pine wilt disease. We determined the complete mitochondrial genome coding region of M. alternatus using long PCR and conserved primer walking. Our results show that the entire mitogenome coding region is 14,649 bp long, with 78.22% A+T content [deposited in GenBank (JX987292)]. Positions and arrangement of the 37 genes encoded by the coding region are identical to those of two other longhorn beetles (Psacothea hilaris and Anoplophora glabripennis) for which the complete gene content and arrangement are known. All protein-coding genes start with a typical initiation codon ATN in insects. All tRNAs show standard clover-leaf structure, except the tRNA(Ser) (AGN), which lacks dihydrouridine (DHU) arm. The most unusual feature found is the use of TCT as tRNA(Ser) (AGN) anticodon instead of GCT, which is used in most other arthropods. This provides further insights into the diversity and evolution of the Cerambycidae family of long-horned beetles. PMID:23829217

  5. Genomic regions associated with the nitrogen limitation response revealed in a global wheat core collection.

    PubMed

    Bordes, Jacques; Ravel, C; Jaubertie, J P; Duperrier, B; Gardet, O; Heumez, E; Pissavy, A L; Charmet, G; Le Gouis, J; Balfourier, F

    2013-03-01

    Modern wheat (Triticum aestivum L.) varieties in Western Europe have mainly been bred, and selected in conditions where high levels of nitrogen-rich fertilizer are applied. However, high input crop management has greatly increased the risk of nitrates leaching into groundwater with negative impacts on the environment. To investigate wheat nitrogen tolerance characteristics that could be adapted to low input crop management, we supplied 196 accessions of a wheat core collection of old and modern cultivars with high or moderate amounts of nitrogen fertilizer in an experimental network consisting of three sites and 2 years. The main breeding traits were assessed including grain yield and grain protein content. The response to nitrogen level was estimated for grain yield and grain number per m(2) using both the difference and the ratio between performance at the two input levels and the slope of joint regression. A large variability was observed for all the traits studied and the response to nitrogen level. Whole genome association mapping was carried out using 899 molecular markers taking into account the five ancestral group structure of the collection. We identified 54 main regions involving almost all chromosomes that influence yield and its components, plant height, heading date and grain protein concentration. Twenty-three regions, including several genes, spread over 16 chromosomes were involved in the response to nitrogen level. These chromosomal regions may be good candidates to be used in breeding programs to improve the performance of wheat varieties at moderate nitrogen input levels. PMID:23192671

  6. siRNA Targeting the 2Apro Genomic Region Prevents Enterovirus 71 Replication In Vitro.

    PubMed

    Liu, Haibing; Qin, Yanyan; Kong, Zhenzhen; Shao, Qixiang; Su, Zhaoliang; Wang, Shengjun; Chen, Jianguo

    2016-01-01

    Enterovirus 71 (EV71) is the most important etiological agent of hand, foot, and mouth disease (HFMD) in young children, which is associated with severe neurological complications and has caused significant mortalities in recent HFMD outbreaks in Asia. However, there is no effective antiviral therapy against EV71. In this study, RNA interference (RNAi) was used as an antiviral strategy to inhibit EV71 replication. Three small interfering RNAs (siRNAs) targeting the 2Apro region of the EV71 genome were designed and synthesized. All the siRNAs were transfected individually into rhabdomyosarcoma (RD) cells, which were then infected with strain EV71-2006-52-9. The cytopathic effects (CPEs) in the infected RD cells, cell viability, viral titer, and viral RNA and protein expression were examined to evaluate the specific viral inhibition by the siRNAs. The results of cytopathogenicity and MTT tests indicated that the RD cells transfected with the three siRNAs showed slight CPEs and significantly high viability. The 50% tissue culture infective dose (TCID50) values demonstrated that the viral titer of the groups treated with three siRNAs were lower than those of the control groups. qRT-PCR and western blotting revealed that the levels of viral RNA and protein in the RD cells treated with the three siRNAs were lower than those in the controls. When RD cells transfected with siRNAs were also infected with strain EV71-2008-43-16, the expression of the VP1 protein was significantly inhibited. The levels of interferon α (IFN-α) and IFN-β did not differ significantly in any group. These results suggest that siRNAs targeting the 2Apro region of the EV71 genome exerted antiviral effects in vitro. PMID:26886455

  7. siRNA Targeting the 2Apro Genomic Region Prevents Enterovirus 71 Replication In Vitro

    PubMed Central

    Kong, Zhenzhen; Shao, Qixiang; Su, Zhaoliang; Wang, Shengjun; Chen, Jianguo

    2016-01-01

    Enterovirus 71 (EV71) is the most important etiological agent of hand, foot, and mouth disease (HFMD) in young children, which is associated with severe neurological complications and has caused significant mortalities in recent HFMD outbreaks in Asia. However, there is no effective antiviral therapy against EV71. In this study, RNA interference (RNAi) was used as an antiviral strategy to inhibit EV71 replication. Three small interfering RNAs (siRNAs) targeting the 2Apro region of the EV71 genome were designed and synthesized. All the siRNAs were transfected individually into rhabdomyosarcoma (RD) cells, which were then infected with strain EV71-2006-52-9. The cytopathic effects (CPEs) in the infected RD cells, cell viability, viral titer, and viral RNA and protein expression were examined to evaluate the specific viral inhibition by the siRNAs. The results of cytopathogenicity and MTT tests indicated that the RD cells transfected with the three siRNAs showed slight CPEs and significantly high viability. The 50% tissue culture infective dose (TCID50) values demonstrated that the viral titer of the groups treated with three siRNAs were lower than those of the control groups. qRT–PCR and western blotting revealed that the levels of viral RNA and protein in the RD cells treated with the three siRNAs were lower than those in the controls. When RD cells transfected with siRNAs were also infected with strain EV71-2008-43-16, the expression of the VP1 protein was significantly inhibited. The levels of interferon α (IFN-α) and IFN-β did not differ significantly in any group. These results suggest that siRNAs targeting the 2Apro region of the EV71 genome exerted antiviral effects in vitro. PMID:26886455

  8. Genome-Based Identification of Active Prophage Regions by Next Generation Sequencing in Bacillus licheniformis DSM13

    PubMed Central

    Hertel, Robert; Rodríguez, David Pintor; Hollensteiner, Jacqueline; Dietrich, Sascha; Leimbach, Andreas; Hoppert, Michael; Liesegang, Heiko; Volland, Sonja

    2015-01-01

    Prophages are viruses, which have integrated their genomes into the genome of a bacterial host. The status of the prophage genome can vary from fully intact with the potential to form infective particles to a remnant state where only a few phage genes persist. Prophages have impact on the properties of their host and are therefore of great interest for genomic research and strain design. Here we present a genome- and next generation sequencing (NGS)-based approach for identification and activity evaluation of prophage regions. Seven prophage or prophage-like regions were identified in the genome of Bacillus licheniformis DSM13. Six of these regions show similarity to members of the Siphoviridae phage family. The remaining region encodes the B. licheniformis orthologue of the PBSX prophage from Bacillus subtilis. Analysis of isolated phage particles (induced by mitomycin C) from the wild-type strain and prophage deletion mutant strains revealed activity of the prophage regions BLi_Pp2 (PBSX-like), BLi_Pp3 and BLi_Pp6. In contrast to BLi_Pp2 and BLi_Pp3, neither phage DNA nor phage particles of BLi_Pp6 could be visualized. However, the ability of prophage BLi_Pp6 to generate particles could be confirmed by sequencing of particle-protected DNA mapping to prophage locus BLi_Pp6. The introduced NGS-based approach allows the investigation of prophage regions and their ability to form particles. Our results show that this approach increases the sensitivity of prophage activity analysis and can complement more conventional approaches such as transmission electron microscopy (TEM). PMID:25811873

  9. Regulation of mitochondrial genome replication by hypoxia: The role of DNA oxidation in D-loop region.

    PubMed

    Pastukh, Viktor M; Gorodnya, Olena M; Gillespie, Mark N; Ruchko, Mykhaylo V

    2016-07-01

    Mitochondria of mammalian cells contain multiple copies of mitochondrial (mt) DNA. Although mtDNA copy number can fluctuate dramatically depending on physiological and pathophysiologic conditions, the mechanisms regulating mitochondrial genome replication remain obscure. Hypoxia, like many other physiologic stimuli that promote growth, cell proliferation and mitochondrial biogenesis, uses reactive oxygen species as signaling molecules. Emerging evidence suggests that hypoxia-induced transcription of nuclear genes requires controlled DNA damage and repair in specific sequences in the promoter regions. Whether similar mechanisms are operative in mitochondria is unknown. Here we test the hypothesis that controlled oxidative DNA damage and repair in the D-loop region of the mitochondrial genome are required for mitochondrial DNA replication and transcription in hypoxia. We found that hypoxia had little impact on expression of mitochondrial proteins in pulmonary artery endothelial cells, but elevated mtDNA content. The increase in mtDNA copy number was accompanied by oxidative modifications in the D-loop region of the mitochondrial genome. To investigate the role of this sequence-specific oxidation of mitochondrial genome in mtDNA replication, we overexpressed mitochondria-targeted 8-oxoguanine glycosylase Ogg1 in rat pulmonary artery endothelial cells, enhancing the mtDNA repair capacity of transfected cells. Overexpression of Ogg1 resulted in suppression of hypoxia-induced mtDNA oxidation in the D-loop region and attenuation of hypoxia-induced mtDNA replication. Ogg1 overexpression also reduced binding of mitochondrial transcription factor A (TFAM) to both regulatory and coding regions of the mitochondrial genome without altering total abundance of TFAM in either control or hypoxic cells. These observations suggest that oxidative DNA modifications in the D-loop region during hypoxia are important for increased TFAM binding and ensuing replication of the mitochondrial

  10. Differential DNA Methylation Regions in Cytokine and Transcription Factor Genomic Loci Associate with Childhood Physical Aggression

    PubMed Central

    Provençal, Nadine; Suderman, Matthew J.; Caramaschi, Doretta; Wang, Dongsha; Hallett, Michael; Vitaro, Frank

    2013-01-01

    Background Animal and human studies suggest that inflammation is associated with behavioral disorders including aggression. We have recently shown that physical aggression of boys during childhood is strongly associated with reduced plasma levels of cytokines IL-1α, IL-4, IL-6, IL-8 and IL-10, later in early adulthood. This study tests the hypothesis that there is an association between differential DNA methylation regions in cytokine genes in T cells and monocytes DNA in adult subjects and a trajectory of physical aggression from childhood to adolescence. Methodology/Principal Findings We compared the methylation profiles of the entire genomic loci encompassing the IL-1α, IL-6, IL-4, IL-10 and IL-8 and three of their regulatory transcription factors (TF) NFkB1, NFAT5 and STAT6 genes in adult males on a chronic physical aggression trajectory (CPA) and males with the same background who followed a normal physical aggression trajectory (control group) from childhood to adolescence. We used the method of methylated DNA immunoprecipitation with comprehensive cytokine gene loci and TF loci microarray hybridization, statistical analysis and false discovery rate correction. We found differentially methylated regions to associate with CPA in both the cytokine loci as well as in their transcription factors loci analyzed. Some of these differentially methylated regions were located in known regulatory regions whereas others, to our knowledge, were previously unknown as regulatory areas. However, using the ENCODE database, we were able to identify key regulatory elements in many of these regions that indicate that they might be involved in the regulation of cytokine expression. Conclusions We provide here the first evidence for an association between differential DNA methylation in cytokines and their regulators in T cells and monocytes and male physical aggression. PMID:23977113

  11. Structure and organization of Marchantia polymorpha chloroplast genome. IV. Inverted repeat and small single copy regions.

    PubMed

    Kohchi, T; Shirai, H; Fukuzawa, H; Sano, T; Komano, T; Umesono, K; Inokuchi, H; Ozeki, H; Ohyama, K

    1988-09-20

    We characterized the genes in the regions of large inverted repeats (IRA and IRB, 10,058 base-pairs each) and a small single copy (SSC 19,813 bp) of chloroplast DNA from Marchantia polymorpha. The inverted repeat (IR) regions contain genes for four ribosomal RNAs (16 S, 23 S, 4.5 S and 5 S rRNAs) and five transfer RNAs (valine tRNA(GAC), isoleucine tRNA(GAU), alanine tRNA(UGC), arginine tRNA(ACG) and asparagine tRNA(GUU)). The gene organization of the IR regions in the liverwort chloroplast genome is conserved, although the IR regions are smaller (10,058 base-pairs) than any reported in higher plant chloroplasts. The small single-copy region (19,813 base-pairs) encoded genes for 17 open reading frames, a leucine tRNA(UAG) and a proline tRNA(GGG)-like sequence. We identified 12 open reading frames by homology of their coding sequences to a 4Fe-4S-type ferredoxin protein, a bacterial nitrogenase reductase component (Fe-protein), five human mitochondrial components of NADH dehydrogenase (ND1, ND4, ND4L, ND5 and ND6), two Escherichia coli ribosomal proteins (S15 and L21), two putative proteins encoded in the kinetoplast maxicircle DNA of Leishmania tarentolae (LtORF 3 and LtORF 4), and a bacterial permease inner membrane component (encoded by malF in E. coli or hisQ in Salmonella typhimurium). PMID:3199437

  12. Two distinct genomic regions, harbouring the period and fruitless genes, affect male courtship song in Drosophila montana

    PubMed Central

    Lagisz, M; Wen, S-Y; Routtu, J; Klappert, K; Mazzi, D; Morales-Hojas, R; Schäfer, M A; Vieira, J; Hoikkala, A; Ritchie, M G; Butlin, R K

    2012-01-01

    Acoustic signals often have a significant role in pair formation and in species recognition. Determining the genetic basis of signal divergence will help to understand signal evolution by sexual selection and its role in the speciation process. An earlier study investigated quantitative trait locus for male courtship song carrier frequency (FRE) in Drosophila montana using microsatellite markers. We refined this study by adding to the linkage map markers for 10 candidate genes known to affect song production in Drosophila melanogaster. We also extended the analyses to additional song characters (pulse train length (PTL), pulse number (PN), interpulse interval, pulse length (PL) and cycle number (CN)). Our results indicate that loci in two different regions of the genome control distinct features of the courtship song. Pulse train traits (PTL and PN) mapped to the X chromosome, showing significant linkage with the period gene. In contrast, characters related to song pulse properties (PL, CN and carrier FRE) mapped to the region of chromosome 2 near the candidate gene fruitless, identifying these genes as suitable loci for further investigations. In previous studies, the pulse train traits have been found to vary substantially between Drosophila species, and so are potential species recognition signals, while the pulse traits may be more important in intra-specific mate choice. PMID:22234247

  13. Recombination within the apospory specific genomic region leads to the uncoupling of apomixis components in Cenchrus ciliaris.

    PubMed

    Conner, Joann A; Gunawan, Gunawati; Ozias-Akins, Peggy

    2013-07-01

    Apomixis enables the clonal propagation of maternal genotypes through seed. If apomixis could be harnessed via genetic engineering or introgression, it would have a major economic impact for agricultural crops. In the grass species Pennisetum squamulatum and Cenchrus ciliaris (syn. P. ciliare), apomixis is controlled by a single dominant "locus", the apospory-specific genomic region (ASGR). For P. squamulatum, 18 published sequenced characterized amplified region (SCAR) markers have been identified which always co-segregate with apospory. Six of these markers are conserved SCARs in the closely related species, C. ciliaris and co-segregate with the trait. A screen of progeny from a cross of sexual × apomictic C. ciliaris genotypes identified a plant, A8, retaining two of the six ASGR-linked SCAR markers. Additional and newly identified ASGR-linked markers were generated to help identify the extent of recombination within the ASGR. Based on analysis of missing markers, the A8 recombinant plant has lost a significant portion of the ASGR but continues to form aposporous embryo sacs. Seedlings produced from aposporous embryo sacs are 6× in ploidy level and hence the A8 recombinant does not express parthenogenesis. The recombinant A8 plant represents a step forward in reducing the complexity of the ASGR locus to determine the factor(s) required for aposporous embryo sac formation and documents the separation of expression of the two components of apomixis in C. ciliaris. PMID:23553451

  14. Characterisation of a genomic clone covering the structural mouse MyoD1 gene and its promoter region.

    PubMed Central

    Zingg, J M; Alva, G P; Jost, J P

    1991-01-01

    We have isolated the mouse MyoD1 gene flanked by its promoter region by screening a genomic library with synthetic oligonucleotides. The structural gene is interrupted by two G + C rich introns. Transfection of the cloned gene inserted into an expression vector converts fibroblasts to myoblasts. Sequence analysis of about 650 bp of the 5' upstream region revealed the presence of several potential regulatory elements such as a TATA-box, an AP2-box, two SP1-boxes and a CAAT-box. In addition, there are three half palindromic estrogen response elements, a potential cAMP response element and various muscle specific elements such as a muscle-specific CAAT-box (MCAT) and four potential binding sites for MyoD1. Using S1 protection analysis the major start site of transcription in muscle and myoblast cells was mapped 3 bp upstream of the published cDNA 5' end. Promoter activity of the 650 bp upstream fragment was tested by in vitro transcription and by transfection analysis of myoblasts and fibroblasts. In all promoter test systems used, MyoD1 promoter activity was detected in myoblasts as well as in fibroblasts. Furthermore, DNA methylation was found to turn off MyoD1 promoter activity both in myoblasts and in fibroblasts. Images PMID:1754380

  15. Additional Routes to Staphylococcus aureus Daptomycin Resistance as Revealed by Comparative Genome Sequencing, Transcriptional Profiling, and Phenotypic Studies

    PubMed Central

    Song, Yang; Rubio, Aileen; Jayaswal, Radheshyam K.; Silverman, Jared A.; Wilkinson, Brian J.

    2013-01-01

    Daptomycin is an extensively used anti-staphylococcal agent due to the rise in methicillin-resistant Staphylococcus aureus, but the mechanism(s) of resistance is poorly understood. Comparative genome sequencing, transcriptomics, ultrastructure, and cell envelope studies were carried out on two relatively higher level (4 and 8 µg/ml−1) laboratory-derived daptomycin-resistant strains (strains CB1541 and CB1540 respectively) compared to their parent strain (CB1118; MW2). Several mutations were found in the strains. Both strains had the same mutations in the two-component system genes walK and agrA. In strain CB1540 mutations were also detected in the ribose phosphate pyrophosphokinase (prs) and polyribonucleotide nucleotidyltransferase genes (pnpA), a hypothetical protein gene, and in an intergenic region. In strain CB1541 there were mutations in clpP, an ATP-dependent protease, and two different hypothetical protein genes. The strain CB1540 transcriptome was characterized by upregulation of cap (capsule) operon genes, genes involved in the accumulation of the compatible solute glycine betaine, ure genes of the urease operon, and mscL encoding a mechanosensitive chanel. Downregulated genes included smpB, femAB and femH involved in the formation of the pentaglycine interpeptide bridge, genes involved in protein synthesis and fermentation, and spa encoding protein A. Genes altered in their expression common to both transcriptomes included some involved in glycine betaine accumulation, mscL, ure genes, femH, spa and smpB. However, the CB1541 transcriptome was further characterized by upregulation of various heat shock chaperone and protease genes, consistent with a mutation in clpP, and lytM and sceD. Both strains showed slow growth, and strongly decreased autolytic activity that appeared to be mainly due to decreased autolysin production. In contrast to previous common findings, we did not find any mutations in phospholipid biosynthesis genes, and it appears there

  16. Comparative Genomics and Phylogenetic Analyses of Newly Cloned Genomic Regions From the Citrus Huanglongbing (HLB)-associated Bacterium Candidatus Liberibacter

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Huanglongbing (HLB), or citrus greening disease, caused by Candidatus Liberibacter species, is a serious threat to citrus production worldwide. The pathogen is a gram negative, unculturable, phloem-limited bacterium, with little known genomic information. Here, we report cloning and characterizatio...

  17. Prior genetic architecture impacting genomic regions under selection: An example using genomic selection in two poultry breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The objective of this study is to investigate if selection on similar traits in different populations progress from selection on similar genes. With the aid of high-density genome wide single-nucleotide polymorphism (SNP) genotyping, it is possible to directly assess changes in allelic f...

  18. Additive and epistatic genome-wide association for growth and ultrasound scan measures of carcass-related traits in Brahman cattle.

    PubMed

    Ali, A A; Khatkar, M S; Kadarmideen, H N; Thomson, P C

    2015-04-01

    Genome-wide association studies are routinely used to identify genomic regions associated with traits of interest. However, this ignores an important class of genomic associations, that of epistatic interactions. A genome-wide interaction analysis between single nucleotide polymorphisms (SNPs) using highly dense markers can detect epistatic interactions, but is a difficult task due to multiple testing and computational demand. However, It is important for revealing complex trait heredity. This study considers analytical methods that detect statistical interactions between pairs of loci. We investigated a three-stage modelling procedure: (i) a model without the SNP to estimate the variance components; (ii) a model with the SNP using variance component estimates from (i), thus avoiding iteration; and (iii) using the significant SNPs from (ii) for genome-wide epistasis analysis. We fitted these three-stage models to field data for growth and ultrasound measures for subcutaneous fat thickness in Brahman cattle. The study demonstrated the usefulness of modelling epistasis in the analysis of complex traits as it revealed extra sources of genetic variation and identified potential candidate genes affecting the concentration of insulin-like growth factor-1 and ultrasound scan measure of fat depth traits. Information about epistasis can add to our understanding of the complex genetic networks that form the fundamental basis of biological systems. PMID:25754883

  19. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: The Adh region

    SciTech Connect

    Ashburner, M.; Misra, S.; Roote, J.; Lewis, S.E.; Blazej, R.; Davis, T.; Doyle, C.; Galle, R.; George, R.; Harris, N.; Hartzell, G.; Harvey, D.; Hong, L.; Houston, K.; Hoskins, R.; Johnson, G.; Martin, C.; Moshrefi, A.; Palazzolo, M.; Reese, M.G.; Spradling, A.; Tsang, G.; Wan, K.; Whitelaw, K.; Kimmel, B.; Celniker, S.; Rubin, G.M.

    1999-03-24

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized

  20. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: BAC-based physical maps provide for sequencing across an entire genome or selected sub-genome regions of biological interest. Using the minimum tiling path as a guide, it is possible to select specific BAC clones from prioritized genome sections such as a genetically defined QTL interv...

  1. Mapping Association between Long-Range Cis-Regulatory Regions and Their Target Genes Using Comparative Genomics

    NASA Astrophysics Data System (ADS)

    Mongin, Emmanuel; Dewar, Ken; Blanchette, Mathieu

    In chordates, long-range cis-regulatory regions are involved in the control of transcription initiation (either as repressors or enhancers). They can be located as far as 1 Mb from the transcription start site of the target gene and can regulate more than one gene. Therefore, proper characterization of functional interactions between long-range cis-regulatory regions and their target genes remains problematic. We present a novel method to predict such interactions based on the analysis of rearrangements between the human and 16 other vertebrate genomes. Our method is based on the assumption that genome rearrangements that would disrupt the functional interaction between a cis-regulatory region and its target gene are likely to be deleterious. Therefore, conservation of synteny through evolution would be an indication of a functional interaction. We use our algorithm to classify a set of 1,406,084 putative associations from the human genome. This genome-wide map of interactions has many potential applications, including the selection of candidate regions prior to in vivo experimental characterization, a better characterization of regulatory regions involved in position effect diseases, and an improved understanding of the mechanisms and importance of long-range regulation.

  2. In silico comparison of genomic regions containing genes coding for enzymes and transcription factors for the phenylpropanoid pathway in Phaseolus vulgaris L. and Glycine max L. Merr

    PubMed Central

    Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter

    2013-01-01

    Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter

  3. Comparative Genomic Sequence Analysis of the Human Chromosome 21 Down Syndrome Critical Region

    PubMed Central

    Toyoda, Atsushi; Noguchi, Hideki; Taylor, Todd D.; Ito, Takehiko; Pletcher, Mathew T.; Sakaki, Yoshiyuki; Reeves, Roger H.; Hattori, Masahira

    2002-01-01

    Comprehensive knowledge of the gene content of human chromosome 21 (HSA21) is essential for understanding the etiology of Down syndrome (DS). Here we report the largest comparison of finished mouse and human sequence to date for a 1.35-Mb region of mouse chromosome 16 (MMU16) that corresponds to human chromosome 21q22.2. This includes a portion of the commonly described “DS critical region,” thought to contain a gene or genes whose dosage imbalance contributes to a number of phenotypes associated with DS. We used comparative sequence analysis to construct a DNA feature map of this region that includes all known genes, plus 144 conserved sequences ≥100 bp long that show ≥80% identity between mouse and human but do not match known exons. Twenty of these have matches to expressed sequence tag and cDNA databases, indicating that they may be transcribed sequences from chromosome 21. Eight putative CpG islands are found at conserved positions. Models for two human genes, DSCR4 and DSCR8, are not supported by conserved sequence, and close examination indicates that low-level transcripts from these loci are unlikely to encode proteins. Gene prediction programs give different results when used to analyze the well-conserved regions between mouse and human sequences. Our findings have implications for evolution and for modeling the genetic basis of DS in mice. [Sequence data described in this paper have been submitted to the DDBJ/GenBank under accession nos. AP003148 through AP003158, and AB066227. Supplemental material is available at http://www.genome.org.] PMID:12213769

  4. Robust physical methods that enrich genomic regions identical by descent for linkage studies: confirmation of a locus for osteogenesis imperfecta

    PubMed Central

    Brooks, Peter; Marcaillou, Charles; Vanpeene, Maud; Saraiva, Jean-Paul; Stockholm, Daniel; Francke, Stephan; Favis, Reyna; Cohen, Nadine; Rousseau, Francis; Tores, Frédéric; Lindenbaum, Pierre; Hager, Jörg; Philippi, Anne

    2009-01-01

    Background The monogenic disease osteogenesis imperfecta (OI) is due to single mutations in either of the collagen genes ColA1 or ColA2, but within the same family a given mutation is accompanied by a wide range of disease severity. Although this phenotypic variability implies the existence of modifier gene variants, genome wide scanning of DNA from OI patients has not been reported. Promising genome wide marker-independent physical methods for identifying disease-related loci have lacked robustness for widespread applicability. Therefore we sought to improve these methods and demonstrate their performance to identify known and novel loci relevant to OI. Results We have improved methods for enriching regions of identity-by-descent (IBD) shared between related, afflicted individuals. The extent of enrichment exceeds 10- to 50-fold for some loci. The efficiency of the new process is shown by confirmation of the identification of the Col1A2 locus in osteogenesis imperfecta patients from Amish families. Moreover the analysis revealed additional candidate linkage loci that may harbour modifier genes for OI; a locus on chromosome 1q includes COX-2, a gene implicated in osteogenesis. Conclusion Technology for physical enrichment of IBD loci is now robust and applicable for finding genes for monogenic diseases and genes for complex diseases. The data support the further investigation of genetic loci other than collagen gene loci to identify genes affecting the clinical expression of osteogenesis imperfecta. The discrimination of IBD mapping will be enhanced when the IBD enrichment procedure is coupled with deep resequencing. PMID:19331686

  5. [Mutation frequencies in HIV-1 subtype-A genome in regions containing efficient RNAi targets].

    PubMed

    Kravatsky, Y V; Chechetkin, V R; Fedoseeva, D M; Gorbacheva, M A; Kretova, O V; Tchurikov, N A

    2016-01-01

    The development of gene-therapy technology using RNAi for AIDS/HIV-1 treatment is a prospective alternative to traditional anti-retroviral therapy. RNAi targets could be selected in HIV-1 transcripts and in CCR5 mRNA. Previously, we experimentally selected a number of efficient siRNAs that target HIV-1 RNAs. The viral genome mutates frequently, and RNAi strength is very sensitive, even for a single mismatches. That is why it is important to study nucleotide sequences of targets in clinical isolates of HIV-1. In the present study, we analyzed mutations in 6 of about 300-bp regions containing RNAi targets from HIV-1 subtype A isolates in Russia. Estimates of the mean frequencies of mutations in the targets were obtained and the frequencies of mutations in the different codon positions were compared. The frequencies of mutations in the vicinity of the targets and directly within the targets were also compared and have been shown to be approximately the same. The frequencies of indels in the chosen regions have been assessed. Their frequencies have proved to be two to three orders of magnitude less compared to that for mutations. PMID:27414786

  6. Association of cohesin and Nipped-B with transcriptionally active regions of the Drosophila melanogaster genome

    PubMed Central

    Misulovin, Ziva; Schwartz, Yuri B.; Li, Xiao-Yong; Kahn, Tatyana G.; Gause, Maria; MacArthur, Stewart; Fay, Justin C.; Eisen, Michael B.; Pirrotta, Vincenzo; Biggin, Mark D.

    2008-01-01

    The cohesin complex is a chromosomal component required for sister chromatid cohesion that is conserved from yeast to man. The similarly conserved Nipped-B protein is needed for cohesin to bind to chromosomes. In higher organisms, Nipped-B and cohesin regulate gene expression and development by unknown mechanisms. Using chromatin immunoprecipitation, we find that Nipped-B and cohesin bind to the same sites throughout the entire non-repetitive Drosophila genome. They preferentially bind transcribed regions and overlap with RNA polymerase II. This contrasts sharply with yeast, where cohesin binds almost exclusively between genes. Differences in cohesin and Nipped-B binding between Drosophila cell lines often correlate with differences in gene expression. For example, cohesin and Nipped-B bind the Abd-B homeobox gene in cells in which it is transcribed, but not in cells in which it is silenced. They bind to the Abd-B transcription unit and downstream regulatory region and thus could regulate both transcriptional elongation and activation. We posit that transcription facilitates cohesin binding, perhaps by unfolding chromatin, and that Nipped-B then regulates gene expression by controlling cohesin dynamics. These mechanisms are likely involved in the etiology of Cornelia de Lange syndrome, in which mutation of one copy of the NIPBL gene encoding the human Nipped-B ortholog causes diverse structural and mental birth defects. PMID:17965872

  7. Genome-Wide Transcriptome Profiling of Region-Specific Vulnerability to Oxidative Stress in the Hippocampus

    PubMed Central

    Wang, Xinkun; Pal, Ranu; Chen, Xue-wen; Kumar, Keshava N.; Kim, Ok-Jin; Michaelis, Elias K.

    2007-01-01

    Neurons in the hippocampal CA1 region are particularly sensitive to oxidative stress (OS), whereas those in CA3 are resistant. To uncover mechanisms for selective CA1 vulnerability to OS, we treated organotypic hippocampal slices with duroquinone and compared transcriptional profiles of CA1 vs. CA3 cells at various intervals. Gene Ontology and biological pathway analyses of differentially expressed genes showed that at all time points, CA1 had higher transcriptional activity of stress/inflammatory response, transition metal transport, ferroxidase, and pre-synaptic signaling activity, while CA3 had higher GABA-signaling, postsynaptic, and calcium and potassium channel activity. Real-time PCR and immunoblots confirmed the transcriptome data and the induction of OS by duroquinone in both hippocampal regions. Our functional genomics approach has identified in CA1 cells molecular pathways as well as unique genes, such as, guanosine deaminase, lipocalin2, synaptotagmin 4, and latrophilin 2, whose time-dependent induction following the initiation of OS may represent attempts at neurite outgrowth, synaptic recovery, and resistance against OS. PMID:17553663

  8. Sequence analysis of two genomic regions containing the KIT and the FMS receptor tyrosine kinase genes

    SciTech Connect

    Andre, C.; Hampe, A.; Lachaume, P.

    1997-01-15

    The KIT and FMS tyrosine kinase receptors, which are implicated in the control of cell growth and differentiation, stem through duplications from a common ancestor. We have conducted a detailed structural analysis of the two loci containing the KIT and FMS genes. The sequence of the {approximately}90-kb KIT locus reveals the position and size of the 21 introns and of the 5{prime} regulatory region of the KIT gene. The introns and the 3{prime}-untranslated parts of KIT and FMS have been analyzed in parallel. Comparison of the two sequences shows that, while introns of both genes have extensively diverged in size and sequence, this divergence is, at least in part, due to intron expansion through internal duplications, as suggested by the discrete extant analogies. Repetitive elements as well as exon predictions obtained using the GRAIL and GENEFINDER programs are described in detail. These programs led us to identify a novel gene, designated SMF, immediately downstream of FMS, in the opposite orientation. This finding emphasizes the gene-rich characteristic of this genomic region. 49 refs., 4 figs., 7 tabs.

  9. Genomic stability of murine leukemia viruses containing insertions at the Env-3' untranslated region boundary.

    PubMed

    Logg, C R; Logg, A; Tai, C K; Cannon, P M; Kasahara, N

    2001-08-01

    Retroviruses containing inserts of exogenous sequences frequently eliminate the inserted sequences upon spread in susceptible cells. We have constructed replication-competent murine leukemia virus (MLV) vectors containing internal ribosome entry site (IRES)-transgene cassettes at the env-3' untranslated region boundary in order to examine the effects of insert sequence and size on the loss of inserts during viral replication. A virus containing an insertion of 1.6 kb replicated with greatly attenuated kinetics relative to wild-type virus and lost the inserted sequences in a single infection cycle. In contrast, MLVs containing inserts of 1.15 to 1.30 kb replicated with kinetics only slightly attenuated compared to wild-type MLV and exhibited much greater stability, maintaining their genomic integrity over multiple serial infection cycles. Eventually, multiple species of deletion mutants were detected simultaneously in later infection cycles; once detected, these variants rapidly dominated the population and thereafter appeared to be maintained at a relative equilibrium. Sequence analysis of these variants identified preferred sites of recombination in the parental viruses, including both short direct repeats and inverted repeats. One instance of insert deletion through recombination with an endogenous retrovirus was also observed. When specific sequences involved in these recombination events were eliminated, deletion variants still arose with the same kinetics upon virus passage and by apparently similar mechanisms, although at different locations in the vectors. Our results suggest that while lengthened, insert-containing genomes can be maintained over multiple replication cycles, preferential deletions resulting in loss of the inserted sequences confer a strong selective advantage. PMID:11435579

  10. Deletion of the E4 region of the genome produces adenovirus DNA concatemers.

    PubMed Central

    Weiden, M D; Ginsberg, H S

    1994-01-01

    Two mutants containing large deletions in the E4 region of the adenovirus genome H5dl366 (91.9-98.3 map units) and H2dl808 (93.0-97.1 map units) were used to investigate the role of E4 genes in adenovirus DNA synthesis. Infection of KB human epidermoid carcinoma cells with either mutant resulted in production of large concatemers of viral DNA. Only monomer viral genome forms were produced, however, when mutants infected W162 cells, a monkey kidney cell line transformed with and expressing the E4 genes. Diffusible E4 gene products, therefore, complement the E4 mutant phenotype. The viral DNA concatemers produced in dl366- and dl808-infected KB cells did not have any specific orientation of monomer joining: the junctions consisted of head-to-head, head-to-tail, and tail-to-tail joints. The junctions were covalently linked molecules, but molecules were not precisely joined, and restriction enzyme maps revealed a heterogeneous size distribution of junction fragments. A series of mutants that disrupted single E4 open reading frames (ORFs) was also studied: none showed phenotypes similar to that of dl366 or dl808. Mutants containing defects in both ORF3 and ORF6, however, manifested the concatemer phenotype, indicating redundancy in genes preventing concatemer formation. These data suggest that the E4 ORFs 3 and 6 express functions critical for regulation of viral DNA replication and that concatemer intermediates may exist during adenovirus DNA synthesis. Images Fig. 2 Fig. 3 Fig. 4 PMID:8278357

  11. Variable Genome Sequences of the Murine Pneumotropic Virus (Polyomaviridae) Regulatory Region Isolated from an Infected Mouse Tissue Viral Suspension

    PubMed Central

    Libbey, Jane E.

    2016-01-01

    The murine pneumotropic virus genome, isolated from an infected murine tissue homogenate, was sequenced to completion. The lungs, liver, spleen, and kidneys were the source of the tissue homogenate in order to mirror the heterogeneity of the virus population in vivo. The regulatory region sequence was found to be highly variable. PMID:27231357

  12. Use of sample pooling in a genome-wide association study identifies chromosomal regions affecting incidence of bovine respiratory disease

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We hypothesize that genome-wide association (GWA) based on high-density SNP arrays can be used to identify chromosomal regions affecting disease incidence using a case/control type approach. However, the large sample size required to map a lowly heritable trait like susceptibility to bovine respirat...

  13. Identification of Nine Genomic Regions of Amplification in Urothelial Carcinoma, Correlation with Stage, and Potential Prognostic and Therapeutic Value

    PubMed Central

    Rosenberg, Jonathan; Riester, Markus; Dai, Qishan; Lin, Sharron; Guo, Yanan; McDougal, W. Scott; Kwiatkowski, David J.

    2013-01-01

    We performed a genome wide analysis of 164 urothelial carcinoma samples and 27 bladder cancer cell lines to identify copy number changes associated with disease characteristics, and examined the association of amplification events with stage and grade of disease. Multiplex inversion probe (MIP) analysis, a recently developed genomic technique, was used to study 80 urothelial carcinomas to identify mutations and copy number changes. Selected amplification events were then analyzed in a validation cohort of 84 bladder cancers by multiplex ligation-dependent probe assay (MLPA). In the MIP analysis, 44 regions of significant copy number change were identified using GISTIC. Nine gene-containing regions of amplification were selected for validation in the second cohort by MLPA. Amplification events at these 9 genomic regions were found to correlate strongly with stage, being seen in only 2 of 23 (9%) Ta grade 1 or 1–2 cancers, in contrast to 31 of 61 (51%) Ta grade 3 and T2 grade 2 cancers, p<0.001. These observations suggest that analysis of genomic amplification of these 9 regions might help distinguish non-invasive from invasive urothelial carcinoma, although further study is required. Both MIP and MLPA methods perform well on formalin-fixed paraffin-embedded DNA, enhancing their potential clinical use. Furthermore several of the amplified genes identified here (ERBB2, MDM2, CCND1) are potential therapeutic targets. PMID:23593348

  14. Variable Genome Sequences of the Murine Pneumotropic Virus (Polyomaviridae) Regulatory Region Isolated from an Infected Mouse Tissue Viral Suspension.

    PubMed

    Libbey, Jane E; Fujinami, Robert S

    2016-01-01

    The murine pneumotropic virus genome, isolated from an infected murine tissue homogenate, was sequenced to completion. The lungs, liver, spleen, and kidneys were the source of the tissue homogenate in order to mirror the heterogeneity of the virus population in vivo The regulatory region sequence was found to be highly variable. PMID:27231357

  15. Genomic regions associated with incidence of disease in cattle using DNA pooling and a high density single nucleotide polymorphism array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic regions associated with general disease (respiratory disease, foot rot, and pinkeye) in beef cattle were identified using treatment records on 2,849 animals. General disease cases included animals treated for bovine respiratory disease, foot rot, or pinkeye. Untreated cohorts, matched on b...

  16. 30 CFR 250.1166 - What additional reporting is required for developments in the Alaska OCS Region?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 2 2011-07-01 2011-07-01 false What additional reporting is required for developments in the Alaska OCS Region? 250.1166 Section 250.1166 Mineral Resources BUREAU OF OCEAN ENERGY... development is jointly regulated by MMS and the State of Alaska, MMS and the Alaska Oil and Gas...

  17. Adaptation of Maize to Temperate Climates: Mid-Density Genome-Wide Association Genetics and Diversity Patterns Reveal Key Genomic Regions, with a Major Contribution of the Vgt2 (ZCN8) Locus

    PubMed Central

    Bouchet, Sophie; Servin, Bertrand; Bertin, Pascal; Madur, Delphine; Combes, Valérie; Dumas, Fabrice; Brunel, Dominique; Laborde, Jacques; Charcosset, Alain; Nicolas, Stéphane

    2013-01-01

    The migration of maize from tropical to temperate climates was accompanied by a dramatic evolution in flowering time. To gain insight into the genetic architecture of this adaptive trait, we conducted a 50K SNP-based genome-wide association and diversity investigation on a panel of tropical and temperate American and European representatives. Eighteen genomic regions were associated with flowering time. The number of early alleles cumulated along these regions was highly correlated with flowering time. Polymorphism in the vicinity of the ZCN8 gene, which is the closest maize homologue to Arabidopsis major flowering time (FT) gene, had the strongest effect. This polymorphism is in the vicinity of the causal factor of Vgt2 QTL. Diversity was lower, whereas differentiation and LD were higher for associated loci compared to the rest of the genome, which is consistent with selection acting on flowering time during maize migration. Selection tests also revealed supplementary loci that were highly differentiated among groups and not associated with flowering time in our panel, whereas they were in other linkage-based studies. This suggests that allele fixation led to a lack of statistical power when structure and relatedness were taken into account in a linear mixed model. Complementary designs and analysis methods are necessary to unravel the architecture of complex traits. Based on linkage disequilibrium (LD) estimates corrected for population structure, we concluded that the number of SNPs genotyped should be at least doubled to capture all QTLs contributing to the genetic architecture of polygenic traits in this panel. These results show that maize flowering time is controlled by numerous QTLs of small additive effect and that strong polygenic selection occurred under cool climatic conditions. They should contribute to more efficient genomic predictions of flowering time and facilitate the dissemination of diverse maize genetic resources under a wide range of

  18. The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires

    PubMed Central

    Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel

    2016-01-01

    Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423

  19. The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires.

    PubMed

    Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P; Smokvina, Tamara; de Vos, Willem M; Knol, Jan; Kleerebezem, Michiel

    2016-01-01

    Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence-absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains' core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423

  20. 12q14 Microdeletions: Additional Case Series with Confirmation of a Macrocephaly Region

    PubMed Central

    Mc Cormack, Adrian; Sharpe, Cynthia; Gregersen, Nerine; Smith, Warwick; Hayes, Ian; George, Alice M.; Love, Donald R.

    2015-01-01

    To date, there have been only a few reports of patients carrying a microdeletion in chromosome 12q14. These patients usually present with pre- and postnatal growth retardation, and developmental delay. Here we report on two additional patients with both genotype and phenotype differences. Similar to previously published cases, one patient has haploinsufficiency of the HMGA2 gene and shows severe short stature and developmental delay. The second patient is only one of a handful without the loss of the HMGA2 gene and shows a much better growth profile, but with absolute macrocephaly. This patient's deletion is unique and hence defines a likely macrocephaly locus that contributes to the general phenotype characterising the 12q14 syndrome. PMID:26266063

  1. Structural organization of poliovirus RNA replication is mediated by viral proteins of the P2 genomic region.

    PubMed Central

    Bienz, K; Egger, D; Troxler, M; Pasamontes, L

    1990-01-01

    Transcriptionally active replication complexes bound to smooth membrane vesicles were isolated from poliovirus-infected cells. In electron microscopic, negatively stained preparations, the replication complex appeared as an irregularly shaped, oblong structure attached to several virus-induced vesicles of a rosettelike arrangement. Electron microscopic immunocytochemistry of such preparations demonstrated that the poliovirus replication complex contains the proteins coded by the P2 genomic region (P2 proteins) in a membrane-associated form. In addition, the P2 proteins are also associated with viral RNA, and they can be cross-linked to viral RNA by UV irradiation. Guanidine hydrochloride prevented the P2 proteins from becoming membrane bound but did not change their association with viral RNA. The findings allow the conclusion that the protein 2C or 2C-containing precursor(s) is responsible for the attachment of the viral RNA to the vesicular membrane and for the spatial organization of the replication complex necessary for its proper functioning in viral transcription. A model for the structure of the viral replication complex and for the function of the 2C-containing P2 protein(s) and the vesicular membranes is proposed. Images PMID:2154600

  2. The complete mitochondrial genome of spittlebug Paphnutius ruficeps (Insecta: Hemiptera: Cercopidae) with a fairly short putative control region.

    PubMed

    Liu, Jie; Liang, Aiping

    2013-04-01

    The mitochondrial genome of the spittlebug Paphnutius ruficeps is a double-strand DNA circular molecule of 14,841 bp with a total A and T content of 73.8%. It is one of the shortest genomes among published hemipteran mitogenomes and encodes 13 protein-coding genes, 2 ribosome RNA genes and 22 transfer RNA (tRNA) genes. The gene order is consistent with the hypothesized ancestral arthropod genome arrangement. Most of the protein-coding genes use ATG as start and TAA as stop codon. The codons show an evident bias toward the nucleotides T and A at the third codon position and the most commonly used codons contain more A and T than their synonymous ones. The anticodons of the 22 tRNA genes are identical to those of the mitogenome of Philaenus spumarius, another studied spittlebug. All the tRNAs could be folded into traditional clover leaf secondary structures. The putative control region (traditionally called A + T-rich region) is the main non-coding part of the mitogenome. The AT content of this region (74.5%) is not significantly higher than that of the total mitogenome (73.8%) and slightly lower than that of the N-chain protein-coding genes (75.3%). The absence of repeat sequences as well as its short length is the most obvious characteristics of the mitochondrial genome of Paphnutius ruficeps compared with those of other published hemipteran species. PMID:23532251

  3. Molecular mapping of genomic regions underlying barley yellow dwarf tolerance in cultivated oat (Avena sativa L.).

    PubMed

    Zhu, S; Kolb, F L; Kaeppler, H F

    2003-05-01

    Barley yellow dwarf (BYD) is one of the most important viral diseases in small grains, including oat (Avena sativa L.). Breeding for BYD tolerance is an effective and efficient means to control the disease. Characterization of major sources of tolerance, and identification of marker and the trait associations, will directly benefit breeding for BYD tolerance. Genomic regions underlying BYD tolerance were mapped and characterized in an oat population consisting of 152 recombinant inbred lines from the cross of 'Ogle' (tolerant)/MAM17-5 (sensitive). Tolerance was evaluated in replicated field trials across 2 years under artificial inoculation with viruliferous aphids harboring BYD virus isolate PAV-IL. Composite interval mapping was used for quantitative trait loci (QTLs) analysis with a framework map consisting of 272 molecular markers. Four QTLs, BYDq1, BYDq2, BYDq3 and BYDq4, for BYD tolerance were identified on linkage groups OM1, 5, 7 and 24, respectively. All but BYDq2 were consistently detected across both years. Significant epistasis was found between some QTLs. The final model including the epistatic effect explained 50.3 to 58.2% of the total phenotypic variation for BYD tolerance. Some QTLs for BYD tolerance were closely linked to QTLs for plant height and days to heading. Potential problems with QTL mapping for BYD tolerance have been discussed. The identified association of markers and tolerance should be useful to pyramid favorable alleles for BYD tolerance into individual oat lines. PMID:12748782

  4. Genomic and Network Patterns of Schizophrenia Genetic Variation in Human Evolutionary Accelerated Regions

    PubMed Central

    Xu, Ke; Schadt, Eric E.; Pollard, Katherine S.; Roussos, Panos; Dudley, Joel T.

    2015-01-01

    The population persistence of schizophrenia despite associated reductions in fitness and fecundity suggests that the genetic basis of schizophrenia has a complex evolutionary history. A recent meta-analysis of schizophrenia genome-wide association studies offers novel opportunities for assessment of the evolutionary trajectories of schizophrenia-associated loci. In this study, we hypothesize that components of the genetic architecture of schizophrenia are attributable to human lineage-specific evolution. Our results suggest that schizophrenia-associated loci enrich in genes near previously identified human accelerated regions (HARs). Specifically, we find that genes near HARs conserved in nonhuman primates (pHARs) are enriched for schizophrenia-associated loci, and that pHAR-associated schizophrenia genes are under stronger selective pressure than other schizophrenia genes and other pHAR-associated genes. We further evaluate pHAR-associated schizophrenia genes in regulatory network contexts to investigate associated molecular functions and mechanisms. We find that pHAR-associated schizophrenia genes significantly enrich in a GABA-related coexpression module that was previously found to be differentially regulated in schizophrenia affected individuals versus healthy controls. In another two independent networks constructed from gene expression profiles from prefrontal cortex samples, we find that pHAR-associated schizophrenia genes are located in more central positions and their average path lengths to the other nodes are significantly shorter than those of other schizophrenia genes. Together, our results suggest that HARs are associated with potentially important functional roles in the genetic architecture of schizophrenia. PMID:25681384

  5. Narrowing and genomic annotation of the commonly deleted region of the 5q- syndrome

    SciTech Connect

    Boultwood, Jacqueline; Fidler, Carrie; Strickson, Amanda J.; Watkins, Fiona; Gama, Susana; Kearney, Lyndal; Tosi, Sabrina; Kasprzyk, Arek; Cheng, Jan-Fang; Jaju, Rina J.; Wainscoat, James S.

    2002-01-15

    The 5q syndrome is the most distinct of the myelodysplastic syndromes, and the molecular basis for this disorder remains unknown. We describe the narrowing of the common deleted region (CDR) of the 5q syndrome to the approximately 1.5-megabases interval at 5q32 flanked by D5S413 and the GLRA1 gene. The Ensemblgene prediction program has been used for the complete genomic annotation of the CDR. The CDR is gene rich and contains 24 known genes and 16 novel (predicted) genes. Of 40 genes in the CDR, 33 are expressed in CD34 cells and, therefore, represent candidate genes since they are expressed within the hematopoietic stem/progenitor cell compartment. A number of the genes assigned to the CDR represent good candidates for the 5q syndrome, including MEGF1, G3BP, and several of the novel gene predictions. These data now afford a comprehensive mutational/expression analysis of all candidate genes assigned to the CDR.

  6. Identification of genomic regions involved in resistance against Sclerotinia sclerotiorum from wild Brassica oleracea.

    PubMed

    Mei, Jiaqin; Ding, Yijuan; Lu, Kun; Wei, Dayong; Liu, Yao; Disi, Joseph Onwusemu; Li, Jiana; Liu, Liezhao; Liu, Shengyi; McKay, John; Qian, Wei

    2013-02-01

    The lack of resistant source has greatly restrained resistance breeding of rapeseed (Brassica napus, AACC) against Sclerotinia sclerotiorum which causes severe yield losses in rapeseed production all over the world. Recently, several wild Brassica oleracea accessions (CC) with high level of resistance have been identified (Mei et al. in Euphytica 177:393-400, 2011), bringing a new hope to improve Sclerotinia resistance of rapeseed. To map quantitative trait loci (QTL) for Sclerotinia resistance from wild B. oleracea, an F2 population consisting of 149 genotypes, with several clones of each genotypes, was developed from one F1 individual derived from the cross between a resistant accession of wild B. oleracea (B. incana) and a susceptible accession of cultivated B. oleracea var. alboglabra. The F2 population was evaluated for Sclerotinia reaction in 2009 and 2010 under controlled condition. Significant differences among genotypes and high heritability for leaf and stem reaction indicated that genetic components accounted for a large portion of the phenotypic variance. A total of 12 QTL for leaf resistance and six QTL for stem resistance were identified in 2 years, each explaining 2.2-28.4 % of the phenotypic variation. The combined effect of alleles from wild B. oleracea reduced the relative susceptibility by 22.5 % in leaves and 15 % in stems on average over 2 years. A 12.8-cM genetic region on chromosome C09 of B. oleracea consisting of two major QTL intervals for both leaf and stem resistance was assigned into a 2.7-Mb genomic region on chromosome A09 of B. rapa, harboring about 30 putative resistance-related genes. Significant negative corrections were found between flowering time and relative susceptibility of leaf and stem. The association of flowering time with Sclerotinia resistance is discussed. PMID:23096003

  7. Allelic Variation in a Willow Warbler Genomic Region Is Associated with Climate Clines

    PubMed Central

    Larson, Keith W.; Liedvogel, Miriam; Addison, BriAnne; Kleven, Oddmund; Laskemoen, Terje; Lifjeld, Jan T.; Lundberg, Max; Åkesson, Susanne; Bensch, Staffan

    2014-01-01

    Local adaptation is an important process contributing to population differentiation which can occur in continuous or isolated populations connected by various amounts of gene flow. The willow warbler (Phylloscopus trochilus) is one of the most common songbirds in Fennoscandia. It has a continuous breeding distribution where it is found in all forested habitats from sea level to the tree line and therefore constitutes an ideal species for the study of locally adapted genes associated with environmental gradients. Previous studies in this species identified a genetic marker (AFLP-WW1) that showed a steep north-south cline in central Sweden with one allele associated with coastal lowland habitats and the other with mountainous habitats. It was further demonstrated that this marker is embedded in a highly differentiated chromosome region that spans several megabases. In the present study, we sampled 2,355 individuals at 128 sites across all of Fennoscandia to study the geographic and climatic variables associated with the allele frequency distributions of WW1. Our results demonstrate that 1) allele frequency patterns significantly differ between mountain and lowland populations, 2) these allele differences coincide with extreme temperature conditions and the short growing season in the mountains, and milder conditions in coastal areas, and 3) the northern-allele or “altitude variant” of WW1 occurs in willow warblers that occupy mountainous habitat regardless of subspecies. Finally these results suggest that climate may exert selection on the genomic region associated with these alleles and would allow us to develop testable predictions for the distribution of the genetic marker based on climate change scenarios. PMID:24788148

  8. Implementation of the Realized Genomic Relationship Matrix to Open-Pollinated White Spruce Family Testing for Disentangling Additive from Nonadditive Genetic Effects

    PubMed Central

    Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Porth, Ilga; Chen, Charles; El-Kassaby, Yousry A.

    2016-01-01

    The open-pollinated (OP) family testing combines the simplest known progeny evaluation and quantitative genetics analyses as candidates’ offspring are assumed to represent independent half-sib families. The accuracy of genetic parameter estimates is often questioned as the assumption of “half-sibling” in OP families may often be violated. We compared the pedigree- vs. marker-based genetic models by analysing 22-yr height and 30-yr wood density for 214 white spruce [Picea glauca (Moench) Voss] OP families represented by 1694 individuals growing on one site in Quebec, Canada. Assuming half-sibling, the pedigree-based model was limited to estimating the additive genetic variances which, in turn, were grossly overestimated as they were confounded by very minor dominance and major additive-by-additive epistatic genetic variances. In contrast, the implemented genomic pairwise realized relationship models allowed the disentanglement of additive from all nonadditive factors through genetic variance decomposition. The marker-based models produced more realistic narrow-sense heritability estimates and, for the first time, allowed estimating the dominance and epistatic genetic variances from OP testing. In addition, the genomic models showed better prediction accuracies compared to pedigree models and were able to predict individual breeding values for new individuals from untested families, which was not possible using the pedigree-based model. Clearly, the use of marker-based relationship approach is effective in estimating the quantitative genetic parameters of complex traits even under simple and shallow pedigree structure. PMID:26801647

  9. Computational tools and resources for prediction and analysis of gene regulatory regions in the chick genome

    PubMed Central

    Khan, Mohsin A. F.; Soto-Jimenez, Luz Mayela; Howe, Timothy; Streit, Andrea; Sosinsky, Alona; Stern, Claudio D.

    2013-01-01

    The discovery of cis-regulatory elements is a challenging problem in bioinformatics, owing to distal locations and context-specific roles of these elements in controlling gene regulation. Here we review the current bioinformatics methodologies and resources available for systematic discovery of cis-acting regulatory elements and conserved transcription factor binding sites in the chick genome. In addition, we propose and make available, a novel workflow using computational tools that integrate CTCF analysis to predict putative insulator elements, enhancer prediction and TFBS analysis. To demonstrate the usefulness of this computational workflow, we then use it to analyze the locus of the gene Sox2 whose developmental expression is known to be controlled by a complex array of cis-acting regulatory elements. The workflow accurately predicts most of the experimentally verified elements along with some that have not yet been discovered. A web version of the CTCF tool, together with instructions for using the workflow can be accessed from http://toolshed.g2.bx.psu.edu/view/mkhan1980/ctcf_analysis. For local installation of the tool, relevant Perl scripts and instructions are provided in the directory named “code” in the supplementary materials. PMID:23355428

  10. Assessing the patterns of linkage disequilibrium in genic regions of the human genome.

    PubMed

    Sun, Peng; Zhang, Ruijie; Jiang, Yongshuai; Wang, Xing; Li, Jin; Lv, Hongchao; Tang, Guoping; Guo, Xiaodan; Meng, Xianwen; Zhang, Haikun; Zhang, Ruimin

    2011-10-01

    We used the genotyping data generated by the International HapMap Project to study the patterns of linkage disequilibrium (LD) in human genic regions. LD patterns for 11,998 genes from 11 HapMap populations were identified by analyzing the distribution of haplotype blocks. The genes were prioritized using LD levels. The results showed that there were significant differences in the degree of LD between genes. Genes with high or low LD (the upper and lower quartiles of the LD levels) fell into different Gene Ontology functional categories. The high LD genes clustered preferentially in the metabolic process, macromolecule localization and cell-cycle categories, whereas the low LD genes clustered in the developmental process, ion transport, and immune and regulation system categories. Furthermore, we subdivided the genic region into 3'-UTR, 5'-UTR and CDS (coding region), and compared the different LD patterns in these subregions. We found that the LD patterns in low LD genes had a more interspersed block structure compared with the high LD genes. This was especially true in the CDS and 5'-UTR. The extent of LD was somewhat higher in 5'-UTRs compared with 3'-UTRs for both high and low LD genes. In addition, we assessed the overlap for the intragenic LD regions and found that the LD regions in high LD genes were more consistent among populations. Comprehensive information about the distribution of LD patterns in gene regions in populations may provide insights into the evolutionary history of humans and help in the selection of biomarkers for disease association studies. PMID:21824289

  11. Coding DNA repeated throughout intergenic regions of the Arabidopsis thaliana genome: Evolutionary footprints of RNA silencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pyknons are non-random sequence patterns significantly repeated throughout non-coding genomic DNA that also appear at least once among genes. They are interesting because they portend an unforeseen connection between coding and non-coding DNA. Pyknons have only been discovered in the human genome,...

  12. Origins of the Xylella fastidiosa prophage-like regions and their impact in genome differentiation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome compositi...

  13. Organellar genome analysis of rye (Secale cereale) representing diverse geographic regions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Rye (Secale cereale) is an important diploid (2n = 14, RR) crop species of the Tritceae and a better understanding of it organellar genome variation can aid in its improvement. Previous genetic analyses of rye focused on the nuclear genome. In the present study, the objective was to investigate the ...

  14. Changes in twelve conserved soybean genomic regions following three rounds of polyploidy

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With the advent of high throughput sequencing, the availability of genomic sequence for comparative genomics is increasing exponentially. A set of highly conserved homoeologous segments would be valuable in the exploration of the retention and evolution of genes within gene families due to the evol...

  15. Comparative genomics reveals a functional thyroid-specific element in the far upstream region of the PAX8 gene

    PubMed Central

    2010-01-01

    Background The molecular mechanisms leading to a fully differentiated thyrocite are still object of intense study even if it is well known that thyroglobulin, thyroperoxidase, NIS and TSHr are the marker genes of thyroid differentiation. It is also well known that Pax8, TTF-1, Foxe1 and Hhex are the thyroid-enriched transcription factors responsible for the expression of the above genes, thus are responsible for the differentiated thyroid phenotype. In particular, the role of Pax8 in the fully developed thyroid gland was studied in depth and it was established that it plays a key role in thyroid development and differentiation. However, to date the bases for the thyroid-enriched expression of this transcription factor have not been unraveled yet. Here, we report the identification and characterization of a functional thyroid-specific enhancer element located far upstream of the Pax8 gene. Results We hypothesized that regulatory cis-acting elements are conserved among mammalian genes. Comparison of a genomic region extending for about 100 kb at the 5'-flanking region of the mouse and human Pax8 gene revealed several conserved regions that were tested for enhancer activity in thyroid and non-thyroid cells. Using this approach we identified one putative thyroid-specific regulatory element located 84.6 kb upstream of the Pax8 transcription start site. The in silico data were verified by promoter-reporter assays in thyroid and non-thyroid cells. Interestingly, the identified far upstream element manifested a very high transcriptional activity in the thyroid cell line PC Cl3, but showed no activity in HeLa cells. In addition, the data here reported indicate that the thyroid-enriched transcription factor TTF-1 is able to bind in vitro and in vivo the Pax8 far upstream element, and is capable to activate transcription from it. Conclusions Results of this study reveal the presence of a thyroid-specific regulatory element in the 5' upstream region of the Pax8 gene. The

  16. Random Addition Concatenation Analysis: A Novel Approach to the Exploration of Phylogenomic Signal Reveals Strong Agreement between Core and Shell Genomic Partitions in the Cyanobacteria

    PubMed Central

    Narechania, Apurva; Baker, Richard H.; Sit, Ryan; Kolokotronis, Sergios-Orestis; DeSalle, Rob; Planet, Paul J.

    2012-01-01

    Recent whole-genome approaches to microbial phylogeny have emphasized partitioning genes into functional classes, often focusing on differences between a stable core of genes and a variable shell. To rigorously address the effects of partitioning and combining genes in genome-level analyses, we developed a novel technique called Random Addition Concatenation Analysis (RADICAL). RADICAL operates by sequentially concatenating randomly chosen gene partitions starting with a single-gene partition and ending with the entire genomic data set. A phylogenetic tree is built for every successive addition, and the entire process is repeated creating multiple random concatenation paths. The result is a library of trees representing a large variety of differently sized random gene partitions. This library can then be mined to identify unique topologies, assess overall agreement, and measure support for different trees. To evaluate RADICAL, we used 682 orthologous genes across 13 cyanobacterial genomes. Despite previous assertions of substantial differences between a core and a shell set of genes for this data set, RADICAL reveals the two partitions contain congruent phylogenetic signal. Substantial disagreement within the data set is limited to a few nodes and genes involved in metabolism, a functional group that is distributed evenly between the core and the shell partitions. We highlight numerous examples where RADICAL reveals aspects of phylogenetic behavior not evident by examining individual gene trees or a “‘total evidence” tree. Our method also demonstrates that most emergent phylogenetic signal appears early in the concatenation process. The software is freely available at http://desalle.amnh.org. PMID:22094860

  17. Comparative Genomic Hybridizations Reveal Genetic Regions within the Mycobacterium avium Complex That Are Divergent from Mycobacterium avium subsp. paratuberculosis Isolates†

    PubMed Central

    Paustian, Michael L.; Kapur, Vivek; Bannantine, John P.

    2005-01-01

    Mycobacterium avium subsp. paratuberculosis is genetically similar to other members of the Mycobacterium avium complex (MAC), some of which are nonpathogenic and widespread in the environment. We have utilized an M. avium subsp. paratuberculosis whole-genome microarray representing over 95% of the predicted coding sequences to examine the genetic conservation among 10 M. avium subsp. paratuberculosis isolates, two isolates each of Mycobacterium avium subsp. silvaticum and Mycobacterium avium subsp. avium, and a single isolate each of both Mycobacterium intracellulare and Mycobacterium smegmatis. Genomic DNA from each isolate was competitively hybridized with DNA from M. avium subsp. paratuberculosis K10, and open reading frames (ORFs) were classified as present, divergent, or intermediate. None of the M. avium subsp. paratuberculosis isolates had ORFs classified as divergent. The two M. avium subsp. avium isolates had 210 and 135 divergent ORFs, while the two M. avium subsp. silvaticum isolates examined had 77 and 103 divergent ORFs. Similarly, 130 divergent ORFs were identified in M. intracellulare. A set of 97 ORFs were classified as divergent or intermediate in all of the nonparatuberculosis MAC isolates tested. Many of these ORFs are clustered together on the genome in regions with relatively low average GC content compared with the entire genome and contain mobile genetic elements. One of these regions of sequence divergence contained genes homologous to a mammalian cell entry (mce) operon. Our results indicate that closely related MAC mycobacteria can be distinguished from M. avium subsp. paratuberculosis by multiple clusters of divergent ORFs. PMID:15774884

  18. Genetic and physical mapping of the genomic region spanning CMT4A

    SciTech Connect

    Othmane, K.B.; Loeb, D.; Roses, A.D.

    1994-09-01

    Autosomal recessive Charcot-Marie-Tooth disease (CMT4) is a severe childhood neuropathy classified into three types: A, B, and C. We previously mapped CMT4A to chromosome 8q13-q21 in four large Tunisian families. Analysis of recombination events suggested the order: cent.-D8S279-(D8S286,D8S164, CMT4A)-D8S84-tel. Families with types B and C were subsequently typed and linkage for these types was excluded for the CMT4A region and other known CMT loci. Recently, the gene for a major peripheral myelin protein (PMP2) was mapped by FISH to chromosome 8q21-q22 and therefore appeared to be a strong candidate gene for CMT4A. We used SSCP analysis, DNA sequencing, FISH and YAC mapping analysis, and demonstrated that PMP2 is not the defect in CMT4A. Using physical mapping data, we sublocalized a new genethon marker (D8S548) to the CMT4A region between D8S286 and D8S164. All affected CMT4A patients were homozygotes for this polymorphic microsatellite as expected from its physical localization. We screened the CEPH megabase YAC library using the closest markers; over 30 YACs were isolated and characterized by PFGE. FISH analysis revealed about 16% chimeras. The YACs span the 8 cM region between D8S279 and PMP2 (mapped distal to D8S84), with a current 1 cM gap between D8S164 and D8S84. We are currently using Alu-PCR and vectorette to develop end clones in order to identify new YACs in the region and further close this gap. Alu-PCR fragments have identified several new microsatellites in the region which can be used for additional mapping of the CMT4A gene.

  19. Genomics of Clostridium tetani.

    PubMed

    Brüggemann, Holger; Brzuszkiewicz, Elzbieta; Chapeton-Montes, Diana; Plourde, Lucile; Speck, Denis; Popoff, Michel R

    2015-05-01

    Genomic information about Clostridium tetani, the causative agent of the tetanus disease, is scarce. The genome of strain E88, a strain used in vaccine production, was sequenced about 10 years ago. One additional genome (strain 12124569) has recently been released. Here we report three new genomes of C. tetani and describe major differences among all five C. tetani genomes. They all harbor tetanus-toxin-encoding plasmids that contain highly conserved genes for TeNT (tetanus toxin), TetR (transcriptional regulator of TeNT) and ColT (collagenase), but substantially differ in other plasmid regions. The chromosomes share a large core genome that contains about 85% of all genes of a given chromosome. The non-core chromosome comprises mainly prophage-like genomic regions and genes encoding environmental interaction and defense functions (e.g. surface proteins, restriction-modification systems, toxin-antitoxin systems, CRISPR/Cas systems) and other fitness functions (e.g. transport systems, metabolic activities). This new genome information will help to assess the level of genome plasticity of the species C. tetani and provide the basis for detailed comparative studies. PMID:25638019

  20. Impacts of Additional HONO Sources on Concentrations and Deposition of NOy in the Beijing-Tianjin-Hebei Region of China

    NASA Astrophysics Data System (ADS)

    Li, Ying; An, Junling; Kajino, Mizuo; Li, Jian; Qu, Yu

    2015-04-01

    Reactive nitrogen-containing compounds (NOy) are involved in many important chemical processes in the atmosphere, including aerosol formation as well as ozone (O3) production and destruction. As NOy deposition was increasing rapidly in China during 1980s ~ 2000s, great effort is urgently needed to reduce N deposition. HONO, an important component of NOy, is a significant precursor of the hydroxyl radical (OH) that drives the formation of O3 and fine particles (PM2.5). Nevertheless, the detailed formation mechanisms of HONO and strength of its sources remain unclear. Unknown HONO sources and their potential impacts on air quality have gained extensive interests but to our current knowledge, the impact of HONO sources on regional-scale deposition of NOy has not been quantified up to date. The goal of this work is to evaluate the effects of the additional HONO sources on concentrations and deposition of individual NOy species as well as the NOy budget in the northern Chinese regions being affected by heavy pollution. Simulations of HONO contributions over Beijing-Tianjin-Hebei region (BTH) during summer and winter periods of 2007 using the fully coupled Weather Research and Forecasting /Chemistry (WRF/Chem) model are performed by including three additional HONO sources: 1) the reaction of photo-excited nitrogen dioxide (NO2*) with water vapor, 2) NO2 heterogeneous reaction at the aerosol surfaces, and 3) HONO emissions. The model results show that the three additional HONO sources produce a 20%~40% (> 100%) increase in monthly-mean OH concentrations in many urban areas in August (February), leading to a 10%~40% (10%~100%) variation in monthly-mean concentrations of NOx, nitrate and PAN, a 5%~10% (10%~40%) increase in the total dry deposition of NOy, and an enhancement of 1.4 Gg N (1.5 Gg N) in the total of dry and wet deposition of NOy over this region in August (February). These results suggest that the additional HONO sources aggravate regional-scale acid deposition

  1. Nuclear Pore Proteins Nup153 and Megator Define Transcriptionally Active Regions in the Drosophila Genome

    PubMed Central

    Miura, Kota; Luscombe, Nicholas M.; Akhtar, Asifa

    2010-01-01

    Transcriptional regulation is one of the most important processes for modulating gene expression. Though much of this control is attributed to transcription factors, histones, and associated enzymes, it is increasingly apparent that the spatial organization of chromosomes within the nucleus has a profound effect on transcriptional activity. Studies in yeast indicate that the nuclear pore complex might promote transcription by recruiting chromatin to the nuclear periphery. In higher eukaryotes, however, it is not known whether such regulation has global significance. Here we establish nucleoporins as a major class of global regulators for gene expression in Drosophila melanogaster. Using chromatin-immunoprecipitation combined with microarray hybridisation, we show that Nup153 and Megator (Mtor) bind to 25% of the genome in continuous domains extending 10 kb to 500 kb. These Nucleoporin-Associated Regions (NARs) are dominated by markers for active transcription, including high RNA polymerase II occupancy and histone H4K16 acetylation. RNAi–mediated knock-down of Nup153 alters the expression of ∼5,700 genes, with a pronounced down-regulatory effect within NARs. We find that nucleoporins play a central role in coordinating dosage compensation—an organism-wide process involving the doubling of expression of the male X chromosome. NARs are enriched on the male X chromosome and occupy 75% of this chromosome. Furthermore, Nup153-depletion abolishes the normal function of the male-specific dosage compensation complex. Finally, by extensive 3D imaging, we demonstrate that NARs contribute to gene expression control irrespective of their sub-nuclear localization. Therefore, we suggest that NAR–binding is used for chromosomal organization that enables gene expression control. PMID:20174442

  2. Gap Closing/Finishing by Targeted Genomic Region Enrichment and Sequencing

    SciTech Connect

    Singh, Kanwar; Froula, Jeff; Trice, Hope; Pennacchio, Len A.; Chen, Feng

    2010-05-27

    Gap Closing/Finishing of draft genome assemblies is a labor and cost intensive process where several rounds of repetitious amplification and sequencing are required. Here we demonstrate a high throughput procedure where custom primers flanking gaps in draft genomes are designed. Primer libraries containing up to 4,000 unique pairs in independent droplets are merged with a fragmented genomic template. From this millions of picoliter scale droplets are formed, each one being the functional equivalent of an individual PCR reaction. The PCR products are concatenated and sequenced by Illumina which is then assembled and used for gap closure. Here we present an overall experimental strategy, primer design algorithm and initial results.

  3. Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions

    SciTech Connect

    MacArthur, Stewart; Li, Xiao-Yong; Li, Jingyi; Brown, James B.; Chu, Hou Cheng; Zeng, Lucy; Grondona, Brandi P.; Hechmer, Aaron; Simirenko, Lisa; Keranen, Soile V.E.; Knowles, David W.; Stapleton, Mark; Bickel, Peter; Biggin, Mark D.; Eisen, Michael B.

    2009-05-15

    BACKGROUND: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. RESULTS: Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of function and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. CONCLUSIONS: It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.

  4. Long non-coding RNA containing ultraconserved genomic region 8 promotes bladder cancer tumorigenesis.

    PubMed

    Olivieri, Michele; Ferro, Matteo; Terreri, Sara; Durso, Montano; Romanelli, Alessandra; Avitabile, Concetta; De Cobelli, Ottavio; Messere, Anna; Bruzzese, Dario; Vannini, Ivan; Marinelli, Luciana; Novellino, Ettore; Zhang, Wei; Incoronato, Mariarosaria; Ilardi, Gennaro; Staibano, Stefania; Marra, Laura; Franco, Renato; Perdonà, Sisto; Terracciano, Daniela; Czerniak, Bogdan; Liguori, Giovanna L; Colonna, Vincenza; Fabbri, Muller; Febbraio, Ferdinando; Calin, George A; Cimmino, Amelia

    2016-04-12

    Ultraconserved regions (UCRs) have been shown to originate non-coding RNA transcripts (T-UCRs) that have different expression profiles and play functional roles in the pathophysiology of multiple cancers. The relevance of these functions to the pathogenesis of bladder cancer (BlCa) is speculative. To elucidate this relevance, we first used genome-wide profiling to evaluate the expression of T-UCRs in BlCa tissues. Analysis of two datasets comprising normal bladder tissues and BlCa specimens with a custom T-UCR microarray identified ultraconserved RNA (uc.) 8+ as the most upregulated T-UCR in BlCa tissues, although its expression was lower than in pericancerous bladder tissues. These results were confirmed on BlCa tissues by real-time PCR and by in situ hybridization. Although uc.8+ is located within intron 1 of CASZ1, a zinc-finger transcription factor, the transcribed non-coding RNA encoding uc.8+ is expressed independently of CASZ1. In vitro experiments evaluating the effects of uc.8+ silencing, showed significantly decreased capacities for cancer cell invasion, migration, and proliferation. From this, we proposed and validated a model of interaction in which uc.8+ shuttles from the nucleus to the cytoplasm of BlCa cells, interacts with microRNA (miR)-596, and cooperates in the promotion and development of BlCa. Using computational analysis, we investigated the miR-binding domain accessibility, as determined by base-pairing interactions within the uc.8+ predicted secondary structure, RNA binding affinity, and RNA species abundance in bladder tissues and showed that uc.8+ is a natural decoy for miR-596. Thus uc.8+ upregulation results in increased expression of MMP9, increasing the invasive potential of BlCa cells. These interactions between evolutionarily conserved regions of DNA suggest that natural selection has preserved this potentially regulatory layer that uses RNA to modulate miR levels, opening up the possibility for development of useful markers for

  5. Long non-coding RNA containing ultraconserved genomic region 8 promotes bladder cancer tumorigenesis

    PubMed Central

    Durso, Montano; Romanelli, Alessandra; Avitabile, Concetta; De Cobelli, Ottavio; Messere, Anna; Bruzzese, Dario; Vannini, Ivan; Marinelli, Luciana; Novellino, Ettore; Zhang, Wei; Incoronato, Mariarosaria; Ilardi, Gennaro; Staibano, Stefania; Marra, Laura; Franco, Renato; Perdonà, Sisto; Terracciano, Daniela; Czerniak, Bogdan; Liguori, Giovanna L.; Colonna, Vincenza; Fabbri, Muller; Febbraio, Ferdinando

    2016-01-01

    Ultraconserved regions (UCRs) have been shown to originate non-coding RNA transcripts (T-UCRs) that have different expression profiles and play functional roles in the pathophysiology of multiple cancers. The relevance of these functions to the pathogenesis of bladder cancer (BlCa) is speculative. To elucidate this relevance, we first used genome-wide profiling to evaluate the expression of T-UCRs in BlCa tissues. Analysis of two datasets comprising normal bladder tissues and BlCa specimens with a custom T-UCR microarray identified ultraconserved RNA (uc.) 8+ as the most upregulated T-UCR in BlCa tissues, although its expression was lower than in pericancerous bladder tissues. These results were confirmed on BlCa tissues by real-time PCR and by in situ hybridization. Although uc.8+ is located within intron 1 of CASZ1, a zinc-finger transcription factor, the transcribed non-coding RNA encoding uc.8+ is expressed independently of CASZ1. In vitro experiments evaluating the effects of uc.8+ silencing, showed significantly decreased capacities for cancer cell invasion, migration, and proliferation. From this, we proposed and validated a model of interaction in which uc.8+ shuttles from the nucleus to the cytoplasm of BlCa cells, interacts with microRNA (miR)-596, and cooperates in the promotion and development of BlCa. Using computational analysis, we investigated the miR-binding domain accessibility, as determined by base-pairing interactions within the uc.8+ predicted secondary structure, RNA binding affinity, and RNA species abundance in bladder tissues and showed that uc.8+ is a natural decoy for miR-596. Thus uc.8+ upregulation results in increased expression of MMP9, increasing the invasive potential of BlCa cells. These interactions between evolutionarily conserved regions of DNA suggest that natural selection has preserved this potentially regulatory layer that uses RNA to modulate miR levels, opening up the possibility for development of useful markers for

  6. Sequence Variability in Viral Genome Non-coding Regions Likely Contribute to Observed Differences in Viral Replication Amongst MARV Strains

    PubMed Central

    ALONSO, JESUS A.; PATTERSON, JEAN L.

    2013-01-01

    The Marburg viruses Musoke (MARV-Mus) and Angola (MARV-Ang) have highly similar genomic sequences. Analysis of viral replication using various assays consistently identified MARV-Ang as the faster replicating virus. Non-coding genomic regions of negative sense RNA viruses are known to play a role in viral gene expression. A comparison of the six non-coding regions using bicistronic minigenomes revealed that the first two non-coding regions (NP / VP35 and VP35 / VP40) differed significantly in their transcriptional regulation. Deletion mutation analysis of the MARV-Mus NP / VP35 region further revealed that the MARV polymerase (L) is able to initiate production of the downstream gene without the presence of highly conserved regulatory signals. Bicistronic minigenome assays also identified the VP30 mRNA 5′ untranslated region as an rZAP-targeted RNA motif. Overall, our studies indicate that the high variation of MARV non-coding regions may play a significant role in observed differences in transcription and/or replication. PMID:23510675

  7. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  8. Comparative Analyses Between Lolium/Festuca Introgression Lines and Rice Reveal the Major Fraction of Functionally Annotated Gene Models Is Located in Recombination-Poor/Very Recombination-Poor Regions of the Genome

    PubMed Central

    King, Julie; Armstead, Ian P.; Donnison, S. Iain; Roberts, Luned A.; Harper, John A.; Skøt, Kirsten; Elborough, Kieran; King, Ian P.

    2007-01-01

    Publication of the rice genome sequence has allowed an in-depth analysis of genome organization in a model monocot plant species. This has provided a powerful tool for genome analysis in large-genome unsequenced agriculturally important monocot species such as wheat, barley, rye, Lolium, etc. Previous data have indicated that the majority of genes in large-genome monocots are located toward the ends of chromosomes in gene-rich regions that undergo high frequencies of recombination. Here we demonstrate that a substantial component of the coding sequences in monocots is localized proximally in regions of very low and even negligible recombination frequencies. The implications of our findings are that during domestication of monocot plant species selection has concentrated on genes located in the terminal regions of chromosomes within areas of high recombination frequency. Thus a large proportion of the genetic variation available for selection of superior plant genotypes has not been exploited. In addition our findings raise the possibility of the evolutionary development of large supergene complexes that confer a selective advantage to the individual. PMID:17603095

  9. Draft Genome Sequence of Bacillus sp. GZT, a 2,4,6-Tribromophenol-Degrading Strain Isolated from the River Sludge of an Electronic Waste-Dismantling Region.

    PubMed

    Liang, Zhishu; Li, Guiying; An, Taicheng; Das, Ranjit

    2016-01-01

    Here, we report the draft genome sequence of Bacillus sp. strain GZT, a 2,4,6-tribromophenol (TBP)-degrading bacterium previously isolated from an electronic waste-dismantling region. The draft genome sequence is 5.18 Mb and has a G+C content of 35.1%. This is the first genome report of a brominated flame retardant-degrading strain. PMID:27257197

  10. Draft Genome Sequence of Bacillus sp. GZT, a 2,4,6-Tribromophenol-Degrading Strain Isolated from the River Sludge of an Electronic Waste-Dismantling Region

    PubMed Central

    Liang, Zhishu; Li, Guiying; Das, Ranjit

    2016-01-01

    Here, we report the draft genome sequence of Bacillus sp. strain GZT, a 2,4,6-tribromophenol (TBP)-degrading bacterium previously isolated from an electronic waste-dismantling region. The draft genome sequence is 5.18 Mb and has a G+C content of 35.1%. This is the first genome report of a brominated flame retardant-degrading strain. PMID:27257197

  11. Evolution of the vertebrate genome as reflected in paralogous chromosomal regions in man and the house mouse

    SciTech Connect

    Lundin, L.G. )

    1993-04-01

    Gene constellations on several human chromosomes are interpreted as indications of large regional duplications that took place during evolution of the vertebrate genome. Four groups of paralogous chromosomal regions in man and the house mouse are suggested and are believed to be conserved remnants of the two or three rounds of tetraploidization that are likely to have occurred during evolution of the vertebrates. The phenomenon of differential silencing of genes is described. The importance of conservation of linkage of particular genes is discussed in relation to genetic regulation and cell differentiation. 120 refs., 5 tabs.

  12. Comparative Genomics Identifies the Mouse Bmp3 Promoter and an Upstream Evolutionary Conserved Region (ECR) in Mammals

    PubMed Central

    Lowery, Jonathan W.; LaVigne, Anna W.; Kokabu, Shoichiro; Rosen, Vicki

    2013-01-01

    The Bone Morphogenetic Protein (BMP) pathway is a multi-member signaling cascade whose basic components are found in all animals. One member, BMP3, which arose more recently in evolution and is found only in deuterostomes, serves a unique role as an antagonist to both the canonical BMP and Activin pathways. However, the mechanisms that control BMP3 expression, and the cis-regulatory regions mediating this regulation, remain poorly defined. With this in mind, we sought to identify the Bmp3 promoter in mouse (M. musculus) through functional and comparative genomic analyses. We found that the minimal promoter required for expression in resides within 0.8 kb upstream of Bmp3 in a region that is highly conserved with rat (R. norvegicus). We also found that an upstream region abutting the minimal promoter acts as a repressor of the minimal promoter in HEK293T cells and osteoblasts. Strikingly, a portion of this region is conserved among all available eutherian mammal genomes (47/47), but not in any non-eutherian animal (0/136). We also identified multiple conserved transcription factor binding sites in the Bmp3 upstream ECR, suggesting that this region may preserve common cis-regulatory elements that govern Bmp3 expression across eutherian mammals. Since dysregulation of BMP signaling appears to play a role in human health and disease, our findings may have application in the development of novel therapeutics aimed at modulating BMP signaling in humans. PMID:23451274

  13. Genome Analysis of Treponema pallidum subsp. pallidum and subsp. pertenue Strains: Most of the Genetic Differences Are Localized in Six Regions

    PubMed Central

    Mikalová, Lenka; Strouhal, Michal; Čejková, Darina; Zobaníková, Marie; Pospíšilová, Petra; Norris, Steven J.; Sodergren, Erica; Weinstock, George M.; Šmajs, David

    2010-01-01

    The genomes of eight treponemes including T. p. pallidum strains (Nichols, SS14, DAL-1 and Mexico A), T. p. pertenue strains (Samoa D, CDC-2 and Gauthier), and the Fribourg-Blanc isolate, were amplified in 133 overlapping amplicons, and the restriction patterns of these fragments were compared. The approximate sizes of the genomes investigated based on this whole genome fingerprinting (WGF) analysis ranged from 1139.3–1140.4 kb, with the estimated genome sequence identity of 99.57–99.98% in the homologous genome regions. Restriction target site analysis, detecting the presence of 1773 individual restriction sites found in the reference Nichols genome, revealed a high genome structure similarity of all strains. The unclassified simian Fribourg-Blanc isolate was more closely related to T. p. pertenue than to T. p. pallidum strains. Most of the genetic differences between T. p. pallidum and T. p. pertenue strains were accumulated in six genomic regions. These genome differences likely contribute to the observed differences in pathogenicity between T. p. pallidum and T. p. pertenue strains. These regions of sequence divergence could be used for the molecular detection and discrimination of syphilis and yaws strains. PMID:21209953

  14. Genome-Wide Genetic Diversity and Differentially Selected Regions among Suffolk, Rambouillet, Columbia, Polypay, and Targhee Sheep

    PubMed Central

    Zhang, Lifan; Mousel, Michelle R.; Wu, Xiaolin; Michal, Jennifer J.; Zhou, Xiang; Ding, Bo; Dodson, Michael V.; El-Halawany, Nermin K.; Lewis, Gregory S.; Jiang, Zhihua

    2013-01-01

    Sheep are among the major economically important livestock species worldwide because the animals produce milk, wool, skin, and meat. In the present study, the Illumina OvineSNP50 BeadChip was used to investigate genetic diversity and genome selection among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds from the United States. After quality-control filtering of SNPs (single nucleotide polymorphisms), we used 48,026 SNPs, including 46,850 SNPs on autosomes that were in Hardy-Weinberg equilibrium and 1,176 SNPs on chromosome × for analysis. Phylogenetic analysis based on all 46,850 SNPs clearly separated Suffolk from Rambouillet, Columbia, Polypay, and Targhee, which was not surprising as Rambouillet contributed to the synthesis of the later three breeds. Based on pair-wise estimates of FST, significant genetic differentiation appeared between Suffolk and Rambouillet (FST = 0.1621), while Rambouillet and Targhee had the closest relationship (FST = 0.0681). A scan of the genome revealed 45 and 41 differentially selected regions (DSRs) between Suffolk and Rambouillet and among Rambouillet-related breed populations, respectively. Our data indicated that regions 13 and 24 between Suffolk and Rambouillet might be good candidates for evaluating breed differences. Furthermore, ovine genome v3.1 assembly was used as reference to link functionally known homologous genes to economically important traits covered by these differentially selected regions. In brief, our present study provides a comprehensive genome-wide view on within- and between-breed genetic differentiation, biodiversity, and evolution among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds. These results may provide new guidance for the synthesis of new breeds with different breeding objectives. PMID:23762451

  15. Genetic variation between Schistosoma japonicum lineages from lake and mountainous regions in China revealed by resequencing whole genomes.

    PubMed

    Yin, Mingbo; Liu, Xiao; Xu, Bin; Huang, Jian; Zheng, Qi; Yang, Zhong; Feng, Zheng; Han, Ze-Guang; Hu, Wei

    2016-09-01

    Schistosoma infection is a major cause of morbidity and mortality worldwide. Schistosomiasis japonica is endemic in mainland China along the Yangtze River, typically distributed in two geographical categories of lake and mountainous regions. Study on schistosome genetic diversity is of interest in respect of understanding parasite biology and transmission, and formulating control strategy. Certain genetic variations may be associated with adaptations to different ecological habitats. The aim of this study is to gain insight into Schistosoma japonicum genetic variation, evolutionary origin and associated causes of different geographic lineages through examining homozygous Single Nucleotide Polymorphisms (SNPs) based on resequenced genome data. We collected S. japonicum samples from four sites, three in the lake regions (LR) of mid-east (Guichi and Tonglin in Anhui province, Laogang in Hunan province) and one in mountainous region (MR) (Xichang in Sichuan province) of south-west of China, resequenced their genomes using Next Generation Sequencing (NGS) technology, and made use of the available database of S. japonicum draft genomic sequence as a reference in genome mapping. A total of 14,575 SNPs from 2059 genes were identified in the four lineages. Phylogenetic analysis confirmed significant genetic variation exhibited between the different geographical lineages, and further revealed that the MR Xichang lineage is phylogenetically closer to LR Guich lineage than to other two LR lineages, and the MR lineage might be evolved from LR lineages. More than two thirds of detected SNPs were nonsynonymous; functional annotation of the SNP-containing genes showed that they are involved mainly in biological processes such as signaling and response to stimuli. Notably, unique nonsynonymous SNP variations were detected in 66 genes of MR lineage, inferring possible genetic adaption to mountainous ecological condition. PMID:27207135

  16. Captured segment exchange: a strategy for custom engineering large genomic regions in Drosophila melanogaster.

    PubMed

    Bateman, Jack R; Palopoli, Michael F; Dale, Sarah T; Stauffer, Jennifer E; Shah, Anita L; Johnson, Justine E; Walsh, Conor W; Flaten, Hanna; Parsons, Christine M

    2013-02-01

    Site-specific recombinases (SSRs) are valuable tools for manipulating genomes. In Drosophila, thousands of transgenic insertions carrying SSR recognition sites have been distributed throughout the genome by several large-scale projects. Here we describe a method with the potential to use these insertions to make custom alterations to the Drosophila genome in vivo. Specifically, by employing recombineering techniques and a dual recombinase-mediated cassette exchange strategy based on the phiC31 integrase and FLP recombinase, we show that a large genomic segment that lies between two SSR recognition-site insertions can be "captured" as a target cassette and exchanged for a sequence that was engineered in bacterial cells. We demonstrate this approach by targeting a 50-kb segment spanning the tsh gene, replacing the existing segment with corresponding recombineered sequences through simple and efficient manipulations. Given the high density of SSR recognition-site insertions in Drosophila, our method affords a straightforward and highly efficient approach to explore gene function in situ for a substantial portion of the Drosophila genome. PMID:23150604

  17. Genome-environment association study suggests local adaptation to climate at the regional scale in Fagus sylvatica.

    PubMed

    Pluess, Andrea R; Frank, Aline; Heiri, Caroline; Lalagüe, Hadrien; Vendramin, Giovanni G; Oddou-Muratorio, Sylvie

    2016-04-01

    The evolutionary potential of long-lived species, such as forest trees, is fundamental for their local persistence under climate change (CC). Genome-environment association (GEA) analyses reveal if species in heterogeneous environments at the regional scale are under differential selection resulting in populations with potential preadaptation to CC within this area. In 79 natural Fagus sylvatica populations, neutral genetic patterns were characterized using 12 simple sequence repeat (SSR) markers, and genomic variation (144 single nucleotide polymorphisms (SNPs) out of 52 candidate genes) was related to 87 environmental predictors in the latent factor mixed model, logistic regressions and isolation by distance/environmental (IBD/IBE) tests. SSR diversity revealed relatedness at up to 150 m intertree distance but an absence of large-scale spatial genetic structure and IBE. In the GEA analyses, 16 SNPs in 10 genes responded to one or several environmental predictors and IBE, corrected for IBD, was confirmed. The GEA often reflected the proposed gene functions, including indications for adaptation to water availability and temperature. Genomic divergence and the lack of large-scale neutral genetic patterns suggest that gene flow allows the spread of advantageous alleles in adaptive genes. Thereby, adaptation processes are likely to take place in species occurring in heterogeneous environments, which might reduce their regional extinction risk under CC. PMID:26777878

  18. A genome-wide association study identifies genomic regions for virulence in the non-model organism Heterobasidion annosum s.s.

    PubMed

    Dalman, Kerstin; Himmelstrand, Kajsa; Olson, Åke; Lind, Mårten; Brandström-Durling, Mikael; Stenlid, Jan

    2013-01-01

    The dense single nucleotide polymorphisms (SNP) panels needed for genome wide association (GWA) studies have hitherto been expensive to establish and use on non-model organisms. To overcome this, we used a next generation sequencing approach to both establish SNPs and to determine genotypes. We conducted a GWA study on a fungal species, analysing the virulence of Heterobasidion annosum s.s., a necrotrophic pathogen, on its hosts Picea abies and Pinus sylvestris. From a set of 33,018 single nucleotide polymorphisms (SNP) in 23 haploid isolates, twelve SNP markers distributed on seven contigs were associated with virulence (P<0.0001). Four of the contigs harbour known virulence genes from other fungal pathogens and the remaining three harbour novel candidate genes. Two contigs link closely to virulence regions recognized previously by QTL mapping in the congeneric hybrid H. irregulare × H. occidentale. Our study demonstrates the efficiency of GWA studies for dissecting important complex traits of small populations of non-model haploid organisms with small genomes. PMID:23341945

  19. Analysis of Genomic Regions Associated With Coronary Artery Disease Reveals Continent-Specific Single Nucleotide Polymorphisms in North African Populations

    PubMed Central

    Zanetti, Daniela; Via, Marc; Carreras-Torres, Robert; Esteban, Esther; Chaabani, Hassen; Anaibar, Fatima; Harich, Nourdin; Habbal, Rachida; Ghalim, Noreddine; Moral, Pedro

    2016-01-01

    Background In recent years, several genomic regions have been robustly associated with coronary artery disease (CAD) in different genome-wide association studies (GWASs) conducted mainly in people of European descent. These kinds of data are lacking in African populations, even though heart diseases are a major cause of premature death and disability. Methods Here, 384 single nucleotide polymorphisms (SNPs) in the top four CAD risk regions (1p13, 1q41, 9p21, and 10q11) were genotyped in 274 case-control samples from Morocco and Tunisia, with the aim of analyzing for the first time if the associations found in European populations were transferable to North Africans. Results The results indicate that, as in Europe, these four genetic regions are also important for CAD risk in North Africa. However, the individual SNPs associated with CAD in Africa are different from those identified in Europe in most cases (1p13, 1q41, and 9p21). Moreover, the seven risk variants identified in North Africans are efficient in discriminating between cases and controls in North African populations, but not in European populations. Conclusions This study indicates a disparity in markers associated to CAD susceptibility between North Africans and Europeans that may be related to population differences in the chromosomal architecture of these risk regions. PMID:26780859

  20. Filarial and Wolbachia genomics.

    PubMed

    Scott, A L; Ghedin, E; Nutman, T B; McReynolds, L A; Poole, C B; Slatko, B E; Foster, J M

    2012-01-01

    Filarial nematode parasites, the causative agents for a spectrum of acute and chronic diseases including lymphatic filariasis and river blindness, threaten the well-being and livelihood of hundreds of millions of people in the developing regions of the world. The 2007 publication on a draft assembly of the 95-Mb genome of the human filarial parasite Brugia malayi- representing the first helminth parasite genome to be sequenced - has been followed in rapid succession by projects that have resulted in the genome sequencing of six additional filarial species, seven nonfilarial nematode parasites of animals and nearly 30 plant parasitic and free-living species. Parallel to the genomic sequencing, transcriptomic and proteomic projects have facilitated genome annotation, expanded our understanding of stage-associated gene expression and provided a first look at the role of epigenetic regulation of filarial genomes through microRNAs. The expansion in filarial genomics will also provide a significant enrichment in our knowledge of the diversity and variability in the genomes of the endosymbiotic bacterium Wolbachia leading to a better understanding of the genetic principles that govern filarial-Wolbachia mutualism. The goal here is to provide an overview of the trends and advances in filarial and Wolbachia genomics. PMID:22098559

  1. Integration of multiethnic fine-mapping and genomic annotation to prioritize candidate functional SNPs at prostate cancer susceptibility regions.

    PubMed

    Han, Ying; Hazelett, Dennis J; Wiklund, Fredrik; Schumacher, Fredrick R; Stram, Daniel O; Berndt, Sonja I; Wang, Zhaoming; Rand, Kristin A; Hoover, Robert N; Machiela, Mitchell J; Yeager, Merideth; Burdette, Laurie; Chung, Charles C; Hutchinson, Amy; Yu, Kai; Xu, Jianfeng; Travis, Ruth C; Key, Timothy J; Siddiq, Afshan; Canzian, Federico; Takahashi, Atsushi; Kubo, Michiaki; Stanford, Janet L; Kolb, Suzanne; Gapstur, Susan M; Diver, W Ryan; Stevens, Victoria L; Strom, Sara S; Pettaway, Curtis A; Al Olama, Ali Amin; Kote-Jarai, Zsofia; Eeles, Rosalind A; Yeboah, Edward D; Tettey, Yao; Biritwum, Richard B; Adjei, Andrew A; Tay, Evelyn; Truelove, Ann; Niwa, Shelley; Chokkalingam, Anand P; Isaacs, William B; Chen, Constance; Lindstrom, Sara; Le Marchand, Loic; Giovannucci, Edward L; Pomerantz, Mark; Long, Henry; Li, Fugen; Ma, Jing; Stampfer, Meir; John, Esther M; Ingles, Sue A; Kittles, Rick A; Murphy, Adam B; Blot, William J; Signorello, Lisa B; Zheng, Wei; Albanes, Demetrius; Virtamo, Jarmo; Weinstein, Stephanie; Nemesure, Barbara; Carpten, John; Leske, M Cristina; Wu, Suh-Yuh; Hennis, Anselm J M; Rybicki, Benjamin A; Neslund-Dudas, Christine; Hsing, Ann W; Chu, Lisa; Goodman, Phyllis J; Klein, Eric A; Zheng, S Lilly; Witte, John S; Casey, Graham; Riboli, Elio; Li, Qiyuan; Freedman, Matthew L; Hunter, David J; Gronberg, Henrik; Cook, Michael B; Nakagawa, Hidewaki; Kraft, Peter; Chanock, Stephen J; Easton, Douglas F; Henderson, Brian E; Coetzee, Gerhard A; Conti, David V; Haiman, Christopher A

    2015-10-01

    Interpretation of biological mechanisms underlying genetic risk associations for prostate cancer is complicated by the relatively large number of risk variants (n = 100) and the thousands of surrogate SNPs in linkage disequilibrium. Here, we combined three distinct approaches: multiethnic fine-mapping, putative functional annotation (based upon epigenetic data and genome-encoded features), and expression quantitative trait loci (eQTL) analyses, in an attempt to reduce this complexity. We examined 67 risk regions using genotyping and imputation-based fine-mapping in populations of European (cases/controls: 8600/6946), African (cases/controls: 5327/5136), Japanese (cases/controls: 2563/4391) and Latino (cases/controls: 1034/1046) ancestry. Markers at 55 regions passed a region-specific significance threshold (P-value cutoff range: 3.9 × 10(-4)-5.6 × 10(-3)) and in 30 regions we identified markers that were more significantly associated with risk than the previously reported variants in the multiethnic sample. Novel secondary signals (P < 5.0 × 10(-6)) were also detected in two regions (rs13062436/3q21 and rs17181170/3p12). Among 666 variants in the 55 regions with P-values within one order of magnitude of the most-associated marker, 193 variants (29%) in 48 regions overlapped with epigenetic or other putative functional marks. In 11 of the 55 regions, cis-eQTLs were detected with nearby genes. For 12 of the 55 regions (22%), the most significant region-specific, prostate-cancer associated variant represented the strongest candidate functional variant based on our annotations; the number of regions increased to 20 (36%) and 27 (49%) when examining the 2 and 3 most significantly associated variants in each region, respectively. These results have prioritized subsets of candidate variants for downstream functional evaluation. PMID:26162851

  2. Whole genome sequence analyses of Xylella fastidiosa PD strains from different geographical regions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome sequences were determined for two Pierce’s disease (PD) causing Xylella fastidiosa (Xf) strains, one from Florida and one from Taiwan. The Florida strain was ATCC 35879, the type of strain used as a standard reference for related taxonomy research. By contrast, the Taiwan strain used was only...

  3. Comparative genomics of Campylobacter iguaniorum to unravel genetic regions associated with reptilian hosts

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Campylobacter iguaniorum is genetically related to the species C. fetus, C. hyointestinalis, and C. lanienae. Reptiles, chelonians and lizards in particular, appear to be the primary reservoir of this Campylobacter species. Here we report the genome comparison of C. iguaniorum strain 1485E, isolated...

  4. Recent artificial selection in U.S. Jersey cattle impacts autozygosity levels of specific genomic regions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome signatures of artificial selection in U.S. Jersey cattle were identified by examining changes in haplotype homozygosity for a resource population of animals born between 1962 and 2005. Genetic merit of this population changed dramatically during this period for a number of traits, especially ...

  5. Distinct factors control histone variant H3.3 localization at specific genomic regions

    PubMed Central

    Goldberg, Aaron D.; Banaszynski, Laura A.; Noh, Kyung-Min; Lewis, Peter W.; Elsaesser, Simon J.; Stadler, Sonja; Dewell, Scott; Law, Martin; Guo, Xingyi; Li, Xuan; Wen, Duancheng; Chapgier, Ariane; DeKelver, Russell C.; Miller, Jeffrey C.; Lee, Ya-Li; Boydston, Elizabeth A.; Holmes, Michael C.; Gregory, Philip D.; Greally, John M.; Rafii, Shahin; Yang, Chingwen; Scambler, Peter J.; Garrick, David; Gibbons, Richard J.; Higgs, Douglas R.; Cristea, Ileana M.; Urnov, Fyodor D.; Zheng, Deyou; Allis, C. David

    2010-01-01

    Summary The incorporation of histone H3 variants has been implicated in the epigenetic memory of cellular state. Using genome editing with zinc finger nucleases to tag endogenous H3.3, we report genome-wide profiles of H3 variants in mammalian embryonic stem (ES) cells and neuronal precursor cells. Genome-wide patterns of H3.3 are dependent on amino acid sequence, and change with cellular differentiation at developmentally regulated loci. The H3.3 chaperone Hira is required for H3.3 enrichment at active and repressed genes. Strikingly, Hira is not essential for localization of H3.3 at telomeres and many transcription factor binding sites. Immunoaffinity purification and mass spectrometry reveal that the proteins Atrx and Daxx associate with H3.3 in a Hira-independent manner. Atrx is required for Hira-independent localization of H3.3 at telomeres, and for the repression of telomeric RNA. Our data demonstrate that multiple and distinct factors are responsible for H3.3 localization at specific genomic locations in mammalian cells. PMID:20211137

  6. Development and validation of new SSR markers from expressed regions in the garlic genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Limited number of simple sequence repeat (SSR) markers is available for the genome of garlic (Allium sativum L.) although SSR markers have become one of the most preferred marker systems because they are typically co-dominant, reproducible, cross species transferable and highly polymorphic. In this ...

  7. Isolation of a Genomic Region Affecting Most Components of Metabolic Syndrome in a Chromosome-16 Congenic Rat Model

    PubMed Central

    Šedová, Lucie; Pravenec, Michal; Křenová, Drahomíra; Kazdová, Ludmila; Zídek, Václav; Krupková, Michaela; Liška, František; Křen, Vladimír; Šeda, Ondřej

    2016-01-01

    Metabolic syndrome is a highly prevalent human disease with substantial genomic and environmental components. Previous studies indicate the presence of significant genetic determinants of several features of metabolic syndrome on rat chromosome 16 (RNO16) and the syntenic regions of human genome. We derived the SHR.BN16 congenic strain by introgression of a limited RNO16 region from the Brown Norway congenic strain (BN-Lx) into the genomic background of the spontaneously hypertensive rat (SHR) strain. We compared the morphometric, metabolic, and hemodynamic profiles of adult male SHR and SHR.BN16 rats. We also compared in silico the DNA sequences for the differential segment in the BN-Lx and SHR parental strains. SHR.BN16 congenic rats had significantly lower weight, decreased concentrations of total triglycerides and cholesterol, and improved glucose tolerance compared with SHR rats. The concentrations of insulin, free fatty acids, and adiponectin were comparable between the two strains. SHR.BN16 rats had significantly lower systolic (18–28 mmHg difference) and diastolic (10–15 mmHg difference) blood pressure throughout the experiment (repeated-measures ANOVA, P < 0.001). The differential segment spans approximately 22 Mb of the telomeric part of the short arm of RNO16. The in silico analyses revealed over 1200 DNA variants between the BN-Lx and SHR genomes in the SHR.BN16 differential segment, 44 of which lead to missense mutations, and only eight of which (in Asb14, Il17rd, Itih1, Syt15, Ercc6, RGD1564958, Tmem161a, and Gatad2a genes) are predicted to be damaging to the protein product. Furthermore, a number of genes within the RNO16 differential segment associated with metabolic syndrome components in human studies showed polymorphisms between SHR and BN-Lx (including Lpl, Nrg3, Pbx4, Cilp2, and Stab1). Our novel congenic rat model demonstrates that a limited genomic region on RNO16 in the SHR significantly affects many of the features of metabolic syndrome

  8. [Variation in evolutionary unstable regions of the chloroplast genome in plants obtained in anther culture of dihaploid wheat lines].

    PubMed

    Mozgova, G V; Orlov, P A; Shalygo, N V

    2006-02-01

    In dihaploid wheats, two evolutionarily unstable regions of the chloroplast genome were examined. These regions include the following genes, changes in which could be associated with albinism in anther culture: rbcL, encoding the large Rubisco subunit; psaA, encoding p700 apoprotein Ia; petA, encoding cytochrome f; atpB and atpE, encoding respectively beta and epsilon subunits of the CF1 ATPase complex; trnE, encoding glutamine tRNA; and cemA, encoding a cell membrane protein. Using PCR, we have shown that atpB was the gene most often not detected in the lines examined. These results suggest that regeneration of albino plants is accompanied by a deletion of a chloroplast DNA region harboring this gene. PMID:16583703

  9. Complete mitochondrial genome of the frillneck lizard (Chlamydosaurus kingii, Reptilia; Agamidae), another squamate with two control regions.

    PubMed

    Ujvari, Beata; Madsen, Thomas

    2008-10-01

    Using PCR, the complete mitochondrial genome was sequenced in three frillneck lizards (Chlamydosaurus kingii). The mitochondria spanned over 16,761bp. As in other vertebrates, two rRNA genes, 22 tRNA genes and 13 protein coding genes were identified. However, similar to some other squamate reptiles, two control regions (CRI and CRII) were identified, spanning 801 and 812 bp, respectively. Our results were compared with another Australian member of the family Agamidae, the bearded dragon (Pogana vitticeps). The overall base composition of the light-strand sequence largely mirrored that observed in P vitticeps. Furthermore, similar to P. vitticeps, we observed an insertion 801 bp long between the ND5 and ND6 genes. However, in contrast to P vitticeps we did not observe a conserved sequence block III region. Based on a comparison among the three frillneck lizards, we also present data on the proportion of variable sites within the major mitochondrial regions. PMID:19489141

  10. Genomic footprinting of a yeast tRNA gene reveals stable complexes over the 5'-flanking region.

    PubMed Central

    Huibregtse, J M; Engelke, D R

    1989-01-01

    We have shown by genomic footprinting that the 5'-flanking region of the Saccharomyces cerevisiae tRNASUP53 gene is protected from DNase I digestion. The protected region has a 5' boundary at -40 (relative to the transcription initiation site) and extends into the coding region of the gene, with a 3' boundary at approximately +15. Although the DNase I protection over this region was much greater than at the A- and B-box internal promoters, point mutations within the A or B box that reduced transcription in vitro eliminated the upstream DNase I protection. This implies that formation of a stable complex over the 5'-flanking region is dependent on interaction of the gene with transcription factor IIIC but that stability of the complex may not require continued interaction with this factor. The DNase I protection under varied growth conditions further suggested that the upstream complex is composed of two or more components. The region over the transcription initiation site (approximately +15 to -10) was less protected in stationary-phase cultures, whereas the more upstream region (approximately -10 to -40) was protected in both exponential- and stationary-phase cultures. Images PMID:2677668

  11. Genome-wide coexpression of steroid receptors in the mouse brain: Identifying signaling pathways and functionally coordinated regions

    PubMed Central

    Lelieveldt, Boudewijn P. F.; Grefhorst, Aldo; van Weert, Lisa T. C. M.; Mol, Isabel M.; Sips, Hetty C. M.; van den Heuvel, José K.; Datson, Nicole A.; Visser, Jenny A.; Meijer, Onno C.

    2016-01-01

    Steroid receptors are pleiotropic transcription factors that coordinate adaptation to different physiological states. An important target organ is the brain, but even though their effects are well studied in specific regions, brain-wide steroid receptor targets and mediators remain largely unknown due to the complexity of the brain. Here, we tested the idea that novel aspects of steroid action can be identified through spatial correlation of steroid receptors with genome-wide mRNA expression across different regions in the mouse brain. First, we observed significant coexpression of six nuclear receptors (NRs) [androgen receptor (Ar), estrogen receptor alpha (Esr1), estrogen receptor beta (Esr2), glucocorticoid receptor (Gr), mineralocorticoid receptor (Mr), and progesterone receptor (Pgr)] with sets of steroid target genes that were identified in single brain regions. These coexpression relationships were also present in distinct other brain regions, suggestive of as yet unidentified coordinate regulation of brain regions by, for example, glucocorticoids and estrogens. Second, coexpression of a set of 62 known NR coregulators and the six steroid receptors in 12 nonoverlapping mouse brain regions revealed selective downstream pathways, such as Pak6 as a mediator for the effects of Ar and Gr on dopaminergic transmission. Third, Magel2 and Irs4 were identified and validated as strongly responsive targets to the estrogen diethylstilbestrol in the mouse hypothalamus. The brain- and genome-wide correlations of mRNA expression levels of six steroid receptors that we provide constitute a rich resource for further predictions and understanding of brain modulation by steroid hormones. PMID:26811448

  12. Genome-wide coexpression of steroid receptors in the mouse brain: Identifying signaling pathways and functionally coordinated regions.

    PubMed

    Mahfouz, Ahmed; Lelieveldt, Boudewijn P F; Grefhorst, Aldo; van Weert, Lisa T C M; Mol, Isabel M; Sips, Hetty C M; van den Heuvel, José K; Datson, Nicole A; Visser, Jenny A; Reinders, Marcel J T; Meijer, Onno C

    2016-03-01

    Steroid receptors are pleiotropic transcription factors that coordinate adaptation to different physiological states. An important target organ is the brain, but even though their effects are well studied in specific regions, brain-wide steroid receptor targets and mediators remain largely unknown due to the complexity of the brain. Here, we tested the idea that novel aspects of steroid action can be identified through spatial correlation of steroid receptors with genome-wide mRNA expression across different regions in the mouse brain. First, we observed significant coexpression of six nuclear receptors (NRs) [androgen receptor (Ar), estrogen receptor alpha (Esr1), estrogen receptor beta (Esr2), glucocorticoid receptor (Gr), mineralocorticoid receptor (Mr), and progesterone receptor (Pgr)] with sets of steroid target genes that were identified in single brain regions. These coexpression relationships were also present in distinct other brain regions, suggestive of as yet unidentified coordinate regulation of brain regions by, for example, glucocorticoids and estrogens. Second, coexpression of a set of 62 known NR coregulators and the six steroid receptors in 12 nonoverlapping mouse brain regions revealed selective downstream pathways, such as Pak6 as a mediator for the effects of Ar and Gr on dopaminergic transmission. Third, Magel2 and Irs4 were identified and validated as strongly responsive targets to the estrogen diethylstilbestrol in the mouse hypothalamus. The brain- and genome-wide correlations of mRNA expression levels of six steroid receptors that we provide constitute a rich resource for further predictions and understanding of brain modulation by steroid hormones. PMID:26811448

  13. Evidence of Shared Genome-Wide Additive Genetic Effects on Interpersonal Trauma Exposure and Generalized Vulnerability to Drug Dependence in a Population of Substance Users.

    PubMed

    Palmer, Rohan H C; Nugent, Nicole R; Brick, Leslie A; Bidwell, Cinnamon L; McGeary, John E; Keller, Matthew C; Knopik, Valerie S

    2016-06-01

    Exposure to traumatic experiences is associated with an increased risk for drug dependence and poorer response to substance abuse treatment (Claus & Kindleberger, 2002; Jaycox, Ebener, Damesek, & Becker, 2004). Despite this evidence, the reasons for the observed associations of trauma and the general tendency to be dependent upon drugs of abuse remain unclear. Data (N = 2,596) from the Study of Addiction: Genetics and Environment were used to analyze (a) the degree to which commonly occurring single nucleotide polymorphisms (SNPs; minor allele frequency > 1%) in the human genome explains exposure to interpersonal traumatic experiences, and (b) the extent to which additive genetic effects on trauma are shared with additive genetic effects on drug dependence. Our results suggested moderate additive genetic influences on interpersonal trauma, h(2) SNP-Interpersonal = .47, 95% confidence interval (CI) [.10, .85], that are partially shared with additive genetic effects on generalized vulnerability to drug dependence, h(2) SNP-DD = .36, 95% CI [.11, .61]; rG-SNP = .49, 95% CI [.02, .96]. Although the design/technique does not exclude the possibility that substance abuse causally increases risk for traumatic experiences (or vice versa), these findings raise the possibility that commonly occurring SNPs influence both the general tendency towards drug dependence and interpersonal trauma. PMID:27214850

  14. swDMR: A Sliding Window Approach to Identify Differentially Methylated Regions Based on Whole Genome Bisulfite Sequencing

    PubMed Central

    Shao, Qianzhi; Liu, Qi; Chen, BingYu; Huang, Dongsheng

    2015-01-01

    DNA methylation is a widespread epigenetic modification that plays an essential role in gene expression through transcriptional regulation and chromatin remodeling. The emergence of whole genome bisulfite sequencing (WGBS) represents an important milestone in the detection of DNA methylation. Characterization of differential methylated regions (DMRs) is fundamental as well for further functional analysis. In this study, we present swDMR (http://sourceforge.net/projects/swdmr/) for the comprehensive analysis of DMRs from whole genome methylation profiles by a sliding window approach. It is an integrated tool designed for WGBS data, which not only implements accessible statistical methods to perform hypothesis test adapted to two or more samples without replicates, but false discovery rate was also controlled by multiple test correction. Downstream analysis tools were also provided, including cluster, annotation and visualization modules. In summary, based on WGBS data, swDMR can produce abundant information of differential methylated regions. As a convenient and flexible tool, we believe swDMR will bring us closer to unveil the potential functional regions involved in epigenetic regulation. PMID:26176536

  15. Genome walking.

    PubMed

    Shapter, Frances M; Waters, Daniel L E

    2014-01-01

    Genome walking is a method for determining the DNA sequence of unknown genomic regions flanking a region of known DNA sequence. The Genome walking has the potential to capture 6-7 kb of sequence in a single round. Ideal for identifying gene promoter regions where only the coding region. Genome walking also has significant utility for capturing homologous genes in new species when there are areas in the target gene with strong sequence conservation to the characterized species. The increasing use of next-generation sequencing technologies will see the principles of genome walking adapted to in silico methods. However, for smaller projects, PCR-based genome walking will remain an efficient method of characterizing unknown flanking sequence. PMID:24243201

  16. Replication protein of tobacco mosaic virus cotranslationally binds the 5′ untranslated region of genomic RNA to enable viral replication

    PubMed Central

    Kawamura-Nagaya, Kazue; Ishibashi, Kazuhiro; Huang, Ying-Ping; Miyashita, Shuhei; Ishikawa, Masayuki

    2014-01-01

    Genomic RNA of positive-strand RNA viruses replicate via complementary (i.e., negative-strand) RNA in membrane-bound replication complexes. Before replication complex formation, virus-encoded replication proteins specifically recognize genomic RNA molecules and recruit them to sites of replication. Moreover, in many of these viruses, selection of replication templates by the replication proteins occurs preferentially in cis. This property is advantageous to the viruses in several aspects of viral replication and evolution, but the underlying molecular mechanisms have not been characterized. Here, we used an in vitro translation system to show that a 126-kDa replication protein of tobacco mosaic virus (TMV), a positive-strand RNA virus, binds a 5′-terminal ∼70-nucleotide region of TMV RNA cotranslationally, but not posttranslationally. TMV mutants that carried nucleotide changes in the 5′-terminal region and showed a defect in the binding were unable to synthesize negative-strand RNA, indicating that this binding is essential for template selection. A C-terminally truncated 126-kDa protein, but not the full-length 126-kDa protein, was able to posttranslationally bind TMV RNA in vitro, suggesting that binding of the 126-kDa protein to the 70-nucleotide region occurs during translation and before synthesis of the C-terminal inhibitory domain. We also show that binding of the 126-kDa protein prevents further translation of the bound TMV RNA. These data provide a mechanistic explanation of how the 126-kDa protein selects replication templates in cis and how fatal collision between translating ribosomes and negative-strand RNA-synthesizing polymerases on the genomic RNA is avoided. PMID:24711385

  17. QTL Mapping in Three Rice Populations Uncovers Major Genomic Regions Associated with African Rice Gall Midge Resistance

    PubMed Central

    Semagn, Kassa; Sow, Mounirou; Nwilene, Francis; Kolade, Olufisayo; Bocco, Roland; Oyetunji, Olumoye; Mitchell-Olds, Thomas; Ndjiondjop, Marie-Noëlle

    2016-01-01

    African rice gall midge (AfRGM) is one of the most destructive pests of irrigated and lowland African ecologies. This study aimed to identify the quantitative trait loci (QTL) associated with AfRGM pest incidence and resistance in three independent bi-parental rice populations (ITA306xBW348-1, ITA306xTOG7106 and ITA306xTOS14519), and to conduct meta QTL (mQTL) analysis to explore whether any genomic regions are conserved across different genetic backgrounds. Composite interval mapping (CIM) conducted on the three populations independently uncovered a total of 28 QTLs associated with pest incidence (12) and pest severity (16). The number of QTLs per population associated with AfRGM resistance varied from three in the ITA306xBW348-1 population to eight in the ITA306xTOG7106 population. Each QTL individually explained 1.3 to 34.1% of the phenotypic variance. The major genomic region for AfRGM resistance had a LOD score and R2 of 60.0 and 34.1% respectively, and mapped at 111 cM on chromosome 4 (qAfrGM4) in the ITA306xTOS14519 population. The meta-analysis reduced the number of QTLs from 28 to 17 mQTLs, each explaining 1.3 to 24.5% of phenotypic variance, and narrowed the confidence intervals by 2.2 cM. There was only one minor effect mQTL on chromosome 1 that was common in the TOS14519 and TOG7106 genetic backgrounds; all other mQTLs were background specific. We are currently fine-mapping and validating the major effect genomic region on chromosome 4 (qAfRGM4). This is the first report in mapping the genomic regions associated with the AfRGM resistance, and will be highly useful for rice breeders. PMID:27508500

  18. QTL Mapping in Three Rice Populations Uncovers Major Genomic Regions Associated with African Rice Gall Midge Resistance.

    PubMed

    Yao, Nasser; Lee, Cheng-Ruei; Semagn, Kassa; Sow, Mounirou; Nwilene, Francis; Kolade, Olufisayo; Bocco, Roland; Oyetunji, Olumoye; Mitchell-Olds, Thomas; Ndjiondjop, Marie-Noëlle

    2016-01-01

    African rice gall midge (AfRGM) is one of the most destructive pests of irrigated and lowland African ecologies. This study aimed to identify the quantitative trait loci (QTL) associated with AfRGM pest incidence and resistance in three independent bi-parental rice populations (ITA306xBW348-1, ITA306xTOG7106 and ITA306xTOS14519), and to conduct meta QTL (mQTL) analysis to explore whether any genomic regions are conserved across different genetic backgrounds. Composite interval mapping (CIM) conducted on the three populations independently uncovered a total of 28 QTLs associated with pest incidence (12) and pest severity (16). The number of QTLs per population associated with AfRGM resistance varied from three in the ITA306xBW348-1 population to eight in the ITA306xTOG7106 population. Each QTL individually explained 1.3 to 34.1% of the phenotypic variance. The major genomic region for AfRGM resistance had a LOD score and R2 of 60.0 and 34.1% respectively, and mapped at 111 cM on chromosome 4 (qAfrGM4) in the ITA306xTOS14519 population. The meta-analysis reduced the number of QTLs from 28 to 17 mQTLs, each explaining 1.3 to 24.5% of phenotypic variance, and narrowed the confidence intervals by 2.2 cM. There was only one minor effect mQTL on chromosome 1 that was common in the TOS14519 and TOG7106 genetic backgrounds; all other mQTLs were background specific. We are currently fine-mapping and validating the major effect genomic region on chromosome 4 (qAfRGM4). This is the first report in mapping the genomic regions associated with the AfRGM resistance, and will be highly useful for rice breeders. PMID:27508500

  19. Isolation of Transducing Particles of ϕ80 Bacteriophage That Carry Different Regions of the Escherichia coli Genome

    PubMed Central

    Press, R.; Glansdorff, N.; Miner, P.; Vries, J. De; Kadner, R.; Maas, W. K.

    1971-01-01

    It has been possible to mate two strains harboring F-prime (F′) factors and to isolate from such matings rare recombinants that behave as though the two episomes had fused. Thus, two genes not previously linked may be brought into close proximity. An F′ factor carrying the attachment site for ϕ80 was fused with one carrying the met-ppc-arg region of the chromosome. Lysogenization of such a strain, followed by induction, led to the isolation of ϕ80arg+ and ϕ80met+ transducing phages. This technique may be utilized as a general method for joining diverse bacterial genes to the genome of phage ϕ80. PMID:4927673

  20. Multiple recent horizontal transfers of a large genomic region in cheese making fungi

    PubMed Central

    Cheeseman, Kevin; Ropars, Jeanne; Renault, Pierre; Dupont, Joëlle; Gouzy, Jérôme; Branca, Antoine; Abraham, Anne-Laure; Ceppi, Maurizio; Conseiller, Emmanuel; Debuchy, Robert; Malagnac, Fabienne; Goarin, Anne; Silar, Philippe; Lacoste, Sandrine; Sallet, Erika; Bensimon, Aaron; Giraud, Tatiana; Brygoo, Yves

    2014-01-01

    While the extent and impact of horizontal transfers in prokaryotes are widely acknowledged, their importance to the eukaryotic kingdom is unclear and thought by many to be anecdotal. Here we report multiple recent transfers of a huge genomic island between Penicillium spp. found in the food environment. Sequencing of the two leading filamentous fungi used in cheese making, P. roqueforti and P. camemberti, and comparison with the penicillin producer P. rubens reveals a 575 kb long genomic island in P. roqueforti—called Wallaby—present as identical fragments at non-homologous loci in P. camemberti and P. rubens. Wallaby is detected in Penicillium collections exclusively in strains from food environments. Wallaby encompasses about 250 predicted genes, some of which are probably involved in competition with microorganisms. The occurrence of multiple recent eukaryotic transfers in the food environment provides strong evidence for the importance of this understudied and probably underestimated phenomenon in eukaryotes. PMID:24407037

  1. Genome Sequences of 11 Brucella abortus Isolates from Persistently Infected Italian Regions.

    PubMed

    Garofolo, Giuliano; Foster, Jeffrey T; Drees, Kevin; Zilli, Katiuscia; Platone, Ilenia; Ancora, Massimo; Cammà, Cesare; De Massis, Fabrizio; Calistri, Paolo; Di Giannatale, Elisabetta

    2015-01-01

    Bovine brucellosis, typically caused by Brucella abortus, has been eradicated from much of the developed world. However, the disease remains prevalent in southern Italy, persisting as a public and livestock health concern. We report here the whole-genome sequences of 11 isolates from cattle (Bos taurus) and water buffalo (Bubalus bubalis) that are representative of the current genetic diversity of B. abortus lineages circulating in Italy. PMID:26679575

  2. AnABlast: a new in silico strategy for the genome-wide search of novel genes and fossil regions.

    PubMed

    Jimenez, Juan; Duncan, Caia D S; Gallardo, María; Mata, Juan; Perez-Pulido, Antonio J

    2015-12-01

    Genome annotation, assisted by computer programs, is one of the great advances in modern biology. Nevertheless, the in silico identification of small and complex coding sequences is still challenging. We observed that amino acid sequences inferred from coding-but rarely from non-coding-DNA sequences accumulated alignments in low-stringency BLAST searches, suggesting that this alignments accumulation could be used to highlight coding regions in sequenced DNA. To investigate this possibility, we developed a computer program (AnABlast) that generates profiles of accumulated alignments in query amino acid sequences using a low-stringency BLAST strategy. To validate this approach, all six-frame translations of DNA sequences between every two annotated exons of the fission yeast genome were analysed with AnABlast. AnABlast-generated profiles identified three new copies of known genes, and four new genes supported by experimental evidence. New pseudogenes, ancestral carboxyl- and amino-terminal subtractions, complex gene rearrangements, and ancient fragments of mitDNA and of bacterial origin, were also inferred. Thus, this novel in silico approach provides a powerful tool to uncover new genes, as well as fossil-coding sequences, thus providing insight into the evolutionary history of annotated genomes. PMID:26494834

  3. AnABlast: a new in silico strategy for the genome-wide search of novel genes and fossil regions

    PubMed Central

    Jimenez, Juan; Duncan, Caia D. S.; Gallardo, María; Mata, Juan; Perez-Pulido, Antonio J.

    2015-01-01

    Genome annotation, assisted by computer programs, is one of the great advances in modern biology. Nevertheless, the in silico identification of small and complex coding sequences is still challenging. We observed that amino acid sequences inferred from coding—but rarely from non-coding—DNA sequences accumulated alignments in low-stringency BLAST searches, suggesting that this alignments accumulation could be used to highlight coding regions in sequenced DNA. To investigate this possibility, we developed a computer program (AnABlast) that generates profiles of accumulated alignments in query amino acid sequences using a low-stringency BLAST strategy. To validate this approach, all six-frame translations of DNA sequences between every two annotated exons of the fission yeast genome were analysed with AnABlast. AnABlast-generated profiles identified three new copies of known genes, and four new genes supported by experimental evidence. New pseudogenes, ancestral carboxyl- and amino-terminal subtractions, complex gene rearrangements, and ancient fragments of mitDNA and of bacterial origin, were also inferred. Thus, this novel in silico approach provides a powerful tool to uncover new genes, as well as fossil-coding sequences, thus providing insight into the evolutionary history of annotated genomes. PMID:26494834

  4. The vertebrate ancestral repertoire of visual opsins, transducin alpha subunits and oxytocin/vasopressin receptors was established by duplication of their shared genomic region in the two rounds of early vertebrate genome duplications

    PubMed Central

    2013-01-01

    Background Vertebrate color vision is dependent on four major color opsin subtypes: RH2 (green opsin), SWS1 (ultraviolet opsin), SWS2 (blue opsin), and LWS (red opsin). Together with the dim-light receptor rhodopsin (RH1), these form the family of vertebrate visual opsins. Vertebrate genomes contain many multi-membered gene families that can largely be explained by the two rounds of whole genome duplication (WGD) in the vertebrate ancestor (2R) followed by a third round in the teleost ancestor (3R). Related chromosome regions resulting from WGD or block duplications are said to form a paralogon. We describe here a paralogon containing the genes for visual opsins, the G-protein alpha subunit families for transducin (GNAT) and adenylyl cyclase inhibition (GNAI), the oxytocin and vasopressin receptors (OT/VP-R), and the L-type voltage-gated calcium channels (CACNA1-L). Results Sequence-based phylogenies and analyses of conserved synteny show that the above-mentioned gene families, and many neighboring gene families, expanded in the early vertebrate WGDs. This allows us to deduce the following evolutionary scenario: The vertebrate ancestor had a chromosome containing the genes for two visual opsins, one GNAT, one GNAI, two OT/VP-Rs and one CACNA1-L gene. This chromosome was quadrupled in 2R. Subsequent gene losses resulted in a set of five visual opsin genes, three GNAT and GNAI genes, six OT/VP-R genes and four CACNA1-L genes. These regions were duplicated again in 3R resulting in additional teleost genes for some of the families. Major chromosomal rearrangements have taken place in the teleost genomes. By comparison with the corresponding chromosomal regions in the spotted gar, which diverged prior to 3R, we could time these rearrangements to post-3R. Conclusions We present an extensive analysis of the paralogon housing the visual opsin, GNAT and GNAI, OT/VP-R, and CACNA1-L gene families. The combined data imply that the early vertebrate WGD events contributed to the

  5. Seed colour loci, homoeology and linkage groups of the C genome chromosomes revealed in Brassica rapa–B. oleracea monosomic alien addition lines

    PubMed Central

    Heneen, Waheeb K.; Geleta, Mulatu; Brismar, Kerstin; Xiong, Zhiyong; Pires, J. Chris; Hasterok, Robert; Stoute, Andrew I.; Scott, Roderick J.; King, Graham J.; Kurup, Smita

    2012-01-01

    Background and Aims Brassica rapa and B. oleracea are the progenitors of oilseed rape B. napus. The addition of each chromosome of B. oleracea to the chromosome complement of B. rapa results in a series of monosomic alien addition lines (MAALs). Analysis of MAALs determines which B. oleracea chromosomes carry genes controlling specific phenotypic traits, such as seed colour. Yellow-seeded oilseed rape is a desirable breeding goal both for food and livestock feed end-uses that relate to oil, protein and fibre contents. The aims of this study included developing a missing MAAL to complement an available series, for studies on seed colour control, chromosome homoeology and assignment of linkage groups to B. oleracea chromosomes. Methods A new batch of B. rapa–B. oleracea aneuploids was produced to generate the missing MAAL. Seed colour and other plant morphological features relevant to differentiation of MAALs were recorded. For chromosome characterization, Snow's carmine, fluorescence in situ hybridization (FISH) and genomic in situ hybridization (GISH) were used. Key Results The final MAAL was developed. Morphological traits that differentiated the MAALs comprised cotyledon number, leaf morphology, flower colour and seed colour. Seed colour was controlled by major genes on two B. oleracea chromosomes and minor genes on five other chromosomes of this species. Homoeologous pairing was largely between chromosomes with similar centromeric positions. FISH, GISH and a parallel microsatellite marker analysis defined the chromosomes in terms of their linkage groups. Conclusions A complete set of MAALs is now available for genetic, genomic, evolutionary and breeding perspectives. Defining chromosomes that carry specific genes, physical localization of DNA markers and access to established genetic linkage maps contribute to the integration of these approaches, manifested in the confirmed correspondence of linkage groups with specific chromosomes. Applications include marker

  6. Mouse models of Down syndrome: how useful can they be? Comparison of the gene content of human chromosome 21 with orthologous mouse genomic regions.

    PubMed

    Gardiner, Katheleen; Fortna, Andrew; Bechtel, Lawrence; Davisson, Muriel T

    2003-10-30

    With an incidence of approximately 1 in 700 live births, Down syndrome (DS) remains the most common genetic cause of mental retardation. The phenotype is assumed to be due to overexpression of some number of the >300 genes encoded by human chromosome 21. Mouse models, in particular the chromosome 16 segmental trisomies, Ts65Dn and Ts1Cje, are indispensable for DS-related studies of gene-phenotype correlations. Here we compare the updated gene content of the finished sequence of human chromosome 21 (364 genes and putative genes) with the gene content of the homologous mouse genomic regions (291 genes and putative genes) obtained from annotation of the public sector C57Bl/6 draft sequence. Annotated genes fall into one of three classes. First, there are 170 highly conserved, human/mouse orthologues. Second, there are 83 minimally conserved, possible orthologues. Included among the conserved and minimally conserved genes are 31 antisense transcripts. Third, there are species-specific genes: 111 spliced human transcripts show no orthologues in the syntenic mouse regions although 13 have homologous sequences elsewhere in the mouse genomic sequence, and 38 spliced mouse transcripts show no identifiable human orthologues. While these species-specific genes are largely based solely on spliced EST data, a majority can be verified in RNA expression experiments. In addition, preliminary data suggest that many human-specific transcripts may represent a novel class of primate-specific genes. Lastly, updated functional annotation of orthologous genes indicates genes encoding components of several cellular pathways are dispersed throughout the orthologous mouse chromosomal regions and are not completely represented in the Down syndrome segmental mouse models. Together, these data point out the potential for existing mouse models to produce extraneous phenotypes and to fail to produce DS-relevant phenotypes. PMID:14585506

  7. Comparative Complete Genome Analysis of Chicken and Turkey Megriviruses (Family Picornaviridae): Long 3′ Untranslated Regions with a Potential Second Open Reading Frame and Evidence for Possible Recombination

    PubMed Central

    Boros, Ákos; Pankovics, Péter; Knowles, Nick J.; Nemes, Csaba; Delwart, Eric

    2014-01-01

    ABSTRACT Members of the family Picornaviridae consist of small positive-sense single-stranded RNA (+ssRNA) viruses capable of infecting various vertebrate species, including birds. One of the recently identified avian picornaviruses, with a remarkably long (>9,040-nucleotide) but still incompletely sequenced genome, is turkey hepatitis virus 1 (THV-1; species Melegrivirus A, genus Megrivirus), a virus associated with liver necrosis and enteritis in commercial turkeys (Meleagris gallopavo). This report presents the results of the genetic analysis of three complete genomes of megriviruses from fecal samples of chickens (chicken/B21-CHV/2012/HUN, GenBank accession no. KF961186, and chicken/CHK-IV-CHV/2013/HUN, GenBank accession no. KF961187) (Gallus gallus domesticus) and turkey (turkey/B407-THV/2011/HUN, GenBank accession no. KF961188) (Meleagris gallopavo) with the largest picornavirus genome (up to 9,739 nucleotides) so far described. The close phylogenetic relationship to THV-1 in the nonstructural protein-coding genome region and possession of the same internal ribosomal entry site type (IVB-like) suggest that the study strains belong to the genus Megrivirus. However, the genome comparisons revealed numerous unique variations (e.g., different numbers of potential 2A peptides, unusually long 3′ genome parts with various lengths of a potential second open reading frame, and multiple repeating sequence motifs in the 3′ untranslated region) and heterogeneous sequence relationships between the structural and nonstructural genome regions. These differences suggest the classification of chicken megrivirus-like viruses into a candidate novel species in the genus Megrivirus. Based on the different phylogenetic positions of chicken megrivirus-like viruses at the structural and nonstructural genome regions, the recombinant nature of these viruses is plausible. IMPORTANCE The comparative genome analysis of turkey and novel chicken megriviruses revealed numerous unique

  8. Genomic analysis of a 1 Mb region near the telomere of Hessian fly chromosome X2 and avirulence gene vH13

    PubMed Central

    Lobo, Neil F; Behura, Susanta K; Aggarwal, Rajat; Chen, Ming-Shun; Collins, Frank H; Stuart, Jeff J

    2006-01-01

    Background To have an insight into the Mayetiola destructor (Hessian fly) genome, we performed an in silico comparative genomic analysis utilizing genetic mapping, genomic sequence and EST sequence data along with data available from public databases. Results Chromosome walking and FISH were utilized to identify a contig of 50 BAC clones near the telomere of the short arm of Hessian fly chromosome X2 and near the avirulence gene vH13. These clones enabled us to correlate physical and genetic distance in this region of the Hessian fly genome. Sequence data from these BAC ends encompassing a 760 kb region, and a fully sequenced and assembled 42.6 kb BAC clone, was utilized to perform a comparative genomic study. In silico gene prediction combined with BLAST analyses was used to determine putative orthology to the sequenced dipteran genomes of the fruit fly, Drosophila melanogaster, and the malaria mosquito, Anopheles gambiae, and to infer evolutionary relationships. Conclusion This initial effort enables us to advance our understanding of the structure, composition and evolution of the genome of this important agricultural pest and is an invaluable tool for a whole genome sequencing effort. PMID:16412254

  9. Genome-Wide Mapping of 5mC and 5hmC Identified Differentially Modified Genomic Regions in Late-Onset Severe Preeclampsia: A Pilot Study

    PubMed Central

    Zhu, Lisha; Lv, Ruitu; Kong, Lingchun; Cheng, Haidong; Lan, Fei; Li, Xiaotian

    2015-01-01

    Preeclampsia (PE) is a leading cause of perinatal morbidity and mortality. However, as a common form of PE, the etiology of late-onset PE is elusive. We analyzed 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) levels in the placentas of late-onset severe PE patients (n = 4) and normal controls (n = 4) using a (hydroxy)methylated DNA immunoprecipitation approach combined with deep sequencing ([h]MeDIP-seq), and the results were verified by (h)MeDIP-qPCR. The most significant differentially methylated regions (DMRs) were verified by MassARRAY EppiTYPER in an enlarged sample size (n = 20). Bioinformatics analysis identified 714 peaks of 5mC that were associated with 403 genes and 119 peaks of 5hmC that were associated with 61 genes, thus showing significant differences between the PE patients and the controls (>2-fold, p<0.05). Further, only one gene, PTPRN2, had both 5mC and 5hmC changes in patients. The ErbB signaling pathway was enriched in those 403 genes that had significantly different5mC level between the groups. This genome-wide mapping of 5mC and 5hmC in late-onset severe PE and normal controls demonstrates that both 5mC and 5hmC play epigenetic roles in the regulation of the disease, but work independently. We reveal the genome-wide mapping of DNA methylation and DNA hydroxymethylation in late-onset PE placentas for the first time, and the identified ErbB signaling pathway and the gene PTPRN2 may be relevant to the epigenetic pathogenesis of late-onset PE. PMID:26214307

  10. Genome-Wide Mapping of 5mC and 5hmC Identified Differentially Modified Genomic Regions in Late-Onset Severe Preeclampsia: A Pilot Study.

    PubMed

    Zhu, Lisha; Lv, Ruitu; Kong, Lingchun; Cheng, Haidong; Lan, Fei; Li, Xiaotian

    2015-01-01

    Preeclampsia (PE) is a leading cause of perinatal morbidity and mortality. However, as a common form of PE, the etiology of late-onset PE is elusive. We analyzed 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) levels in the placentas of late-onset severe PE patients (n = 4) and normal controls (n = 4) using a (hydroxy)methylated DNA immunoprecipitation approach combined with deep sequencing ([h]MeDIP-seq), and the results were verified by (h)MeDIP-qPCR. The most significant differentially methylated regions (DMRs) were verified by MassARRAY EppiTYPER in an enlarged sample size (n = 20). Bioinformatics analysis identified 714 peaks of 5mC that were associated with 403 genes and 119 peaks of 5hmC that were associated with 61 genes, thus showing significant differences between the PE patients and the controls (>2-fold, p<0.05). Further, only one gene, PTPRN2, had both 5mC and 5hmC changes in patients. The ErbB signaling pathway was enriched in those 403 genes that had significantly different 5mC level between the groups. This genome-wide mapping of 5mC and 5hmC in late-onset severe PE and normal controls demonstrates that both 5mC and 5hmC play epigenetic roles in the regulation of the disease, but work independently. We reveal the genome-wide mapping of DNA methylation and DNA hydroxymethylation in late-onset PE placentas for the first time, and the identified ErbB signaling pathway and the gene PTPRN2 may be relevant to the epigenetic pathogenesis of late-onset PE. PMID:26214307

  11. Genome organization and transcription strategy in the complex GNS-L intergenic region of bovine ephemeral fever rhabdovirus.

    PubMed

    McWilliam, S M; Kongsuwan, K; Cowley, J A; Byrne, K A; Walker, P J

    1997-06-01

    A 1622 nucleotide region of the bovine ephemeral fever virus (BEFV) genome, located between the second glycoprotein (GNS) gene and the polymerase (L) gene, has been cloned and sequenced in Australian (BB7721) and Chinese (Beijing-1) isolates of the virus. In the Australian isolate, the region contains five long open reading frames (ORFs) organized into three coding regions (alpha, beta and gamma), each of which are bound by a consensus transcription initiation and transcription termination-polyadenylation-like sequences. The alpha coding region contains three long ORFs (alpha 1, alpha 2 and alpha 3). The alpha 1 ORF encodes a 10.6 kDa polypeptide which contains hydrophobic and highly basic regions characteristic of a viroporin. The alpha 2 ORF encodes a 13.7 kDa polypeptide and overlaps the alpha 3 ORF which encodes a 5.7 kDa polypeptide. The beta coding region contains a single long ORF encoding a polypeptide of 12.2 kDa. The gamma coding region, which does not occur in Adelaide River virus (ARV), contains a single long ORF encoding a polypeptide of 13.4 kDa. The Chinese isolate shares 91% nucleotide sequence identity with the Australian isolate. The organization of the alpha, beta and gamma coding regions is preserved and the sequences of the encoded polypeptides are similar to those of BB7721. The major transcription products of the region were identified in BB7721 as polycistronic alpha (alpha 1-alpha 2-alpha 3) and beta-gamma mRNAs. Sequence similarities in the BEFV alpha-beta and beta-gamma gene junctions, and the gamma-L and beta-L gene junctions of BEFV and ARV, suggest that the gamma gene may have evolved from the beta-gene by sequence duplication. PMID:9191923

  12. The Effectiveness of Three Regions in Mitochondrial Genome for Aphid DNA Barcoding: A Case in Lachininae

    PubMed Central

    Chen, Rui; Jiang, Li-Yun; Qiao, Ge-Xia

    2012-01-01

    Background The mitochondrial gene COI has been widely used by taxonomists as a standard DNA barcode sequence for the identification of many animal species. However, the COI region is of limited use for identifying certain species and is not efficiently amplified by PCR in all animal taxa. To evaluate the utility of COI as a DNA barcode and to identify other barcode genes, we chose the aphid subfamily Lachninae (Hemiptera: Aphididae) as the focus of our study. We compared the results obtained using COI with two other mitochondrial genes, COII and Cytb. In addition, we propose a new method to improve the efficiency of species identification using DNA barcoding. Methodology/Principal Findings Three mitochondrial genes (COI, COII and Cytb) were sequenced and were used in the identification of over 80 species of Lachninae. The COI and COII genes demonstrated a greater PCR amplification efficiency than Cytb. Species identification using COII sequences had a higher frequency of success (96.9% in “best match” and 90.8% in “best close match”) and yielded lower intra- and higher interspecific genetic divergence values than the other two markers. The use of “tag barcodes” is a new approach that involves attaching a species-specific tag to the standard DNA barcode. With this method, the “barcoding overlap” can be nearly eliminated. As a result, we were able to increase the identification success rate from 83.9% to 95.2% by using COI and the “best close match” technique. Conclusions/Significance A COII-based identification system should be more effective in identifying lachnine species than COI or Cytb. However, the Cytb gene is an effective marker for the study of aphid population genetics due to its high sequence diversity. Furthermore, the use of “tag barcodes” can improve the accuracy of DNA barcoding identification by reducing or removing the overlap between intra- and inter-specific genetic divergence values. PMID:23056258

  13. Unique and conserved genome regions in Vibrio harveyi and related species in comparison with the shrimp pathogen Vibrio harveyi CAIM 1792.

    PubMed

    Espinoza-Valles, Iliana; Vora, Gary J; Lin, Baochuan; Leekitcharoenphon, Pimlapas; González-Castillo, Adrián; Ussery, Dave; Høj, Lone; Gomez-Gil, Bruno

    2015-09-01

    Vibrio harveyi CAIM 1792 is a marine bacterial strain that causes mortality in farmed shrimp in north-west Mexico, and the identification of virulence genes in this strain is important for understanding its pathogenicity. The aim of this work was to compare the V. harveyi CAIM 1792 genome with related genome sequences to determine their phylogenic relationship and explore unique regions in silico that differentiate this strain from other V. harveyi strains. Twenty-one newly sequenced genomes were compared in silico against the CAIM 1792 genome at nucleotidic and predicted proteome levels. The proteome of CAIM 1792 had higher similarity to those of other V. harveyi strains (78%) than to those of the other closely related species Vibrio owensii (67%), Vibrio rotiferianus (63%) and Vibrio campbellii (59%). Pan-genome ORFans trees showed the best fit with the accepted phylogeny based on DNA-DNA hybridization and multi-locus sequence analysis of 11 concatenated housekeeping genes. SNP analysis clustered 34/38 genomes within their accepted species. The pangenomic and SNP trees showed that V. harveyi is the most conserved of the four species studied and V. campbellii may be divided into at least three subspecies, supported by intergenomic distance analysis. blastp atlases were created to identify unique regions among the genomes most related to V. harveyi CAIM 1792; these regions included genes encoding glycosyltransferases, specific type restriction modification systems and a transcriptional regulator, LysR, reported to be involved in virulence, metabolism, quorum sensing and motility. PMID:26198743

  14. Structural and Functional Divergence of a 1-Mb Duplicated Region in the Soybean (Glycine max) Genome and Comparison to an Orthologous Region from Phaseolus vulgaris[W][OA

    PubMed Central

    Lin, Jer-Young; Stupar, Robert M.; Hans, Christian; Hyten, David L.; Jackson, Scott A.

    2010-01-01

    Soybean (Glycine max) has undergone at least two rounds of polyploidization, resulting in a paleopolyploid genome that is a mosaic of homoeologous regions. To determine the structural and functional impact of these duplications, we sequenced two ~1-Mb homoeologous regions of soybean, Gm8 and Gm15, derived from the most recent ~13 million year duplication event and the orthologous region from common bean (Phaseolus vulgaris), Pv5. We observed inversions leading to major structural variation and a bias between the two chromosome segments as Gm15 experienced more gene movement (gene retention rate of 81% in Gm15 versus 91% in Gm8) and a nearly twofold increase in the deletion of long terminal repeat (LTR) retrotransposons via solo LTR formation. Functional analyses of Gm15 and Gm8 revealed decreases in gene expression and synonymous substitution rates for Gm15, for instance, a 38% increase in transcript levels from Gm8 relative to Gm15. Transcriptional divergence of homoeologs was found based on expression patterns among seven tissues and developmental stages. Our results indicate asymmetric evolution between homoeologous regions of soybean as evidenced by structural changes and expression variances of homoeologous genes. PMID:20729383

  15. Role of different regions of the hepatitis C virus genome in the therapeutic response to interferon-based treatment.

    PubMed

    Khaliq, Saba; Latief, Noreen; Jahan, Shah

    2014-01-01

    Hepatitis C virus (HCV) is considered a significant risk factor in HCV-induced liver diseases and development of hepatocellular carcinoma (HCC). Nucleotide substitutions in the viral genome result in its diversification into quasispecies, subtypes and distinct genotypes. Different genotypes vary in their infectivity and immune response due to these nucleotide/amino acid variations. The current combination treatment for HCV infection is pegylated interferon α (PEG-IFN-α) with ribavirin, with a highly variable response rate mainly depending upon the HCV genotype. Genotypes 2 and 3 are found to respond better than genotypes 1 and 4, which are more resistant to IFN-based therapies. Different studies have been conducted worldwide to explore the basis of this difference in therapy response, which identified some putative regions in the HCV genome, especially in Core and NS5a, and to some extent in the E2 region, containing specific sequences in different genotypes that act differently with respect to the IFN response. In the review, we try to summarize the role of HCV proteins and their nucleotide sequences in association with treatment outcome in IFN-based therapy. PMID:23851652

  16. Prospection of genomic regions divergently selected in racing line of Quarter Horses in relation to cutting line.

    PubMed

    Meira, C T; Curi, R A; Farah, M M; de Oliveira, H N; Béltran, N A R; Silva, J A V; Mota, M D S da

    2014-11-01

    Selection of Quarter Horses for different purposes has led to the formation of lines, including racing and cutting horses. The objective of this study was to identify genomic regions divergently selected in racing line of Quarter Horses in relation to cutting line applying relative extended haplotype homozygosity (REHH) analysis, an extension of extended haplotype homozygosity (EHH) analysis, and the fixation index (F ST) statistic. A total of 188 horses of both sexes, born between 1985 and 2009 and registered at the Brazilian Association of Quarter Horse Breeders, including 120 of the racing line and 68 of the cutting line, were genotyped using single nucleotide polymorphism arrays. On the basis of 27 genomic regions identified as selection signatures by REHH and F ST statistics, functional annotations of genes were made in order to identify those that could have been important during formation of the racing line and that could be used subsequently for the development of selection tools. Genes involved in muscle growth (n=8), skeletal growth (n=10), muscle energy metabolism (n=15), cardiovascular system (n=14) and nervous system (n=23) were identified, including the FKTN, INSR, GYS1, CLCN1, MYLK, SYK, ANG, CNTFR and HTR2B. PMID:25032727

  17. Whole Genome Comparisons Suggest Random Distribution of Mycobacterium ulcerans Genotypes in a Buruli Ulcer Endemic Region of Ghana

    PubMed Central

    Ablordey, Anthony S.; Vandelannoote, Koen; Frimpong, Isaac A.; Ahortor, Evans K.; Amissah, Nana Ama; Eddyani, Miriam; Durnez, Lies; Portaels, Françoise; de Jong, Bouke C.; Leirs, Herwig; Porter, Jessica L.; Mangas, Kirstie M.; Lam, Margaret M. C.; Buultjens, Andrew; Seemann, Torsten; Tobias, Nicholas J.; Stinear, Timothy P.

    2015-01-01

    Efforts to control the spread of Buruli ulcer – an emerging ulcerative skin infection caused by Mycobacterium ulcerans - have been hampered by our poor understanding of reservoirs and transmission. To help address this issue, we compared whole genomes from 18 clinical M. ulcerans isolates from a 30km2 region within the Asante Akim North District, Ashanti region, Ghana, with 15 other M. ulcerans isolates from elsewhere in Ghana and the surrounding countries of Ivory Coast, Togo, Benin and Nigeria. Contrary to our expectations of finding minor DNA sequence variations among isolates representing a single M. ulcerans circulating genotype, we found instead two distinct genotypes. One genotype was closely related to isolates from neighbouring regions of Amansie West and Densu, consistent with the predicted local endemic clone, but the second genotype (separated by 138 single nucleotide polymorphisms [SNPs] from other Ghanaian strains) most closely matched M. ulcerans from Nigeria, suggesting another introduction of M. ulcerans to Ghana, perhaps from that country. Both the exotic genotype and the local Ghanaian genotype displayed highly restricted intra-strain genetic variation, with less than 50 SNP differences across a 5.2Mbp core genome within each genotype. Interestingly, there was no discernible spatial clustering of genotypes at the local village scale. Interviews revealed no obvious epidemiological links among BU patients who had been infected with identical M. ulcerans genotypes but lived in geographically separate villages. We conclude that M. ulcerans is spread widely across the region, with multiple genotypes present in any one area. These data give us new perspectives on the behaviour of possible reservoirs and subsequent transmission mechanisms of M. ulcerans. These observations also show for the first time that M. ulcerans can be mobilized, introduced to a new area and then spread within a population. Potential reservoirs of M. ulcerans thus might include

  18. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome

    PubMed Central

    Beh, Leslie Y.; Müller, Manuel M.; Muir, Tom W.; Kaplan, Noam; Landweber, Laura F.

    2015-01-01

    A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these “seed” nucleosomes—together with trans-acting factors—may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences. PMID:26330564

  19. Evaluation of Apis mellifera syriaca Levant Region honeybee conservation using Comparative Genome Hybridization

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant Region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control r...

  20. Additional stratifications in the equatorial F region at dawn and dusk during geomagnetic storms: Role of electrodynamics

    NASA Astrophysics Data System (ADS)

    Sreeja, V.; Balan, N.; Ravindran, Sudha; Pant, Tarun Kumar; Sridharan, R.; Bailey, G. J.

    2009-08-01

    The role of electrodynamics in producing additional stratifications in the equatorial F region (F 3 layer) at dawn and dusk during geomagnetic storms is discussed. Two cases of F 3 layer at dawn (0600-0730 LT on 5 October 2000 and 8 December 2000) and one case of F 3 layer at dusk (1600-1730 LT on 5 October 2000) are observed, for the first time, by the digital ionosonde at the equatorial station Trivandrum (8.5°N 77°E dip ˜ 0.5°N) in India. The unusual F 3 layers occurred during the geomagnetic storms and are associated with southward turning of interplanetary magnetic field B z , suggesting that eastward prompt penetration electric field could be the main cause of the F 3 layers. The dawn F 3 layer on 5 October is modeled using the Sheffield University Plasmasphere-Ionosphere Model by using the E × B drift estimated from the real height variation of the ionospheric peak during the morning period. The model qualitatively reproduces the dawn F 3 layer. While the existing F 2 layer rapidly drifts upward and forms the F 3 layer and topside ledge, a new layer forming at lower heights develops into the normal F 2 layer.

  1. Apollo 16 regolith breccias and soils - Recorders of exotic component addition to the Descartes region of the moon

    NASA Technical Reports Server (NTRS)

    Simon, S. B.; Papike, J. J.; Laul, J. C.; Hughes, S. S.; Schmitt, R. A.

    1988-01-01

    Using the subdivision of Apollo 16 regolith breccias into ancient (about 4 Gyr) and younger samples (McKay et al., 1986), with the present-day soils as a third sample, a petrologic and chemical determination of regolith evolution and exotic component addition at the A-16 site was performed. The modal petrologies and mineral and chemical compositions of the regolith breccias in the region are presented. It is shown that the early regolith was composed of fragments of plutonic rocks, impact melt rocks, and minerals and impact glasses. It is found that KREEP lithologies and impact melts formed early in lunar history. The mare components, mainly orange high-TiO2 glass and green low-TiO2 glass, were added to the site after formation of the ancient breccias and prior to the formation of young breccias. The major change in the regolith since the formation of the young breccias is an increase in maturity represented by the formation of fused soil particles with prolonged exposure to micrometeorite impacts.

  2. The Genomic Ancestry of Individuals from Different Geographical Regions of Brazil Is More Uniform Than Expected

    PubMed Central

    Pena, Sérgio D. J.; Di Pietro, Giuliano; Fuchshuber-Moraes, Mateus; Genro, Julia Pasqualini; Hutz, Mara H.; Kehdy, Fernanda de Souza Gomes; Kohlrausch, Fabiana; Magno, Luiz Alexandre Viana; Montenegro, Raquel Carvalho; Moraes, Manoel Odorico; de Moraes, Maria Elisabete Amaral; de Moraes, Milene Raiol; Ojopi, Élida B.; Perini, Jamila A.; Racciopi, Clarice; Ribeiro-dos-Santos, Ândrea Kely Campos; Rios-Santos, Fabrício; Romano-Silva, Marco A.; Sortica, Vinicius A.; Suarez-Kurtz, Guilherme

    2011-01-01

    Based on pre-DNA racial/color methodology, clinical and pharmacological trials have traditionally considered the different geographical regions of Brazil as being very heterogeneous. We wished to ascertain how such diversity of regional color categories correlated with ancestry. Using a panel of 40 validated ancestry-informative insertion-deletion DNA polymorphisms we estimated individually the European, African and Amerindian ancestry components of 934 self-categorized White, Brown or Black Brazilians from the four most populous regions of the Country. We unraveled great ancestral diversity between and within the different regions. Especially, color categories in the northern part of Brazil diverged significantly in their ancestry proportions from their counterparts in the southern part of the Country, indicating that diverse regional semantics were being used in the self-classification as White, Brown or Black. To circumvent these regional subjective differences in color perception, we estimated the general ancestry proportions of each of the four regions in a form independent of color considerations. For that, we multiplied the proportions of a given ancestry in a given color category by the official census information about the proportion of that color category in the specific region, to arrive at a “total ancestry” estimate. Once such a calculation was performed, there emerged a much higher level of uniformity than previously expected. In all regions studied, the European ancestry was predominant, with proportions ranging from 60.6% in the Northeast to 77.7% in the South. We propose that the immigration of six million Europeans to Brazil in the 19th and 20th centuries - a phenomenon described and intended as the “whitening of Brazil” - is in large part responsible for dissipating previous ancestry dissimilarities that reflected region-specific population histories. These findings, of both clinical and sociological importance for Brazil, should also be

  3. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments

    PubMed Central

    2011-01-01

    Background The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Results Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers. For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies. Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. Conclusions This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and

  4. Genetic region characterization (Gene RECQuest) - software to assist in identification and selection of candidate genes from genomic regions

    PubMed Central

    Sadasivam, Rajani S; Sundar, Gayathri; Vaughan, Laura K; Tanik, Murat M; Arnett, Donna K

    2009-01-01

    Background The availability of research platforms like the web tools of the National Center for Biotechnology Information (NCBI) has transformed the time-consuming task of identifying candidate genes from genetic studies to an interactive process where data from a variety of sources are obtained to select likely genes for follow-up. This process presents its own set of challenges, as the genetic researcher has to interact with several tools in a time-intensive, manual, and cumbersome manner. We developed a method and implemented an effective software system to address these challenges by multidisciplinary efforts of professional software developers with domain experts. The method presented in this paper, Gene RECQuest, simplifies the interaction with existing research platforms through the use of advanced integration technologies. Findings Gene RECQuest is a web-based application that assists in the identification of candidate genes from linkage and association studies using information from Online Mendelian Inheritance in Man (OMIM) and PubMed. To illustrate the utility of Gene RECQuest we used it to identify genes physically located within a linkage region as potential candidate genes for a quantitative trait locus (QTL) for very low density lipoprotein (VLDL) response on chromosome 18. Conclusion Gene RECQuest provides a tool which enables researchers to easily identify and organize literature supporting their own expertise and make informed decisions. It is important to note that Gene RECQuest is a data acquisition and organization software, and not a data analysis method. PMID:19793396

  5. Draft Genome Sequence of Microvirga vignae Strain BR 3299T, a Novel Symbiotic Nitrogen-Fixing Alphaproteobacterium Isolated from a Brazilian Semiarid Region

    PubMed Central

    Edson Zilli, Jerri; Ribeiro Passos, Samuel; Leite, Jakson; Ribeiro Xavier, Gustavo; Gouvea Rumjaneck, Norma

    2015-01-01

    Microvirga vignae is a recently described species of root-nodule bacteria isolated from cowpeas grown in a Brazilian semiarid region. We report here the 6.4-Mb draft genome sequence and annotation of M. vignae type strain BR 3299. This genome information may help to understand the mechanisms underlying the ability of the organism to grow under drought and high-temperatures conditions. PMID:26159523

  6. 3. 6-Mb genomic and YAC physical map of the Down syndrome chromosome region on chromosome 21

    SciTech Connect

    Dufresne-Zacharia, M.C.; Dahmane, N.; Theophile, D.; Orti, R.; Chettouh, Z.; Sinet, P.M.; Delabar, J.M. )

    1994-02-01

    The Down syndrome chromosome region (DCR) on chromosome 21 has been shown to contain a gene(s) important in the pathogenesis of Down syndrome. The authors constructed a long-range restriction map of the D21S55-D21S65 region covering the proximal part of the DCR. Pulsed-field gel electrophoresis of lymphocyte DNA digested with three rare cutting enzymes, NotI, NruI, and Mlu1, was used to establish two physical linkage groups of 5 and 7 markers, respectively, spanning 4.6 Mb on the NotI map. Mapping analysis of 40 YACs allowed the selection of 13 YACs covering 95% of the D21S55-D21S65 region and spanning 3.6 Mb. The restriction maps of these YACs and their positioning on the genomic map allowed 19 markers to be ordered, including 4 NotI linking clones, 9 polymorphic markers, the CBR gene, and the AML1 gene. The distances between markers could also be estimated. This physical map and the location of eight NotI sites between D21S55 and D21S17 should facilitate the isolation of previously unidentified genes in this region. 34 refs., 2 figs., 2 tabs.

  7. Familiarity to a Feed Additive Modulates Its Effects on Brain Responses in Reward and Memory Regions in the Pig Model.

    PubMed

    Val-Laillet, David; Meurice, Paul; Clouard, Caroline

    2016-01-01

    Brain responses to feed flavors with or without a feed additive (FA) were investigated in piglets familiarized or not with this FA. Sixteen piglets were allocated to 2 dietary treatments from weaning until d 37: the naive group (NAI) received a standard control feed and the familiarized group (FAM) received the same feed added with a FA mainly made of orange extracts. Animals were subjected to a feed transition at d 16 post-weaning, and to 2-choice feeding tests at d 16 and d 23. Production traits of the piglets were assessed up to d 28 post-weaning. From d 26 onwards, animals underwent 2 brain imaging sessions (positron emission tomography of 18FDG) under anesthesia to investigate the brain activity triggered by the exposure to the flavors of the feed with (FA) or without (C) the FA. Images were analyzed with SPM8 and a region of interest (ROI)-based small volume correction (p < 0.05, k ≥ 25 voxels per cluster). The brain ROI were selected upon their role in sensory evaluation, cognition and reward, and included the prefrontal cortex, insular cortex, fusiform gyrus, limbic system and corpus striatum. The FAM animals showed a moderate preference for the novel post-transition FA feed compared to the C feed on d 16, i.e., day of the feed transition (67% of total feed intake). The presence or absence of the FA in the diet from weaning had no impact on body weight, average daily gain, and feed efficiency of the animals over the whole experimental period (p ≥ 0.10). Familiar feed flavors activated the prefrontal cortex. The amygdala, insular cortex, and prepyriform area were only activated in familiarized animals exposed to the FA feed flavor. The perception of FA feed flavor in the familiarized animals activated the dorsal striatum differently than the perception of the C feed flavor in naive animals. Our data demonstrated that the perception of FA in familiarized individuals induced different brain responses in regions involved in reward anticipation and

  8. Genome sequence of foot-and-mouth disease virus outside the 3A region is also responsible for virus replication in bovine cells.

    PubMed

    Ma, Xueqing; Li, Pinghua; Sun, Pu; Lu, Zengjun; Bao, Huifang; Bai, Xingwen; Fu, Yuanfang; Cao, Yimei; Li, Dong; Chen, Yingli; Qiao, Zilin; Liu, Zaixin

    2016-07-15

    The deletion of residues 93-102 in non-structure protein 3A of foot-and-mouth disease virus (FMDV) is associated with the inability of FMDV to grow in bovine cells and attenuated virulence in cattle.Whereas, a previously reported FMDV strain O/HKN/21/70 harboring 93-102 deletion in 3A protein grew equally well in bovine and swine cells. This suggests that changes inFMDV genome sequence, in addition to 93-102 deletion in 3A, may also affectthe viral growth phenotype in bovine cellsduring infection and replication.However, it is nuclear that changes in which region (inside or outside of 3A region) influences FMDV growth phenotype in bovine cells.In this study, to determine the region in FMDV genomeaffecting viral growth phenotype in bovine cells, we constructed chimeric FMDVs, rvGZSB-HKN3A and rvHN-HKN3A, by introducing the 3A coding region of O/HKN/21/70 into the context of O/SEA/Mya-98 strain O/GZSB/2011 and O Cathay topotype strain O/HN/CHA/93, respectively, since O/GZSB/2011 containing full-length 3A protein replicated well in bovine and swine cells, and O/HN/CHA/93 harboring 93-102 deletion in 3A protein grew poorly in bovine cells.The chimeric virusesrvGZSB-HKN3A and rvHN-HKN3A displayed growth properties and plaque phenotypes similar to those of the parental virus rvGZSB and rv-HN in BHK-21 and primary fetal porcine kidney (FPK) cells. However, rvHN-HKN3A and rv-HN replicated poorly in primary fetal bovine kidney (FBK) cells with no visible plaques, and rvGZSB-HKN3A exhibited lower growth rate and smaller plaque size phenotypes than those of the parental virus in FBK cells, but similar growth properties and plaque phenotypes to those of the recombinant viruses harboring 93-102 deletion in 3A. These results demonstrate that the difference present in FMDV genome sequence outside the 3A coding region also have influence on FMDV replication ability in bovine cells. PMID:27094491

  9. Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors

    PubMed Central

    2012-01-01

    Background Transcription factors function by binding different classes of regulatory elements. The Encyclopedia of DNA Elements (ENCODE) project has recently produced binding data for more than 100 transcription factors from about 500 ChIP-seq experiments in multiple cell types. While this large amount of data creates a valuable resource, it is nonetheless overwhelmingly complex and simultaneously incomplete since it covers only a small fraction of all human transcription factors. Results As part of the consortium effort in providing a concise abstraction of the data for facilitating various types of downstream analyses, we constructed statistical models that capture the genomic features of three paired types of regions by machine-learning methods: firstly, regions with active or inactive binding; secondly, those with extremely high or low degrees of co-binding, termed HOT and LOT regions; and finally, regulatory modules proximal or distal to genes. From the distal regulatory modules, we developed computational pipelines to identify potential enhancers, many of which were validated experimentally. We further associated the predicted enhancers with potential target transcripts and the transcription factors involved. For HOT regions, we found a significant fraction of transcription factor binding without clear sequence motifs and showed that this observation could be related to strong DNA accessibility of these regions. Conclusions Overall, the three pairs of regions exhibit intricate differences in chromosomal locations, chromatin features, factors that bind them, and cell-type specificity. Our machine learning approach enables us to identify features potentially general to all transcription factors, including those not included in the data. PMID:22950945

  10. RNA-protein interactions: involvement of NS3, NS5, and 3' noncoding regions of Japanese encephalitis virus genomic RNA.

    PubMed Central

    Chen, C J; Kuo, M D; Chien, L J; Hsu, S L; Wang, Y M; Lin, J H

    1997-01-01

    The mechanism of replication of the flavivirus Japanese encephalitis virus (JEV) is not well known. The structures at the 3' end of the viral genome are highly conserved among divergent flaviviruses, suggesting that they may function as cis-acting signals for RNA replication and, as such, might specifically bind to cellular or viral proteins. UV cross-linking experiments were performed to identify the proteins that bind with the JEV plus-strand 3' noncoding region (NCR). Two proteins, p71 and p110, from JEV-infected but not from uninfected cell extracts were shown to bind specifically to the plus-strand 3' NCR. The quantities of these binding proteins increased during the course of JEV infection and correlated with the levels of JEV RNA synthesis in cell extracts. UV cross-linking coupled with Western blot and immunoprecipitation analysis showed that the p110 and p71 proteins were JEV NS5 and NS3, respectively, which are proposed as components of the RNA replicase. The putative stem-loop structure present within the plus-strand 3' NCR was required for the binding of these proteins. Furthermore, both proteins could interact with each other and form a protein-protein complex in vivo. These findings suggest that the 3' NCR of JEV genomic RNA may form a replication complex together with NS3 and NS5; this complex may be involved in JEV minus-strand RNA synthesis. PMID:9094618

  11. QTLs Regulating the Contents of Antioxidants, Phenolics, and Flavonoids in Soybean Seeds Share a Common Genomic Region

    PubMed Central

    Li, Man-Wah; Muñoz, Nacira B.; Wong, Chi-Fai; Wong, Fuk-Ling; Wong, Kwong-Sen; Wong, Johanna Wing-Hang; Qi, Xinpeng; Li, Kwan-Pok; Ng, Ming-Sin; Lam, Hon-Ming

    2016-01-01

    Soybean seeds are a rich source of phenolic compounds, especially isoflavonoids, which are important nutraceuticals. Our study using 14 wild- and 16 cultivated-soybean accessions shows that seeds from cultivated soybeans generally contain lower total antioxidants compared to their wild counterparts, likely an unintended consequence of domestication or human selection. Using a recombinant inbred population resulting from a wild and a cultivated soybean parent and a bin map approach, we have identified an overlapping genomic region containing major quantitative trait loci (QTLs) that regulate the seed contents of total antioxidants, phenolics, and flavonoids. The QTL for seed antioxidant content contains 14 annotated genes based on the Williams 82 reference genome (Gmax1.01). None of these genes encodes functions that are related to the phenylpropanoid pathway of soybean. However, we found three putative Multidrug And Toxic Compound Extrusion (MATE) transporter genes within this QTL and one adjacent to it (GmMATE1-4). Moreover, we have identified non-synonymous changes between GmMATE1 and GmMATE2, and that GmMATE3 encodes an antisense transcript that expresses in pods. Whether the polymorphisms in GmMATE proteins are major determinants of the antioxidant contents, or whether the antisense transcripts of GmMATE3 play important regulatory roles, awaits further functional investigations. PMID:27379137

  12. QTLs Regulating the Contents of Antioxidants, Phenolics, and Flavonoids in Soybean Seeds Share a Common Genomic Region.

    PubMed

    Li, Man-Wah; Muñoz, Nacira B; Wong, Chi-Fai; Wong, Fuk-Ling; Wong, Kwong-Sen; Wong, Johanna Wing-Hang; Qi, Xinpeng; Li, Kwan-Pok; Ng, Ming-Sin; Lam, Hon-Ming

    2016-01-01

    Soybean seeds are a rich source of phenolic compounds, especially isoflavonoids, which are important nutraceuticals. Our study using 14 wild- and 16 cultivated-soybean accessions shows that seeds from cultivated soybeans generally contain lower total antioxidants compared to their wild counterparts, likely an unintended consequence of domestication or human selection. Using a recombinant inbred population resulting from a wild and a cultivated soybean parent and a bin map approach, we have identified an overlapping genomic region containing major quantitative trait loci (QTLs) that regulate the seed contents of total antioxidants, phenolics, and flavonoids. The QTL for seed antioxidant content contains 14 annotated genes based on the Williams 82 reference genome (Gmax1.01). None of these genes encodes functions that are related to the phenylpropanoid pathway of soybean. However, we found three putative Multidrug And Toxic Compound Extrusion (MATE) transporter genes within this QTL and one adjacent to it (GmMATE1-4). Moreover, we have identified non-synonymous changes between GmMATE1 and GmMATE2, and that GmMATE3 encodes an antisense transcript that expresses in pods. Whether the polymorphisms in GmMATE proteins are major determinants of the antioxidant contents, or whether the antisense transcripts of GmMATE3 play important regulatory roles, awaits further functional investigations. PMID:27379137

  13. Regional signals in the planarian body guide stem cell fate in the presence of genomic instability.

    PubMed

    Peiris, T Harshani; Ramirez, Daniel; Barghouth, Paul G; Ofoha, Udokanma; Davidian, Devon; Weckerle, Frank; Oviedo, Néstor J

    2016-05-15

    Cellular fate decisions are influenced by their topographical location in the adult body. For instance, tissue repair and neoplastic growth are greater in anterior than in posterior regions of adult animals. However, the molecular underpinnings of these regional differences are unknown. We identified a regional switch in the adult planarian body upon systemic disruption of homologous recombination with RNA-interference of Rad51 Rad51 knockdown increases DNA double-strand breaks (DSBs) throughout the body, but stem cells react differently depending on their location along the anteroposterior axis. In the presence of extensive DSBs, cells in the anterior part of the body resist death, whereas cells in the posterior region undergo apoptosis. Furthermore, we found that proliferation of cells with DNA damage is induced in the presence of brain tissue and that the retinoblastoma pathway enables overproliferation of cells with DSBs while attending to the demands of tissue growth and repair. Our results implicate both autonomous and non-autonomous mechanisms as key mediators of regional cell behavior and cellular transformation in the adult body. PMID:27013241

  14. Cloning and characterization of human PREB; a gene that maps to a genomic region associated with trisomy 2p syndrome.

    PubMed

    Taylor Clelland, C L; Levy, B; McKie, J M; Duncan, A M; Hirschhorn, K; Bancroft, C

    2000-08-01

    We have isolated the human homolog of a novel rodent gene that may be involved in the regulation of pituitary gene transcription. The human PREB gene encodes a predicted protein of 417 amino acids, exhibiting several sequences characteristic of the WD-motif protein family. PREB transcripts were detected in every human fetal and adult tissue examined, although a great variation in levels of expression was observed. PREB was mapped to human Chromosome 2p23, a region of the genome associated with partial trisomy 2p syndrome. Although variable, the common duplication phenotype includes facial abnormalities, skeletal defects, growth and mental retardation, congenital heart and neural tube defects, and abnormalities of the genitalia. We propose that PREB has a role during human development and that abnormal dosage of this transcription factor may be involved in some of the developmental abnormalities observed in patients with partial trisomy 2p. PMID:10920239

  15. An analysis by restriction enzymes of the genomic structure of the 3' untranslated region of the human estrogen receptor gene.

    PubMed

    Keaveney, M; Neilan, J; Gannon, F

    1989-04-12

    The estrogen receptor gene has a very long 3' untranslated region. As a first step towards the analysis of this structural feature for any functional role, we have cloned the human genomic estrogen receptor gene. Extensive restriction enzyme analysis of this DNA and comparison of the sizes of the DNA fragments obtained with those predicted from published cDNA sequences indicate that the 3' exon extends for at least 4304 bases from base number 2018 in the cDNA to the end of the cDNA. The data also show that the most 3' intron in this gene occurs between bases 1902 and 2018 of the cDNA. PMID:2930778

  16. Origins of the Moken Sea Gypsies inferred from mitochondrial hypervariable region and whole genome sequences.

    PubMed

    Dancause, Kelsey Needham; Chan, Chim W; Arunotai, Narumon Hinshiranan; Lum, J Koji

    2009-02-01

    The origins of the Moken 'Sea Gypsies,' a group of traditionally boat-dwelling nomadic foragers, remain speculative despite previous examinations from linguistic, sociocultural and genetic perspectives. We explored Moken origin(s) and affinities by comparing whole mitochondrial genome and hypervariable segment I sequences from 12 Moken individuals, sampled from four islands of the Mergui Archipelago, to other mainland Asian, Island Southeast Asian (ISEA) and Oceanic populations. These analyses revealed a major (11/12) and a minor (1/12) haplotype in the population, indicating low mitochondrial diversity likely resulting from historically low population sizes, isolation and consequent genetic drift. Phylogenetic analyses revealed close relationships between the major lineage (MKN1) and ISEA, mainland Asian and aboriginal Malay populations, and of the minor lineage (MKN2) to populations from ISEA. MKN1 belongs to a recently defined subclade of the ancient yet localized M21 haplogroup. MKN2 is not closely related to any previously sampled lineages, but has been tentatively assigned to the basal M46 haplogroup that possibly originated among the original inhabitants of ISEA. Our analyses suggest that MKN1 originated within coastal mainland SEA and dispersed into ISEA and rapidly into the Mergui Archipelago within the past few thousand years as a result of climate change induced population pressure. PMID:19158811

  17. Genome-Wide Association Studies Identifies Seven Major Regions Responsible for Iron Deficiency Chlorosis in Soybean (Glycine max)

    PubMed Central

    Mamidi, Sujan; Lee, Rian K.; Goos, Jay R.; McClean, Phillip E.

    2014-01-01

    Iron deficiency chlorosis (IDC) is a yield limiting problem in soybean (Glycine max (L.) Merr) production regions with calcareous soils. Genome-wide association study (GWAS) was performed using a high density SNP map to discover significant markers, QTL and candidate genes associated with IDC trait variation. A stepwise regression model included eight markers after considering LD between markers, and identified seven major effect QTL on seven chromosomes. Twelve candidate genes known to be associated with iron metabolism mapped near these QTL supporting the polygenic nature of IDC. A non-synonymous substitution with the highest significance in a major QTL region suggests soybean orthologs of FRE1 on Gm03 is a major gene responsible for trait variation. NAS3, a gene that encodes the enzyme nicotianamine synthase which synthesizes the iron chelator nicotianamine also maps to the same QTL region. Disease resistant genes also map to the major QTL, supporting the hypothesis that pathogens compete with the plant for Fe and increase iron deficiency. The markers and the allelic combinations identified here can be further used for marker assisted selection. PMID:25225893

  18. Genome-wide association studies identifies seven major regions responsible for iron deficiency chlorosis in soybean (Glycine max).

    PubMed

    Mamidi, Sujan; Lee, Rian K; Goos, Jay R; McClean, Phillip E

    2014-01-01

    Iron deficiency chlorosis (IDC) is a yield limiting problem in soybean (Glycine max (L.) Merr) production regions with calcareous soils. Genome-wide association study (GWAS) was performed using a high density SNP map to discover significant markers, QTL and candidate genes associated with IDC trait variation. A stepwise regression model included eight markers after considering LD between markers, and identified seven major effect QTL on seven chromosomes. Twelve candidate genes known to be associated with iron metabolism mapped near these QTL supporting the polygenic nature of IDC. A non-synonymous substitution with the highest significance in a major QTL region suggests soybean orthologs of FRE1 on Gm03 is a major gene responsible for trait variation. NAS3, a gene that encodes the enzyme nicotianamine synthase which synthesizes the iron chelator nicotianamine also maps to the same QTL region. Disease resistant genes also map to the major QTL, supporting the hypothesis that pathogens compete with the plant for Fe and increase iron deficiency. The markers and the allelic combinations identified here can be further used for marker assisted selection. PMID:25225893

  19. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions.

    PubMed

    Kaneko, T; Sato, S; Kotani, H; Tanaka, A; Asamizu, E; Nakamura, Y; Miyajima, N; Hirosawa, M; Sugiura, M; Sasamoto, S; Kimura, T; Hosouchi, T; Matsuno, A; Muraki, A; Nakazaki, N; Naruo, K; Okumura, S; Shimpo, S; Takeuchi, C; Wada, T; Watanabe, A; Yamada, M; Yasuda, M; Tabata, S

    1996-06-30

    The sequence determination of the entire genome of the Synechocystis sp. strain PCC6803 was completed. The total length of the genome finally confirmed was 3,573,470 bp, including the previously reported sequence of 1,003,450 bp from map position 64% to 92% of the genome. The entire sequence was assembled from the sequences of the physical map-based contigs of cosmid clones and of lambda clones and long PCR products which were used for gap-filling. The accuracy of the sequence was guaranteed by analysis of both strands of DNA through the entire genome. The authenticity of the assembled sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA using the assembled sequence data. To predict the potential protein-coding regions, analysis of open reading frames (ORFs), analysis by the GeneMark program and similarity search to databases were performed. As a result, a total of 3,168 potential protein genes were assigned on the genome, in which 145 (4.6%) were identical to reported genes and 1,257 (39.6%) and 340 (10.8%) showed similarity to reported and hypothetical genes, respectively. The remaining 1,426 (45.0%) had no apparent similarity to any genes in databases. Among the potential protein genes assigned, 128 were related to the genes participating in photosynthetic reactions. The sum of the sequences coding for potential protein genes occupies 87% of the genome length. By adding rRNA and tRNA genes, therefore, the genome has a very compact arrangement of protein- and RNA-coding regions. A notable feature on the gene organization of the genome was that 99 ORFs, which showed similarity to transposase genes and could be classified into 6 groups, were found spread all over the genome, and at least 26 of them appeared to remain intact. The result implies that rearrangement of the genome occurred frequently during and after establishment of this species. PMID:8905231

  20. HTLV-1 Integration into Transcriptionally Active Genomic Regions Is Associated with Proviral Expression and with HAM/TSP

    PubMed Central

    Meekings, Kiran N.; Leipzig, Jeremy; Bushman, Frederic D.; Taylor, Graham P.; Bangham, Charles R. M.

    2008-01-01

    Human T-lymphotropic virus type 1 (HTLV-1) causes leukaemia or chronic inflammatory disease in ∼5% of infected hosts. The level of proviral expression of HTLV-1 differs significantly among infected people, even at the same proviral load (proportion of infected mononuclear cells in the circulation). A high level of expression of the HTLV-1 provirus is associated with a high proviral load and a high risk of the inflammatory disease of the central nervous system known as HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP). But the factors that control the rate of HTLV-1 proviral expression remain unknown. Here we show that proviral integration sites of HTLV-1 in vivo are not randomly distributed within the human genome but are associated with transcriptionally active regions. Comparison of proviral integration sites between individuals with high and low levels of proviral expression, and between provirus-expressing and provirus non-expressing cells from within an individual, demonstrated that frequent integration into transcription units was associated with an increased rate of proviral expression. An increased frequency of integration sites in transcription units in individuals with high proviral expression was also associated with the inflammatory disease HAM/TSP. By comparing the distribution of integration sites in human lymphocytes infected in short-term cell culture with those from persistent infection in vivo, we infer the action of two selective forces that shape the distribution of integration sites in vivo: positive selection for cells containing proviral integration sites in transcriptionally active regions of the genome, and negative selection against cells with proviral integration sites within transcription units. PMID:18369476

  1. Integrated Pathway-Based Approach Identifies Association between Genomic Regions at CTCF and CACNB2 and Schizophrenia

    PubMed Central

    Zapatka, Marc; Frank, Josef; Witt, Stephanie H.; Mühleisen, Thomas W.; Treutlein, Jens; Strohmaier, Jana; Meier, Sandra; Degenhardt, Franziska; Giegling, Ina; Ripke, Stephan; Leber, Markus; Lange, Christoph; Schulze, Thomas G.; Mössner, Rainald; Nenadic, Igor; Sauer, Heinrich; Rujescu, Dan; Maier, Wolfgang; Børglum, Anders; Ophoff, Roel; Cichon, Sven; Nöthen, Markus M.; Rietschel, Marcella; Mattheisen, Manuel; Brors, Benedikt

    2014-01-01

    In the present study, an integrated hierarchical approach was applied to: (1) identify pathways associated with susceptibility to schizophrenia; (2) detect genes that may be potentially affected in these pathways since they contain an associated polymorphism; and (3) annotate the functional consequences of such single-nucleotide polymorphisms (SNPs) in the affected genes or their regulatory regions. The Global Test was applied to detect schizophrenia-associated pathways using discovery and replication datasets comprising 5,040 and 5,082 individuals of European ancestry, respectively. Information concerning functional gene-sets was retrieved from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and the Molecular Signatures Database. Fourteen of the gene-sets or pathways identified in the discovery dataset were confirmed in the replication dataset. These include functional processes involved in transcriptional regulation and gene expression, synapse organization, cell adhesion, and apoptosis. For two genes, i.e. CTCF and CACNB2, evidence for association with schizophrenia was available (at the gene-level) in both the discovery study and published data from the Psychiatric Genomics Consortium schizophrenia study. Furthermore, these genes mapped to four of the 14 presently identified pathways. Several of the SNPs assigned to CTCF and CACNB2 have potential functional consequences, and a gene in close proximity to CACNB2, i.e. ARL5B, was identified as a potential gene of interest. Application of the present hierarchical approach thus allowed: (1) identification of novel biological gene-sets or pathways with potential involvement in the etiology of schizophrenia, as well as replication of these findings in an independent cohort; (2) detection of genes of interest for future follow-up studies; and (3) the highlighting of novel genes in previously reported candidate regions for schizophrenia. PMID:24901509

  2. Integrated pathway-based approach identifies association between genomic regions at CTCF and CACNB2 and schizophrenia.

    PubMed

    Juraeva, Dilafruz; Haenisch, Britta; Zapatka, Marc; Frank, Josef; Witt, Stephanie H; Mühleisen, Thomas W; Treutlein, Jens; Strohmaier, Jana; Meier, Sandra; Degenhardt, Franziska; Giegling, Ina; Ripke, Stephan; Leber, Markus; Lange, Christoph; Schulze, Thomas G; Mössner, Rainald; Nenadic, Igor; Sauer, Heinrich; Rujescu, Dan; Maier, Wolfgang; Børglum, Anders; Ophoff, Roel; Cichon, Sven; Nöthen, Markus M; Rietschel, Marcella; Mattheisen, Manuel; Brors, Benedikt

    2014-06-01

    In the present study, an integrated hierarchical approach was applied to: (1) identify pathways associated with susceptibility to schizophrenia; (2) detect genes that may be potentially affected in these pathways since they contain an associated polymorphism; and (3) annotate the functional consequences of such single-nucleotide polymorphisms (SNPs) in the affected genes or their regulatory regions. The Global Test was applied to detect schizophrenia-associated pathways using discovery and replication datasets comprising 5,040 and 5,082 individuals of European ancestry, respectively. Information concerning functional gene-sets was retrieved from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and the Molecular Signatures Database. Fourteen of the gene-sets or pathways identified in the discovery dataset were confirmed in the replication dataset. These include functional processes involved in transcriptional regulation and gene expression, synapse organization, cell adhesion, and apoptosis. For two genes, i.e. CTCF and CACNB2, evidence for association with schizophrenia was available (at the gene-level) in both the discovery study and published data from the Psychiatric Genomics Consortium schizophrenia study. Furthermore, these genes mapped to four of the 14 presently identified pathways. Several of the SNPs assigned to CTCF and CACNB2 have potential functional consequences, and a gene in close proximity to CACNB2, i.e. ARL5B, was identified as a potential gene of interest. Application of the present hierarchical approach thus allowed: (1) identification of novel biological gene-sets or pathways with potential involvement in the etiology of schizophrenia, as well as replication of these findings in an independent cohort; (2) detection of genes of interest for future follow-up studies; and (3) the highlighting of novel genes in previously reported candidate regions for schizophrenia. PMID:24901509

  3. Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology

    PubMed Central

    Rossin, Elizabeth J.; Lage, Kasper; Raychaudhuri, Soumya; Xavier, Ramnik J.; Tatar, Diana; Benita, Yair

    2011-01-01

    Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein–protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in

  4. Sequence Level Analysis of Recently Duplicated Regions in the Soybean [Glycine max (L.) Merr.] Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A single recessive gene, rxp, on linkage group (LG) D2 controls bacterial leaf pustule resistance in soybean. Markers linked to rxp were used to develop BAC contigs spanning the Rxp region. We identified two homoeologous contigs (GmA and GmA’) composed of five bacterial artificial chromosomes (BAC...

  5. Genome-wide association study reveals regions associated with gestation length in two pig populations.

    PubMed

    Hidalgo, A M; Lopes, M S; Harlizius, B; Bastiaansen, J W M

    2016-04-01

    Reproduction traits, such as gestation length (GLE), play an important role in dam line breeding in pigs. The objective of our study was to identify single nucleotide polymorphisms (SNPs) that are associated with GLE in two pig populations. Genotypes and deregressed breeding values were available for 2081 Dutch Landrace-based (DL) and 2301 Large White-based (LW) pigs. We identified two QTL regions for GLE, one in each population. For DL, three associated SNPs were detected in one QTL region spanning 0.52 Mbp on Sus scrofa chromosome (SSC) 2. For LW, four associated SNPs were detected in one region of 0.14 Mbp on SSC5. The region on SSC2 contains the heparin-binding EGF-like growth factor (HBEGF) gene, which promotes embryo implantation and has been described to be involved in embryo survival throughout gestation. The associated SNP can be used for marker-assisted selection in the studied populations, and further studies of the HBEGF gene are warranted to investigate its role in GLE. PMID:26667091

  6. Avian papillomaviruses: the parrot Psittacus erithacus papillomavirus (PePV) genome has a unique organization of the early protein region and is phylogenetically related to the chaffinch papillomavirus

    PubMed Central

    Tachezy, Ruth; Rector, Annabel; Havelkova, Marta; Wollants, Elke; Fiten, Pierre; Opdenakker, Ghislain; Jenson, A Bennett; Sundberg, John P; Van Ranst, Marc

    2002-01-01

    Background An avian papillomavirus genome has been cloned from a cutaneous exophytic papilloma from an African grey parrot (Psittacus erithacus). The nucleotide sequence, genome organization, and phylogenetic position of the Psittacus erithacus papillomavirus (PePV) were determined. This PePV sequence represents the first complete avian papillomavirus genome defined. Results The PePV genome (7304 basepairs) differs from other papillomaviruses, in that it has a unique organization of the early protein region lacking classical E6 and E7 open reading frames. Phylogenetic comparison of the PePV sequence with partial E1 and L1 sequences of the chaffinch (Fringilla coelebs) papillomavirus (FPV) reveals that these two avian papillomaviruses form a monophyletic cluster with a common branch that originates near the unresolved center of the papillomavirus evolutionary tree. Conclusions The PePV genome has a unique layout of the early protein region which represents a novel prototypic genomic organization for avian papillomaviruses. The close relationship between PePV and FPV, and between their Psittaciformes and Passeriformes hosts, supports the hypothesis that papillomaviruses have co-evolved and speciated together with their host species throughout evolution. PMID:12110158

  7. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation

    PubMed Central

    Du, Xiangjun; Gertz, E. Michael; Wojtowicz, Damian; Zhabinskaya, Dina; Levens, David; Benham, Craig J.; Schäffer, Alejandro A.; Przytycka, Teresa M.

    2014-01-01

    While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression variation between individuals. This coupling between mutagenicity and tolerance for expression variability of downstream genes may be a result of evolutionary adaptation, which allows reconciling mutagenicity of non-B DNA structures with their location in functionally important regions and their potential regulatory role. PMID:25336616

  8. The complete mitochondrial genome of the grand jackknife clam, Solen grandis (Bivalvia: Solenidae): a novel gene order and unusual non-coding region.

    PubMed

    Yuan, Yang; Li, Qi; Kong, Lingfeng; Yu, Hong

    2012-02-01

    Molluscs in general, and bivalves in particular, exhibit an extraordinary degree of mitochondrial gene order variation when compared with other metazoans. The complete mitochondrial genome of Solen grandis (Bivalvia: Solenidae) was determined using long-PCR and genome walking techniques. The entire mitochondrial genome sequence of S. grandis is 16,784 bp in length, and contains 36 genes including 12 protein-coding genes (atp8 is absent), 2 ribosomal RNAs, and 22 tRNAs. All genes are encoded on the same strand. Compared with other species, it bears a novel gene order. Besides these, we find a peculiar non-coding region of 435 bp with a microsatellite-like (TA)(12) element, poly-structures and many hairpin structures. In contrast to the available heterodont mitochondrial genomes from GenBank, the complete mtDNA of S. grandis has the shortest cox3 gene, and the longest atp6, nad4, nad5 genes. PMID:21598108

  9. In situ optical sequencing and structure analysis of a trinucleotide repeat genome region by localization microscopy after specific COMBO-FISH nano-probing

    NASA Astrophysics Data System (ADS)

    Stuhlmüller, M.; Schwarz-Finsterle, J.; Fey, E.; Lux, J.; Bach, M.; Cremer, C.; Hinderhofer, K.; Hausmann, M.; Hildenbrand, G.

    2015-10-01

    Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5' untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.

  10. A novel mitochondrial genome architecture in thrips (Insecta: Thysanoptera): extreme size asymmetry among chromosomes and possible recent control region duplication

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Multi-partite mitochondrial genomes are very rare in animals but have been found previously in two insect orders with highly rearranged genomes, the Phthiraptera (parasitic lice), and the Psocoptera (booklice/barklice). We provide the first report of a multi-partite mitochondrial genome architecture...

  11. Predicting the Effects of Nano-Scale Cerium Additives in Diesel Fuel on Regional-Scale Air Quality

    EPA Science Inventory

    Diesel vehicles are a major source of air pollutant emissions. Fuel additives containing nanoparticulate cerium (nCe) are currently being used in some diesel vehicles to improve fuel efficiency. These fuel additives also reduce fine particulate matter (PM2.5) emissio...

  12. A genome-wide association study of marginal zone lymphoma shows association to the HLA region

    PubMed Central

    Vijai, Joseph; Wang, Zhaoming; Berndt, Sonja I.; Skibola, Christine F.; Slager, Susan L.; de Sanjose, Silvia; Melbye, Mads; Glimelius, Bengt; Bracci, Paige M.; Conde, Lucia; Birmann, Brenda M.; Wang, Sophia S.; Brooks-Wilson, Angela R.; Lan, Qing; de Bakker, Paul I. W.; Vermeulen, Roel C. H.; Portlock, Carol; Ansell, Stephen M.; Link, Brian K.; Riby, Jacques; North, Kari E.; Gu, Jian; Hjalgrim, Henrik; Cozen, Wendy; Becker, Nikolaus; Teras, Lauren R.; Spinelli, John J.; Turner, Jenny; Zhang, Yawei; Purdue, Mark P.; Giles, Graham G.; Kelly, Rachel S.; Zeleniuch-Jacquotte, Anne; Ennas, Maria Grazia; Monnereau, Alain; Bertrand, Kimberly A.; Albanes, Demetrius; Lightfoot, Tracy; Yeager, Meredith; Chung, Charles C.; Burdett, Laurie; Hutchinson, Amy; Lawrence, Charles; Montalvan, Rebecca; Liang, Liming; Huang, Jinyan; Ma, Baoshan; Villano, Danylo J.; Maria, Ann; Corines, Marina; Thomas, Tinu; Novak, Anne J.; Dogan, Ahmet; Liebow, Mark; Thompson, Carrie A.; Witzig, Thomas E.; Habermann, Thomas M.; Weiner, George J.; Smith, Martyn T.; Holly, Elizabeth A.; Jackson, Rebecca D.; Tinker, Lesley F.; Ye, Yuanqing; Adami, Hans-Olov; Smedby, Karin E.; De Roos, Anneclaire J.; Hartge, Patricia; Morton, Lindsay M.; Severson, Richard K.; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; McKay, James; Staines, Anthony; Diver, W. Ryan; Vajdic, Claire M.; Armstrong, Bruce K.; Kricker, Anne; Zheng, Tongzhang; Holford, Theodore R.; Severi, Gianluca; Vineis, Paolo; Ferri, Giovanni M.; Ricco, Rosalia; Miligi, Lucia; Clavel, Jacqueline; Giovannucci, Edward; Kraft, Peter; Virtamo, Jarmo; Smith, Alex; Kane, Eleanor; Roman, Eve; Chiu, Brian C. H.; Fraumeni, Joseph F.; Wu, Xifeng; Cerhan, James R.; Offit, Kenneth; Chanock, Stephen J.; Rothman, Nathaniel; Nieters, Alexandra

    2015-01-01

    Marginal zone lymphoma (MZL) is the third most common subtype of B-cell non-Hodgkin lymphoma. Here we perform a two-stage GWAS of 1,281 MZL cases and 7,127 controls of European ancestry and identify two independent loci near BTNL2 (rs9461741, P=3.95 × 10−15) and HLA-B (rs2922994, P=2.43 × 10−9) in the HLA region significantly associated with MZL risk. This is the first evidence that genetic variation in the major histocompatibility complex influences MZL susceptibility. PMID:25569183

  13. A genome-wide linkage scan identifies multiple chromosomal regions influencing serum lipid levels in the population on the Samoan islands* s⃞

    PubMed Central

    Åberg, Karolina; Dai, Feng; Sun, Guangyun; Keighley, Ember; Indugula, Subba Rao; Bausserman, Linda; Viali, Satupaitea; Tuitele, John; Deka, Ranjan; Weeks, Daniel E.; McGarvey, Stephen T.

    2008-01-01

    Abnormal lipid levels are important risk factors for cardiovascular diseases. We conducted genome-wide variance component linkage analyses to search for loci influencing total cholesterol (TC), LDL, HDL and triglyceride in families residing in American Samoa and Samoa as well as in a combined sample from the two polities. We adjusted the traits for a number of environmental covariates, such as smoking, alcohol consumption, physical activity, and material lifestyle. We found suggestive univariate linkage with log of the odds (LOD) scores > 3 for LDL on 6p21-p12 (LOD 3.13) in Samoa and on 12q21-q23 (LOD 3.07) in American Samoa. Furthermore, in American Samoa on 12q21, we detected genome-wide linkage (LODeq 3.38) to the bivariate trait TC-LDL. Telomeric of this region, on 12q24, we found suggestive bivariate linkage to TC-HDL (LODeq 3.22) in the combined study sample. In addition, we detected suggestive univariate linkage (LOD 1.9–2.93) on chromosomes 4p-q, 6p, 7q, 9q, 11q, 12q 13q, 15q, 16p, 18q, 19p, 19q and Xq23 and suggestive bivariate linkage (LODeq 2.05–2.62) on chromosomes 6p, 7q, 12p, 12q, and 19p-q. In conclusion, chromosome 6p and 12q may host promising susceptibility loci influencing lipid levels; however, the low degree of overlap between the three study samples strongly encourages further studies of the lipid-related traits. PMID:18594117

  14. KIF14 is a candidate oncogene in the 1q minimal region of genomic gain in multiple cancers.

    PubMed

    Corson, Timothy W; Huang, Annie; Tsao, Ming-Sound; Gallie, Brenda L

    2005-07-14

    Gain of chromosome 1q31-1q32 is seen in >50% of retinoblastoma and is common in other tumors. To define the minimal 1q region of gain, we determined genomic copy number by quantitative multiplex PCR of 14 sequence tagged sites (STSs) spanning 1q25.3-1q41. The most frequently gained STS at 1q32.1 (71%; 39 of 55 retinoblastoma) defined a 3.06 Mbp minimal region of gain between flanking markers, containing 14 genes. Of these, only KIF14, a putative chromokinesin, was overexpressed in various cancers by real-time RT-PCR. KIF14 mRNA was expressed in 20/22 retinoblastoma samples 100-1000-fold higher than in retina (t-test P=0.00002); cell lines (n=10) had higher levels than tumors (n=12) (P=0.009). KIF14 protein was overexpressed in retinoblastoma tumors and breast cancer cell lines by immunoblot. KIF14 was expressed in 4/4 breast cancer cell lines 31-92-fold higher than in normal breast tissue, in 5/5 medulloblastoma cell lines 22-79-fold higher than in fetal brain, and in 10/22 primary lung tumors 3-34-fold higher than in normal lung. Patients with lung tumors that overexpress KIF14 showed a trend toward decreased survival. KIF14 may thus be important in oncogenesis, and has promise as a prognostic indicator and therapeutic target. PMID:15897902

  15. A fine structure genomic map of the region of 12q13 containing SAS and CDK4

    SciTech Connect

    Linder, C.Y.; Elkahloun, A.G.; Su, Y.A.

    1994-09-01

    We have recently adapted a method, originally described by Rackwitz, to the rapid restriction mapping of multiple cosmid DNA samples. Linearization of the cosmids at the lambda cohesive site using lambda terminase is followed by partial digestion with selected restriction enzymes and hybridization to oligonucleotides specific for the right or left hand termini. Partial digestions are performed in a microtiter plate thus allowing up to 12 cosmid clones to be digested with one restriction enzyme. We have applied this rapid restriction mapping method to cosmids derived from a region of chromosome 12q13 that has recently been shown to be amplified in a variety of cancers including malignant fibrous histiocytoma, fibrosarcoma, liposarcoma, osteosarcoma and brain tumors. A small segment of this amplification unit containing three genes, SAS (a membrane protein), CDK4 (a cyclin dependent kinase) and OS-9 (a recently described cDNA) has been analyzed with the system described above. This fine structure genomic map will be useful for completing the expression map of this region as well as characterizing its pattern of amplification in tumor specimens.

  16. Nucleotide sequence of the 3'-noncoding region of alfalfa mosaic virus RNA 4 and its homology with the genomic RNAs.

    PubMed Central

    Koper-Zwarthoff, E C; Brederode, F T; Walstra, P; Bol, J F

    1979-01-01

    A 226-nucleotide fragment was derived from alfalfa mosaic virus RNA 4 (ALMV RNA 4), the subgenomic messenger for viral coat protein, and its sequence was deduced by in vitro labeling with polynucleotide kinase and application of RNA sequencing techniques. The fragment contains the 3'-terminal 45 nucleotides of the coat protein cistron and the complete 3'-noncoding region of 182 nucleotides. The total length of RNA 4 was calculated to be 881 nucleotides. AlMV RNAs 1, 2 and 3 were elongated with a 3'-terminal poly(A) stretch and subjected to sequence analysis by using a specific primer, reverse transcriptase and chain terminators. This revealed and extensive homology between the 3'-terminal 140 to 150 nucleotides of all four ALMV RNAs. Despite a number of base substitutions, the secondary structure of the homologous region is highly conserved. The observed homology indicates that, as with RNA 4, the sites with a high affinity for the viral coat protein are located at the 3'-termini of the genomic RNAs. Images PMID:537914

  17. Genomic analysis of a region encompassing QRfs1 and QRfs2: genes that underlie soybean resistance to sudden death syndrome.

    PubMed

    Triwitayakorn, K; Njiti, V N; Iqbal, M J; Yaegashi, S; Town, C; Lightfoot, D A

    2005-02-01

    Candidate genes were identified for two loci, QRfs2 providing resistance to the leaf scorch called soybean (Glycine max (L.) Merr.) sudden death syndrome (SDS) and QRfs1 providing resistance to root infection by the causal pathogen Fusarium solani f.sp. glycines. The 7.5 +/- 0.5 cM region of chromosome 18 (linkage group G) was shown to encompass a cluster of resistance loci using recombination events from 4 near-isogenic line populations and 9 DNA markers. The DNA markers anchored 9 physical map contigs (7 are shown on the soybean Gbrowse, 2 are unpublished), 45 BAC end sequences (41 in Gbrowse), and contiguous DNA sequences of 315, 127, and 110 kbp. Gene density was high at 1 gene per 7 kbp only around the already sequenced regions. Three to 4 gene-rich islands were inferred to be distributed across the entire 7.5 cM or 3.5 Mbp showing that genes are clustered in the soybean genome. Candidate resistance genes were identified and a molecular basis for interactions among the disease resistance genes in the cluster inferred. PMID:15729404

  18. Identification of Genomic Regions and the Isoamylase Gene for Reduced Grain Chalkiness in Rice

    PubMed Central

    Sun, Wenqian; Zhou, Qiaoling; Yao, Yue; Qiu, Xianjin; Xie, Kun; Yu, Sibin

    2015-01-01

    Grain chalkiness is an important grain quality related to starch granules in the endosperm. A high percentage of grain chalkiness is a major problem because it diminishes grain quality in rice. Here, we report quantitative trait loci identification for grain chalkiness using high-throughput single nucleotide polymorphism genotyping of a chromosomal segment substitution line population in which each line carried one or a few introduced japonica cultivar Nipponbare segments in the genetic background of the indica cultivar ZS97. Ten quantitative trait loci regions were commonly identified for the percentage of grain chalkiness and the degree of endosperm chalkiness. The allelic effects at nine of these quantitative trait loci reduced grain chalkiness. Furthermore, a quantitative trait locus (qPGC8-2) on chromosome 8 was validated in a chromosomal segment substitution line–derived segregation population, and had a stable effect on chalkiness in a multiple-environment evaluation of the near-isogenic lines. Residing on the qPGC8-2 region, the isoamylase gene (ISA1) was preferentially expressed in the endosperm and revealed some nucleotide polymorphisms between two varieties, Nipponbare and ZS97. Transgenic lines with suppression of ISA1 by RNA interference produced grains with 20% more chalkiness than the control. The results support that the gene may underlie qPGC8-2 for grain chalkiness. The multiple-environment trials of the near-isogenic lines also show that combination of the favorable alleles such as the ISA1 gene for low chalkiness and the GS3 gene for long grains considerably improved grain quality of ZS97, which proves useful for grain quality improvement in rice breeding programs. PMID:25790260

  19. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

    PubMed

    Besemer, J; Lomsadze, A; Borodovsky, M

    2001-06-15

    Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-coding regions and models of regulatory sites near gene start within an iterative Hidden Markov model based algorithm. The new gene prediction method, called GeneMarkS, utilizes a non-supervised training procedure and can be used for a newly sequenced prokaryotic genome with no prior knowledge of any protein or rRNA genes. The GeneMarkS implementation uses an improved version of the gene finding program GeneMark.hmm, heuristic Markov models of coding and non-coding regions and the Gibbs sampling multiple alignment program. GeneMarkS predicted precisely 83.2% of the translation starts of GenBank annotated Bacillus subtilis genes and 94.4% of translation starts in an experimentally validated set of Escherichia coli genes. We have also observed that GeneMarkS detects prokaryotic genes, in terms of identifying open reading frames containing real genes, with an accuracy matching the level of the best currently used gene detection methods. Accurate translation start prediction, in addition to the refinement of protein sequence N-terminal data, provides the benefit of precise positioning of the sequence region situated upstream to a gene start. Therefore, sequence motifs related to transcription and translation regulatory sites can be revealed and analyzed with higher precision. These motifs were shown to possess a significant variability, the functional and evolutionary connections of which are discussed. PMID:11410670

  20. Shared Genomic Regions Between Derivatives of a Large Segregating Population of Maize Identified Using Bulked Segregant Analysis Sequencing and Traditional Linkage Analysis

    PubMed Central

    Haase, Nicholas J.; Beissinger, Timothy; Hirsch, Candice N.; Vaillancourt, Brieanne; Deshpande, Shweta; Barry, Kerrie; Buell, C. Robin; Kaeppler, Shawn M.; de Leon, Natalia

    2015-01-01

    Delayed transition from the vegetative stage to the reproductive stage of development and increased plant height have been shown to increase biomass productivity in grasses. The goal of this project was to detect quantitative trait loci using extremes from a large synthetic population, as well as a related recombinant inbred line mapping population for these two traits. Ten thousand individuals from a B73 × Mo17 noninbred population intermated for 14 generations (IBM Syn14) were grown at a density of approximately 16,500 plants ha−1. Flowering time and plant height were measured within this population. DNA was pooled from the 46 most extreme individuals from each distributional tail for each of the traits measured and used in bulk segregant analysis (BSA) sequencing. Allelic divergence at each of the ∼1.1 million SNP loci was estimated as the difference in allele frequencies between the selected extremes. Additionally, 224 intermated B73 × Mo17 recombinant inbred lines were concomitantly grown at a similar density adjacent to the large synthetic population and were assessed for flowering time and plant height. Using the BSA sequencing method, 14 and 13 genomic regions were identified for flowering time and plant height, respectively. Linkage mapping with the RIL population identified eight and three regions for flowering time and plant height, respectively. Of the regions identified, three colocalized between the two populations for flowering time and two colocalized for plant height. This study demonstrates the utility of using BSA sequencing for the dissection of complex quantitative traits important for production of lignocellulosic ethanol. PMID:26038364

  1. GENOMIC ANALYSIS OF A 1 MB REGION NEAR THE TELOMERE OF HESSIAN FLY CHROMOSOME X2 AND AVIRULENCE GENE VH13

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chromosome walking and FISH were utilized to identify a contig of 50 BAC clones near the telomere of the short arm of Hessian fly chromosome X2 and near the avirulence gene vH13. These clones enabled us to correlate physical and genetic distance in this region of the Hessian fly genome. Sequence da...

  2. Genome-wide linkage analysis to identify chromosomal regions affecting phenotypic traits in the chicken. I. Growth and average daily gain

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genome scan was used to detect chromosomal regions and QTL that control quantitative traits of economic importance in chickens. Two unique F2 crosses generated from a commercial broiler male line and 2 genetically distinct inbred lines (Leghorn and Fayoumi) were used to identify QTL affecting BW a...

  3. Detection of short protein coding regions within the cyanobacterium genome: application of the hidden Markov model.

    PubMed

    Yada, T; Hirosawa, M

    1996-12-31

    The gene-finding programs developed so far have not paid much attention to the detection of short protein coding regions (CDSs). However, the detection of short CDSs is important for the study of photosynthesis. We utilized GeneHacker, a gene-finding program based on the hidden Markov model (HMM), to detect short CDSs (from 90 to 300 bases) in a 1.0 mega contiguous sequence of cyanobacterium Synechocystis sp. strain PCC6803 which carries a complete set of genes for oxygenic photosynthesis. GeneHacker differs from other gene-finding programs based on the HMM in that it utilizes di-codon statistics as well. GeneHacker successfully detected seven out of the eight short CDSs annotated in this sequence and was clearly superior to GeneMark in this range of length. GeneHacker detected 94 potentially new CDSs, 9 of which have counterparts in the genetic databases. Four of the nine CDSs were less than 150 bases and were photosynthesis-related genes. The results show the effectiveness of GeneHacker in detecting very short CDSs corresponding to genes. PMID:9097038

  4. The impact of nutrition on differential methylated regions of the genome.

    PubMed

    Parle-McDermott, Anne; Ozaki, Mari

    2011-11-01

    Nutrition has always played an important role in health and disease, ranging from common diseases to its likely contribution to the fetal origins of adult disease. However, deciphering the molecular details of this role is much more challenging. The impact of nutrition on the methylome, i.e., DNA methylation, has received particular attention in more recent years. Our understanding of the complexity of the methylome is evolving as efforts to catalog the DNA methylation differences that exist between different tissues and individuals continue. We review selected examples of animal and human studies that provide evidence that, in fact, specific genes and DNA methylation sites are subject to change during development and during a lifetime as a direct response to nutrition. Investigation of the methyl donors folate, choline, and methionine provide the most compelling evidence of a role in mediating DNA methylation changes. Although a number of candidate regions/genes have been identified to date, we are just at the beginning in terms of cataloging so-called nutrient-sensitive methylation variable positions in humans. PMID:22332089

  5. Extensive duplication events account for multiple control regions and pseudo-genes in the mitochondrial genome of the velvet worm Metaperipatus inae (Onychophora, Peripatopsidae).

    PubMed

    Braband, Anke; Podsiadlowski, Lars; Cameron, Stephen L; Daniels, Savel; Mayer, Georg

    2010-10-01

    The phylogeny of Onychophora (velvet worms) is unresolved and even the monophyly of the two major onychophoran subgroups, Peripatidae and Peripatopsidae, is uncertain. Previous studies of complete mitochondrial genomes from two onychophoran species revealed two strikingly different gene arrangement patterns from highly conserved in a representative of Peripatopsidae to highly derived in a species of Peripatidae, suggesting that these data might be informative for clarifying the onychophoran phylogeny. In order to assess the diversity of mitochondrial genomes among onychophorans, we analyzed the complete mitochondrial genome of Metaperipatus inae, a second representative of Peripatopsidae from Chile. Compared to the proposed ancestral gene order in Onychophora, the mitochondrial genome of M. inae shows dramatic rearrangements, although all protein-coding and ribosomal RNA genes are encoded on the same strands as in the ancestral peripatopsid genome. The retained strand affiliation of all protein-coding and ribosomal RNA genes and the occurrence of three control regions and several pseudo-genes suggest that the derived mitochondrial gene arrangement pattern in M. inae evolved by partial genome duplications, followed by a subsequent loss of redundant genes. Our findings, thus, confirm the diversity of the mitochondrial gene arrangement patterns among onychophorans and support their utility for clarifying the phylogeography of Onychophora, in particular of the Peripatopsidae species from South Africa and Chile. PMID:20510379

  6. RNA from an immediate early region of the type 1 herpes simplex virus genome is present in the trigeminal ganglia of latently infected mice

    SciTech Connect

    Deatly, A.M.; Spivack, J.G.; Lavi, E.; Fraser, N.W.

    1987-05-01

    Transcription of the type 1 herpes simplex virus (HSV-1) genome in trigeminal ganglia of latently infected mice was studied using in situ hybridization. Probes representative of each temporal gene class were used to determine the regions of the genome that encode the transcripts present in latently infected cells. Probes encoding HSV-1 sequences of the five immediate early genes and representative early (thymidine kinase), early-late (major capsid protein), and late (glycoprotein C) genes were used in these experiments. Of the probes tested, only those encoding the immediate early gene product infected-cell polypeptide (ICP) 0 hybridized to RNA in latently infected tissues. Probes containing the other immediate early genes (ICP4, ICP22, ICP27, and ICP47) and the representative early, early-late, and late genes did not hybridize. Two probes covering approx. = 30% of the HSV-1 genome and encoding over 20 early and late transcripts also did not hybridize to RNA in latently infected tissues. These results, with probes spanning > 60% of the HSV-1 genome, suggest that transcription of the HSV-1 genome is restricted to one region in latently infected mouse trigeminal ganglia.

  7. The compact Brachypodium genome conserves centromeric regions of a common ancestor with wheat and rice.

    PubMed

    Qi, Lili; Friebe, Bernd; Wu, Jiajie; Gu, Yongqiang; Qian, Chen; Gill, Bikram S

    2010-11-01

    The evolution of five chromosomes of Brachypodium distachyon from a 12-chromosome ancestor of all grasses by dysploidy raises an interesting question about the fate of redundant centromeres. Three independent but complementary approaches were pursued to study centromeric region homologies among the chromosomes of Brachypodium, wheat, and rice. The genes present in pericentromeres of the basic set of seven chromosomes of wheat and the Triticeae, and the 80 rice centromeric genes spanning the CENH3 binding domain of centromeres 3, 4, 5, 7, and 8 were used as "anchor" markers to identify centromere locations in the B. distachyon chromosomes. A total of 53 B. distachyon bacterial artificial chromosome (BAC) clones anchored by wheat pericentromeric expressed sequence tags (ESTs) were used as probes for BAC-fluorescence in situ hybridization (FISH) analysis of B. distachyon mitotic chromosomes. Integrated sequence alignment and BAC-FISH data were used to determine the approximate positions of active and inactive centromeres in the five B. distachyon chromosomes. The following syntenic relationships of the centromeres for Brachypodium (Bd), rice (R), and wheat (W) were evident: Bd1-R6, Bd2-R5-W1, Bd3-R10, Bd4-R11-W4, and Bd5-R4. Six rice centromeres syntenic to five wheat centromeres were inactive in Brachypodium chromosomes. The conservation of centromere gene synteny among several sets of homologous centromeres of three species indicates that active genes can persist in ancient centromeres with more than 40 million years of shared evolutionary history. Annotation of a BAC contig spanning an inactive centromere in chromosome Bd3 which is syntenic to rice Cen8 and W7 pericentromeres, along with BAC FISH data from inactive centromeres revealed that the centromere inactivation was accompanied by the loss of centromeric retrotransposons and turnover of centromere-specific satellites during Bd chromosome evolution. PMID:20842403

  8. Genomic regions underlying agronomic traits in linseed (Linum usitatissimum L.) as revealed by association mapping.

    PubMed

    Soto-Cerda, Braulio J; Duguid, Scott; Booker, Helen; Rowland, Gordon; Diederichsen, Axel; Cloutier, Sylvie

    2014-01-01

    The extreme climate of the Canadian Prairies poses a major challenge to improve yield. Although it is possible to breed for yield per se, focusing on yield-related traits could be advantageous because of their simpler genetic architecture. The Canadian flax core collection of 390 accessions was genotyped with 464 simple sequence repeat markers, and phenotypic data for nine agronomic traits including yield, bolls per area, 1,000 seed weight, seeds per boll, start of flowering, end of flowering, plant height, plant branching, and lodging collected from up to eight environments was used for association mapping. Based on a mixed model (principal component analysis (PCA) + kinship matrix (K)), 12 significant marker-trait associations for six agronomic traits were identified. Most of the associations were stable across environments as revealed by multivariate analyses. Statistical simulation for five markers associated with 1000 seed weight indicated that the favorable alleles have additive effects. None of the modern cultivars carried the five favorable alleles and the maximum number of four observed in any accessions was mostly in breeding lines. Our results confirmed the complex genetic architecture of yield-related traits and the inherent difficulties associated with their identification while illustrating the potential for improvement through marker-assisted selection. PMID:24138336

  9. 77 FR 5714 - Change of Addresses for Regional Offices, Addition of One New Address, and Correction of Names of...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-06

    ... Order 12045 (62 FR 19885, April 23, 1997), because it is not economically significant. This action does..., Addition of One New Address, and Correction of Names of House and Senate Committees We Must Notify AGENCY...: We, the U.S. Fish and Wildlife Service (we, or the Service), are revising our rights-of-way...

  10. Genomic structure and evolution of the ancestral chromosome fusion site in 2q13-2q14.1 and paralogous regions on other human chromosomes.

    PubMed

    Fan, Yuxin; Linardopoulou, Elena; Friedman, Cynthia; Williams, Eleanor; Trask, Barbara J

    2002-11-01

    Human chromosome 2 was formed by the head-to-head fusion of two ancestral chromosomes that remained separate in other primates. Sequences that once resided near the ends of the ancestral chromosomes are now interstitially located in 2q13-2q14.1. Portions of these sequences had duplicated to other locations prior to the fusion. Here we present analyses of the genomic structure and evolutionary history of >600 kb surrounding the fusion site and closely related sequences on other human chromosomes. Sequence blocks that closely flank the inverted arrays of degenerate telomere repeats marking the fusion site are duplicated at many, primarily subtelomeric, locations. In addition, large portions of a 168-kb centromere-proximal block are duplicated at 9pter, 9p11.2, and 9q13, with 98%-99% average sequence identity. A 67-kb block on the distal side of the fusion site is highly homologous to sequences at 22qter. A third ~100-kb segment is 96% identical to a region in 2q11.2. By integrating data on the extent and similarity of these paralogous blocks, including the presence of phylogenetically informative repetitive elements, with observations of their chromosomal distribution in nonhuman primates, we infer the order of the duplications that led to their current arrangement. Several of these duplicated blocks may be associated with breakpoints of inversions that occurred during primate evolution and of recurrent chromosome rearrangements in humans. PMID:12421751

  11. mGenomeSubtractor: a web-based tool for parallel in silico subtractive hybridization analysis of multiple bacterial genomes.

    PubMed

    Shao, Yucheng; He, Xinyi; Harrison, Ewan M; Tai, Cui; Ou, Hong-Yu; Rajakumar, Kumar; Deng, Zixin

    2010-07-01

    mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/. PMID:20435682

  12. Mutation of the RDR1 gene caused genome-wide changes in gene expression, regional variation in small RNA clusters and localized alteration in DNA methylation in rice

    PubMed Central

    2014-01-01

    Background Endogenous small (sm) RNAs (primarily si- and miRNAs) are important trans/cis-acting regulators involved in diverse cellular functions. In plants, the RNA-dependent RNA polymerases (RDRs) are essential for smRNA biogenesis. It has been established that RDR2 is involved in the 24 nt siRNA-dependent RNA-directed DNA methylation (RdDM) pathway. Recent studies have suggested that RDR1 is involved in a second RdDM pathway that relies mostly on 21 nt smRNAs and functions to silence a subset of genomic loci that are usually refractory to the normal RdDM pathway in Arabidopsis. Whether and to what extent the homologs of RDR1 may have similar functions in other plants remained unknown. Results We characterized a loss-of-function mutant (Osrdr1) of the OsRDR1 gene in rice (Oryza sativa L.) derived from a retrotransposon Tos17 insertion. Microarray analysis identified 1,175 differentially expressed genes (5.2% of all expressed genes in the shoot-tip tissue of rice) between Osrdr1 and WT, of which 896 and 279 genes were up- and down-regulated, respectively, in Osrdr1. smRNA sequencing revealed regional alterations in smRNA clusters across the rice genome. Some of the regions with altered smRNA clusters were associated with changes in DNA methylation. In addition, altered expression of several miRNAs was detected in Osrdr1, and at least some of which were associated with altered expression of predicted miRNA target genes. Despite these changes, no phenotypic difference was identified in Osrdr1 relative to WT under normal condition; however, ephemeral phenotypic fluctuations occurred under some abiotic stress conditions. Conclusions Our results showed that OsRDR1 plays a role in regulating a substantial number of endogenous genes with diverse functions in rice through smRNA-mediated pathways involving DNA methylation, and which participates in abiotic stress response. PMID:24980094

  13. High-throughput engineering of a mammalian genome reveals building principles of methylation states at CG rich regions

    PubMed Central

    Krebs, Arnaud R; Dessus-Babus, Sophie; Burger, Lukas; Schübeler, Dirk

    2014-01-01

    The majority of mammalian promoters are CpG islands; regions of high CG density that require protection from DNA methylation to be functional. Importantly, how sequence architecture mediates this unmethylated state remains unclear. To address this question in a comprehensive manner, we developed a method to interrogate methylation states of hundreds of sequence variants inserted at the same genomic site in mouse embryonic stem cells. Using this assay, we were able to quantify the contribution of various sequence motifs towards the resulting DNA methylation state. Modeling of this comprehensive dataset revealed that CG density alone is a minor determinant of their unmethylated state. Instead, these data argue for a principal role for transcription factor binding sites, a prediction confirmed by testing synthetic mutant libraries. Taken together, these findings establish the hierarchy between the two cis-encoded mechanisms that define the DNA methylation state and thus the transcriptional competence of CpG islands. DOI: http://dx.doi.org/10.7554/eLife.04094.001 PMID:25259795

  14. Characterization of an Essential RNA Secondary Structure in the 3′ Untranslated Region of the Murine Coronavirus Genome

    PubMed Central

    Hsue, Bilan; Hartshorne, Toinette; Masters, Paul S.

    2000-01-01

    We have previously identified a functionally essential bulged stem-loop in the 3′ untranslated region of the positive-stranded RNA genome of mouse hepatitis virus. This 68-nucleotide structure is composed of six stem segments interrupted by five bulges, and its structure, but not its primary sequence, is entirely conserved in the related bovine coronavirus. The functional importance of individual stem segments of this stem-loop was characterized by genetic analysis using targeted RNA recombination. We also examined the effects of stem segment mutations on the replication of mouse hepatitis virus defective interfering RNAs. These studies were complemented by enzymatic and chemical probing of the stem-loop. Taken together, our results confirmed most of the previously proposed structure, but they revealed that the terminal loop and an internal loop are larger than originally thought. Three of the stem segments were found to be essential for viral replication. Further, our results suggest that the stem segment at the base of the stem-loop is an alternative base-pairing structure for part of a downstream, and partially overlapping, RNA pseudoknot that has recently been shown to be necessary for bovine coronavirus replication. PMID:10888630

  15. Genomic organization and evolution of double minutes/homogeneously staining regions with MYC amplification in human cancer.

    PubMed

    L'Abbate, Alberto; Macchia, Gemma; D'Addabbo, Pietro; Lonoce, Angelo; Tolomeo, Doron; Trombetta, Domenico; Kok, Klaas; Bartenhagen, Christoph; Whelan, Christopher W; Palumbo, Orazio; Severgnini, Marco; Cifola, Ingrid; Dugas, Martin; Carella, Massimo; De Bellis, Gianluca; Rocchi, Mariano; Carbone, Lucia; Storlazzi, Clelia Tiziana

    2014-08-01

    The mechanism for generating double minutes chromosomes (dmin) and homogeneously staining regions (hsr) in cancer is still poorly understood. Through an integrated approach combining next-generation sequencing, single nucleotide polymorphism array, fluorescent in situ hybridization and polymerase chain reaction-based techniques, we inferred the fine structure of MYC-containing dmin/hsr amplicons harboring sequences from several different chromosomes in seven tumor cell lines, and characterized an unprecedented number of hsr insertion sites. Local chromosome shattering involving a single-step catastrophic event (chromothripsis) was recently proposed to explain clustered chromosomal rearrangements and genomic amplifications in cancer. Our bioinformatics analyses based on the listed criteria to define chromothripsis led us to exclude it as the driving force underlying amplicon genesis in our samples. Instead, the finding of coexisting heterogeneous amplicons, differing in their complexity and chromosome content, in cell lines derived from the same tumor indicated the occurrence of a multi-step evolutionary process in the genesis of dmin/hsr. Our integrated approach allowed us to gather a complete view of the complex chromosome rearrangements occurring within MYC amplicons, suggesting that more than one model may be invoked to explain the origin of dmin/hsr in cancer. Finally, we identified PVT1 as a target of fusion events, confirming its role as breakpoint hotspot in MYC amplification. PMID:25034695

  16. Methylation quantitative trait loci in the developing brain and their enrichment in schizophrenia-associated genomic regions

    PubMed Central

    Hannon, Eilis; Spiers, Helen; Viana, Joana; Pidsley, Ruth; Burrage, Joe; Murphy, Therese M; Troakes, Claire; Turecki, Gustavo; O’Donovan, Michael C.; Schalkwyk, Leonard C.; Bray, Nicholas J.; Mill, Jonathan

    2015-01-01

    We characterized DNA methylation quantitative trait loci (mQTLs) in a large collection (n=166) of human fetal brain samples spanning 56–166 days post-conception, identifying >16,000 fetal brain mQTLs. Fetal brain mQTLs are primarily cis-acting, enriched in regulatory chromatin domains and transcription factor binding sites, and show significant overlap with genetic variants also associated with gene expression in the brain. Using tissue from three distinct regions of the adult brain (prefrontal cortex, striatum and cerebellum) we show that most fetal brain mQTLs are developmentally stable, although a subset is characterized by fetal-specific effects. We show that fetal brain mQTLs are enriched amongst risk loci identified in a recent large-scale genome-wide association study (GWAS) of schizophrenia, a severe psychiatric disorder with a hypothesized neurodevelopmental component. Finally, we demonstrate how mQTLs can be used to refine GWAS loci through the identification of discrete sites of variable fetal brain methylation associated with schizophrenia risk variants. PMID:26619357

  17. BamHI E region of the Epstein-Barr virus genome encodes three transformation-associated nuclear proteins.

    PubMed Central

    Ricksten, A; Kallin, B; Alexander, H; Dillner, J; Fåhraeus, R; Klein, G; Lerner, R; Rymo, L

    1988-01-01

    Recombinant vectors carrying DNA fragments from the BamHI E region of the B95-8 Epstein-Barr virus (EBV) genome were transfected into COS-1 cells, and the transient expression of EBV-encoded nuclear antigens (EBNAs) was analyzed by using polyvalent human antisera and rabbit antibodies to synthetic peptides. Vector DNA containing two rightward open reading frames in the BamHI E fragment, BERF2a and BERF2b, induced the expression of a nuclear antigen identical serologically and with respect to size to the larger of the two polypeptides previously designated as EBNA4 in B95-8 cells. An antigen corresponding to the smaller polypeptide was induced in cells transfected with constructs that contained two neighboring reading frames, BERF3 and BERF4. This antigen also reacted with a rabbit antiserum to the synthetic peptide 203, deduced from BERF4. Thus, the findings show that the two components of the EBNA4 doublet in B95-8 cells are encoded by separate genes. The antigen encoded by BERF2a and/or BERF2b has been designated as EBNA4 and the antigen encoded by BERF3 and/or BERF4 has been designated as EBNA6. Polyvalent human antisera detected EBNA4 and EBNA6 in 9 of 11 lymphoid cell lines carrying independent EBV isolates. In the remaining two lines, either EBNA4 or EBNA6 was not detectable. Images PMID:2829223

  18. Target Site Specificity of the Tos17 Retrotransposon Shows a Preference for Insertion within Genes and against Insertion in Retrotransposon-Rich Regions of the Genome

    PubMed Central

    Miyao, Akio; Tanaka, Katsuyuki; Murata, Kazumasa; Sawaki, Hiromichi; Takeda, Shin; Abe, Kiyomi; Shinozuka, Yoriko; Onosato, Katsura; Hirochika, Hirohiko

    2003-01-01

    Because retrotransposons are the major component of plant genomes, analysis of the target site selection of retrotransposons is important for understanding the structure and evolution of plant genomes. Here, we examined the target site specificity of the rice retrotransposon Tos17, which can be activated by tissue culture. We have produced 47,196 Tos17-induced insertion mutants of rice. This mutant population carries ∼500,000 insertions. We analyzed >42,000 flanking sequences of newly transposed Tos17 copies from 4316 mutant lines. More than 20,000 unique loci were assigned on the rice genomic sequence. Analysis of these sequences showed that insertion events are three times more frequent in genic regions than in intergenic regions. Consistent with this result, Tos17 was shown to prefer gene-dense regions over centromeric heterochromatin regions. Analysis of insertion target sequences revealed a palindromic consensus sequence, ANGTT-TSD-AACNT, flanking the 5-bp target site duplication. Although insertion targets are distributed throughout the chromosomes, they tend to cluster, and 76% of the clusters are located in genic regions. The mechanisms of target site selection by Tos17, the utility of the mutant lines, and the knockout gene database are discussed. PMID:12897251

  19. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems.

    PubMed

    Wade, Len J; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K; Reddy, K R Kamalnath; Varma, C Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H E; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7-40.7 Mb) and on chromosome 8 (20.3-21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  20. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems

    PubMed Central

    Wade, Len J.; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K.; Reddy, K. R. Kamalnath; Varma, C. Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R.; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H. E.; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L.; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7–40.7 Mb) and on chromosome 8 (20.3–21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  1. Comparative analysis of the full genome sequence of European bat lyssavirus type 1 and type 2 with other lyssaviruses and evidence for a conserved transcription termination and polyadenylation motif in the G-L 3' non-translated region.

    PubMed

    Marston, D A; McElhinney, L M; Johnson, N; Müller, T; Conzelmann, K K; Tordo, N; Fooks, A R

    2007-04-01

    We report the first full-length genomic sequences for European bat lyssavirus type-1 (EBLV-1) and type-2 (EBLV-2). The EBLV-1 genomic sequence was derived from a virus isolated from a serotine bat in Hamburg, Germany, in 1968 and the EBLV-2 sequence was derived from a virus isolate from a human case of rabies that occurred in Scotland in 2002. A long-distance PCR strategy was used to amplify the open reading frames (ORFs), followed by standard and modified RACE (rapid amplification of cDNA ends) techniques to amplify the 3' and 5' ends. The lengths of each complete viral genome for EBLV-1 and EBLV-2 were 11 966 and 11 930 base pairs, respectively, and follow the standard rhabdovirus genome organization of five viral proteins. Comparison with other lyssavirus sequences demonstrates variation in degrees of homology, with the genomic termini showing a high degree of complementarity. The nucleoprotein was the most conserved, both intra- and intergenotypically, followed by the polymerase (L), matrix and glyco- proteins, with the phosphoprotein being the most variable. In addition, we have shown that the two EBLVs utilize a conserved transcription termination and polyadenylation (TTP) motif, approximately 50 nt upstream of the L gene start codon. All available lyssavirus sequences to date, with the exception of Pasteur virus (PV) and PV-derived isolates, use the second TTP site. This observation may explain differences in pathogenicity between lyssavirus strains, dependent on the length of the untranslated region, which might affect transcriptional activity and RNA stability. PMID:17374776

  2. Reconstitution of the mitochondrial Hsp70 (mortalin)-p53 interaction using purified proteins--identification of additional interacting regions.

    PubMed

    Iosefson, Ohad; Azem, Abdussalam

    2010-03-19

    Previous studies have shown that the mammalian mitochondrial 70 kDa heat-shock protein (mortalin) can also be detected in the cytosol. Cytosolic mortalin binds p53 and by doing so, prevents translocation of the tumor suppressor into the nucleus. In this study, we developed a novel binding assay, using purified proteins, for tracking the interaction between p53 and mortalin. Our results reveal that: (i) P53 binds to the peptide-binding site of mortalin which enhances the ability of the former to bind DNA. (ii) An additional previously unknown binding site for mortalin exists within the C-terminal domain of p53. PMID:20153329

  3. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations.

    PubMed

    McNally, Alan; Oren, Yaara; Kelly, Darren; Pascoe, Ben; Dunn, Steven; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B; Ashour, Amgad; Avram, Oren; Pupko, Tal; Dobrindt, Ulrich; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H; Zhiyong, Zong; Sheppard, Samuel K; McInerney, James O; Corander, Jukka

    2016-09-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  4. Scaffolder - software for manual genome scaffolding

    PubMed Central

    2012-01-01

    Background The assembly of next-generation short-read sequencing data can result in a fragmented non-contiguous set of genomic sequences. Therefore a common step in a genome project is to join neighbouring sequence regions together and fill gaps. This scaffolding step is non-trivial and requires manually editing large blocks of nucleotide sequence. Joining these sequences together also hides the source of each region in the final genome sequence. Taken together these considerations may make reproducing or editing an existing genome scaffold difficult. Methods The software outlined here, “Scaffolder,” is implemented in the Ruby programming language and can be installed via the RubyGems software management system. Genome scaffolds are defined using YAML - a data format which is both human and machine-readable. Command line binaries and extensive documentation are available. Results This software allows a genome build to be defined in terms of the constituent sequences using a relatively simple syntax. This syntax further allows unknown regions to be specified and additional sequence to be used to fill known gaps in the scaffold. Defining the genome construction in a file makes the scaffolding process reproducible and easier to edit compared with large FASTA nucleotide sequences. Conclusions Scaffolder is easy-to-use genome scaffolding software which promotes reproducibility and continuous development in a genome project. Scaffolder can be found at http://next.gs. PMID:22640820

  5. Genome Sequence of EU-Unauthorized Genetically Modified Bacillus subtilis Strain 2014-3557 Overproducing Riboflavin, Isolated from a Vitamin B2 80% Feed Additive

    PubMed Central

    Barbau-Piednoir, Elodie; De Keersmaecker, Sigrid C. J.; Wuyts, Véronique; Gau, Céline; Pirovano, Walter; Costessi, Adalberto; Philipp, Patrick

    2015-01-01

    This paper announces the genome sequence and annotation of the genetically modified (GM) Bacillus subtilis strain 2014-3557 overproducing riboflavin (vitamin B2). This GM-strain is unauthorized in the European Union. Nevertheless, it has been isolated from a lot of vitamin B2 (riboflavin) 80% feed grade imported to Europe from China. PMID:25858836

  6. Sequencing of 15,622 gene-bearing BACs clarifies the gene-dense regions of the barley genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework....

  7. Changuinola Virus Serogroup, New Genomes within the Genus Orbivirus (Family Reoviridae) Isolated in the Brazilian Amazon Region

    PubMed Central

    Dilcher, Meik; Weidmann, Manfred; Carvalho, Valéria L.; Casseb, Alexandre R.; Silva, Eliana V. P.; Nunes, Keley N. B.; Chiang, Jannifer O.; Martins, Lívia C.; Vasconcelos, Pedro F. C.

    2013-01-01

    We report here the first complete genome sequence of a Changuinola virus (CGLV) serotype Irituia virus (BE AN 28873) isolated from a wild rodent (Oryzomys goeldi) in the municipality of Ipixuna, State of Pará, northern Brazil. All genome segments showed similarity with those belonging to members of the genus Orbivirus, family Reoviridae. PMID:24285662

  8. Analysis of a genomic DNA region from the cyanobacterium Synechococcus sp. strain PCC7942 involved in carboxysome assembly and function.

    PubMed Central

    Price, G D; Howitt, S M; Harrison, K; Badger, M R

    1993-01-01

    We report on the sequencing and analysis of a 3,557-bp genomic DNA clone that is located between 4.8 and 1.2 kilobase pairs (kb) upstream of the rbcL gene and is capable of complementing a class of cyanobacterium Synechococcus sp. strain PCC7942 mutants requiring a high level of CO2. The upstream 2,704 bp of this sequence is novel, the remaining 852 bp having been reported by other workers. Four new open reading frames (ORFs) have been identified along with putative promoter elements. These ORFs, which could code for proteins of 7, 10.9, 11, and 58 kDa in size, have been named ORF 64, ccmK, ccmL, and ccmM, respectively. The last three have been named ccm genes on the basis that insertional mutagenesis of each produces a phenotype requiring a high level of CO2 (i.e., each produces a lesion in the CO2 concentrating mechanism). The putative gene product for the large ccmM ORF has three internally repeated regions and also has two possible DNA binding motifs. Two defined mutants in the 3,557-bp region, mutants PVU and P-N, have been more fully characterized. The PVU mutant has a drug marker inserted into the ccmL gene, and it possesses abnormal rod-shaped carboxysomes. The P-N mutant is a 2.64-kb deletion of DNA from the same position in ccmL to a region closer to rbcL. This mutant, which has previously been shown to lack carboxysomes and have soluble ribulosebiphosphate carboxylase/oxygenase activity, has now been shown to have a predominantly soluble carboxysomal carbonic anhydrase activity. Both mutants were found to possess carboxysomal carbonic anhydrase activities which are below wild-type levels, and in the P-N mutant this activity appears to be unstable. The results are discussed in terms of the possible interactions of putative ccm gene products in the process of carboxysome assembly and function. Images PMID:8491708

  9. Development of a multiplex RT-PCR assay for the identification of recombination types at different genomic regions of vaccine-derived polioviruses.

    PubMed

    Dimitriou, T G; Kyriakopoulou, Z; Tsakogiannis, D; Fikatas, A; Gartzonika, C; Levidiotou-Stefanou, S; Markoulatos, P

    2016-08-01

    Polioviruses (PVs) are the causal agents of acute paralytic poliomyelitis. Since the 1960s, poliomyelitis has been effectively controlled by the use of two vaccines containing all three serotypes of PVs, the inactivated poliovirus vaccine and the live attenuated oral poliovirus vaccine (OPV). Despite the success of OPV in polio eradication programme, a significant disadvantage was revealed: the emergence of vaccine-associated paralytic poliomyelitis (VAPP). VAPP is the result of accumulated mutations and putative recombination events located at the genome of attenuated vaccine Sabin strains. In the present study, ten Sabin isolates derived from OPV vaccinees and environmental samples were studied in order to identify recombination types located from VP1 to 3D genomic regions of virus genome. The experimental procedure that was followed was virus RNA extraction, reverse transcription to convert the virus genome into cDNA, PCR and multiplex-PCR using specific designed primers able to localize and identify each recombination following agarose gel electrophoresis. This multiplex RT-PCR assay allows for the immediate detection and identification of multiple recombination types located at the viral genome of OPV derivatives. After the eradication of wild PVs, the remaining sources of poliovirus infection worldwide would be the OPV derivatives. As a consequence, the immediate detection and molecular characterization of recombinant derivatives are important to avoid epidemics due to the circulation of neurovirulent viral strains. PMID:27098645

  10. The human genome: a multifractal analysis

    PubMed Central

    2011-01-01

    Background Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode. Results We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed. Conclusions Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful. PMID:21999602

  11. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  12. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  13. Use of whole-genome sequencing to trace, control and characterize the regional expansion of extended-spectrum β-lactamase producing ST15 Klebsiella pneumoniae

    PubMed Central

    Zhou, Kai; Lokate, Mariette; Deurenberg, Ruud H.; Tepper, Marga; Arends, Jan P.; Raangs, Erwin G. C.; Lo-Ten-Foe, Jerome; Grundmann, Hajo; Rossen, John W. A.; Friedrich, Alexander W.

    2016-01-01

    The study describes the transmission of a CTX-M-15-producing ST15 Klebsiella pneumoniae between patients treated in a single center and the subsequent inter-institutional spread by patient referral occurring between May 2012 and September 2013. A suspected epidemiological link between clinical K. pneumoniae isolates was supported by patient contact tracing and genomic phylogenetic analysis from May to November 2012. By May 2013, a patient treated in three institutions in two cities was involved in an expanding cluster caused by this high-risk clone (HiRiC) (local expansion, CTX-M-15 producing, and containing hypervirulence factors). A clone-specific multiplex PCR was developed for patient screening by which another patient was identified in September 2013. Genomic phylogenetic analysis including published ST15 genomes revealed a close homology with isolates previously found in the USA. Environmental contamination and lack of consistent patient screening were identified as being responsible for the clone dissemination. The investigation addresses the advantages of whole-genome sequencing in the early detection of HiRiC with a high propensity of nosocomial transmission and prolonged circulation in the regional patient population. Our study suggests the necessity for inter-institutional/regional collaboration for infection/outbreak management of K. pneumoniae HiRiCs. PMID:26864946

  14. The dark matter of the cancer genome: aberrations in regulatory elements, untranslated regions, splice sites, non-coding RNA and synonymous mutations.

    PubMed

    Diederichs, Sven; Bartsch, Lorenz; Berkmann, Julia C; Fröse, Karin; Heitmann, Jana; Hoppe, Caroline; Iggena, Deetje; Jazmati, Danny; Karschnia, Philipp; Linsenmeier, Miriam; Maulhardt, Thomas; Möhrmann, Lino; Morstein, Johannes; Paffenholz, Stella V; Röpenack, Paula; Rückert, Timo; Sandig, Ludger; Schell, Maximilian; Steinmann, Anna; Voss, Gjendine; Wasmuth, Jacqueline; Weinberger, Maria E; Wullenkord, Ramona

    2016-01-01

    Cancer is a disease of the genome caused by oncogene activation and tumor suppressor gene inhibition. Deep sequencing studies including large consortia such as TCGA and ICGC identified numerous tumor-specific mutations not only in protein-coding sequences but also in non-coding sequences. Although 98% of the genome is not translated into proteins, most studies have neglected the information hidden in this "dark matter" of the genome. Malignancy-driving mutations can occur in all genetic elements outside the coding region, namely in enhancer, silencer, insulator, and promoter as well as in 5'-UTR and 3'-UTR Intron or splice site mutations can alter the splicing pattern. Moreover, cancer genomes contain mutations within non-coding RNA, such as microRNA, lncRNA, and lincRNA A synonymous mutation changes the coding region in the DNA and RNA but not the protein sequence. Importantly, oncogenes such as TERT or miR-21 as well as tumor suppressor genes such as TP53/p53, APC, BRCA1, or RB1 can be affected by these alterations. In summary, coding-independent mutations can affect gene regulation from transcription, splicing, mRNA stability to translation, and hence, this largely neglected area needs functional studies to elucidate the mechanisms underlying tumorigenesis. This review will focus on the important role and novel mechanisms of these non-coding or allegedly silent mutations in tumorigenesis. PMID:26992833

  15. An examination of the origin and evolution of additional tandem repeats in the mitochondrial DNA control region of Japanese sika deer (Cervus Nippon).

    PubMed

    Ba, Hengxing; Wu, Lang; Liu, Zongyue; Li, Chunyi

    2016-01-01

    Tandem repeat units are only detected in the left domain of the mitochondrial DNA control region in sika deer. Previous studies showed that Japanese sika deer have more tandem repeat units than its cousins from the Asian continent and Taiwan, which often have only three repeat units. To determine the origin and evolution of these additional repeat units in Japanese sika deer, we obtained the sequence of repeat units from an expanded dataset of the control region from all sika deer lineages. The functional constraint is inferred to act on the first repeat unit because this repeat has the least sequence divergence in comparison to the other units. Based on slipped-strand mispairing mechanisms, the illegitimate elongation model could account for the addition or deletion of these additional repeat units in the Japanese sika deer population. We also report that these additional repeat units could be occurring in the internal positions of tandem repeat regions, possibly via coupling with a homogenization mechanism within and among these lineages. Moreover, the increased number of repeat units in the Japanese sika deer population could reflect a balance between mutation and selection, as well as genetic drift. PMID:24621225

  16. Genome-wide analysis of glucocorticoid receptor binding regions in adipocytes reveal gene network involved in triglyceride homeostasis.

    PubMed

    Yu, Chi-Yi; Mayba, Oleg; Lee, Joyce V; Tran, Joanna; Harris, Charlie; Speed, Terence P; Wang, Jen-Chywan

    2010-01-01

    Glucocorticoids play important roles in the regulation of distinct aspects of adipocyte biology. Excess glucocorticoids in adipocytes are associated with metabolic disorders, including central obesity, insulin resistance and dyslipidemia. To understand the mechanisms underlying the glucocorticoid action in adipocytes, we used chromatin immunoprecipitation sequencing to isolate genome-wide glucocorticoid receptor (GR) binding regions (GBRs) in 3T3-L1 adipocytes. Furthermore, gene expression analyses were used to identify genes that were regulated by glucocorticoids. Overall, 274 glucocorticoid-regulated genes contain or locate nearby GBR. We found that many GBRs were located in or nearby genes involved in triglyceride (TG) synthesis (Scd-1, 2, 3, GPAT3, GPAT4, Agpat2, Lpin1), lipolysis (Lipe, Mgll), lipid transport (Cd36, Lrp-1, Vldlr, Slc27a2) and storage (S3-12). Gene expression analysis showed that except for Scd-3, the other 13 genes were induced in mouse inguinal fat upon 4-day glucocorticoid treatment. Reporter gene assays showed that except Agpat2, the other 12 glucocorticoid-regulated genes contain at least one GBR that can mediate hormone response. In agreement with the fact that glucocorticoids activated genes in both TG biosynthetic and lipolytic pathways, we confirmed that 4-day glucocorticoid treatment increased TG synthesis and lipolysis concomitantly in inguinal fat. Notably, we found that 9 of these 12 genes were induced in transgenic mice that have constant elevated plasma glucocorticoid levels. These results suggested that a similar mechanism was used to regulate TG homeostasis during chronic glucocorticoid treatment. In summary, our studies have identified molecular components in a glucocorticoid-controlled gene network involved in the regulation of TG homeostasis in adipocytes. Understanding the regulation of this gene network should provide important insight for future therapeutic developments for metabolic diseases. PMID:21187916

  17. Coordinate regulation of two cytoplasmic RNA species transcribed from early region 2 of the adenovirus 2 genome.

    PubMed

    Goldenberg, C J; Rosenthal, R; Bhaduri, S; Raskas, H

    1981-06-01

    Early region 2 (E2) of the adenovirus 2 genome specifies a 72,000-dalton DNA-binding protein that is required for viral DNA replication. Electron microscopy studies have detected two major forms of 20S E2 mRNA, one species with a 5' leader from map position 75 and a second form having a leader from position 72 (Chow et al., J. Mol. Biol. 134:265-303, 1979). Only the species with a leader from position 75 was detected at early times; however, both forms were found at late times. We have analyzed the temporal regulation of E2 expression by documenting mRNA accumulation in the cytoplasm. Kinetic studies of pulse-labeled RNAs demonstrated a peak of E2 cytoplasmic RNa synthesis at 10 to 12 h, coinciding with the time of maximal synthesis of the 72,000-dalton DNA binding protein and viral DNA. To estimate the relative abundances of the two major E2 RNA species at various times during infection, total E2 cytoplasmic and polysomal 20S RNAs were isolated by hybridization-selection with specific DNA probes. The leader sequences in the selected RNAs were then quantitated by further RNA-DNA hybridization. We found that the elevated accumulation rate for E2 cytoplasmic RNA at late times reflected an increase in formation of both major species. Moreover, for all time points examined 66% of the mRNA species had a 5' end from map position 75, and 33% had a 5' terminus from position 72. Continuous labeling experiments provided evidence that both RNA forms have comparable half-lives. The results suggest that the two major species encoded by E2 are regulated in a coordinate fashion late in infection. PMID:6894621

  18. Recombination and Evolution of Duplicate Control Regions in the Mitochondrial Genome of the Asian Big-Headed Turtle, Platysternon megacephalum

    PubMed Central

    Zheng, Chenfei; Nie, Liuwang; Wang, Jue; Zhou, Huaxing; Hou, Huazhen; Wang, Hao; Liu, Juanjuan

    2013-01-01

    Complete mitochondrial (mt) genome sequences with duplicate control regions (CRs) have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs) at the 3′ end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs) suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P. megacephalum. PMID

  19. Evaluation of a Partial Genome Screening of Two Asthma Susceptibility Regions Using Bayesian Network Based Bayesian Multilevel Analysis of Relevance

    PubMed Central

    Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba

    2012-01-01

    Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035

  20. Recombination and evolution of duplicate control regions in the mitochondrial genome of the Asian big-headed turtle, Platysternon megacephalum.

    PubMed

    Zheng, Chenfei; Nie, Liuwang; Wang, Jue; Zhou, Huaxing; Hou, Huazhen; Wang, Hao; Liu, Juanjuan

    2013-01-01

    Complete mitochondrial (mt) genome sequences with duplicate control regions (CRs) have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs) at the 3' end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs) suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P. megacephalum. PMID

  1. The mitochondrial genome of the banded guitarfish, Zapteryx exasperata (Jordan and Gilbert, 1880), possesses a non-coding duplication remnant region.

    PubMed

    Castillo-Páez, Ana; Del Río-Portilla, Miguel Angel; Oñate-González, Erick; Rocha-Olivares, Axayácatl

    2016-05-01

    The complete mitochondrial genome of the banded guitarfish is 17,310 bp long and includes 2 ribosomal RNA, 22 transfer RNA, and 13 protein-coding genes, a replication origin and a control region (GenBank accession number KM370325). Gene arrangement is similar to that found in other batoids. An extra non-coding region was found between the genes coding for transfer RNA proline and threonine possessing a set of tandem repeat motifs pointing to its origin as a duplication remnant. Start codon ATG and stop codon TAA/T were found in most protein-coding genes. The base composition of the genome is 32.3% A, 30.2% T, 24.3% C and 13.1% G. PMID:25208175

  2. High-resolution array CGH identifies novel regions of genomic alteration in intermediate-risk prostate cancer.

    PubMed

    Ishkanian, Adrian S; Mallof, Chad A; Ho, James; Meng, Alice; Albert, Monique; Syed, Amena; van der Kwast, Theodorus; Milosevic, Michael; Yoshimoto, Maisa; Squire, Jeremy A; Lam, Wan L; Bristow, Robert G

    2009-07-01

    Approximately one-third of prostate cancer patients present with intermediate risk disease. Interestingly, while this risk group is clinically well defined, it demonstrates the most significant heterogeneity in PSA-based biochemical outcome. Further, the majority of candidate genes associated with prostate cancer progression have been identified using cell lines, xenograft models, and high-risk androgen-independent or metastatic patient samples. We used a global high-resolution array comparative genomic hybridization (CGH) assay to characterize copy number alterations (CNAs) in intermediate risk prostate cancer. Herein, we show this risk group contains a number of alterations previously associated with high-risk disease: (1) deletions at 21q22.2 (TMPRSS2:ERG), 16q22-24 (containing CDH1), 13q14.2 (RB1), 10q23.31 (PTEN), 8p21 (NKX3.1); and, (2) amplification at 8q21.3-24.3 (containing c-MYC). In addition, we identified six novel microdeletions at high frequency: 1q42.12-q42.3 (33.3%), 5q12.3-13.3 (21%), 20q13.32-13.33 (29.2%), 22q11.21 (25%), 22q12.1 (29.2%), and 22q13.31 (33.3%). Further, we show there is little concordance between CNAs from these clinical samples and those found in commonly used prostate cancer cell models. These unexpected findings suggest that the intermediate-risk category is a crucial cohort warranting further study to determine if a unique molecular fingerprint can predict aggressive versus indolent phenotypes. PMID:19350549

  3. A microsatellite linkage map for the cultivated strawberry (Fragaria × ananassa) suggests extensive regions of homozygosity in the genome that may have resulted from breeding and selection.

    PubMed

    Sargent, D J; Passey, T; Surbanovski, N; Lopez Girona, E; Kuchta, P; Davik, J; Harrison, R; Passey, A; Whitehouse, A B; Simpson, D W

    2012-05-01

    The linkage maps of the cultivated strawberry, Fragaria × ananassa (2n = 8x = 56) that have been reported to date have been developed predominantly from AFLPs, along with supplementation with transferrable microsatellite (SSR) markers. For the investigation of the inheritance of morphological characters in the cultivated strawberry and for the development of tools for marker-assisted breeding and selection, it is desirable to populate maps of the genome with an abundance of transferrable molecular markers such as microsatellites (SSRs) and gene-specific markers. Exploiting the recent release of the genome sequence of the diploid F. vesca, and the publication of an extensive number of polymorphic SSR markers for the genus Fragaria, we have extended the linkage map of the 'Redgauntlet' × 'Hapil' (RG × H) mapping population to include a further 330 loci, generated from 160 primer pairs, to create a linkage map for F. × ananassa containing 549 loci, 490 of which are transferrable SSR or gene-specific markers. The map covers 2140.3 cM in the expected 28 linkage groups for an integrated map (where one group is composed of two separate male and female maps), which represents an estimated 91% of the cultivated strawberry genome. Despite the relative saturation of the linkage map on the majority of linkage groups, regions of apparent extensive homozygosity were identified in the genomes of 'Redgauntlet' and 'Hapil' which may be indicative of allele fixation during the breeding and selection of modern F. × ananassa cultivars. The genomes of the octoploid and diploid Fragaria are largely collinear, but through comparison of mapped markers on the RG × H linkage map to their positions on the genome sequence of F. vesca, a number of inversions were identified that may have occurred before the polyploidisation event that led to the evolution of the modern octoploid strawberry species. PMID:22218676

  4. Systematic sequencing of the Escherichia coli genome: analysis of the 2.4-4.1 min (110,917-193,643 bp) region.

    PubMed Central

    Fujita, N; Mori, H; Yura, T; Ishihama, A

    1994-01-01

    The complete sequence analysis of the E. coli genome was initiated as a collaborative study in Japan. Following the initial analysis of the 0-2.4 min region (Yura, T. et al. (1992) Nucleic Acids Res. 20, 3305-3308), a contiguous sequence of 82,727 bp corresponding to the 2.4-4.1 min region (110,917-193,643 bp as counted from 0 min) was determined. The resulting sequence was found to contain at least 33 known genes and 24 putative genes predicted from protein sequence homology. PMID:8202364

  5. Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease, and shows evidence for additional susceptibility genes

    PubMed Central

    Harold, Denise; Abraham, Richard; Hollingworth, Paul; Sims, Rebecca; Gerrish, Amy; Hamshere, Marian; Singh Pahwa, Jaspreet; Moskvina, Valentina; Dowzell, Kimberley; Williams, Amy; Jones, Nicola; Thomas, Charlene; Stretton, Alexandra; Morgan, Angharad; Lovestone, Simon; Powell, John; Proitsi, Petroula; Lupton, Michelle K; Brayne, Carol; Rubinsztein, David C.; Gill, Michael; Lawlor, Brian; Lynch, Aoibhinn; Morgan, Kevin; Brown, Kristelle; Passmore, Peter; Craig, David; McGuinness, Bernadette; Todd, Stephen; Holmes, Clive; Mann, David; Smith, A. David; Love, Seth; Kehoe, Patrick G.; Hardy, John; Mead, Simon; Fox, Nick; Rossor, Martin; Collinge, John; Maier, Wolfgang; Jessen, Frank; Schürmann, Britta; van den Bussche, Hendrik; Heuser, Isabella; Kornhuber, Johannes; Wiltfang, Jens; Dichgans, Martin; Frölich, Lutz; Hampel, Harald; Hüll, Michael; Rujescu, Dan; Goate, Alison; Kauwe, John S.K.; Cruchaga, Carlos; Nowotny, Petra; Morris, John C.; Mayo, Kevin; Sleegers, Kristel; Bettens, Karolien; Engelborghs, Sebastiaan; De Deyn, Peter; van Broeckhoven, Christine; Livingston, Gill; Bass, Nicholas J.; Gurling, Hugh; McQuillin, Andrew; Gwilliam, Rhian; Deloukas, Panagiotis; Al-Chalabi, Ammar; Shaw, Christopher E.; Tsolaki, Magda; Singleton, Andrew; Guerreiro, Rita; Mühleisen, Thomas W.; Nöthen, Markus M.; Moebus, Susanne; Jöckel, Karl-Heinz; Klopp, Norman; Wichmann, H-Erich; Carrasquillo, Minerva M.; Pankratz, V. Shane; Younkin, Steven G.; Holmans, Peter; O'Donovan, Michael; Owen, Michael J.; Williams, Julie

    2010-01-01

    We undertook a two-stage genome-wide association study of Alzheimer's disease involving over 16,000 individuals. In stage 1 (3,941 cases and 7,848 controls), we replicated the established association with the APOE locus (most significant SNP: rs2075650, p= 1.8×10−157) and observed genome-wide significant association with SNPs at two novel loci: rs11136000 in the CLU or APOJ gene (p= 1.4×10−9) and rs3851179, a SNP 5′ to the PICALM gene (p= 1.9×10−8). Both novel associations were supported in stage 2 (2,023 cases and 2,340 controls), producing compelling evidence for association with AD in the combined dataset (rs11136000: p= 8.5×10−10, odds ratio= 0.86; rs3851179: p= 1.3×10−9, odds ratio= 0.86). We also observed more variants associated at p< 1×10−5 than expected by chance (p=7.5×10−6), including polymorphisms at the BIN1, DAB1 and CR1 loci. PMID:19734902

  6. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility

    PubMed Central

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K.; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C.; Burgess, Shawn M.; Sampath, Karuna

    2016-01-01

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. PMID:26818075

  7. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

    PubMed

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

    2016-01-01

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. PMID:26818075

  8. Effect of the additional installation of implants in the posterior region on the prognosis of treatment in the edentulous mandibular jaw.

    PubMed

    Miyamoto, Youji; Fujisawa, Kenji; Takechi, Masaaki; Momota, Yukihiro; Yuasa, Tetsuya; Tatehara, Seiko; Nagayama, Masaru; Yamauchi, Eiji

    2003-12-01

    The aim of this study was to elucidate the effect of the additional installation of implants in the posterior region on the prognosis of treatment in the edentulous mandibular jaw. Fifteen patients who had received implants (Brånemark system, Nobel Biocare, Gotebörg, Sweden) in the edentulous mandible and completed a 1-year follow-up after the fitting of implant-anchored fixed prostheses were selected. In seven patients (Group A), four or five implants were installed between the mental foramina, and in eight patients (Group P), one or two implants, one on each side, were installed in the posterior regions in addition to the implants between the foramina. All implants of both groups achieved osseointegration. In Group A, there was no implant loss after loading. Six implants were lost in five patients of Group P within 1 year after loading. All of them were located in the posterior region. To elucidate whether or not the failure rate of the implants in the posterior region of Group P after loading was especially high, the failures were also compared with 89 implants, which were installed in the posterior region of the mandibles to support implant-anchored fixed partial prosthesis, during the same period (Group C). The cumulative survival rate of the implants of Group P was 60%, while that of the implants of Group C was 100% (P<0.001). When the survival rates of posterior implants with the same length of the two groups were compared, there were significant differences for the 7- and 10-mm-length implants only. These data demonstrate that the posterior implants in Group P are at greater risk. Deformation of the mandible due to jaw movement was thought to be the most likely cause of the implant loss. Therefore, when such modified treatment is chosen, it should be performed with meticulous attention. PMID:15015949

  9. Generation and Comparative Analysis of ∼3.3 Mb of Mouse Genomic Sequence Orthologous to the Region of Human Chromosome 7q11.23 Implicated in Williams Syndrome

    PubMed Central

    DeSilva, Udaya; Elnitski, Laura; Idol, Jacquelyn R.; Doyle, Johannah L.; Gan, Weiniu; Thomas, James W.; Schwartz, Scott; Dietrich, Nicole L.; Beckstrom-Sternberg, Stephen M.; McDowell, Jennifer C.; Blakesley, Robert W.; Bouffard, Gerard G.; Thomas, Pamela J.; Touchman, Jeffrey W.; Miller, Webb; Green, Eric D.

    2002-01-01

    Williams syndrome is a complex developmental disorder that results from the heterozygous deletion of a ∼1.6-Mb segment of human chromosome 7q11.23. These deletions are mediated by large (∼300 kb) duplicated blocks of DNA of near-identical sequence. Previously, we showed that the orthologous region of the mouse genome is devoid of such duplicated segments. Here, we extend our studies to include the generation of ∼3.3 Mb of genomic sequence from the mouse Williams syndrome region, of which just over 1.4 Mb is finished to high accuracy. Comparative analyses of the mouse and human sequences within and immediately flanking the interval commonly deleted in Williams syndrome have facilitated the identification of nine previously unreported genes, provided detailed sequence-based information regarding 30 genes residing in the region, and revealed a number of potentially interesting conserved noncoding sequences. Finally, to facilitate comparative sequence analysis, we implemented several enhancements to the program PipMaker, including the addition of links from annotated features within a generated percent-identity plot to specific records in public databases. Taken together, the results reported here provide an important comparative sequence resource that should catalyze additional studies of Williams syndrome, including those that aim to characterize genes within the commonly deleted interval and to develop mouse models of the disorder. [The sequence data described in this paper have been submitted to GenBank under accession nos. AF267747, AF289666, AF289667, AF289664, AF289665, AC091250, AC079938, AC084109, AC024607, AC074359, AC024608, AC083858, AC083948, AC084162, AC087420, AC083890, AC080158, AC084402, AC083889, AC083857, and AC079872.] PMID:11779826

  10. Inside the Pan-genome - Methods and Software Overview

    PubMed Central

    Guimarães, Luis Carlos; Florczak-Wyspianska, Jolanta; de Jesus, Leandro Benevides; Viana, Marcus Vinícius Canário; Silva, Artur; Ramos, Rommel Thiago Jucá; Soares, Siomar de Castro; Soares, Siomar de Castro

    2015-01-01

    The number of genomes that have been deposited in databases has increased exponentially after the advent of Next-Generation Sequencing (NGS), which produces high-throughput sequence data; this circumstance has demanded the development of new bioinformatics software and the creation of new areas, such as comparative genomics. In comparative genomics, the genetic content of an organism is compared against other organisms, which helps in the prediction of gene function and coding region sequences, identification of evolutionary events and determination of phylogenetic relationships. However, expanding comparative genomics to a large number of related bacteria, we can infer their lifestyles, gene repertoires and minimal genome size. In this context, a powerful approach called Pan-genome has been initiated and developed. This approach involves the genomic comparison of different strains of the same species, or even genus. Its main goal is to establish the total number of non-redundant genes that are present in a determined dataset. Pan-genome consists of three parts: core genome; accessory or dispensable genome; and species-specific or strain-specific genes. Furthermore, pan-genome is considered to be “open” as long as new genes are added significantly to the total repertoire for each new additional genome and “closed” when the newly added genomes cannot be inferred to significantly increase the total repertoire of the genes. To perform all of the required calculations, a substantial amount of software has been developed, based on orthologous and paralogous gene identification. PMID:27006628

  11. Complete mtDNA genomes of Filipino ethnolinguistic groups: a melting pot of recent and ancient lineages in the Asia-Pacific region.

    PubMed

    Delfin, Frederick; Min-Shan Ko, Albert; Li, Mingkun; Gunnarsdóttir, Ellen D; Tabbada, Kristina A; Salvador, Jazelyn M; Calacal, Gayvelline C; Sagum, Minerva S; Datar, Francisco A; Padilla, Sabino G; De Ungria, Maria Corazon A; Stoneking, Mark

    2014-02-01

    The Philippines is a strategic point in the Asia-Pacific region for the study of human diversity, history and origins, as it is a cross-road for human migrations and consequently exhibits enormous ethnolinguistic diversity. Following on a previous in-depth study of Y-chromosome variation, here we provide new insights into the maternal genetic history of Filipino ethnolinguistic groups by surveying complete mitochondrial DNA (mtDNA) genomes from a total of 14 groups (11 groups in this study and 3 groups previously published) including previously published mtDNA hypervariable segment (HVS) data from Filipino regional center groups. Comparison of HVS data indicate genetic differences between ethnolinguistic and regional center groups. The complete mtDNA genomes of 14 ethnolinguistic groups reveal genetic aspects consistent with the Y-chromosome, namely: diversity and heterogeneity of groups, no support for a simple dichotomy between Negrito and non-Negrito groups, and different genetic affinities with Asia-Pacific groups that are both ancient and recent. Although some mtDNA haplogroups can be associated with the Austronesian expansion, there are others that associate with South Asia, Near Oceania and Australia that are consistent with a southern migration route for ethnolinguistic group ancestors into the Asia-Pacific, with a timeline that overlaps with the initial colonization of the Asia-Pacific region, the initial colonization of the Philippines and a possible separate post-colonization migration into the Philippine archipelago. PMID:23756438

  12. Complete mtDNA genomes of Filipino ethnolinguistic groups: a melting pot of recent and ancient lineages in the Asia-Pacific region

    PubMed Central

    Delfin, Frederick; Min-Shan Ko, Albert; Li, Mingkun; Gunnarsdóttir, Ellen D; Tabbada, Kristina A; Salvador, Jazelyn M; Calacal, Gayvelline C; Sagum, Minerva S; Datar, Francisco A; Padilla, Sabino G; De Ungria, Maria Corazon A; Stoneking, Mark

    2014-01-01

    The Philippines is a strategic point in the Asia-Pacific region for the study of human diversity, history and origins, as it is a cross-road for human migrations and consequently exhibits enormous ethnolinguistic diversity. Following on a previous in-depth study of Y-chromosome variation, here we provide new insights into the maternal genetic history of Filipino ethnolinguistic groups by surveying complete mitochondrial DNA (mtDNA) genomes from a total of 14 groups (11 groups in this study and 3 groups previously published) including previously published mtDNA hypervariable segment (HVS) data from Filipino regional center groups. Comparison of HVS data indicate genetic differences between ethnolinguistic and regional center groups. The complete mtDNA genomes of 14 ethnolinguistic groups reveal genetic aspects consistent with the Y-chromosome, namely: diversity and heterogeneity of groups, no support for a simple dichotomy between Negrito and non-Negrito groups, and different genetic affinities with Asia-Pacific groups that are both ancient and recent. Although some mtDNA haplogroups can be associated with the Austronesian expansion, there are others that associate with South Asia, Near Oceania and Australia that are consistent with a southern migration route for ethnolinguistic group ancestors into the Asia-Pacific, with a timeline that overlaps with the initial colonization of the Asia-Pacific region, the initial colonization of the Philippines and a possible separate post-colonization migration into the Philippine archipelago. PMID:23756438