Sample records for adjacent genomic regions

  1. Best Merge Region Growing Segmentation with Integrated Non-Adjacent Region Object Aggregation

    NASA Technical Reports Server (NTRS)

    Tilton, James C.; Tarabalka, Yuliya; Montesano, Paul M.; Gofman, Emanuel

    2012-01-01

    Best merge region growing normally produces segmentations with closed connected region objects. Recognizing that spectrally similar objects often appear in spatially separate locations, we present an approach for tightly integrating best merge region growing with non-adjacent region object aggregation, which we call Hierarchical Segmentation or HSeg. However, the original implementation of non-adjacent region object aggregation in HSeg required excessive computing time even for moderately sized images because of the required intercomparison of each region with all other regions. This problem was previously addressed by a recursive approximation of HSeg, called RHSeg. In this paper we introduce a refined implementation of non-adjacent region object aggregation in HSeg that reduces the computational requirements of HSeg without resorting to the recursive approximation. In this refinement, HSeg s region inter-comparisons among non-adjacent regions are limited to regions of a dynamically determined minimum size. We show that this refined version of HSeg can process moderately sized images in about the same amount of time as RHSeg incorporating the original HSeg. Nonetheless, RHSeg is still required for processing very large images due to its lower computer memory requirements and amenability to parallel processing. We then note a limitation of RHSeg with the original HSeg for high spatial resolution images, and show how incorporating the refined HSeg into RHSeg overcomes this limitation. The quality of the image segmentations produced by the refined HSeg is then compared with other available best merge segmentation approaches. Finally, we comment on the unique nature of the hierarchical segmentations produced by HSeg.

  2. Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits.

    PubMed

    Gebreyesus, Grum; Lund, Mogens S; Buitenhuis, Bart; Bovenhuis, Henk; Poulsen, Nina A; Janss, Luc G

    2017-12-05

    Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in

  3. A genome-wide association study identifies a genomic region for the polycerate phenotype in sheep (Ovis aries).

    PubMed

    Ren, Xue; Yang, Guang-Li; Peng, Wei-Feng; Zhao, Yong-Xin; Zhang, Min; Chen, Ze-Hui; Wu, Fu-An; Kantanen, Juha; Shen, Min; Li, Meng-Hua

    2016-02-17

    Horns are a cranial appendage found exclusively in Bovidae, and play important roles in accessing resources and mates. In sheep (Ovies aries), horns vary from polled to six-horned, and human have been selecting polled animals in farming and breeding. Here, we conducted a genome-wide association study on 24 two-horned versus 22 four-horned phenotypes in a native Chinese breed of Sishui Fur sheep. Together with linkage disequilibrium (LD) analyses and haplotype-based association tests, we identified a genomic region comprising 132.0-133.1 Mb on chromosome 2 that contained the top 10 SNPs (including 4 significant SNPs) and 5 most significant haplotypes associated with the polycerate phenotype. In humans and mice, this genomic region contains the HOXD gene cluster and adjacent functional genes EVX2 and KIAA1715, which have a close association with the formation of limbs and genital buds. Our results provide new insights into the genetic basis underlying variable numbers of horns and represent a new resource for use in sheep genetics and breeding.

  4. AnnotateGenomicRegions: a web application.

    PubMed

    Zammataro, Luca; DeMolfetta, Rita; Bucci, Gabriele; Ceol, Arnaud; Muller, Heiko

    2014-01-01

    Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions.

  5. AnnotateGenomicRegions: a web application

    PubMed Central

    2014-01-01

    Background Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Results Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. Conclusions The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions. PMID:24564446

  6. Comparative transgenic analysis of enhancers from the human SHOX and mouse Shox2 genomic regions.

    PubMed

    Rosin, Jessica M; Abassah-Oppong, Samuel; Cobb, John

    2013-08-01

    Disruption of presumptive enhancers downstream of the human SHOX gene (hSHOX) is a frequent cause of the zeugopodal limb defects characteristic of Léri-Weill dyschondrosteosis (LWD). The closely related mouse Shox2 gene (mShox2) is also required for limb development, but in the more proximal stylopodium. In this study, we used transgenic mice in a comparative approach to characterize enhancer sequences in the hSHOX and mShox2 genomic regions. Among conserved noncoding elements (CNEs) that function as enhancers in vertebrate genomes, those that are maintained near paralogous genes are of particular interest given their ancient origins. Therefore, we first analyzed the regulatory potential of a genomic region containing one such duplicated CNE (dCNE) downstream of mShox2 and hSHOX. We identified a strong limb enhancer directly adjacent to the mShox2 dCNE that recapitulates the expression pattern of the endogenous gene. Interestingly, this enhancer requires sequences only conserved in the mammalian lineage in order to drive strong limb expression, whereas the more deeply conserved sequences of the dCNE function as a neural enhancer. Similarly, we found that a conserved element downstream of hSHOX (CNE9) also functions as a neural enhancer in transgenic mice. However, when the CNE9 transgenic construct was enlarged to include adjacent, non-conserved sequences frequently deleted in LWD patients, the transgene drove expression in the zeugopodium of the limbs. Therefore, both hSHOX and mShox2 limb enhancers are coupled to distinct neural enhancers. This is the first report demonstrating the activity of cis-regulatory elements from the hSHOX and mShox2 genomic regions in mammalian embryos.

  7. Regional Spectral Model simulations of the summertime regional climate over Taiwan and adjacent areas

    Treesearch

    Ching-Teng Lee; Ming-Chin Wu; Shyh-Chin Chen

    2005-01-01

    The National Centers for Environmental Prediction (NCEP) regional spectral model (RSM) version 97 was used to investigate the regional summertime climate over Taiwan and adjacent areas for June-July-August of 1990 through 2000. The simulated sea-level-pressure and wind fields of RSM1 with 50-km grid space are similar to the reanalysis, but the strength of the...

  8. West Nile virus (WNV) genome RNAs with up to three adjacent mutations that disrupt long distance 5'-3' cyclization sequence basepairs are viable

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Basu, Mausumi; Brinton, Margo A., E-mail: mbrinton@gsu.ed

    2011-03-30

    Mosquito-borne flavivirus genomes contain conserved 5' and 3' cyclization sequences (CYC) that facilitate long distance RNA-RNA interactions. In previous studies, flavivirus replicon RNA replication was completely inhibited by single or multiple mismatching CYC nt substitutions. In the present study, full-length WNV genomes with one, two or three mismatching CYC substitutions showed reduced replication efficiencies but were viable and generated revertants with increased replication efficiency. Several different three adjacent mismatching CYC substitution mutant RNAs were rescued by a second site mutation that created an additional basepair (nts 147-10913) on the internal genomic side of the 5'-3' CYC. The finding that full-lengthmore » genomes with up to three mismatching CYC mutations are viable and can be rescued by a single nt spontaneous mutation indicates that more than three adjacent CYC basepair substitutions would be required to increase the safety of vaccine genomes by creating mismatches in inter-genomic recombinants.« less

  9. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae,more » respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.« less

  10. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  11. regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests.

    PubMed

    Gel, Bernat; Díez-Villanueva, Anna; Serra, Eduard; Buschbeck, Marcus; Peinado, Miguel A; Malinverni, Roberto

    2016-01-15

    Statistically assessing the relation between a set of genomic regions and other genomic features is a common challenging task in genomic and epigenomic analyses. Randomization based approaches implicitly take into account the complexity of the genome without the need of assuming an underlying statistical model. regioneR is an R package that implements a permutation test framework specifically designed to work with genomic regions. In addition to the predefined randomization and evaluation strategies, regioneR is fully customizable allowing the use of custom strategies to adapt it to specific questions. Finally, it also implements a novel function to evaluate the local specificity of the detected association. regioneR is an R package released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/regioneR). rmalinverni@carrerasresearch.org. © The Author 2015. Published by Oxford University Press.

  12. Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

    PubMed Central

    Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

    2018-01-01

    Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048

  13. Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data.

    PubMed

    Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

    2018-03-01

    Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.

  14. Protospacer Adjacent Motif (PAM)-Distal Sequences Engage CRISPR Cas9 DNA Target Cleavage

    PubMed Central

    Ethier, Sylvain; Schmeing, T. Martin; Dostie, Josée; Pelletier, Jerry

    2014-01-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)-associated enzyme Cas9 is an RNA-guided nuclease that has been widely adapted for genome editing in eukaryotic cells. However, the in vivo target specificity of Cas9 is poorly understood and most studies rely on in silico predictions to define the potential off-target editing spectrum. Using chromatin immunoprecipitation followed by sequencing (ChIP-seq), we delineate the genome-wide binding panorama of catalytically inactive Cas9 directed by two different single guide (sg) RNAs targeting the Trp53 locus. Cas9:sgRNA complexes are able to load onto multiple sites with short seed regions adjacent to 5′NGG3′ protospacer adjacent motifs (PAM). Yet among 43 ChIP-seq sites harboring seed regions analyzed for mutational status, we find editing only at the intended on-target locus and one off-target site. In vitro analysis of target site recognition revealed that interactions between the 5′ end of the guide and PAM-distal target sequences are necessary to efficiently engage Cas9 nucleolytic activity, providing an explanation for why off-target editing is significantly lower than expected from ChIP-seq data. PMID:25275497

  15. The influence of specific neighboring bases on substitution bias in noncoding regions of the plant chloroplast genome.

    PubMed

    Morton, B R; Oberholzer, V M; Clegg, M T

    1997-09-01

    Substitutions occurring in noncoding sequences of the plant chloroplast genome violate the independence of sites that is assumed by substitution models in molecular evolution. The probability that a substitution at a site is a transversion, as opposed to a transition, increases significantly with increasing A + T content of the two adjacent nucleotides. In the present study, this dependency of substitutions on local context is examined further in a number of noncoding regions from the chloroplast genome of members of the grass family (Poaceae). Two features were examined; the influence of specific neighboring bases, as opposed to the general A + T content, on transversion proportion and an influence on substitutions by nucleotides other than the two immediately adjacent to the site of substitution. In both cases, a significant effect was found. In the case of specific nucleotides, transversion proportion is significantly higher at sites with a pyrimidine immediately 5' on either strand. Substitutions at sites of the type YNR, where N is the site of substitution, have the highest rate of transversion. This specific effect is secondary to the A + T content effect such that, in terms of proportion of substitutions that are transversions, the nucleotides are ranked T > A > C > G as to their effect when they are immediately 5' to the site of substitution. In the case of nucleotides other than the immediate neighbors, a significant influence on substitution dynamics is observed in the case where the two neighboring bases are both A and/or T. Thus, substitutions are primarily, but not exclusively, influenced by the composition of the two nucleotides that are immediately adjacent. These results indicate that the pattern of molecular evolution of the plant chloroplast genome is extremely complex as a result of a variety of inter-site dependencies.

  16. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    PubMed Central

    Zhang, Hao; van Diepeningen, Anne D.; van der Lee, Theo A. J.; Waalwijk, Cees; de Hoog, G. Sybren

    2016-01-01

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/). PMID

  17. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research.

    PubMed

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    2016-06-01

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/).

  18. Lithospheric structure of the South China Sea and adjacent regions: Results from potential field modelling

    NASA Astrophysics Data System (ADS)

    Chen, Ming; Fang, Jian; Cui, Ronghua

    2018-02-01

    This work aims to investigate the crustal and lithospheric mantle thickness of the South China Sea (SCS) and adjacent regions. The crust-mantle interface, average crustal density, and lithospheric mantle base are calculated from free-air gravity anomaly and topographic data using an iterative inversion method. We construct a three-dimensional lithospheric model with different hierarchical layers. The satellite-derived gravity is used to invert the average crustal density and Moho (crust-mantle interface) undulations. The average crustal density and LAB (lithosphere-asthenosphere boundary) depths are further adjusted by topographic data under the assumption of local isostasy. The average difference in Moho depths between this study and the seismic measurement results is <1.5 km. The results show that in oceanic regions, the Moho depths are 7.5-30 km and the LAB depths are 65-120 km. The lithospheric thickness of the SCS basin and the adjacent regions increases from the sea basin to the continental margin with a large gradient in the ocean-continent transition zones. The Moho depths of conjugate plots during the opening of SCS, Zhongsha Islands and Reed Bank, reveal the asymmetric spreading pattern of SCS seafloor spreading. The lithospheric thinning pattern indicate two different spreading directions during seafloor spreading, which changed from N-S to NW-SE after the southward transition of the spreading axis. The lithosphere of the SCS basin and adjacent regions indicate that the SCS basin is a young basin with a stable interior lithosphere.

  19. Using Morphological, Molecular and Climatic Data to Delimitate Yews along the Hindu Kush-Himalaya and Adjacent Regions

    PubMed Central

    Poudel, Ram C.; Möller, Michael; Gao, Lian-Ming; Ahrends, Antje; Baral, Sushim R.; Liu, Jie; Thomas, Philip; Li, De-Zhu

    2012-01-01

    Background Despite the availability of several studies to clarify taxonomic problems on the highly threatened yews of the Hindu Kush-Himalaya (HKH) and adjacent regions, the total number of species and their exact distribution ranges remains controversial. We explored the use of comprehensive sets of morphological, molecular and climatic data to clarify taxonomy and distributions of yews in this region. Methodology/Principal Findings A total of 743 samples from 46 populations of wild yew and 47 representative herbarium specimens were analyzed. Principle component analyses on 27 morphological characters and 15 bioclimatic variables plus altitude and maximum parsimony analysis on molecular ITS and trnL-F sequences indicated the existence of three distinct species occurring in different ecological (climatic) and altitudinal gradients along the HKH and adjacent regions Taxus contorta from eastern Afghanistan to the eastern end of Central Nepal, T. wallichiana from the western end of Central Nepal to Northwest China, and the first report of the South China low to mid-elevation species T. mairei in Nepal, Bhutan, Northeast India, Myanmar and South Vietnam. Conclusion/Significance The detailed sampling and combination of different data sets allowed us to identify three clearly delineated species and their precise distribution ranges in the HKH and adjacent regions, which showed no overlap or no distinct hybrid zone. This might be due to differences in the ecological (climatic) requirements of the species. The analyses further provided the selection of diagnostic morphological characters for the identification of yews occurring in the HKH and adjacent regions. Our work demonstrates that extensive sampling combined with the analysis of diverse data sets can reliably address the taxonomy of morphologically challenging plant taxa. PMID:23056501

  20. Telomere maintenance through recruitment of internal genomic regions.

    PubMed

    Seo, Beomseok; Kim, Chuna; Hills, Mark; Sung, Sanghyun; Kim, Hyesook; Kim, Eunkyeong; Lim, Daisy S; Oh, Hyun-Seok; Choi, Rachael Mi Jung; Chun, Jongsik; Shim, Jaegal; Lee, Junho

    2015-09-18

    Cells surviving crisis are often tumorigenic and their telomeres are commonly maintained through the reactivation of telomerase. However, surviving cells occasionally activate a recombination-based mechanism called alternative lengthening of telomeres (ALT). Here we establish stably maintained survivors in telomerase-deleted Caenorhabditis elegans that escape from sterility by activating ALT. ALT survivors trans-duplicate an internal genomic region, which is already cis-duplicated to chromosome ends, across the telomeres of all chromosomes. These 'Template for ALT' (TALT) regions consist of a block of genomic DNA flanked by telomere-like sequences, and are different between two genetic background. We establish a model that an ancestral duplication of a donor TALT region to a proximal telomere region forms a genomic reservoir ready to be incorporated into telomeres on ALT activation.

  1. Tsunami Ready Recognition Program for the Caribbean and Adjacent Regions Launched in 2015

    NASA Astrophysics Data System (ADS)

    von Hillebrandt-Andrade, C.; Hinds, K.; Aliaga, B.; Brome, A.; Lopes, R.

    2015-12-01

    Over 75 tsunamis have been documented in the Caribbean and Adjacent Regions over the past 500 years with 4,561 associated deaths according to the NOAA Tsunami Database. The most recent devastating tsunamis occurred in 1946 in Dominican Republic; 1865 died. With the explosive increase in residents, tourists, infrastructure, and economic activity along the coasts, the potential for human and economic loss is enormous. It has been estimated that on any day, more than 500,000 people in the Caribbean could be in harm's way just along the beaches, with hundreds of thousands more working and living in the tsunamis hazard zones. In 2005 the UNESCO Intergovernmental Oceanographic Commission established the Intergovernmental Coordination Group for the Tsunami and other Coastal Hazards Warning System for the Caribbean and Adjacent Regions (ICG CARIBE EWS) to coordinate tsunami efforts among the 48 participating countries in territories in the region. In addition to monitoring, modeling and communication systems, one of the fundamental components of the warning system is community preparedness, readiness and resilience. Over the past 10 years 49 coastal communities in the Caribbean have been recognized as TsunamiReady® by the US National Weather Service (NWS) in the case of Puerto Rico and the US Virgin Islands and jointly by UNESCO and NWS in the case of the non US jurisdictions of Anguilla and the British Virgin Islands. In response to the positive feedback of the implementation of TsunamiReady, the ICG CARIBE EWS in 2015 recommended the approval of the guidelines for a Community Performance Based Recognition program. It also recommended the adoption of the name "Tsunami Ready", which has been positively consulted with the NWS. Ten requirements were established for recognition and are divided among Preparedness, Mitigation and Response elements which were adapted from the proposed new US TsunamiReady guidelines and align well with emergency management functions. Both a

  2. Conservation of synteny between the genome of the pufferfish (Fugu rubripes) and the region on human chromosome 14 (14q24.3) associated with familial Alzheimer disease (AD3 locus)

    PubMed

    Trower, M K; Orton, S M; Purvis, I J; Sanseau, P; Riley, J; Christodoulou, C; Burt, D; See, C G; Elgar, G; Sherrington, R; Rogaev, E I; St George-Hyslop, P; Brenner, S; Dykes, C W

    1996-02-20

    The genome of the pufferfish (Fugu rubripes) (400 Mb) is approximately 7.5 times smaller than the human genome, but it has a similar gene repertoire to that of man. If regions of the two genomes exhibited conservation of gene order (i.e., were syntenic), it should be possible to reduce dramatically the effort required for identification of candidate genes in human disease loci by sequencing syntenic regions of the compact Fugu genome. We have demonstrated that three genes (dihydrolipoamide succinyltransferase, S31iii125, and S20i15), which are linked to FOS in the familial Alzheimer disease focus (AD3) on human chromosome 14, have homologues in the Fugu genome adjacent to Fugu cFOS. The relative gene order of cFOS, S31iii125, and S20i15 was the same in both genomes, but in Fugu these three genes lay within a 12.4-kb region, compared to >600 kb in the human AD3 locus. These results demonstrate the conservation of synteny between the genomes of Fugu and man and highlight the utility of this approach for sequence-based identification of genes in human disease loci.

  3. Enhancer scanning to locate regulatory regions in genomic loci

    PubMed Central

    Buckley, Melissa; Gjyshi, Anxhela; Mendoza-Fandiño, Gustavo; Baskin, Rebekah; Carvalho, Renato S.; Carvalho, Marcelo A.; Woods, Nicholas T.; Monteiro, Alvaro N.A.

    2016-01-01

    The present protocol provides a rapid, streamlined and scalable strategy to systematically scan genomic regions for the presence of transcriptional regulatory regions active in a specific cell type. It creates genomic tiles spanning a region of interest that are subsequently cloned by recombination into a luciferase reporter vector containing the Simian Virus 40 promoter. Tiling clones are transfected into specific cell types to test for the presence of transcriptional regulatory regions. The protocol includes testing of different SNP (single nucleotide polymorphism) alleles to determine their effect on regulatory activity. This procedure provides a systematic framework to identify candidate functional SNPs within a locus during functional analysis of genome-wide association studies. This protocol adapts and combines previous well-established molecular biology methods to provide a streamlined strategy, based on automated primer design and recombinational cloning to rapidly go from a genomic locus to a set of candidate functional SNPs in eight weeks. PMID:26658467

  4. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

    PubMed

    Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

    2018-05-31

    In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.

  5. Augmenting Chinese hamster genome assembly by identifying regions of high confidence.

    PubMed

    Vishwanathan, Nandita; Bandyopadhyay, Arpan A; Fu, Hsu-Yuan; Sharma, Mohit; Johnson, Kathryn C; Mudge, Joann; Ramaraj, Thiruvarangan; Onsongo, Getiria; Silverstein, Kevin A T; Jacob, Nitya M; Le, Huong; Karypis, George; Hu, Wei-Shou

    2016-09-01

    Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. We assembled two independent drafts of Chinese hamster genome by de novo assembly from shotgun sequencing reads and by re-scaffolding and gap-filling the draft genome from NCBI for improved scaffold lengths and gap fractions. We then used the two independent assemblies to identify high confidence regions using two different approaches. First, the two independent assemblies were compared at the sequence level to identify their consensus regions as "high confidence regions" which accounts for at least 78 % of the assembled genome. Further, a genome wide comparison of the Chinese hamster scaffolds with mouse chromosomes revealed scaffolds with large blocks of collinearity, which were also compiled as high-quality scaffolds. Genome scale collinearity was complemented with EST based synteny which also revealed conserved gene order compared to mouse. As cell line sequencing becomes more commonly practiced, the approaches reported here are useful for assessing the quality of assembly and potentially facilitate the engineering of cell lines. Copyright © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Is mammalian chromosomal evolution driven by regions of genome fragility?

    PubMed Central

    Ruiz-Herrera, Aurora; Castresana, Jose; Robinson, Terence J

    2006-01-01

    Background A fundamental question in comparative genomics concerns the identification of mechanisms that underpin chromosomal change. In an attempt to shed light on the dynamics of mammalian genome evolution, we analyzed the distribution of syntenic blocks, evolutionary breakpoint regions, and evolutionary breakpoints taken from public databases available for seven eutherian species (mouse, rat, cattle, dog, pig, cat, and horse) and the chicken, and examined these for correspondence with human fragile sites and tandem repeats. Results Our results confirm previous investigations that showed the presence of chromosomal regions in the human genome that have been repeatedly used as illustrated by a high breakpoint accumulation in certain chromosomes and chromosomal bands. We show, however, that there is a striking correspondence between fragile site location, the positions of evolutionary breakpoints, and the distribution of tandem repeats throughout the human genome, which similarly reflect a non-uniform pattern of occurrence. Conclusion These observations provide further evidence that certain chromosomal regions in the human genome have been repeatedly used in the evolutionary process. As a consequence, the genome is a composite of fragile regions prone to reorganization that have been conserved in different lineages, and genomic tracts that do not exhibit the same levels of evolutionary plasticity. PMID:17156441

  7. Origin of the CMS gene locus in rapeseed cybrid mitochondria: active and inactive recombination produces the complex CMS gene region in the mitochondrial genomes of Brassicaceae.

    PubMed

    Oshima, Masao; Kikuchi, Rie; Imamura, Jun; Handa, Hirokazu

    2010-01-01

    CMS (cytoplasmic male sterile) rapeseed is produced by asymmetrical somatic cell fusion between the Brassica napus cv. Westar and the Raphanus sativus Kosena CMS line (Kosena radish). The CMS rapeseed contains a CMS gene, orf125, which is derived from Kosena radish. Our sequence analyses revealed that the orf125 region in CMS rapeseed originated from recombination between the orf125/orfB region and the nad1C/ccmFN1 region by way of a 63 bp repeat. A precise sequence comparison among the related sequences in CMS rapeseed, Kosena radish and normal rapeseed showed that the orf125 region in CMS rapeseed consisted of the Kosena orf125/orfB region and the rapeseed nad1C/ccmFN1 region, even though Kosena radish had both the orf125/orfB region and the nad1C/ccmFN1 region in its mitochondrial genome. We also identified three tandem repeat sequences in the regions surrounding orf125, including a 63 bp repeat, which were involved in several recombination events. Interestingly, differences in the recombination activity for each repeat sequence were observed, even though these sequences were located adjacent to each other in the mitochondrial genome. We report results indicating that recombination events within the mitochondrial genomes are regulated at the level of specific repeat sequences depending on the cellular environment.

  8. GANESH: software for customized annotation of genome regions.

    PubMed

    Huntley, Derek; Hummerich, Holger; Smedley, Damian; Kittivoravitkul, Sasivimol; McCarthy, Mark; Little, Peter; Sergot, Marek

    2003-09-01

    GANESH is a software package designed to support the genetic analysis of regions of human and other genomes. It provides a set of components that may be assembled to construct a self-updating database of DNA sequence, mapping data, and annotations of possible genome features. Once one or more remote sources of data for the target region have been identified, all sequences for that region are downloaded, assimilated, and subjected to a (configurable) set of standard database-searching and genome-analysis packages. The results are stored in compressed form in a relational database, and are updated automatically on a regular schedule so that they are always immediately available in their most up-to-date versions. A Java front-end, executed as a stand alone application or web applet, provides a graphical interface for navigating the database and for viewing the annotations. There are facilities for importing and exporting data in the format of the Distributed Annotation System (DAS), enabling a GANESH database to be used as a component of a DAS configuration. The system has been used to construct databases for about a dozen regions of human chromosomes and for three regions of mouse chromosomes.

  9. Deformation Rates in the Snake River Plain and Adjacent Basin and Range Regions Based on GPS Measurements

    NASA Astrophysics Data System (ADS)

    Payne, S. J.; McCaffrey, R.; King, R. W.; Kattenhorn, S. A.

    2012-12-01

    We estimate horizontal velocities for 405 sites using Global Positioning System (GPS) phase data collected from 1994 to 2010 within the Northern Basin and Range Province, U.S.A. The velocities reveal a slowly-deforming region within the Snake River Plain in Idaho and Owyhee-Oregon Plateau in Oregon separated from the actively extending adjacent Basin and Range regions by shear. Our results show a NE-oriented extensional strain rate of 5.6 ± 0.7 nanostrain/yr in the Centennial Tectonic Belt and an ~E-oriented extensional strain rate of 3.5 ± 0.2 nanostrain/yr in the Great Basin. These extensional rates contrast with the very low strain rate within the 125 km x 650 km region of the Snake River Plain and Owyhee-Oregon Plateau which is not distinguishable from zero (-0.1 ± 0.4 x nanostrain/yr). Inversions of Snake River Plain velocities with dike-opening models indicate that rapid extension by dike intrusion in volcanic rift zones, as previously hypothesized, is not currently occurring. GPS data also disclose that rapid extension in the surrounding regions adjacent to the slowly-deforming region of the Snake River Plain drives shear between them. We estimate right-lateral shear with slip rates of 0.3-1.5 mm/yr along the northwestern boundary adjacent to the Centennial Tectonic Belt and left-lateral oblique extension with slip rates of 0.5-1.5 mm/yr along the southeastern boundary adjacent to the Intermountain Seismic Belt. The fastest lateral shearing evident in the GPS occurs near the Yellowstone Plateau where earthquakes with right-lateral strike-slip focal mechanisms are within a NE-trending zone of seismicity. The regional velocity gradients are best fit by nearby poles of rotation for the Centennial Tectonic Belt, Snake River Plain, Owyhee-Oregon Plateau, and eastern Oregon, indicating that clockwise rotation is not locally driven by Yellowstone hotspot volcanism, but instead by extension to the south across the Wasatch fault possibly due to gravitational

  10. Epigenomic profiling of DNA methylation in paired prostate cancer versus adjacent benign tissue

    PubMed Central

    Geybels, Milan S.; Zhao, Shanshan; Wong, Chao-Jen; Bibikova, Marina; Klotzle, Brandy; Wu, Michael; Ostrander, Elaine A.; Fan, Jian-Bing; Feng, Ziding; Stanford, Janet L.

    2016-01-01

    Background Aberrant DNA methylation may promote prostate carcinogenesis. We investigated epigenome-wide DNA methylation profiles in prostate cancer (PCa) compared to adjacent benign tissue to identify differentially methylated CpG sites. Methods The study included paired PCa and adjacent benign tissue samples from 20 radical prostatectomy patients. Epigenetic profiling was done using the Infinium HumanMethylation450 BeadChip. Linear models that accounted for the paired study design and False Discovery Rate Q-values were used to evaluate differential CpG methylation. mRNA expression levels of the genes with the most differentially methylated CpG sites were analyzed. Results In total, 2,040 differentially methylated CpG sites were identified in PCa versus adjacent benign tissue (Q-value <0.001), the majority of which were hypermethylated (n = 1,946; 95%). DNA methylation profiles accurately distinguished between PCa and benign tissue samples. Twenty-seven top-ranked hypermethylated CpGs had a mean methylation difference of at least 40% between tissue types, which included 25 CpGs in 17 genes. Furthermore, for ten genes over 50% of promoter region CpGs were hypermethylated in PCa versus benign tissue. The top-ranked differentially methylated genes included three genes that were associated with both promoter hypermethylation and reduced gene expression: SCGB3A1, HIF3A, and AOX1. Analysis of The Cancer Genome Atlas (TCGA) data provided confirmatory evidence for our findings. Conclusions This study of PCa versus adjacent benign tissue showed many differentially methylated CpGs and regions in and outside gene promoter regions, which may potentially be used for the development of future epigenetic-based diagnostic tests or as therapeutic targets. PMID:26383847

  11. Epigenomic profiling of DNA methylation in paired prostate cancer versus adjacent benign tissue.

    PubMed

    Geybels, Milan S; Zhao, Shanshan; Wong, Chao-Jen; Bibikova, Marina; Klotzle, Brandy; Wu, Michael; Ostrander, Elaine A; Fan, Jian-Bing; Feng, Ziding; Stanford, Janet L

    2015-12-01

    Aberrant DNA methylation may promote prostate carcinogenesis. We investigated epigenome-wide DNA methylation profiles in prostate cancer (PCa) compared to adjacent benign tissue to identify differentially methylated CpG sites. The study included paired PCa and adjacent benign tissue samples from 20 radical prostatectomy patients. Epigenetic profiling was done using the Infinium HumanMethylation450 BeadChip. Linear models that accounted for the paired study design and False Discovery Rate Q-values were used to evaluate differential CpG methylation. mRNA expression levels of the genes with the most differentially methylated CpG sites were analyzed. In total, 2,040 differentially methylated CpG sites were identified in PCa versus adjacent benign tissue (Q-value < 0.001), the majority of which were hypermethylated (n = 1,946; 95%). DNA methylation profiles accurately distinguished between PCa and benign tissue samples. Twenty-seven top-ranked hypermethylated CpGs had a mean methylation difference of at least 40% between tissue types, which included 25 CpGs in 17 genes. Furthermore, for 10 genes over 50% of promoter region CpGs were hypermethylated in PCa versus benign tissue. The top-ranked differentially methylated genes included three genes that were associated with both promoter hypermethylation and reduced gene expression: SCGB3A1, HIF3A, and AOX1. Analysis of The Cancer Genome Atlas (TCGA) data provided confirmatory evidence for our findings. This study of PCa versus adjacent benign tissue showed many differentially methylated CpGs and regions in and outside gene promoter regions, which may potentially be used for the development of future epigenetic-based diagnostic tests or as therapeutic targets. © 2015 Wiley Periodicals, Inc.

  12. Harnessing genomics to improve health in the Eastern Mediterranean Region – an executive course in genomics policy

    PubMed Central

    Acharya, Tara; Rab, Mohammed Abdur; Singer, Peter A; Daar, Abdallah S

    2005-01-01

    Background While innovations in medicine, science and technology have resulted in improved health and quality of life for many people, the benefits of modern medicine continue to elude millions of people in many parts of the world. To assess the potential of genomics to address health needs in EMR, the World Health Organization's Eastern Mediterranean Regional Office and the University of Toronto Joint Centre for Bioethics jointly organized a Genomics and Public Health Policy Executive Course, held September 20th–23rd, 2003, in Muscat, Oman. The 4-day course was sponsored by WHO-EMRO with additional support from the Canadian Program in Genomics and Global Health. The overall objective of the course was to collectively explore how to best harness genomics to improve health in the region. This article presents the course findings and recommendations for genomics policy in EMR. Methods The course brought together senior representatives from academia, biotechnology companies, regulatory bodies, media, voluntary, and legal organizations to engage in discussion. Topics covered included scientific advances in genomics, followed by innovations in business models, public sector perspectives, ethics, legal issues and national innovation systems. Results A set of recommendations, summarized below, was formulated for the Regional Office, the Member States and for individuals. • Advocacy for genomics and biotechnology for political leadership; • Networking between member states to share information, expertise, training, and regional cooperation in biotechnology; coordination of national surveys for assessment of health biotechnology innovation systems, science capacity, government policies, legislation and regulations, intellectual property policies, private sector activity; • Creation in each member country of an effective National Body on genomics, biotechnology and health to: - formulate national biotechnology strategies - raise biotechnology awareness - encourage

  13. RGmatch: matching genomic regions to proximal genes in omics data integration.

    PubMed

    Furió-Tarí, Pedro; Conesa, Ana; Tarazona, Sonia

    2016-11-22

    The integrative analysis of multiple genomics data often requires that genome coordinates-based signals have to be associated with proximal genes. The relative location of a genomic region with respect to the gene (gene area) is important for functional data interpretation; hence algorithms that match regions to genes should be able to deliver insight into this information. In this work we review the tools that are publicly available for making region-to-gene associations. We also present a novel method, RGmatch, a flexible and easy-to-use Python tool that computes associations either at the gene, transcript, or exon level, applying a set of rules to annotate each region-gene association with the region location within the gene. RGmatch can be applied to any organism as long as genome annotation is available. Furthermore, we qualitatively and quantitatively compare RGmatch to other tools. RGmatch simplifies the association of a genomic region with its closest gene. At the same time, it is a powerful tool because the rules used to annotate these associations are very easy to modify according to the researcher's specific interests. Some important differences between RGmatch and other similar tools already in existence are RGmatch's flexibility, its wide range of user options, compatibility with any annotatable organism, and its comprehensive and user-friendly output.

  14. Genetic organization of the unc-22 IV gene and the adjacent region in Caenorhabditis elegans.

    PubMed

    Rogalski, T M; Baillie, D L

    1985-01-01

    The genetic organization of the region immediately adjacent to the unc-22 IV gene in Caenorhabditis elegans has been studied. We have identified twenty essential genes in this interval of approximately 1.5-map units on Linkage Group IV. The mutations that define these genes were positioned by recombination mapping and complementation with several deficiencies. With few exceptions, the positions obtained by these two methods agreed. Eight of the twenty essential genes identified are represented by more than one allele. Three possible internal deletions of the unc-22 gene have been located by intra-genic mapping. In addition, the right end point of a deficiency or an inversion affecting the adjacent genes let-56 and unc-22 has been positioned inside the unc-22 gene.

  15. The use of genomic coancestry matrices in the optimisation of contributions to maintain genetic diversity at specific regions of the genome.

    PubMed

    Gómez-Romano, Fernando; Villanueva, Beatriz; Fernández, Jesús; Woolliams, John A; Pong-Wong, Ricardo

    2016-01-13

    Optimal contribution methods have proved to be very efficient for controlling the rates at which coancestry and inbreeding increase and therefore, for maintaining genetic diversity. These methods have usually relied on pedigree information for estimating genetic relationships between animals. However, with the large amount of genomic information now available such as high-density single nucleotide polymorphism (SNP) chips that contain thousands of SNPs, it becomes possible to calculate more accurate estimates of relationships and to target specific regions in the genome where there is a particular interest in maximising genetic diversity. The objective of this study was to investigate the effectiveness of using genomic coancestry matrices for: (1) minimising the loss of genetic variability at specific genomic regions while restricting the overall loss in the rest of the genome; or (2) maximising the overall genetic diversity while restricting the loss of diversity at specific genomic regions. Our study shows that the use of genomic coancestry was very successful at minimising the loss of diversity and outperformed the use of pedigree-based coancestry (genetic diversity even increased in some scenarios). The results also show that genomic information allows a targeted optimisation to maintain diversity at specific genomic regions, whether they are linked or not. The level of variability maintained increased when the targeted regions were closely linked. However, such targeted management leads to an important loss of diversity in the rest of the genome and, thus, it is necessary to take further actions to constrain this loss. Optimal contribution methods also proved to be effective at restricting the loss of diversity in the rest of the genome, although the resulting rate of coancestry was higher than the constraint imposed. The use of genomic matrices when optimising contributions permits the control of genetic diversity and inbreeding at specific regions of the

  16. A regional ionospheric TEC mapping technique over China and adjacent areas on the basis of data assimilation

    NASA Astrophysics Data System (ADS)

    Aa, Ercha; Huang, Wengeng; Yu, Shimei; Liu, Siqing; Shi, Liqin; Gong, Jiancun; Chen, Yanhong; Shen, Hua

    2015-06-01

    In this paper, a regional total electron content (TEC) mapping technique over China and adjacent areas (70°E-140°E and 15°N-55°N) is developed on the basis of a Kalman filter data assimilation scheme driven by Global Navigation Satellite Systems (GNSS) data from the Crustal Movement Observation Network of China and International GNSS Service. The regional TEC maps can be generated accordingly with the spatial and temporal resolution being 1°×1° and 5 min, respectively. The accuracy and quality of the TEC mapping technique have been validated through the comparison with GNSS observations, the International Reference Ionosphere model values, the global ionosphere maps from Center for Orbit Determination of Europe, and the Massachusetts Institute of Technology Automated Processing of GPS TEC data from Madrigal database. The verification results indicate that great systematic improvements can be obtained when data are assimilated into the background model, which demonstrates the effectiveness of this technique in providing accurate regional specification of the ionospheric TEC over China and adjacent areas.

  17. Algorithms and Complexity Results for Genome Mapping Problems.

    PubMed

    Rajaraman, Ashok; Zanetti, Joao Paulo Pereira; Manuch, Jan; Chauve, Cedric

    2017-01-01

    Genome mapping algorithms aim at computing an ordering of a set of genomic markers based on local ordering information such as adjacencies and intervals of markers. In most genome mapping models, markers are assumed to occur uniquely in the resulting map. We introduce algorithmic questions that consider repeats, i.e., markers that can have several occurrences in the resulting map. We show that, provided with an upper bound on the copy number of repeated markers and with intervals that span full repeat copies, called repeat spanning intervals, the problem of deciding if a set of adjacencies and repeat spanning intervals admits a genome representation is tractable if the target genome can contain linear and/or circular chromosomal fragments. We also show that extracting a maximum cardinality or weight subset of repeat spanning intervals given a set of adjacencies that admits a genome realization is NP-hard but fixed-parameter tractable in the maximum copy number and the number of adjacent repeats, and tractable if intervals contain a single repeated marker.

  18. Genome-Wide Analysis in Brazilians Reveals Highly Differentiated Native American Genome Regions

    PubMed Central

    Havt, Alexandre; Nayak, Uma; Pinkerton, Relana; Farber, Emily; Concannon, Patrick; Lima, Aldo A.; Guerrant, Richard L.

    2017-01-01

    Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas. PMID:28100790

  19. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    PubMed Central

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  20. Digital depth horizon compilations of the Alaskan North Slope and adjacent Arctic regions

    USGS Publications Warehouse

    Saltus, Richard W.; Bird, Kenneth J.

    2003-01-01

    Data have been digitized and combined to create four detailed depth horizon grids spanning the Alaskan North Slope and adjacent offshore areas. These map horizon compilations were created to aid in petroleum system modeling and related studies. Topography/bathymetry is extracted from a recent Arctic compilation of global onshore DEM and satellite altimetry and ship soundings offshore. The Lower Cretaceous Unconformity (LCU), the top of the Triassic Shublik Formation, and the pre-Carboniferous acoustic basement horizon grids are created from numerous seismic studies, drill hole information, and interpolation. These horizons were selected because they mark critical times in the geologic evolution of the region as it relates to petroleum. The various horizons clearly show the major tectonic elements of this region including the Brooks Range, Colville Trough, Barrow Arch, Hanna Trough, Chukchi Platform, Nuwuk Basin, Kaktovik Basin, and Canada Basin. The gridded data are available in a variety of data formats for use in regional studies.

  1. [Genome similarity of Baikal omul and sig].

    PubMed

    Bychenko, O S; Sukhanova, L V; Ukolova, S S; Skvortsov, T A; Potapov, V K; Azhikina, T L; Sverdlov, E D

    2009-01-01

    Two members of the Baikal sig family, a lake sig (Coregonus lavaretus baicalensis Dybovsky) and omul (C. autumnalis migratorius Georgi), are close relatives that diverged from the same ancestor 10-20 thousand years ago. In this work, we studied genomic polymorphism of these two fish species. The method of subtraction hybridization (SH) did not reveal the presence of extended sequences in the sig genome and their absence in the omul genome. All the fragments found by SH corresponded to polymorphous noncoding genome regions varying in mononucleotide substitutions and short deletions. Many of them are mapped close to genes of the immune system and have regions identical to the Tc-1-like transposons abundant among fish, whose transcription activity may affect the expression of adjacent genes. Thus, we showed for the first time that genetic differences between Baikal sig family members are extremely small and cannot be revealed by the SH method. This is another endorsement of the hypothesis on the close relationship between Baikal sig and omul and their evolutionarily recent divergence from a common ancestor.

  2. Attenuation of monkeypox virus by deletion of genomic regions

    USGS Publications Warehouse

    Lopera, Juan G.; Falendysz, Elizabeth A.; Rocke, Tonie E.; Osorio, Jorge E.

    2015-01-01

    Monkeypox virus (MPXV) is an emerging pathogen from Africa that causes disease similar to smallpox. Two clades with different geographic distributions and virulence have been described. Here, we utilized bioinformatic tools to identify genomic regions in MPXV containing multiple virulence genes and explored their roles in pathogenicity; two selected regions were then deleted singularly or in combination. In vitro and in vivostudies indicated that these regions play a significant role in MPXV replication, tissue spread, and mortality in mice. Interestingly, while deletion of either region led to decreased virulence in mice, one region had no effect on in vitro replication. Deletion of both regions simultaneously also reduced cell culture replication and significantly increased the attenuation in vivo over either single deletion. Attenuated MPXV with genomic deletions present a safe and efficacious tool in the study of MPX pathogenesis and in the identification of genetic factors associated with virulence.

  3. Attenuation of monkeypox virus by deletion of genomic regions.

    PubMed

    Lopera, Juan G; Falendysz, Elizabeth A; Rocke, Tonie E; Osorio, Jorge E

    2015-01-15

    Monkeypox virus (MPXV) is an emerging pathogen from Africa that causes disease similar to smallpox. Two clades with different geographic distributions and virulence have been described. Here, we utilized bioinformatic tools to identify genomic regions in MPXV containing multiple virulence genes and explored their roles in pathogenicity; two selected regions were then deleted singularly or in combination. In vitro and in vivo studies indicated that these regions play a significant role in MPXV replication, tissue spread, and mortality in mice. Interestingly, while deletion of either region led to decreased virulence in mice, one region had no effect on in vitro replication. Deletion of both regions simultaneously also reduced cell culture replication and significantly increased the attenuation in vivo over either single deletion. Attenuated MPXV with genomic deletions present a safe and efficacious tool in the study of MPX pathogenesis and in the identification of genetic factors associated with virulence. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Genomic regions underlying susceptibility to bovine tuberculosis in Holstein-Friesian cattle.

    PubMed

    Raphaka, Kethusegile; Matika, Oswald; Sánchez-Molano, Enrique; Mrode, Raphael; Coffey, Mike Peter; Riggio, Valentina; Glass, Elizabeth Janet; Woolliams, John Arthur; Bishop, Stephen Christopher; Banos, Georgios

    2017-03-23

    The significant social and economic loss as a result of bovine tuberculosis (bTB) presents a continuous challenge to cattle industries in the UK and worldwide. However, host genetic variation in cattle susceptibility to bTB provides an opportunity to select for resistant animals and further understand the genetic mechanisms underlying disease dynamics. The present study identified genomic regions associated with susceptibility to bTB using genome-wide association (GWA), regional heritability mapping (RHM) and chromosome association approaches. Phenotypes comprised de-regressed estimated breeding values of 804 Holstein-Friesian sires and pertained to three bTB indicator traits: i) positive reactors to the skin test with positive post-mortem examination results (phenotype 1); ii) positive reactors to the skin test regardless of post-mortem examination results (phenotype 2) and iii) as in (ii) plus non-reactors and inconclusive reactors to the skin tests with positive post-mortem examination results (phenotype 3). Genotypes based on the 50 K SNP DNA array were available and a total of 34,874 SNPs remained per animal after quality control. The estimated polygenic heritability for susceptibility to bTB was 0.26, 0.37 and 0.34 for phenotypes 1, 2 and 3, respectively. GWA analysis identified a putative SNP on Bos taurus autosomes (BTA) 2 associated with phenotype 1, and another on BTA 23 associated with phenotype 2. Genomic regions encompassing these SNPs were found to harbour potentially relevant annotated genes. RHM confirmed the effect of these genomic regions and identified new regions on BTA 18 for phenotype 1 and BTA 3 for phenotypes 2 and 3. Heritabilities of the genomic regions ranged between 0.05 and 0.08 across the three phenotypes. Chromosome association analysis indicated a major role of BTA 23 on susceptibility to bTB. Genomic regions and candidate genes identified in the present study provide an opportunity to further understand pathways critical to cattle

  5. Genome assemblies for 11 Yersinia pestis strains isolated in the Caucasus region

    DOE PAGES

    Zhgenti, Ekaterine; Johnson, Shannon L.; Davenport, Karen W.; ...

    2015-09-17

    Yersinia pestis, the causative agent of plague, is endemic to the Caucasus region but few reference strain genome sequences from that region are available. We present the improved draft or finished assembled genomes from 11 strains isolated in the nation of Georgia and surrounding countries.

  6. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  7. Differential contribution of genomic regions to marked genetic variation and prediction of quantitative traits in broiler chickens.

    PubMed

    Abdollahi-Arpanahi, Rostam; Morota, Gota; Valente, Bruno D; Kranis, Andreas; Rosa, Guilherme J M; Gianola, Daniel

    2016-02-03

    Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. All genic and non-genic regions contributed to

  8. Genomic evaluation of regional dairy cattle breeds in single-breed and multibreed contexts.

    PubMed

    Jónás, D; Ducrocq, V; Fritz, S; Baur, A; Sanchez, M-P; Croiseau, P

    2017-02-01

    An important prerequisite for high prediction accuracy in genomic prediction is the availability of a large training population, which allows accurate marker effect estimation. This requirement is not fulfilled in case of regional breeds with a limited number of breeding animals. We assessed the efficiency of the current French routine genomic evaluation procedure in four regional breeds (Abondance, Tarentaise, French Simmental and Vosgienne) as well as the potential benefits when the training populations consisting of males and females of these breeds are merged to form a multibreed training population. Genomic evaluation was 5-11% more accurate than a pedigree-based BLUP in three of the four breeds, while the numerically smallest breed showed a < 1% increase in accuracy. Multibreed genomic evaluation was beneficial for two breeds (Abondance and French Simmental) with maximum gains of 5 and 8% in correlation coefficients between yield deviations and genomic estimated breeding values, when compared to the single-breed genomic evaluation results. Inflation of genomic evaluation of young candidates was also reduced. Our results indicate that genomic selection can be effective in regional breeds as well. Here, we provide empirical evidence proving that genetic distance between breeds is only one of the factors affecting the efficiency of multibreed genomic evaluation. © 2016 Blackwell Verlag GmbH.

  9. GEAR: genomic enrichment analysis of regional DNA copy number changes.

    PubMed

    Kim, Tae-Min; Jung, Yu-Chae; Rhyu, Mun-Gan; Jung, Myeong Ho; Chung, Yeun-Jun

    2008-02-01

    We developed an algorithm named GEAR (genomic enrichment analysis of regional DNA copy number changes) for functional interpretation of genome-wide DNA copy number changes identified by array-based comparative genomic hybridization. GEAR selects two types of chromosomal alterations with potential biological relevance, i.e. recurrent and phenotype-specific alterations. Then it performs functional enrichment analysis using a priori selected functional gene sets to identify primary and clinical genomic signatures. The genomic signatures identified by GEAR represent functionally coordinated genomic changes, which can provide clues on the underlying molecular mechanisms related to the phenotypes of interest. GEAR can help the identification of key molecular functions that are activated or repressed in the tumor genomes leading to the improved understanding on the tumor biology. GEAR software is available with online manual in the website, http://www.systemsbiology.co.kr/GEAR/.

  10. Genomic regions associated with kyphosis in swine

    USDA-ARS?s Scientific Manuscript database

    Background: A back curvature defect similar to kyphosis in humans has been observed in swine herds. The defect ranges from mild to severe curvature of the thoracic vertebrate in split carcasses and has an estimated heritability of 0.3. The objective of this study was to identify genomic regions that...

  11. Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

    PubMed Central

    Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

    2000-01-01

    The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409

  12. Polytene Chromosomes - A Portrait of Functional Organization of the Drosophila Genome.

    PubMed

    Zykova, Tatyana Yu; Levitsky, Victor G; Belyaeva, Elena S; Zhimulev, Igor F

    2018-04-01

    This mini-review is devoted to the problem genetic meaning of main polytene chromosome structures - bands and interbands. Generally, densely packed chromatin forms black bands, moderately condensed regions form grey loose bands, whereas decondensed regions of the genome appear as interbands. Recent progress in the annotation of the Drosophila genome and epigenome has made it possible to compare the banding pattern and the structural organization of genes, as well as their activity. This was greatly aided by our ability to establish the borders of bands and interbands on the physical map, which allowed to perform comprehensive side-by-side comparisons of cytology, genetic and epigenetic maps and to uncover the association between the morphological structures and the functional domains of the genome. These studies largely conclude that interbands 5'-ends of housekeeping genes that are active across all cell types. Interbands are enriched with proteins involved in transcription and nucleosome remodeling, as well as with active histone modifications. Notably, most of the replication origins map to interband regions. As for grey loose bands adjacent to interbands, they typically host the bodies of house-keeping genes. Thus, the bipartite structure composed of an interband and an adjacent grey band functions as a standalone genetic unit. Finally, black bands harbor tissue-specific genes with narrow temporal and tissue expression profiles. Thus, the uniform and permanent activity of interbands combined with the inactivity of genes in bands forms the basis of the universal banding pattern observed in various Drosophila tissues.

  13. Content-based image retrieval by matching hierarchical attributed region adjacency graphs

    NASA Astrophysics Data System (ADS)

    Fischer, Benedikt; Thies, Christian J.; Guld, Mark O.; Lehmann, Thomas M.

    2004-05-01

    Content-based image retrieval requires a formal description of visual information. In medical applications, all relevant biological objects have to be represented by this description. Although color as the primary feature has proven successful in publicly available retrieval systems of general purpose, this description is not applicable to most medical images. Additionally, it has been shown that global features characterizing the whole image do not lead to acceptable results in the medical context or that they are only suitable for specific applications. For a general purpose content-based comparison of medical images, local, i.e. regional features that are collected on multiple scales must be used. A hierarchical attributed region adjacency graph (HARAG) provides such a representation and transfers image comparison to graph matching. However, building a HARAG from an image requires a restriction in size to be computationally feasible while at the same time all visually plausible information must be preserved. For this purpose, mechanisms for the reduction of the graph size are presented. Even with a reduced graph, the problem of graph matching remains NP-complete. In this paper, the Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison. Based on synthetic image material build from simple geometric objects, all visually similar regions were matched accordingly showing the framework's general applicability to content-based image retrieval of medical images.

  14. Genome-wide DNA methylation measurements in prostate tissues uncovers novel prostate cancer diagnostic biomarkers and transcription factor binding patterns.

    PubMed

    Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M

    2017-04-17

    Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.

  15. QTLs Regulating the Contents of Antioxidants, Phenolics, and Flavonoids in Soybean Seeds Share a Common Genomic Region.

    PubMed

    Li, Man-Wah; Muñoz, Nacira B; Wong, Chi-Fai; Wong, Fuk-Ling; Wong, Kwong-Sen; Wong, Johanna Wing-Hang; Qi, Xinpeng; Li, Kwan-Pok; Ng, Ming-Sin; Lam, Hon-Ming

    2016-01-01

    Soybean seeds are a rich source of phenolic compounds, especially isoflavonoids, which are important nutraceuticals. Our study using 14 wild- and 16 cultivated-soybean accessions shows that seeds from cultivated soybeans generally contain lower total antioxidants compared to their wild counterparts, likely an unintended consequence of domestication or human selection. Using a recombinant inbred population resulting from a wild and a cultivated soybean parent and a bin map approach, we have identified an overlapping genomic region containing major quantitative trait loci (QTLs) that regulate the seed contents of total antioxidants, phenolics, and flavonoids. The QTL for seed antioxidant content contains 14 annotated genes based on the Williams 82 reference genome (Gmax1.01). None of these genes encodes functions that are related to the phenylpropanoid pathway of soybean. However, we found three putative Multidrug And Toxic Compound Extrusion (MATE) transporter genes within this QTL and one adjacent to it (GmMATE1-4). Moreover, we have identified non-synonymous changes between GmMATE1 and GmMATE2, and that GmMATE3 encodes an antisense transcript that expresses in pods. Whether the polymorphisms in GmMATE proteins are major determinants of the antioxidant contents, or whether the antisense transcripts of GmMATE3 play important regulatory roles, awaits further functional investigations.

  16. Quantifying 10 years of Improvements in Earthquake and Tsunami Monitoring in the Caribbean and Adjacent Regions

    NASA Astrophysics Data System (ADS)

    von Hillebrandt-Andrade, C.; Huerfano Moreno, V. A.; McNamara, D. E.; Saurel, J. M.

    2014-12-01

    The magnitude-9.3 Sumatra-Andaman Islands earthquake of December 26, 2004, increased global awareness to the destructive hazard of earthquakes and tsunamis. Post event assessments of global coastline vulnerability highlighted the Caribbean as a region of high hazard and risk and that it was poorly monitored. Nearly 100 tsunamis have been reported for the Caribbean region and Adjacent Regions in the past 500 years and continue to pose a threat for its nations, coastal areas along the Gulf of Mexico, and the Atlantic seaboard of North and South America. Significant efforts to improve monitoring capabilities have been undertaken since this time including an expansion of the United States Geological Survey (USGS) Global Seismographic Network (GSN) (McNamara et al., 2006) and establishment of the United Nations Educational, Scientific and Cultural Organization (UNESCO) Intergovernmental Coordination Group (ICG) for the Tsunami and other Coastal Hazards Warning System for the Caribbean and Adjacent Regions (CARIBE EWS). The minimum performance standards it recommended for initial earthquake locations include: 1) Earthquake detection within 1 minute, 2) Minimum magnitude threshold = M4.5, and 3) Initial hypocenter error of <30 km. In this study, we assess current compliance with performance standards and model improvements in earthquake and tsunami monitoring capabilities in the Caribbean region since the first meeting of the UNESCO ICG-Caribe EWS in 2006. The three measures of network capability modeled in this study are: 1) minimum Mw detection threshold; 2) P-wave detection time of an automatic processing system and; 3) theoretical earthquake location uncertainty. By modeling three measures of seismic network capability, we can optimize the distribution of ICG-Caribe EWS seismic stations and select an international network that will be contributed from existing real-time broadband national networks in the region. Sea level monitoring improvements both offshore and

  17. Regulation of Sex Determination in Mice by a Non-coding Genomic Region

    PubMed Central

    Arboleda, Valerie A.; Fleming, Alice; Barseghyan, Hayk; Délot, Emmanuèle; Sinsheimer, Janet S.; Vilain, Eric

    2014-01-01

    To identify novel genomic regions that regulate sex determination, we utilized the powerful C57BL/6J-YPOS (B6-YPOS) model of XY sex reversal where mice with autosomes from the B6 strain and a Y chromosome from a wild-derived strain, Mus domesticus poschiavinus (YPOS), show complete sex reversal. In B6-YPOS, the presence of a 55-Mb congenic region on chromosome 11 protects from sex reversal in a dose-dependent manner. Using mouse genetic backcross designs and high-density SNP arrays, we narrowed the congenic region to a 1.62-Mb genomic region on chromosome 11 that confers 80% protection from B6-YPOS sex reversal when one copy is present and complete protection when two copies are present. It was previously believed that the protective congenic region originated from the 129S1/SviMJ (129) strain. However, genomic analysis revealed that this region is not derived from 129 and most likely is derived from the semi-inbred strain POSA. We show that the small 1.62-Mb congenic region that protects against B6-YPOS sex reversal is located within the Sox9 promoter and promotes the expression of Sox9, thereby driving testis development within the B6-YPOS background. Through 30 years of backcrossing, this congenic region was maintained, as it promoted male sex determination and fertility despite the female-promoting B6-YPOS genetic background. Our findings demonstrate that long-range enhancer regions are critical to developmental processes and can be used to identify the complex interplay between genome variants, epigenetics, and developmental gene regulation. PMID:24793290

  18. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome.

    PubMed

    Greally, John M

    2002-01-08

    To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival.

  19. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome

    PubMed Central

    Greally, John M.

    2002-01-01

    To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival. PMID:11756672

  20. High-Throughput resequencing of maize landraces at genomic regions associated with flowering time

    USDA-ARS?s Scientific Manuscript database

    Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...

  1. Interspecific and intraspecific gene variability in a 1-Mb region containing the highest density of NBS-LRR genes found in the melon genome.

    PubMed

    González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere

    2014-12-17

    Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the

  2. SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand.

    PubMed

    Tang, Haibao; Bomhoff, Matthew D; Briones, Evan; Zhang, Liangsheng; Schnable, James C; Lyons, Eric

    2015-11-11

    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, even when no such gene is present. This capability means that synteny-based methods are far more effective than sequence similarity-based methods in identifying true-negatives, a necessity for studying gene loss and gene transposition. However, the identification of syntenic regions requires complex analyses which must be repeated for pairwise comparisons between any two species. Therefore, as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of target genomes. SynFind is capable of reporting per-gene information, useful for researchers studying specific gene families, as well as genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Regional Jurassic geologic framework of Alabama coastal waters area and adjacent Federal waters area

    USGS Publications Warehouse

    Mink, R.M.; Bearden, B.L.; Mancini, E.A.

    1989-01-01

    To date, numerous Jurassic hydrocarbon fields and pools have been discovered in the Cotton Valley Group, Haynesville Formation, Smackover Formation and Norphlet Formation in the tri-state area of Mississippi, Alabama and Florida, and in Alabama State coastal waters and adjacent Federal waters area. Petroleum traps are basement highs, salt anticlines, faulted salt anticlines and extensional faults associated with salt movement. Reservoirs include continental and marine sandstones, limestones and dolostones. Hydrocarbon types are oil, condensate and natural gas. The onshore stratigraphic and structural information can be used to establish a regional geologic framework for the Jurassic for the State coastal waters and adjacent Federal waters areas. Evaluation of the geologic information along with the hydrocarbon data from the tri-state area indicates that at least three Jurassic hydrocarbon trends (oil, oil and gas condensate, and deep natural gas) can be identified onshore. These onshore hydrocarbon trends can be projected into the Mobile area in the Central Gulf of Mexico and into the Pensacola, Destin Dome and Apalachicola areas in the Eastern Gulf of Mexico. Substantial reserves of natural gas are expected to be present in Alabama State waters and the northern portion of the Mobile area. Significant accumulations of oil and gas condensate may be encountered in the Pensacola, Destin Dome, and Apalachicola areas. ?? 1989.

  4. Gross rearrangements within the 5'-untranslated region of the picornaviral genomes.

    PubMed

    Pilipenko, E V; Blinov, V M; Agol, V I

    1990-06-11

    An analysis of reported nucleotide sequences revealed several cases of gross rearrangements in the 5'-untranslated region (5-UTR) of picornaviral genomes. A large (greater than 100 nt) duplication was discovered in a downstream region of poliovirus 5-UTR involved in the translational control. Properties of the poliovirus mutants with large deletions [Kuge and Nomoto (1987) J. Virol. 61, 1478-1487] show that a single copy of the appropriate repeating unit is compatible with a wild type phenotype of the virus. In contrast to poliovirus and another enterovirus genomes, human rhinovirus RNAs contain only a single copy of this repeating unit. Another similarly large repeat was found in an upstream segment of the bovine enterovirus 5-UTR. A comparison of the primary and secondary structures of cardio- and aphthovirus 5-UTRs demonstrated the existence of a large (ca. 250 nucleotides) insertion/deletion in a region preceding the poly(C) tract. The two latter rearrangements appear to involve elements of the viral genome replication machinery. Possible origin as well as evolutionary and functional implications of these structural peculiarities are discussed.

  5. Whole-genome resequencing of 292 pigeonpea accessions identifies genomic regions associated with domestication and agronomic traits.

    PubMed

    Varshney, Rajeev K; Saxena, Rachit K; Upadhyaya, Hari D; Khan, Aamir W; Yu, Yue; Kim, Changhoon; Rathore, Abhishek; Kim, Dongseon; Kim, Jihun; An, Shaun; Kumar, Vinay; Anuradha, Ghanta; Yamini, Kalinati Narasimhan; Zhang, Wei; Muniswamy, Sonnappa; Kim, Jong-So; Penmetsa, R Varma; von Wettberg, Eric; Datta, Swapan K

    2017-07-01

    Pigeonpea (Cajanus cajan), a tropical grain legume with low input requirements, is expected to continue to have an important role in supplying food and nutritional security in developing countries in Asia, Africa and the tropical Americas. From whole-genome resequencing of 292 Cajanus accessions encompassing breeding lines, landraces and wild species, we characterize genome-wide variation. On the basis of a scan for selective sweeps, we find several genomic regions that were likely targets of domestication and breeding. Using genome-wide association analysis, we identify associations between several candidate genes and agronomically important traits. Candidate genes for these traits in pigeonpea have sequence similarity to genes functionally characterized in other plants for flowering time control, seed development and pod dehiscence. Our findings will allow acceleration of genetic gains for key traits to improve yield and sustainability in pigeonpea.

  6. Interstitial telomere-like repeats in the Arabidopsis thaliana genome.

    PubMed

    Uchida, Wakana; Matsunaga, Sachihiro; Sugiyama, Ryuji; Kawano, Shigeyuki

    2002-02-01

    Eukaryotic chromosomal ends are protected by telomeres, which are thought to play an important role in ensuring the complete replication of chromosomes. On the other hand, non-functional telomere-like repeats in the interchromosomal regions (interstitial telomeric repeats; ITRs) have been reported in several eukaryotes. In this study, we identified eight ITRs in the Arabidopsis thaliana genome, each consisting of complete and degenerate 300- to 1200-bp sequences. The ITRs were grouped into three classes (class IA-B, class II, and class IIIA-E) based on the degeneracy of the telomeric repeats in ITRs. The telomeric repeats of the two ITRs in class I were conserved for the most part, whereas the single ITR in class II, and the five ITRs in class III were relatively degenerated. In addition, degenerate ITRs were surrounded by common sequences that shared 70-100% homology to each other; these are named ITR-adjacent sequences (IAS). Although the genomic regions around ITRs in class I lacked IAS, those around ITRs in class II contained IAS (IASa), and those around five ITRs in class III had nine types of IAS (IASb, c, d, e, f, g, h, i, and j). Ten IAS types in classes II and III showed no significant homology to each other. The chromosomal locations of ITRs and IAS were not category-related, but most of them were adjacent to, or part of, a centromere. These results show that the A. thaliana genome has undergone chromosomal rearrangements, such as end-fusions and segmental duplications.

  7. The complete genome of a new marine Thaumarchaea strain contains evidence of previous virus infection and a possible defense mechanism from infection

    NASA Astrophysics Data System (ADS)

    Ahlgren, N.; Parada, A. E.; Fuhrman, J. A.

    2016-02-01

    While marine viruses have been isolated from several marine bacterial phyla, no reported viruses have been isolated from mesophilic marine archaea. There is growing evidence for viruses that infect marine Thaumarchaea, an abundant phylum of mesophilic archaea that are important in C and N cycles in the ocean. We have recently sequenced the complete genome of new Thaumarchaeota strain, SPOT01, that contains evidence of viral infection. Two independent virus finding programs, VirSorter and phiSpy, indicate the genome contains a 20 kb region that is likely viral in origin. Manual inspection of this region, including comparison to known viral proteins, also supports that this region contains viral genes. It is unclear if this region is a viable prophage or the remnants of a previous lytic infection. Next to this region are genes for a newly recognized form of DNA modification, phosphorothioation (PT), and an adjacent operon that likely encodes a restriction endonuclease (RE). PT genes are found in a variety of bacteria and archaea, but this is the first example of PT genes in a marine achaeon. PT and adjacent RE genes in Salmonella enterica have been shown to function as a restriction modification system—non PT-modified DNA is degraded by the PT system RE such that the host is protected from invasion of foreign DNA. The discovery of both PT and adjacent RE genes in SPOT01 is novel among marine microbes, and we hypothesize that they act to restrict infection by degrading non PT-modified viral DNA. Recruitment of metagenomes from a near-shore site off California indicates that the putative virus and PT regions are found in roughly 25% and 2% respectively of Thaumarchaea in the field. Results from PacBio sequencing will be presented on which genomic sites are PT modified. This new genome provides compelling evidence that marine Thaumarchaea are susceptible to viral infection and possess a potential new mechanism for defense from infection.

  8. Contribution of the upper river, the estuarine region, and the adjacent sea to the heavy metal pollution in the Yangtze Estuary.

    PubMed

    Yin, Su; Wu, Yuehan; Xu, Wei; Li, Yangyang; Shen, Zhenyao; Feng, Chenghong

    2016-07-01

    To determine whether the discharge control of heavy metals in the Yangtze River basin can significantly change the pollution level in the estuary, this study analyzed the sources (upper river, the estuarine region, and the adjacent sea) of ten heavy metals (As, Cd, Co, Cr, Cu, Hg, Ni, Pb, Sb, and Zn) in dissolved and particulate phases in the surface water of the estuary during wet, normal, and dry seasons. Metal sources inferred from section fluxes agree with those in statistical analysis methods. Heavy metal pollution in the surface water of Yangtze Estuary primarily depends on the sediment suspension and the wastewater discharge from estuary cities. Upper river only constitutes the main source of dissolved heavy metals during the wet season, while the estuarine region and the adjacent sea (especially the former) dominate the dissolved metal pollution in the normal and dry seasons. Particulate metals are mainly derived from sediment suspension in the estuary and the adjacent sea, and the contribution of the upper river can be neglected. Compared with the hydrologic seasons, flood-ebb tides exert a more obvious effect on the water flow directions in the estuary. Sediment suspension, not the upper river, significantly affects the suspended particulate matter concentration in the estuary. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations

  10. Exploring objective climate classification for the Himalayan arc and adjacent regions using gridded data sources

    NASA Astrophysics Data System (ADS)

    Forsythe, N.; Blenkinsop, S.; Fowler, H. J.

    2015-05-01

    A three-step climate classification was applied to a spatial domain covering the Himalayan arc and adjacent plains regions using input data from four global meteorological reanalyses. Input variables were selected based on an understanding of the climatic drivers of regional water resource variability and crop yields. Principal component analysis (PCA) of those variables and k-means clustering on the PCA outputs revealed a reanalysis ensemble consensus for eight macro-climate zones. Spatial statistics of input variables for each zone revealed consistent, distinct climatologies. This climate classification approach has potential for enhancing assessment of climatic influences on water resources and food security as well as for characterising the skill and bias of gridded data sets, both meteorological reanalyses and climate models, for reproducing subregional climatologies. Through their spatial descriptors (area, geographic centroid, elevation mean range), climate classifications also provide metrics, beyond simple changes in individual variables, with which to assess the magnitude of projected climate change. Such sophisticated metrics are of particular interest for regions, including mountainous areas, where natural and anthropogenic systems are expected to be sensitive to incremental climate shifts.

  11. MHC class I-associated peptides derive from selective regions of the human genome.

    PubMed

    Pearson, Hillary; Daouda, Tariq; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Mader, Sylvie; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude

    2016-12-01

    MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.

  12. MHC class I–associated peptides derive from selective regions of the human genome

    PubMed Central

    Pearson, Hillary; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Thibault, Pierre

    2016-01-01

    MHC class I–associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology. PMID:27841757

  13. Comparative analysis of the 5{prime} genomic and promoter regions between the mouse (Hdh) and human Huntington disease (HD) gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kalchman, M.; Lin, B.; Nasir, J.

    1994-09-01

    The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less

  14. Genome-wide variation within and between wild and domestic yak.

    PubMed

    Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang

    2014-07-01

    The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.

  15. Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria.

    PubMed

    Thorpe, Harry A; Bayliss, Sion C; Sheppard, Samuel K; Feil, Edward J

    2018-04-01

    The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

  16. Genome-wide methylation analysis identified sexually dimorphic methylated regions in hybrid tilapia

    PubMed Central

    Wan, Zi Yi; Xia, Jun Hong; Lin, Grace; Wang, Le; Lin, Valerie C. L.; Yue, Gen Hua

    2016-01-01

    Sexual dimorphism is an interesting biological phenomenon. Previous studies showed that DNA methylation might play a role in sexual dimorphism. However, the overall picture of the genome-wide methylation landscape in sexually dimorphic species remains unclear. We analyzed the DNA methylation landscape and transcriptome in hybrid tilapia (Oreochromis spp.) using whole genome bisulfite sequencing (WGBS) and RNA-sequencing (RNA-seq). We found 4,757 sexually dimorphic differentially methylated regions (DMRs), with significant clusters of DMRs located on chromosomal regions associated with sex determination. CpG methylation in promoter regions was negatively correlated with the gene expression level. MAPK/ERK pathway was upregulated in male tilapia. We also inferred active cis-regulatory regions (ACRs) in skeletal muscle tissues from WGBS datasets, revealing sexually dimorphic cis-regulatory regions. These results suggest that DNA methylation contribute to sex-specific phenotypes and serve as resources for further investigation to analyze the functions of these regions and their contributions towards sexual dimorphisms. PMID:27782217

  17. Genomic region operation kit for flexible processing of deep sequencing data.

    PubMed

    Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

    2013-01-01

    Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.

  18. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    PubMed

    Ravinet, Mark; Yoshida, Kohta; Shigenobu, Shuji; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2018-05-01

    Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus) and the Japan Sea stickleback (G. nipponicus), which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  19. Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation.

    PubMed

    Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

    2015-01-01

    Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes.

  20. Identification of genomic regions contributing to etoposide-induced cytotoxicity

    PubMed Central

    Bleibel, Wasim K.; Duan, Shiwei; Huang, R. Stephanie; Kistner, Emily O.; Shukla, Sunita J.; Wu, Xiaolin; Badner, Judith A.

    2009-01-01

    Etoposide is routinely used in combination based chemotherapy for testicular cancer and small-cell lung cancer; however, myelosuppression, therapy-related leukemia and neurotoxicity limit its utility. To determine the genetic contribution to cellular sensitivity to etoposide, we evaluated cell growth inhibition in Centre d’ Etude du Polymorphisme Humain lymphoblastoid cell lines from 24 multi-generational pedigrees (321 samples) following treatment with 0.02–2.5 µM etoposide for 72 h. Heritability analysis showed that genetic variation contributes significantly to the cytotoxic phenotypes (h2 = 0.17–0.25, P = 4.9 × 10−5−7.3 × 10−3). Whole genome linkage scans uncovered 8 regions with peak LOD scores ranging from 1.57 to 2.55, with the most significant signals being found on chromosome 5 (LOD = 2.55) and chromosome 6 (LOD = 2.52). Linkage-directed association was performed on a subset of HapMap samples within the pedigrees to find 22 SNPs significantly associated with etoposide cytotoxicity at one or more treatment concentrations. UVRAG, a DNA repair gene, SEMA5A, SLC7A6 and PRMT7 are implicated from these unbiased studies. Our findings suggest that susceptibility to etoposide-induced cytotoxicity is heritable and using an integrated genomics approach we identified both genomic regions and SNPs associated with the cytotoxic phenotypes. PMID:19089452

  1. Identification of genomic regions contributing to etoposide-induced cytotoxicity.

    PubMed

    Bleibel, Wasim K; Duan, Shiwei; Huang, R Stephanie; Kistner, Emily O; Shukla, Sunita J; Wu, Xiaolin; Badner, Judith A; Dolan, M Eileen

    2009-03-01

    Etoposide is routinely used in combination-based chemotherapy for testicular cancer and small-cell lung cancer; however, myelosuppression, therapy-related leukemia and neurotoxicity limit its utility. To determine the genetic contribution to cellular sensitivity to etoposide, we evaluated cell growth inhibition in Centre d' Etude du Polymorphisme Humain lymphoblastoid cell lines from 24 multi-generational pedigrees (321 samples) following treatment with 0.02-2.5 microM etoposide for 72 h. Heritability analysis showed that genetic variation contributes significantly to the cytotoxic phenotypes (h (2) = 0.17-0.25, P = 4.9 x 10(-5)-7.3 x 10(-3)). Whole genome linkage scans uncovered 8 regions with peak LOD scores ranging from 1.57 to 2.55, with the most significant signals being found on chromosome 5 (LOD = 2.55) and chromosome 6 (LOD = 2.52). Linkage-directed association was performed on a subset of HapMap samples within the pedigrees to find 22 SNPs significantly associated with etoposide cytotoxicity at one or more treatment concentrations. UVRAG, a DNA repair gene, SEMA5A, SLC7A6 and PRMT7 are implicated from these unbiased studies. Our findings suggest that susceptibility to etoposide-induced cytotoxicity is heritable and using an integrated genomics approach we identified both genomic regions and SNPs associated with the cytotoxic phenotypes.

  2. Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region

    PubMed Central

    Jiang, Zhi J; Castoe, Todd A; Austin, Christopher C; Burbrink, Frank T; Herron, Matthew D; McGuire, Jimmy A; Parkinson, Christopher L; Pollock, David D

    2007-01-01

    Background The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. Results We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. Conclusion Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and

  3. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma

    PubMed Central

    Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, Tin-Lap; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang

    2016-01-01

    Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150

  4. Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.

    PubMed

    Pehkonen, Petri; Wong, Garry; Törönen, Petri

    2010-01-01

    Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.

  5. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  6. Cancer Genomic Resources and Present Needs in the Latin American Region.

    PubMed

    Torres, Ángela; Oliver, Javier; Frecha, Cecilia; Montealegre, Ana Lorena; Quezada-Urbán, Rosalía; Díaz-Velásquez, Clara Estela; Vaca-Paniagua, Felipe; Perdomo, Sandra

    2017-01-01

    In Latin America (LA), cancer is the second leading cause of death, and little is known about the capacities and needs for the development of research in the field of cancer genomics. In order to evaluate the current capacity for and development of cancer genomics in LA, we collected the available information on genomics, including the number of next-generation sequencing (NGS) platforms, the number of cancer research institutions and research groups, publications in the last 10 years, educational programs, and related national cancer control policies. Currently, there are 221 NGS platforms and 118 research groups in LA developing cancer genomics projects. A total of 272 articles in the field of cancer genetics/genomics were published by authors affiliated to Latin American institutions. Educational programs in genomics are scarce, almost exclusive of graduate programs, and only few are concerning cancer. Only 14 countries have national cancer control plans, but all of them consider secondary prevention strategies for early diagnosis, opportune treatment, and decreasing mortality, where genomic analyses could be implemented. Despite recent advances in introducing knowledge about cancer genomics and its application to LA, the region lacks development of integrated genomic research projects, improved use of NGS platforms, implementation of associated educational programs, and health policies that could have an impact on cancer care. © 2017 S. Karger AG, Basel.

  7. Genome image programs: visualization and interpretation of Escherichia coli microarray experiments.

    PubMed

    Zimmer, Daniel P; Paliy, Oleg; Thomas, Brian; Gyaneshwar, Prasad; Kustu, Sydney

    2004-08-01

    We have developed programs to facilitate analysis of microarray data in Escherichia coli. They fall into two categories: manipulation of microarray images and identification of known biological relationships among lists of genes. A program in the first category arranges spots from glass-slide DNA microarrays according to their position in the E. coli genome and displays them compactly in genome order. The resulting genome image is presented in a web browser with an image map that allows the user to identify genes in the reordered image. Another program in the first category aligns genome images from two or more experiments. These images assist in visualizing regions of the genome with common transcriptional control. Such regions include multigene operons and clusters of operons, which are easily identified as strings of adjacent, similarly colored spots. The images are also useful for assessing the overall quality of experiments. The second category of programs includes a database and a number of tools for displaying biological information about many E. coli genes simultaneously rather than one gene at a time, which facilitates identifying relationships among them. These programs have accelerated and enhanced our interpretation of results from E. coli DNA microarray experiments. Examples are given. Copyright 2004 Genetics Society of America

  8. Full-genome sequences of hepatitis B virus subgenotype D3 isolates from the Brazilian Amazon Region.

    PubMed

    Spitz, Natália; Mello, Francisco C A; Araujo, Natalia Motta

    2015-02-01

    The Brazilian Amazon Region is a highly endemic area for hepatitis B virus (HBV). However, little is known regarding the genetic variability of the strains circulating in this geographical region. Here, we describe the first full-length genomes of HBV isolated in the Brazilian Amazon Region; these genomes are also the first complete HBV subgenotype D3 genomes reported for Brazil. The genomes of the five Brazilian isolates were all 3,182 base pairs in length and the isolates were classified as belonging to subgenotype D3, subtypes ayw2 (n = 3) and ayw3 (n = 2). Phylogenetic analysis suggested that the Brazilian sequences are not likely to be closely related to European D3 sequences. Such results will contribute to further epidemiological and evolutionary studies of HBV.

  9. Analysis of Genomic Regions of Trichoderma harzianum IOC-3844 Related to Biomass Degradation

    PubMed Central

    Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

    2015-01-01

    Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes. PMID:25836973

  10. Comparative Genomics of Campylobacter iguaniorum to Unravel Genetic Regions Associated with Reptilian Hosts

    PubMed Central

    Gilbert, Maarten J.; Miller, William G.; Yee, Emma; Kik, Marja; Zomer, Aldert L.; Wagenaar, Jaap A.; Duim, Birgitta

    2016-01-01

    Abstract Campylobacter iguaniorum is most closely related to the species C. fetus, C. hyointestinalis, and C. lanienae. Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C. iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C. fetus subsp. testudinum. In contrast to C. fetus, C. iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C. iguaniorum. Instead, multiple predicted glycosylation regions were identified in C. iguaniorum. One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C. iguaniorum shared highest homology with C. hyointestinalis and C. fetus. As in reptile-associated C. fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C. iguaniorum and related Campylobacter taxa. PMID:27604878

  11. Comparison of vesicular-arbuscular mycorrhizae in plants from disturbed and adjacent undisturbed regions of a coastal salt marsh in Clinton, Connecticut, USA

    NASA Astrophysics Data System (ADS)

    Cooke, John C.; Lefor, Michael W.

    1990-01-01

    Roots of salt marsh plant species Spartina alterniflora, S. patens, Distichlis spicata, and others were examined for the presence of vesicular-arbuscular mycorrhizal (VAM) fungi. Samples were taken from introduced planted material in a salt marsh restoration project and from native material in adjacent marsh areas along the Indian River, Clinton, Connecticut, USA. After ten years the replanted area still has sites devoid of vegetation. The salt marsh plants introduced there were devoid of VAM fungi, while high marsh species from the adjacent undisturbed region showed consistent infection, leading the authors to suggest that VAM fungal infection of planting stocks may be a factor in the success of marsh restoration.

  12. ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function

    PubMed Central

    Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira

    2015-01-01

    Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026

  13. Identification of Genomic Regions Associated with Phenotypic Variation between Dog Breeds using Selection Mapping

    PubMed Central

    Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Åke; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.

    2011-01-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease. PMID:22022279

  14. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    PubMed

    Vaysse, Amaury; Ratnakumar, Abhirami; Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H; Hansen, Mark S T; Lawley, Cindy T; Karlsson, Elinor K; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Ake; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T

    2011-10-01

    The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  15. Estimation of (co)variances for genomic regions of flexible sizes: application to complex infectious udder diseases in dairy cattle

    PubMed Central

    2012-01-01

    Background Multi-trait genomic models in a Bayesian context can be used to estimate genomic (co)variances, either for a complete genome or for genomic regions (e.g. per chromosome) for the purpose of multi-trait genomic selection or to gain further insight into the genomic architecture of related traits such as mammary disease traits in dairy cattle. Methods Data on progeny means of six traits related to mastitis resistance in dairy cattle (general mastitis resistance and five pathogen-specific mastitis resistance traits) were analyzed using a bivariate Bayesian SNP-based genomic model with a common prior distribution for the marker allele substitution effects and estimation of the hyperparameters in this prior distribution from the progeny means data. From the Markov chain Monte Carlo samples of the allele substitution effects, genomic (co)variances were calculated on a whole-genome level, per chromosome, and in regions of 100 SNP on a chromosome. Results Genomic proportions of the total variance differed between traits. Genomic correlations were lower than pedigree-based genetic correlations and they were highest between general mastitis and pathogen-specific traits because of the part-whole relationship between these traits. The chromosome-wise genomic proportions of the total variance differed between traits, with some chromosomes explaining higher or lower values than expected in relation to chromosome size. Few chromosomes showed pleiotropic effects and only chromosome 19 had a clear effect on all traits, indicating the presence of QTL with a general effect on mastitis resistance. The region-wise patterns of genomic variances differed between traits. Peaks indicating QTL were identified but were not very distinctive because a common prior for the marker effects was used. There was a clear difference in the region-wise patterns of genomic correlation among combinations of traits, with distinctive peaks indicating the presence of pleiotropic QTL. Conclusions

  16. DNA sequence responsible for the amplification of adjacent genes.

    PubMed

    Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

    1987-10-01

    A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

  17. A genome-wide association study of seed protein and oil content in soybean

    PubMed Central

    2014-01-01

    Background Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. Results A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r 2 ) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. Conclusions This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome

  18. A genome-wide association study of seed protein and oil content in soybean.

    PubMed

    Hwang, Eun-Young; Song, Qijian; Jia, Gaofeng; Specht, James E; Hyten, David L; Costa, Jose; Cregan, Perry B

    2014-01-02

    Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r2) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise

  19. Centromere-Like Regions in the Budding Yeast Genome

    PubMed Central

    Lefrançois, Philippe; Auerbach, Raymond K.; Yellman, Christopher M.; Roeder, G. Shirleen; Snyder, Michael

    2013-01-01

    Accurate chromosome segregation requires centromeres (CENs), the DNA sequences where kinetochores form, to attach chromosomes to microtubules. In contrast to most eukaryotes, which have broad centromeres, Saccharomyces cerevisiae possesses sequence-defined point CENs. Chromatin immunoprecipitation followed by sequencing (ChIP–Seq) reveals colocalization of four kinetochore proteins at novel, discrete, non-centromeric regions, especially when levels of the centromeric histone H3 variant, Cse4 (a.k.a. CENP-A or CenH3), are elevated. These regions of overlapping protein binding enhance the segregation of plasmids and chromosomes and have thus been termed Centromere-Like Regions (CLRs). CLRs form in close proximity to S. cerevisiae CENs and share characteristics typical of both point and regional CENs. CLR sequences are conserved among related budding yeasts. Many genomic features characteristic of CLRs are also associated with these conserved homologous sequences from closely related budding yeasts. These studies provide general and important insights into the origin and evolution of centromeres. PMID:23349633

  20. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    PubMed Central

    2011-01-01

    Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1

  1. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits.

    PubMed

    Saski, Christopher A; Li, Zhigang; Feltus, Frank A; Luo, Hong

    2011-07-18

    Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into

  2. Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse

    PubMed Central

    Kortschak, R. Daniel

    2018-01-01

    The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or “churning” in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against. PMID:29677183

  3. Comparative Genomics of Campylobacter iguaniorum to Unravel Genetic Regions Associated with Reptilian Hosts.

    PubMed

    Gilbert, Maarten J; Miller, William G; Yee, Emma; Kik, Marja; Zomer, Aldert L; Wagenaar, Jaap A; Duim, Birgitta

    2016-10-05

    Campylobacter iguaniorum is most closely related to the species C fetus, C hyointestinalis, and C lanienae Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C fetus subsp. testudinum In contrast to C fetus, C iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C iguaniorum Instead, multiple predicted glycosylation regions were identified in C iguaniorum One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C iguaniorum shared highest homology with C hyointestinalis and C fetus. As in reptile-associated C fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C iguaniorum and related Campylobacter taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. Evolutionary history of the ABCB2 genomic region in teleosts

    USGS Publications Warehouse

    Palti, Y.; Rodriguez, M.F.; Gahr, S.A.; Hansen, J.D.

    2007-01-01

    Gene duplication, silencing and translocation have all been implicated in shaping the unique genomic architecture of the teleost MH regions. Previously, we demonstrated that trout possess five unlinked regions encoding MH genes. One of these regions harbors ABCB2 which in all other vertebrate classes is found in the MHC class II region. In this study, we sequenced a BAC contig for the trout ABCB2 region. Analysis of this region revealed the presence of genes homologous to those located in the human class II (ABCB2, BRD2, ??DAA), extended class II (RGL2, PHF1, SYGP1) and class III (PBX2, Notch-L) regions. The organization and syntenic relationships of this region were then compared to similar regions in humans, Tetraodon and zebrafish to learn more about the evolutionary history of this region. Our analysis indicates that this region was generated during the teleost-specific duplication event while also providing insight about potential MH paralogous regions in teleosts. ?? 2006 Elsevier Ltd. All rights reserved.

  5. Identification of genomic regions associated with resistance to clinical mastitis in US Holstein cattle

    USDA-ARS?s Scientific Manuscript database

    The objective of this research was to identify genomic regions associated with clinical mastitis (MAST) in US Holsteins using producer-reported data. Genome-wide association studies (GWAS) were performed on deregressed PTA using GEMMA v. 0.94. Genotypes included 60,671 SNP for all predictor bulls (n...

  6. Read clouds uncover variation in complex regions of the human genome

    PubMed Central

    Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E.; West, Robert; Sidow, Arend; Batzoglou, Serafim

    2015-01-01

    Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. PMID:26286554

  7. Read clouds uncover variation in complex regions of the human genome.

    PubMed

    Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E; West, Robert; Sidow, Arend; Batzoglou, Serafim

    2015-10-01

    Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. © 2015 Bishara et al.; Published by Cold Spring Harbor Laboratory Press.

  8. PIXE analysis of elements in gastric cancer and adjacent mucosa

    NASA Astrophysics Data System (ADS)

    Liu, Qixin; Zhong, Ming; Zhang, Xiaofeng; Yan, Lingnuo; Xu, Yongling; Ye, Simao

    1990-04-01

    The elemental regional distributions in 20 resected human stomach tissues were obtained using PIXE analysis. The samples were pathologically divided into four types: normal, adjacent mucosa A, adjacent mucosa B and cancer. The targets for PIXE analysis were prepared by wet digestion with a pressure bomb system. P, K, Fe, Cu, Zn and Se were measured and statistically analysed. We found significantly higher concentrations of P, K, Cu, Zn and a higher ratio of Cu compared to Zn in cancer tissue as compared with normal tissue, but statistically no significant difference between adjacent mucosa and cancer tissue was found.

  9. In silico screening of the chicken genome for overlaps between genomic regions: microRNA genes, coding and non-coding transcriptional units, QTL, and genetic variations.

    PubMed

    Zorc, Minja; Kunej, Tanja

    2016-05-01

    MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a

  10. Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology.

    PubMed

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K P; Woo, Patrick C Y

    2015-10-22

    Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10-49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n=2), Pichia (Candida) norvegensis (n=2), Candida tropicalis (n=1) and Saccharomyces cerevisiae (n=1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study.

  11. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    PubMed

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  12. Expanding probe repertoire and improving reproducibility in human genomic hybridization

    PubMed Central

    Dorman, Stephanie N.; Shirley, Ben C.; Knoll, Joan H. M.; Rogan, Peter K.

    2013-01-01

    Diagnostic DNA hybridization relies on probes composed of single copy (sc) genomic sequences. Sc sequences in probe design ensure high specificity and avoid cross-hybridization to other regions of the genome, which could lead to ambiguous results that are difficult to interpret. We examine how the distribution and composition of repetitive sequences in the genome affects sc probe performance. A divide and conquer algorithm was implemented to design sc probes. With this approach, sc probes can include divergent repetitive elements, which hybridize to unique genomic targets under higher stringency experimental conditions. Genome-wide custom probe sets were created for fluorescent in situ hybridization (FISH) and microarray genomic hybridization. The scFISH probes were developed for detection of copy number changes within small tumour suppressor genes and oncogenes. The microarrays demonstrated increased reproducibility by eliminating cross-hybridization to repetitive sequences adjacent to probe targets. The genome-wide microarrays exhibited lower median coefficients of variation (17.8%) for two HapMap family trios. The coefficients of variations of commercial probes within 300 nt of a repetitive element were 48.3% higher than the nearest custom probe. Furthermore, the custom microarray called a chromosome 15q11.2q13 deletion more consistently. This method for sc probe design increases probe coverage for FISH and lowers variability in genomic microarrays. PMID:23376933

  13. Pedigree-based analysis of derivation of genome segments of an elite rice reveals key regions during its breeding.

    PubMed

    Zhou, Degui; Chen, Wei; Lin, Zechuan; Chen, Haodong; Wang, Chongrong; Li, Hong; Yu, Renbo; Zhang, Fengyun; Zhen, Gang; Yi, Junliang; Li, Kanghuo; Liu, Yaoguang; Terzaghi, William; Tang, Xiaoyan; He, Hang; Zhou, Shaochuan; Deng, Xing Wang

    2016-02-01

    Analyses of genome variations with high-throughput assays have improved our understanding of genetic basis of crop domestication and identified the selected genome regions, but little is known about that of modern breeding, which has limited the usefulness of massive elite cultivars in further breeding. Here we deploy pedigree-based analysis of an elite rice, Huanghuazhan, to exploit key genome regions during its breeding. The cultivars in the pedigree were resequenced with 7.6× depth on average, and 2.1 million high-quality single nucleotide polymorphisms (SNPs) were obtained. Tracing the derivation of genome blocks with pedigree and information on SNPs revealed the chromosomal recombination during breeding, which showed that 26.22% of Huanghuazhan genome are strictly conserved key regions. These major effect regions were further supported by a QTL mapping of 260 recombinant inbred lines derived from the cross of Huanghuazhan and a very dissimilar cultivar, Shuanggui 36, and by the genome profile of eight cultivars and 36 elite lines derived from Huanghuazhan. Hitting these regions with the cloned genes revealed they include numbers of key genes, which were then applied to demonstrate how Huanghuazhan were bred after 30 years of effort and to dissect the deficiency of artificial selection. We concluded the regions are helpful to the further breeding based on this pedigree and performing breeding by design. Our study provides genetic dissection of modern rice breeding and sheds new light on how to perform genomewide breeding by design. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  14. New Regions of the Human Genome Linked to Skin Color Variation in Some African Populations

    Cancer.gov

    In the first study of its kind, an international team of genomics researchers has identified new regions of the human genome that are associated with skin color variation in some African populations, opening new avenues for research on skin diseases and cancer in all populations.

  15. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  16. A Rapid Method of Genomic Array Analysis of Scaffold/Matrix Attachment Regions (S/MARs) Identifies a 2.5-Mb Region of Enhanced Scaffold/Matrix Attachment at a Human Neocentromere

    PubMed Central

    Sumer, Huseyin; Craig, Jeffrey M.; Sibson, Mandy; Choo, K.H. Andy

    2003-01-01

    Human neocentromeres are fully functional centromeres that arise at previously noncentromeric regions of the genome. We have tested a rapid procedure of genomic array analysis of chromosome scaffold/matrix attachment regions (S/MARs), involving the isolation of S/MAR DNA and hybridization of this DNA to a genomic BAC/PAC array. Using this procedure, we have defined a 2.5-Mb domain of S/MAR-enriched chromatin that fully encompasses a previously mapped centromere protein-A (CENP-A)-associated domain at a human neocentromere. We have independently verified this procedure using a previously established fluorescence in situ hybridization method on salt-treated metaphase chromosomes. In silico sequence analysis of the S/MAR-enriched and surrounding regions has revealed no outstanding sequence-related predisposition. This study defines the S/MAR-enriched domain of a higher eukaryotic centromere and provides a method that has broad application for the mapping of S/MAR attachment sites over large genomic regions or throughout a genome. PMID:12840048

  17. Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology

    PubMed Central

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K. P.; Woo, Patrick C. Y.

    2015-01-01

    Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10–49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n = 2), Pichia (Candida) norvegensis (n = 2), Candida tropicalis (n = 1) and Saccharomyces cerevisiae (n = 1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study. PMID:26506340

  18. Genome-wide association study of body weight in Australian Merino sheep reveals an orthologous region on OAR6 to human and bovine genomic regions affecting height and weight.

    PubMed

    Al-Mamun, Hawlader A; Kwan, Paul; Clark, Samuel A; Ferdosi, Mohammad H; Tellam, Ross; Gondro, Cedric

    2015-08-14

    Body weight (BW) is an important trait for meat production in sheep. Although over the past few years, numerous quantitative trait loci (QTL) have been detected for production traits in cattle, few QTL studies have been reported for sheep, with even fewer on meat production traits. Our objective was to perform a genome-wide association study (GWAS) with the medium-density Illumina Ovine SNP50 BeadChip to identify genomic regions and corresponding haplotypes associated with BW in Australian Merino sheep. A total of 1781 Australian Merino sheep were genotyped using the medium-density Illumina Ovine SNP50 BeadChip. Among the 53 862 single nucleotide polymorphisms (SNPs) on this array, 48 640 were used to perform a GWAS using a linear mixed model approach. Genotypes were phased with hsphase; to estimate SNP haplotype effects, linkage disequilibrium blocks were identified in the detected QTL region. Thirty-nine SNPs were associated with BW at a Bonferroni-corrected genome-wide significance threshold of 1 %. One region on sheep (Ovis aries) chromosome 6 (OAR6) between 36.15 and 38.56 Mb, included 13 significant SNPs that were associated with BW; the most significant SNP was OAR6_41936490.1 (P = 2.37 × 10(-16)) at 37.69 Mb with an allele substitution effect of 2.12 kg, which corresponds to 0.248 phenotypic standard deviations for BW. The region that surrounds this association signal on OAR6 contains three genes: leucine aminopeptidase 3 (LAP3), which is involved in the processing of the oxytocin precursor; NCAPG non-SMC condensin I complex, subunit G (NCAPG), which is associated with foetal growth and carcass size in cattle; and ligand dependent nuclear receptor corepressor-like (LCORL), which is associated with height in humans and cattle. The GWAS analysis detected 39 SNPs associated with BW in sheep and a major QTL region was identified on OAR6. In several other mammalian species, regions that are syntenic with this region have been found to be associated with body

  19. Genome-wide identification of RNA editing in hepatocellular carcinoma.

    PubMed

    Kang, Lin; Liu, Xiaoqiao; Gong, Zhoulin; Zheng, Hancheng; Wang, Jun; Li, Yingrui; Yang, Huanming; Hardwick, James; Dai, Hongyue; Poon, Ronnie T P; Lee, Nikki P; Mao, Mao; Peng, Zhiyu; Chen, Ronghua

    2015-02-01

    We did whole-transcriptome sequencing and whole-genome sequencing on nine pairs of Hepatocellular carcinoma (HCC) tumors and matched adjacent tissues to identify RNA editing events. We identified mean 26,982 editing sites with mean 89.5% canonical A→G edits in each sample using an improved bioinformatics pipeline. The editing rate was significantly higher in tumors than adjacent normal tissues. Comparing the difference between tumor and normal tissues of each patient, we found 7 non-synonymous tissue specific editing events including 4 tumor-specific edits and 3 normal-specific edits in the coding region, as well as 292 edits varying in editing degree. The significant expression changes of 150 genes associated with RNA editing were found in tumors, with 3 of the 4 most significant genes being cancer related. Our results show that editing might be related to higher gene expression. These findings indicate that RNA editing modification may play an important role in the development of HCC. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Nature and function of insulator protein binding sites in the Drosophila genome

    PubMed Central

    Schwartz, Yuri B.; Linder-Basso, Daniela; Kharchenko, Peter V.; Tolstorukov, Michael Y.; Kim, Maria; Li, Hua-Bing; Gorchakov, Andrey A.; Minoda, Aki; Shanower, Gregory; Alekseyenko, Artyom A.; Riddle, Nicole C.; Jung, Youngsook L.; Gu, Tingting; Plachetka, Annette; Elgin, Sarah C.R.; Kuroda, Mitzi I.; Park, Peter J.; Savitsky, Mikhail; Karpen, Gary H.; Pirrotta, Vincenzo

    2012-01-01

    Chromatin insulator elements and associated proteins have been proposed to partition eukaryotic genomes into sets of independently regulated domains. Here we test this hypothesis by quantitative genome-wide analysis of insulator protein binding to Drosophila chromatin. We find distinct combinatorial binding of insulator proteins to different classes of sites and uncover a novel type of insulator element that binds CP190 but not any other known insulator proteins. Functional characterization of different classes of binding sites indicates that only a small fraction act as robust insulators in standard enhancer-blocking assays. We show that insulators restrict the spreading of the H3K27me3 mark but only at a small number of Polycomb target regions and only to prevent repressive histone methylation within adjacent genes that are already transcriptionally inactive. RNAi knockdown of insulator proteins in cultured cells does not lead to major alterations in genome expression. Taken together, these observations argue against the concept of a genome partitioned by specialized boundary elements and suggest that insulators are reserved for specific regulation of selected genes. PMID:22767387

  1. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea

    PubMed Central

    2013-01-01

    Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706

  2. Characterisation of the subtelomeric regions of Giardia lamblia genome isolate WBC6.

    PubMed

    Prabhu, Anjali; Morrison, Hilary G; Martinez, Charles R; Adam, Rodney D

    2007-04-01

    Giardia trophozoites are polyploid and have five chromosomes. The chromosome homologues demonstrate considerable size heterogeneity due to variation in the subtelomeric regions. We used clones from the genome project with telomeric sequence at one end to identify six subtelomeric regions in addition to previously identified subtelomeric regions, to study the telomeric arrangement of the chromosomes. The subtelomeric regions included two retroposons, one retroposon pseudogene, and two vsp genes, in addition to the previously identified subtelomeric regions that include ribosomal DNA repeats. The presence of vsp genes in a subtelomeric region suggests that telomeric rearrangements may contribute to the generation of vsp diversity. These studies of the subtelomeric regions of Giardia may contribute to our understanding of the factors that maintain stability, while allowing diversity in chromosome structure.

  3. Silencing is noisy: population and cell level noise in telomere-adjacent genes is dependent on telomere position and sir2.

    PubMed

    Anderson, Matthew Z; Gerstein, Aleeza C; Wigen, Lauren; Baller, Joshua A; Berman, Judith

    2014-07-01

    Cell-to-cell gene expression noise is thought to be an important mechanism for generating phenotypic diversity. Furthermore, telomeric regions are major sites for gene amplification, which is thought to drive genetic diversity. Here we found that individual subtelomeric TLO genes exhibit increased variation in transcript and protein levels at both the cell-to-cell level as well as at the population-level. The cell-to-cell variation, termed Telomere-Adjacent Gene Expression Noise (TAGEN) was largely intrinsic noise and was dependent upon genome position: noise was reduced when a TLO gene was expressed at an ectopic internal locus and noise was elevated when a non-telomeric gene was expressed at a telomere-adjacent locus. This position-dependent TAGEN also was dependent on Sir2p, an NAD+-dependent histone deacetylase. Finally, we found that telomere silencing and TAGEN are tightly linked and regulated in cis: selection for either silencing or activation of a TLO-adjacent URA3 gene resulted in reduced noise at the neighboring TLO but not at other TLO genes. This provides experimental support to computational predictions that the ability to shift between silent and active chromatin states has a major effect on cell-to-cell noise. Furthermore, it demonstrates that these shifts affect the degree of expression variation at each telomere individually.

  4. The genomic complexity of primary human prostate cancer

    PubMed Central

    Berger, Michael F.; Lawrence, Michael S.; Demichelis, Francesca; Drier, Yotam; Cibulskis, Kristian; Sivachenko, Andrey Y.; Sboner, Andrea; Esgueva, Raquel; Pflueger, Dorothee; Sougnez, Carrie; Onofrio, Robert; Carter, Scott L.; Park, Kyung; Habegger, Lukas; Ambrogio, Lauren; Fennell, Timothy; Parkin, Melissa; Saksena, Gordon; Voet, Douglas; Ramos, Alex H.; Pugh, Trevor J.; Wilkinson, Jane; Fisher, Sheila; Winckler, Wendy; Mahan, Scott; Ardlie, Kristin; Baldwin, Jennifer; Simons, Jonathan W.; Kitabayashi, Naoki; MacDonald, Theresa Y.; Kantoff, Philip W.; Chin, Lynda; Gabriel, Stacey B.; Gerstein, Mark B.; Golub, Todd R.; Meyerson, Matthew; Tewari, Ashutosh; Lander, Eric S.; Getz, Gad; Rubin, Mark A.; Garraway, Levi A.

    2010-01-01

    Prostate cancer is the second most common cause of male cancer deaths in the United States. Here we present the complete sequence of seven primary prostate cancers and their paired normal counterparts. Several tumors contained complex chains of balanced rearrangements that occurred within or adjacent to known cancer genes. Rearrangement breakpoints were enriched near open chromatin, androgen receptor and ERG DNA binding sites in the setting of the ETS gene fusion TMPRSS2-ERG, but inversely correlated with these regions in tumors lacking ETS fusions. This observation suggests a link between chromatin or transcriptional regulation and the genesis of genomic aberrations. Three tumors contained rearrangements that disrupted CADM2, and four harbored events disrupting either PTEN (unbalanced events), a prostate tumor suppressor, or MAGI2 (balanced events), a PTEN interacting protein not previously implicated in prostate tumorigenesis. Thus, genomic rearrangements may arise from transcriptional or chromatin aberrancies to engage prostate tumorigenic mechanisms. PMID:21307934

  5. Regions of conservation and divergence in the 3' untranslated sequences of genomic RNA from Ross River virus isolates.

    PubMed

    Faragher, S G; Dalgarno, L

    1986-07-20

    The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.

  6. L1-associated genomic regions are deleted in somatic cells of the healthy human brain.

    PubMed

    Erwin, Jennifer A; Paquola, Apuã C M; Singer, Tatjana; Gallina, Iryna; Novotny, Mark; Quayle, Carolina; Bedrosian, Tracy A; Alves, Francisco I A; Butcher, Cheyenne R; Herdy, Joseph R; Sarkar, Anindita; Lasken, Roger S; Muotri, Alysson R; Gage, Fred H

    2016-12-01

    The healthy human brain is a mosaic of varied genomes. Long interspersed element-1 (LINE-1 or L1) retrotransposition is known to create mosaicism by inserting L1 sequences into new locations of somatic cell genomes. Using a machine learning-based, single-cell sequencing approach, we discovered that somatic L1-associated variants (SLAVs) are composed of two classes: L1 retrotransposition insertions and retrotransposition-independent L1-associated variants. We demonstrate that a subset of SLAVs comprises somatic deletions generated by L1 endonuclease cutting activity. Retrotransposition-independent rearrangements in inherited L1s resulted in the deletion of proximal genomic regions. These rearrangements were resolved by microhomology-mediated repair, which suggests that L1-associated genomic regions are hotspots for somatic copy number variants in the brain and therefore a heritable genetic contributor to somatic mosaicism. We demonstrate that SLAVs are present in crucial neural genes, such as DLG2 (also called PSD93), and affect 44-63% of cells of the cells in the healthy brain.

  7. L1-Associated Genomic Regions are Deleted in Somatic Cells of the Healthy Human Brain

    PubMed Central

    Erwin, Jennifer A.; Paquola, Apuã C.M.; Singer, Tatjana; Gallina, Iryna; Novotny, Mark; Quayle, Carolina; Bedrosian, Tracy; Ivanio, Francisco; Butcher, Cheyenne R.; Herdy, Joseph R.; Sarkar, Anindita; Lasken, Roger S.; Muotri, Alysson R.; Gage, Fred H.

    2016-01-01

    The healthy human brain is a mosaic of varied genomes. L1 retrotransposition is known to create mosaicism by inserting L1 sequences into new locations of somatic cell genomes. Using a machine learning-based, single-cell sequencing approach, we discovered that Somatic L1-Associated Variants (SLAVs) are actually composed of two classes: L1 retrotransposition insertions and retrotransposition-independent L1-associated variants. We demonstrate that a subset of SLAVs are, in fact, somatic deletions generated by L1 endonuclease cutting activity. Retrotransposition- independent rearrangements within inherited L1s resulted in the deletion of proximal genomic regions. These rearrangements were resolved by microhomology-mediated repair, which suggests that L1-associated genomic regions are hotspots for somatic copy number variants in the brain and therefore a heritable genetic contributor to somatic mosaicism. We demonstrate that SLAVs are present in crucial neural genes, such as DLG2/PSD93, and affect between 44–63% of cells of the cells in the healthy brain. PMID:27618310

  8. DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies

    PubMed Central

    Anselmetti, Yoann; Patterson, Murray; Ponty, Yann; B�rard, S�verine; Chauve, Cedric; Scornavacca, Celine; Daubin, Vincent; Tannier, Eric

    2017-01-01

    DeCoSTAR is a software that aims at reconstructing the organization of ancestral genes or genomes in the form of sets of neighborhood relations (adjacencies) between pairs of ancestral genes or gene domains. It can also improve the assembly of fragmented genomes by proposing evolutionary-induced adjacencies between scaffolding fragments. Ancestral genes or domains are deduced from reconciled phylogenetic trees under an evolutionary model that considers gains, losses, speciations, duplications, and transfers as possible events for gene evolution. Reconciliations are either given as input or computed with the ecceTERA package, into which DeCoSTAR is integrated. DeCoSTAR computes adjacency evolutionary scenarios using a scoring scheme based on a weighted sum of adjacency gains and breakages. Solutions, both optimal and near-optimal, are sampled according to the Boltzmann–Gibbs distribution centered around parsimonious solutions, and statistical supports on ancestral and extant adjacencies are provided. DeCoSTAR supports the features of previously contributed tools that reconstruct ancestral adjacencies, namely DeCo, DeCoLT, ART-DeCo, and DeClone. In a few minutes, DeCoSTAR can reconstruct the evolutionary history of domains inside genes, of gene fusion and fission events, or of gene order along chromosomes, for large data sets including dozens of whole genomes from all kingdoms of life. We illustrate the potential of DeCoSTAR with several applications: ancestral reconstruction of gene orders for Anopheles mosquito genomes, multidomain proteins in Drosophila, and gene fusion and fission detection in Actinobacteria. Availability: http://pbil.univ-lyon1.fr/software/DeCoSTAR (Last accessed April 24, 2017). PMID:28402423

  9. Predicted stem-loop structures and variation in nucleotide sequence of 3' noncoding regions among animal calicivirus genomes.

    PubMed

    Seal, B S; Neill, J D; Ridpath, J F

    1994-07-01

    Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.

  10. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  11. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  12. Apparatus and methods for impingement cooling of an undercut region adjacent a side wall of a turbine nozzle segment

    DOEpatents

    Burdgick, Steven Sebastian; Itzel, Gary Michael

    2001-01-01

    A gas turbine nozzle segment has outer and inner bands. Each band includes a side wall, a cover and an impingement plate between the cover and nozzle wall defining two cavities on opposite sides of the impingement plate. Cooling steam is supplied to one cavity for flow through apertures of the impingement plate to cool the nozzle wall. The side wall of the band and inturned flange define with the nozzle wall an undercut region. The inturned flange has a plurality of apertures for directing cooling steam to cool the side wall between adjacent nozzle segments.

  13. Evidence that local land use practices influence regional climate, vegetation, and stream flow patterns in adjacent natural areas

    USGS Publications Warehouse

    Stohlgren, T.J.; Chase, T.N.; Pielke, R.A.; Kittel, T.G.F.; Baron, Jill S.

    1998-01-01

    We present evidence that land use practices in the plains of Colorado influence regional climate and vegetation in adjacent natural areas in the Rocky Mountains in predictable ways. Mesoscale climate model simulations using the Colorado State University Regional Atmospheric Modelling System (RAMS) projected that modifications to natural vegetation in the plains, primarily due to agriculture and urbanization, could produce lower summer temperatures in the mountains. We corroborate the RAMS simulations with three independent sets of data: (i) climate records from 16 weather stations, which showed significant trends of decreasing July temperatures in recent decades; (ii) the distribution of seedlings of five dominant conifer species in Rocky Mountain National Park, Colorado, which suggested that cooler, wetter conditions occurred over roughly the same time period; and (iii) increased stream flow, normalized for changes in precipitation, during the summer months in four river basins, which also indicates cooler summer temperatures and lower transpiration at landscape scales. Combined, the mesoscale atmospheric/land-surface model, short-term in regional temperatures, forest distribution changes, and hydrology data indicate that the effects of land use practices on regional climate may overshadow larger-scale temperature changes commonly associated with observed increases in CO2 and other greenhouse gases.

  14. Defining window-boundaries for genomic analyses using smoothing spline techniques

    DOE PAGES

    Beissinger, Timothy M.; Rosa, Guilherme J.M.; Kaeppler, Shawn M.; ...

    2015-04-17

    High-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the datamore » and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome.« less

  15. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    PubMed

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  16. Linear and exponential TAIL-PCR: a method for efficient and quick amplification of flanking sequences adjacent to Tn5 transposon insertion sites.

    PubMed

    Jia, Xianbo; Lin, Xinjian; Chen, Jichen

    2017-11-02

    Current genome walking methods are very time consuming, and many produce non-specific amplification products. To amplify the flanking sequences that are adjacent to Tn5 transposon insertion sites in Serratia marcescens FZSF02, we developed a genome walking method based on TAIL-PCR. This PCR method added a 20-cycle linear amplification step before the exponential amplification step to increase the concentration of the target sequences. Products of the linear amplification and the exponential amplification were diluted 100-fold to decrease the concentration of the templates that cause non-specific amplification. Fast DNA polymerase with a high extension speed was used in this method, and an amplification program was used to rapidly amplify long specific sequences. With this linear and exponential TAIL-PCR (LETAIL-PCR), we successfully obtained products larger than 2 kb from Tn5 transposon insertion mutant strains within 3 h. This method can be widely used in genome walking studies to amplify unknown sequences that are adjacent to known sequences.

  17. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes.

    PubMed

    Feltus, Frank A; Saski, Christopher A; Mockaitis, Keithanne; Haiminen, Niina; Parida, Laxmi; Smith, Zachary; Ford, James; Staton, Margaret E; Ficklin, Stephen P; Blackmon, Barbara P; Cheng, Chun-Huai; Schnell, Raymond J; Kuhn, David N; Motamayor, Juan-Carlos

    2011-07-27

    BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed.

  18. Variable length adjacent partitioning for PTS based PAPR reduction of OFDM signal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ibraheem, Zeyid T.; Rahman, Md. Mijanur; Yaakob, S. N.

    2015-05-15

    Peak-to-Average power ratio (PAPR) is a major drawback in OFDM communication. It leads the power amplifier into nonlinear region operation resulting into loss of data integrity. As such, there is a strong motivation to find techniques to reduce PAPR. Partial Transmit Sequence (PTS) is an attractive scheme for this purpose. Judicious partitioning the OFDM data frame into disjoint subsets is a pivotal component of any PTS scheme. Out of the existing partitioning techniques, adjacent partitioning is characterized by an attractive trade-off between cost and performance. With an aim of determining effects of length variability of adjacent partitions, we performed anmore » investigation into the performances of a variable length adjacent partitioning (VL-AP) and fixed length adjacent partitioning in comparison with other partitioning schemes such as pseudorandom partitioning. Simulation results with different modulation and partitioning scenarios showed that fixed length adjacent partition had better performance compared to variable length adjacent partitioning. As expected, simulation results showed a slightly better performance of pseudorandom partitioning technique compared to fixed and variable adjacent partitioning schemes. However, as the pseudorandom technique incurs high computational complexities, adjacent partitioning schemes were still seen as favorable candidates for PAPR reduction.« less

  19. A Cryptosporidium parvum genomic region encoding hemolytic activity.

    PubMed Central

    Steele, M I; Kuhls, T L; Nida, K; Meka, C S; Halabi, I M; Mosier, D A; Elliott, W; Crawford, D L; Greenfield, R A

    1995-01-01

    Successful parasitization by Cryptosporidium parvum requires multiple disruptions in both host and protozoan cell membranes as cryptosporidial sporozoites invade intestinal epithelial cells and subsequently develop into asexual and sexual life stages. To identify cryptosporidial proteins which may play a role in these membrane alterations, hemolytic activity was used as a marker to screen a C. parvum genomic expression library. A stable hemolytic clone (H4) containing a 5.5-kb cryptosporidial genomic fragment was identified. The hemolytic activity encoded on H4 was mapped to a 1-kb region that contained a complete 690-bp open reading frame (hemA) ending in a common stop codon. A 21-kDa plasmid-encoded recombinant protein was expressed in maxicells containing H4. Subclones of H4 which contained only a portion of hemA did not induce hemolysis on blood agar or promote expression of the recombinant protein in maxicells. Reverse transcriptase-mediated PCR analysis of total RNA isolated from excysted sporozoites and the intestines of infected adult mice with severe combined immunodeficiency demonstrated that hemA is actively transcribed during the cryptosporidial life cycle. PMID:7558289

  20. Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping.

    PubMed

    U'Ren, Jana M; Schupp, James M; Pearson, Talima; Hornstra, Heidie; Friedman, Christine L Clark; Smith, Kimothy L; Daugherty, Rebecca R Leadem; Rhoton, Shane D; Leadem, Ben; Georgia, Shalamar; Cardon, Michelle; Huynh, Lynn Y; DeShazer, David; Harvey, Steven P; Robison, Richard; Gal, Daniel; Mayo, Mark J; Wagner, David; Currie, Bart J; Keim, Paul

    2007-03-30

    The facultative, intracellular bacterium Burkholderia pseudomallei is the causative agent of melioidosis, a serious infectious disease of humans and animals. We identified and categorized tandem repeat arrays and their distribution throughout the genome of B. pseudomallei strain K96243 in order to develop a genetic typing method for B. pseudomallei. We then screened 104 of the potentially polymorphic loci across a diverse panel of 31 isolates including B. pseudomallei, B. mallei and B. thailandensis in order to identify loci with varying degrees of polymorphism. A subset of these tandem repeat arrays were subsequently developed into a multiple-locus VNTR analysis to examine 66 B. pseudomallei and 21 B. mallei isolates from around the world, as well as 95 lineages from a serial transfer experiment encompassing ~18,000 generations. B. pseudomallei contains a preponderance of tandem repeat loci throughout its genome, many of which are duplicated elsewhere in the genome. The majority of these loci are composed of repeat motif lengths of 6 to 9 bp with 4 to 10 repeat units and are predominately located in intergenic regions of the genome. Across geographically diverse B. pseudomallei and B.mallei isolates, the 32 VNTR loci displayed between 7 and 28 alleles, with Nei's diversity values ranging from 0.47 and 0.94. Mutation rates for these loci are comparable (>10-5 per locus per generation) to that of the most diverse tandemly repeated regions found in other less diverse bacteria. The frequency, location and duplicate nature of tandemly repeated regions within the B. pseudomallei genome indicate that these tandem repeat regions may play a role in generating and maintaining adaptive genomic variation. Multiple-locus VNTR analysis revealed extensive diversity within the global isolate set containing B. pseudomallei and B. mallei, and it detected genotypic differences within clonal lineages of both species that were identical using previous typing methods. Given the health

  1. RRE: a tool for the extraction of non-coding regions surrounding annotated genes from genomic datasets.

    PubMed

    Lazzarato, F; Franceschinis, G; Botta, M; Cordero, F; Calogero, R A

    2004-11-01

    RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html

  2. Complete sequence and analysis of the mitochondrial genome of Hemiselmis andersenii CCMP644 (Cryptophyceae).

    PubMed

    Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

    2008-05-12

    Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes-a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a approximately 20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22-336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Comparison of

  3. A new interpretation of deformation rates in the Snake River Plain and adjacent basin and range regions based on GPS measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    S.J. Payne; R. McCaffrey; R.W. King

    2012-04-01

    We evaluate horizontal Global Positioning System (GPS) velocities together with geologic, volcanic, and seismic data to interpret extension, shear, and contraction within the Snake River Plain and the Northern Basin and Range Province, U.S.A. We estimate horizontal surface velocities using GPS data collected at 385 sites from 1994 to 2009 and present an updated velocity field within the Stable North American Reference Frame (SNARF). Our results show an ENE-oriented extensional strain rate of 5.9 {+-} 0.7 x 10{sup -9} yr{sup -1} in the Centennial Tectonic belt and an E-oriented extensional strain rate of 6.2 {+-} 0.3 x 10{sup -9} yr{supmore » -1} in the Intermountain Seismic belt combined with the northern Great Basin. These extensional strain rates contrast with the regional north-south contraction of -2.6 {+-} 1.1 x 10{sup -9} yr{sup -1} calculated in the Snake River Plain and Owyhee-Oregon Plateau over a 125 x 650 km region. Tests that include dike-opening reveal that rapid extension by dike intrusion in volcanic rift zones does not occur in the Snake River Plain at present. This slow internal deformation in the Snake River Plain is in contrast to the rapidly-extending adjacent Basin and Range provinces and implies shear along boundaries of the Snake River Plain. We estimate right-lateral shear with slip rates of 0.5-1.5 mm/yr along the northwestern boundary adjacent to the Centennial Tectonic belt and left-lateral oblique extension with slip rates of <0.5 to 1.7 mm/yr along the southeastern boundary adjacent to the Intermountain Seismic belt. The fastest lateral shearing occurs near the Yellowstone Plateau where strike-slip focal mechanisms and faults with observed strike-slip components of motion are documented. The regional GPS velocity gradients are best fit by nearby poles of rotation for the Centennial Tectonic belt, Idaho batholith, Snake River Plain, Owyhee-Oregon Plateau, and central Oregon, indicating that clockwise rotation is driven by extension to

  4. A Three-Dimensional Model of the Yeast Genome

    NASA Astrophysics Data System (ADS)

    Noble, William; Duan, Zhi-Jun; Andronescu, Mirela; Schutz, Kevin; McIlwain, Sean; Kim, Yoo Jung; Lee, Choli; Shendure, Jay; Fields, Stanley; Blau, C. Anthony

    Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or factories for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.

  5. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    PubMed Central

    2011-01-01

    Background BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. Results This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Conclusions Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed. PMID:21794110

  6. Comparative genomics of Lupinus angustifolius gene-rich regions: BAC library exploration, genetic mapping and cytogenetics

    PubMed Central

    2013-01-01

    Background The narrow-leafed lupin, Lupinus angustifolius L., is a grain legume species with a relatively compact genome. The species has 2n = 40 chromosomes and its genome size is 960 Mbp/1C. During the last decade, L. angustifolius genomic studies have achieved several milestones, such as molecular-marker development, linkage maps, and bacterial artificial chromosome (BAC) libraries. Here, these resources were integratively used to identify and sequence two gene-rich regions (GRRs) of the genome. Results The genome was screened with a probe representing the sequence of a microsatellite fragment length polymorphism (MFLP) marker linked to Phomopsis stem blight resistance. BAC clones selected by hybridization were subjected to restriction fingerprinting and contig assembly, and 232 BAC-ends were sequenced and annotated. BAC fluorescence in situ hybridization (BAC-FISH) identified eight single-locus clones. Based on physical mapping, cytogenetic localization, and BAC-end annotation, five clones were chosen for sequencing. Within the sequences of clones that hybridized in FISH to a single-locus, two large GRRs were identified. The GRRs showed strong and conserved synteny to Glycine max duplicated genome regions, illustrated by both identical gene order and parallel orientation. In contrast, in the clones with dispersed FISH signals, more than one-third of sequences were transposable elements. Sequenced, single-locus clones were used to develop 12 genetic markers, increasing the number of L. angustifolius chromosomes linked to appropriate linkage groups by five pairs. Conclusions In general, probes originating from MFLP sequences can assist genome screening and gene discovery. However, such probes are not useful for positional cloning, because they tend to hybridize to numerous loci. GRRs identified in L. angustifolius contained a low number of interspersed repeats and had a high level of synteny to the genome of the model legume G. max. Our results showed that

  7. A Genome-Wide Association Study Identifies Multiple Regions Associated with Head Size in Catfish

    PubMed Central

    Geng, Xin; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Chao; Wang, Ruijia; Sha, Jin; Zeng, Peng; Zhi, Degui; Liu, Zhanjiang

    2016-01-01

    Skull morphology is fundamental to evolution and the biological adaptation of species to their environments. With aquaculture fish species, head size is also important for economic reasons because it has a direct impact on fillet yield. However, little is known about the underlying genetic basis of head size. Catfish is the primary aquaculture species in the United States. In this study, we performed a genome-wide association study using the catfish 250K SNP array with backcross hybrid catfish to map the QTL for head size (head length, head width, and head depth). One significantly associated region on linkage group (LG) 7 was identified for head length. In addition, LGs 7, 9, and 16 contain suggestively associated regions for head length. For head width, significantly associated regions were found on LG9, and additional suggestively associated regions were identified on LGs 5 and 7. No region was found associated with head depth. Head size genetic loci were mapped in catfish to genomic regions with candidate genes involved in bone development. Comparative analysis indicated that homologs of several candidate genes are also involved in skull morphology in various other species ranging from amphibian to mammalian species, suggesting possible evolutionary conservation of those genes in the control of skull morphologies. PMID:27558670

  8. Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria

    PubMed Central

    Dy, Ron L; Pitman, Andrew R; Fineran, Peter C

    2013-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution. PMID:24251073

  9. Genome-wide copy number variation (CNV) detection in Nelore cattle reveals highly frequent variants in genome regions harboring QTLs affecting production traits.

    PubMed

    da Silva, Joaquim Manoel; Giachetto, Poliana Fernanda; da Silva, Luiz Otávio; Cintra, Leandro Carrijo; Paiva, Samuel Rezende; Yamagishi, Michel Eduardo Beleza; Caetano, Alexandre Rodrigues

    2016-06-13

    Copy number variations (CNVs) have been shown to account for substantial portions of observed genomic variation and have been associated with qualitative and quantitative traits and the onset of disease in a number of species. Information from high-resolution studies to detect, characterize and estimate population-specific variant frequencies will facilitate the incorporation of CNVs in genomic studies to identify genes affecting traits of importance. Genome-wide CNVs were detected in high-density single nucleotide polymorphism (SNP) genotyping data from 1,717 Nelore (Bos indicus) cattle, and in NGS data from eight key ancestral bulls. A total of 68,007 and 12,786 distinct CNVs were observed, respectively. Cross-comparisons of results obtained for the eight resequenced animals revealed that 92 % of the CNVs were observed in both datasets, while 62 % of all detected CNVs were observed to overlap with previously validated cattle copy number variant regions (CNVRs). Observed CNVs were used for obtaining breed-specific CNV frequencies and identification of CNVRs, which were subsequently used for gene annotation. A total of 688 of the detected CNVRs were observed to overlap with 286 non-redundant QTLs associated with important production traits in cattle. All of 34 CNVs previously reported to be associated with milk production traits in Holsteins were also observed in Nelore cattle. Comparisons of estimated frequencies of these CNVs in the two breeds revealed 14, 13, 6 and 14 regions in high (>20 %), low (<20 %) and divergent (NEL > HOL, NEL < HOL) frequencies, respectively. Obtained results significantly enriched the bovine CNV map and enabled the identification of variants that are potentially associated with traits under selection in Nelore cattle, particularly in genome regions harboring QTLs affecting production traits.

  10. Transcription of the herpes simplex virus 1 genome during productive and quiescent infection of neuronal and nonneuronal cells.

    PubMed

    Harkness, Justine M; Kader, Muhamuda; DeLuca, Neal A

    2014-06-01

    Herpes simplex virus 1 (HSV-1) can undergo a productive infection in nonneuronal and neuronal cells such that the genes of the virus are transcribed in an ordered cascade. HSV-1 can also establish a more quiescent or latent infection in peripheral neurons, where gene expression is substantially reduced relative to that in productive infection. HSV mutants defective in multiple immediate early (IE) gene functions are highly defective for later gene expression and model some aspects of latency in vivo. We compared the expression of wild-type (wt) virus and IE gene mutants in nonneuronal cells (MRC5) and adult murine trigeminal ganglion (TG) neurons using the Illumina platform for cDNA sequencing (RNA-seq). RNA-seq analysis of wild-type virus revealed that expression of the genome mostly followed the previously established kinetics, validating the method, while highlighting variations in gene expression within individual kinetic classes. The accumulation of immediate early transcripts differed between MRC5 cells and neurons, with a greater abundance in neurons. Analysis of a mutant defective in all five IE genes (d109) showed dysregulated genome-wide low-level transcription that was more highly attenuated in MRC5 cells than in TG neurons. Furthermore, a subset of genes in d109 was more abundantly expressed over time in neurons. While the majority of the viral genome became relatively quiescent, the latency-associated transcript was specifically upregulated. Unexpectedly, other genes within repeat regions of the genome, as well as the unique genes just adjacent the repeat regions, also remained relatively active in neurons. The relative permissiveness of TG neurons to viral gene expression near the joint region is likely significant during the establishment and reactivation of latency. During productive infection, the genes of HSV-1 are transcribed in an ordered cascade. HSV can also establish a more quiescent or latent infection in peripheral neurons. HSV mutants

  11. A multiplex primer design algorithm for target amplification of continuous genomic regions.

    PubMed

    Ozturk, Ahmet Rasit; Can, Tolga

    2017-06-19

    Targeted Next Generation Sequencing (NGS) assays are cost-efficient and reliable alternatives to Sanger sequencing. For sequencing of very large set of genes, the target enrichment approach is suitable. However, for smaller genomic regions, the target amplification method is more efficient than both the target enrichment method and Sanger sequencing. The major difficulty of the target amplification method is the preparation of amplicons, regarding required time, equipment, and labor. Multiplex PCR (MPCR) is a good solution for the mentioned problems. We propose a novel method to design MPCR primers for a continuous genomic region, following the best practices of clinically reliable PCR design processes. On an experimental setup with 48 different combinations of factors, we have shown that multiple parameters might effect finding the first feasible solution. Increasing the length of the initial primer candidate selection sequence gives better results whereas waiting for a longer time to find the first feasible solution does not have a significant impact. We generated MPCR primer designs for the HBB whole gene, MEFV coding regions, and human exons between 2000 bp to 2100 bp-long. Our benchmarking experiments show that the proposed MPCR approach is able produce reliable NGS assay primers for a given sequence in a reasonable amount of time.

  12. Seismicity in Azerbaijan and Adjacent Caspian Sea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Panahi, Behrouz M.

    2006-03-23

    So far no general view on the geodynamic evolution of the Black Sea to the Caspian Sea region is elaborated. This is associated with the geological and structural complexities of the region revealed by geophysical, geochemical, petrologic, structural, and other studies. A clash of opinions on geodynamic conditions of the Caucasus region, sometimes mutually exclusive, can be explained by a simplified interpretation of the seismic data. In this paper I analyze available data on earthquake occurrences in Azerbaijan and the adjacent Caspian Sea region. The results of the analysis of macroseismic and instrumental data, seismic regime, and earthquake reoccurrence indicatemore » that a level of seismicity in the region is moderate, and seismic event are concentrated in the shallow part of the lithosphere. Seismicity is mostly intra-plate, and spatial distribution of earthquake epicenters does not correlate with the plate boundaries.« less

  13. aCGH Local Copy Number Aberrations Associated with Overall Copy Number Genomic Instability in Colorectal Cancer: Coordinate Involvement of the Regions Including BCR and ABL

    PubMed Central

    Bartos, Jeremy D.; Gaile, Daniel P.; McQuaid, Devin E.; Conroy, Jeffrey M.; Darbary, Huferesh; Nowak, Norma J.; Block, Annemarie; Petrelli, Nicholas J.; Mittelman, Arnold; Stoler, Daniel L.; Anderson, Garth R.

    2007-01-01

    In order to identify small regions of the genome whose specific copy number alteration is associated with high genomic instability in the form of overall genome-wide copy number aberrations, we have analyzed array-based comparative genomic hybridization (aCGH) data from 33 sporadic colorectal carcinomas. Copy number changes of a small number of specific regions were significantly correlated with elevated overall amplifications and deletions scattered throughout the entire genome. One significant region at 9q34 includes the c-ABL gene Another region spanning 22q11–13 includes the breakpoint cluster region (BCR) of the Philadelphia chromosome Coordinate 22q11–13 alterations were observed in nine of eleven tumors with the 9q34 alteration Additional regions on 1q and 14q were associated with overall genome-wide copy number changes, while copy number aberrations on chromosome 7p, 7q, and 13q21.1–31.3 were found associated with this instability only in tumors from patients with a smoking history Our analysis demonstrates there are a small number of regions of the genome where gain or loss is commonly associated with a tumor’s overall level of copy number aberrations Our finding BCR and ABL located within two of the instability-associated regions, and the involvement of these two regions occurring coordinately, suggests a system akin to the BCR-ABL translocation of CML may be involved in genomic instability in about one-third of human colorectal carcinomas. PMID:17196995

  14. megaTALs: a rare-cleaving nuclease architecture for therapeutic genome engineering.

    PubMed

    Boissel, Sandrine; Jarjour, Jordan; Astrakhan, Alexander; Adey, Andrew; Gouble, Agnès; Duchateau, Philippe; Shendure, Jay; Stoddard, Barry L; Certo, Michael T; Baker, David; Scharenberg, Andrew M

    2014-02-01

    Rare-cleaving endonucleases have emerged as important tools for making targeted genome modifications. While multiple platforms are now available to generate reagents for research applications, each existing platform has significant limitations in one or more of three key properties necessary for therapeutic application: efficiency of cleavage at the desired target site, specificity of cleavage (i.e. rate of cleavage at 'off-target' sites), and efficient/facile means for delivery to desired target cells. Here, we describe the development of a single-chain rare-cleaving nuclease architecture, which we designate 'megaTAL', in which the DNA binding region of a transcription activator-like (TAL) effector is used to 'address' a site-specific meganuclease adjacent to a single desired genomic target site. This architecture allows the generation of extremely active and hyper-specific compact nucleases that are compatible with all current viral and nonviral cell delivery methods.

  15. Engineered chromosome-based genetic mapping establishes a 3.7 Mb critical genomic region for Down syndrome-associated heart defects in mice.

    PubMed

    Liu, Chunhong; Morishima, Masae; Jiang, Xiaoling; Yu, Tao; Meng, Kai; Ray, Debjit; Pao, Annie; Ye, Ping; Parmacek, Michael S; Yu, Y Eugene

    2014-06-01

    Trisomy 21 (Down syndrome, DS) is the most common human genetic anomaly associated with heart defects. Based on evolutionary conservation, DS-associated heart defects have been modeled in mice. By generating and analyzing mouse mutants carrying different genomic rearrangements in human chromosome 21 (Hsa21) syntenic regions, we found the triplication of the Tiam1-Kcnj6 region on mouse chromosome 16 (Mmu16) resulted in DS-related cardiovascular abnormalities. In this study, we developed two tandem duplications spanning the Tiam1-Kcnj6 genomic region on Mmu16 using recombinase-mediated genome engineering, Dp(16)3Yey and Dp(16)4Yey, spanning the 2.1 Mb Tiam1-Il10rb and 3.7 Mb Ifnar1-Kcnj6 regions, respectively. We found that Dp(16)4Yey/+, but not Dp(16)3Yey/+, led to heart defects, suggesting the triplication of the Ifnar1-Kcnj6 region is sufficient to cause DS-associated heart defects. Our transcriptional analysis of Dp(16)4Yey/+ embryos showed that the Hsa21 gene orthologs located within the duplicated interval were expressed at the elevated levels, reflecting the consequences of the gene dosage alterations. Therefore, we have identified a 3.7 Mb genomic region, the smallest critical genomic region, for DS-associated heart defects, and our results should set the stage for the final step to establish the identities of the causal gene(s), whose elevated expression(s) directly underlie this major DS phenotype.

  16. The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes.

    PubMed

    Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars

    2017-02-10

    The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.

  17. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

    PubMed Central

    Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

    1999-01-01

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707

  18. Single Molecule Analysis of Replicated DNA Reveals the Usage of Multiple KSHV Genome Regions for Latent Replication

    PubMed Central

    Verma, Subhash C.; Lu, Jie; Cai, Qiliang; Kosiyatrakul, Settapong; McDowell, Maria E.; Schildkraut, Carl L.; Robertson, Erle S.

    2011-01-01

    Kaposi's sarcoma associated herpesvirus (KSHV), an etiologic agent of Kaposi's sarcoma, Body Cavity Based Lymphoma and Multicentric Castleman's Disease, establishes lifelong latency in infected cells. The KSHV genome tethers to the host chromosome with the help of a latency associated nuclear antigen (LANA). Additionally, LANA supports replication of the latent origins within the terminal repeats by recruiting cellular factors. Our previous studies identified and characterized another latent origin, which supported the replication of plasmids ex-vivo without LANA expression in trans. Therefore identification of an additional origin site prompted us to analyze the entire KSHV genome for replication initiation sites using single molecule analysis of replicated DNA (SMARD). Our results showed that replication of DNA can initiate throughout the KSHV genome and the usage of these regions is not conserved in two different KSHV strains investigated. SMARD also showed that the utilization of multiple replication initiation sites occurs across large regions of the genome rather than a specified sequence. The replication origin of the terminal repeats showed only a slight preference for their usage indicating that LANA dependent origin at the terminal repeats (TR) plays only a limited role in genome duplication. Furthermore, we performed chromatin immunoprecipitation for ORC2 and MCM3, which are part of the pre-replication initiation complex to determine the genomic sites where these proteins accumulate, to provide further characterization of potential replication initiation sites on the KSHV genome. The ChIP data confirmed accumulation of these pre-RC proteins at multiple genomic sites in a cell cycle dependent manner. Our data also show that both the frequency and the sites of replication initiation vary within the two KSHV genomes studied here, suggesting that initiation of replication is likely to be affected by the genomic context rather than the DNA sequences. PMID

  19. Genomic Mapping of Human DNA provides Evidence of Difference in Stretch between AT and GC rich regions

    NASA Astrophysics Data System (ADS)

    Reifenberger, Jeffrey; Dorfman, Kevin; Cao, Han

    Human DNA is a not a polymer consisting of a uniform distribution of all 4 nucleic acids, but rather contains regions of high AT and high GC content. When confined, these regions could have different stretch due to the extra hydrogen bond present in the GC basepair. To measure this potential difference, human genomic DNA was nicked with NtBspQI, labeled with a cy3 like fluorophore at the nick site, stained with YOYO, loaded into a device containing an array of nanochannels, and imaged. Over 473,000 individual molecules of DNA, corresponding to roughly 30x coverage of a human genome, were collected and aligned to the human reference. Based on the known AT/GC content between aligned pairs of labels, the stretch was measured for regions of similar size but different AT/GC content. We found that regions of high GC content were consistently more stretched than regions of high AT content between pairs of labels varying in size between 2.5 kbp and 500 kbp. We measured that for every 1% increase in GC content there was roughly a 0.06% increase in stretch. While this effect is small, it is important to take into account differences in stretch between AT and GC rich regions to improve the sensitivity of detection of structural variations from genomic variations. NIH Grant: R01-HG006851.

  20. Genome sequencing of idiopathic pulmonary fibrosis in conjunction with a medical school human anatomy course.

    PubMed

    Kumar, Akash; Dougherty, Max; Findlay, Gregory M; Geisheker, Madeleine; Klein, Jason; Lazar, John; Machkovech, Heather; Resnick, Jesse; Resnick, Rebecca; Salter, Alexander I; Talebi-Liasi, Faezeh; Arakawa, Christopher; Baudin, Jacob; Bogaard, Andrew; Salesky, Rebecca; Zhou, Qian; Smith, Kelly; Clark, John I; Shendure, Jay; Horwitz, Marshall S

    2014-01-01

    Even in cases where there is no obvious family history of disease, genome sequencing may contribute to clinical diagnosis and management. Clinical application of the genome has not yet become routine, however, in part because physicians are still learning how best to utilize such information. As an educational research exercise performed in conjunction with our medical school human anatomy course, we explored the potential utility of determining the whole genome sequence of a patient who had died following a clinical diagnosis of idiopathic pulmonary fibrosis (IPF). Medical students performed dissection and whole genome sequencing of the cadaver. Gross and microscopic findings were more consistent with the fibrosing variant of nonspecific interstitial pneumonia (NSIP), as opposed to IPF per se. Variants in genes causing Mendelian disorders predisposing to IPF were not detected. However, whole genome sequencing identified several common variants associated with IPF, including a single nucleotide polymorphism (SNP), rs35705950, located in the promoter region of the gene encoding mucin glycoprotein MUC5B. The MUC5B promoter polymorphism was recently found to markedly elevate risk for IPF, though a particular association with NSIP has not been previously reported, nor has its contribution to disease risk previously been evaluated in the genome-wide context of all genetic variants. We did not identify additional predicted functional variants in a region of linkage disequilibrium (LD) adjacent to MUC5B, nor did we discover other likely risk-contributing variants elsewhere in the genome. Whole genome sequencing thus corroborates the association of rs35705950 with MUC5B dysregulation and interstitial lung disease. This novel exercise additionally served a unique mission in bridging clinical and basic science education.

  1. Adjacent slice prostate cancer prediction to inform MALDI imaging biomarker analysis

    NASA Astrophysics Data System (ADS)

    Chuang, Shao-Hui; Sun, Xiaoyan; Cazares, Lisa; Nyalwidhe, Julius; Troyer, Dean; Semmes, O. John; Li, Jiang; McKenzie, Frederic D.

    2010-03-01

    Prostate cancer is the second most common type of cancer among men in US [1]. Traditionally, prostate cancer diagnosis is made by the analysis of prostate-specific antigen (PSA) levels and histopathological images of biopsy samples under microscopes. Proteomic biomarkers can improve upon these methods. MALDI molecular spectra imaging is used to visualize protein/peptide concentrations across biopsy samples to search for biomarker candidates. Unfortunately, traditional processing methods require histopathological examination on one slice of a biopsy sample while the adjacent slice is subjected to the tissue destroying desorption and ionization processes of MALDI. The highest confidence tumor regions gained from the histopathological analysis are then mapped to the MALDI spectra data to estimate the regions for biomarker identification from the MALDI imaging. This paper describes a process to provide a significantly better estimate of the cancer tumor to be mapped onto the MALDI imaging spectra coordinates using the high confidence region to predict the true area of the tumor on the adjacent MALDI imaged slice.

  2. Analysis of new isolates reveals new genome organization and a hypervariable region in infectious myonecrosis virus (IMNV).

    PubMed

    Dantas, Márcia Danielle A; Chavante, Suely F; Teixeira, Dárlio Inácio A; Lima, João Paulo M S; Lanza, Daniel C F

    2015-05-04

    Infectious myonecrosis virus (IMNV) has been the cause of many losses in shrimp farming since 2002, when the first myonecrosis outbreak was reported at Brazilian's northeast coast. Two additional genomes of Brazilian IMNV isolates collected in 2009 and 2013 were sequenced and analyzed in the present study. The sequencing revealed extra 643 bp and 22 bp, at 5' and 3' ends of IMNV genome respectively, confirming that its actual size is at least 8226 bp long. Considering these additional sequences in genome extremities, ORF1 can starts at nt 470, encoding a 1708 aa polyprotein. Computational predictions reveal two stem loops and two pseudoknots in the 5' end and a putative stem loop and a slippery motif located at 3' end, indicating that these regions can be involved in the start and termination of translation. Through a careful phylogenetic analysis, a higher genetic variability among Brazilian isolates could be observed, comparing with Indonesian IMNV isolates. It was also observed that the most variable region of IMNV genome is located in the first half of ORF1, coinciding with a region which probably encodes the capsid protrusions. The results presented here are a starting point to elucidate the viral's translational regulation and the mechanisms involved in virulence. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. The yak genome and adaptation to life at high altitude.

    PubMed

    Qiu, Qiang; Zhang, Guojie; Ma, Tao; Qian, Wubin; Wang, Junyi; Ye, Zhiqiang; Cao, Changchang; Hu, Quanjun; Kim, Jaebum; Larkin, Denis M; Auvil, Loretta; Capitanu, Boris; Ma, Jian; Lewin, Harris A; Qian, Xiaoju; Lang, Yongshan; Zhou, Ran; Wang, Lizhong; Wang, Kun; Xia, Jinquan; Liao, Shengguang; Pan, Shengkai; Lu, Xu; Hou, Haolong; Wang, Yan; Zang, Xuetao; Yin, Ye; Ma, Hui; Zhang, Jian; Wang, Zhaofeng; Zhang, Yingmei; Zhang, Dawei; Yonezawa, Takahiro; Hasegawa, Masami; Zhong, Yang; Liu, Wenbin; Zhang, Yan; Huang, Zhiyong; Zhang, Shengxiang; Long, Ruijun; Yang, Huanming; Wang, Jian; Lenstra, Johannes A; Cooper, David N; Wu, Yi; Wang, Jun; Shi, Peng; Wang, Jian; Liu, Jianquan

    2012-07-01

    Domestic yaks (Bos grunniens) provide meat and other necessities for Tibetans living at high altitude on the Qinghai-Tibetan Plateau and in adjacent regions. Comparison between yak and the closely related low-altitude cattle (Bos taurus) is informative in studying animal adaptation to high altitude. Here, we present the draft genome sequence of a female domestic yak generated using Illumina-based technology at 65-fold coverage. Genomic comparisons between yak and cattle identify an expansion in yak of gene families related to sensory perception and energy metabolism, as well as an enrichment of protein domains involved in sensing the extracellular environment and hypoxic stress. Positively selected and rapidly evolving genes in the yak lineage are also found to be significantly enriched in functional categories and pathways related to hypoxia and nutrition metabolism. These findings may have important implications for understanding adaptation to high altitude in other animal species and for hypoxia-related diseases in humans.

  4. Genomic organization of the neurofibromatosis 1 gene (NF1)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Y.; O`Connell, P.; Huntsman Breidenbach, H.

    Neurofibromatosis 1 maps to chromosome band 17q11.2, and the NF1 locus has been partially characterized. Even though the full-length NF1 cDNA has been sequenced, the complete genomic structure of the NF1 gene has not been elucidated. The 5{prime} end of NF1 is embedded in a CpG island containing a NotI restriction site, and the remainder of the gene lies in the adjacent 350-kb NotI fragment. In our efforts to develop a comprehensive screen for NF1 mutations, we have isolated genomic DNA clones that together harbor the entire NF1 cDNA sequence. We have identified all intron-exon boundaries of the coding regionmore » and established that it is composed of 59 exons. Furthermore, we have defined the 3{prime}-untranslated region (3{prime}-UTR) of the NF1 gene; it spans approximately 3.5 kb of genomic DNA sequence and is continuous with the stop codon. Oligonucleotide primer pairs synthesized from exon-flanking DNA sequences were used in the polymerase chain reaction with cloned, chromosome 17-specific genomic DNA as template to amplify NF1 exons 1 through 27b and the exon containing the 3{prime}-UTR separately. This information should be useful for implementing a comprehensive NF1 mutation screen using genomic DNA as template. 41 refs., 3 figs., 2 tabs.« less

  5. Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle.

    PubMed

    Doran, Anthony G; Berry, Donagh P; Creevey, Christopher J

    2014-10-01

    Four traits related to carcass performance have been identified as economically important in beef production: carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Although Holstein-Friesian cattle are primarily utilized for milk production, they are also an important source of meat for beef production and export. Because of this, there is great interest in understanding the underlying genomic structure influencing these traits. Several genome-wide association studies have identified regions of the bovine genome associated with growth or carcass traits, however, little is known about the mechanisms or underlying biological pathways involved. This study aims to detect regions of the bovine genome associated with carcass performance traits (employing a panel of 54,001 SNPs) using measures of genetic merit (as predicted transmitting abilities) for 5,705 Irish Holstein-Friesian animals. Candidate genes and biological pathways were then identified for each trait under investigation. Following adjustment for false discovery (q-value < 0.05), 479 quantitative trait loci (QTL) were associated with at least one of the four carcass traits using a single SNP regression approach. Using a Bayesian approach, 46 QTL were associated (posterior probability > 0.5) with at least one of the four traits. In total, 557 unique bovine genes, which mapped to 426 human orthologs, were within 500kbs of QTL found associated with a trait using the Bayesian approach. Using this information, 24 significantly over-represented pathways were identified across all traits. The most significantly over-represented biological pathway was the peroxisome proliferator-activated receptor (PPAR) signaling pathway. A large number of genomic regions putatively associated with bovine carcass traits were detected using two different statistical approaches. Notably, several significant associations were detected in close proximity to genes with a known role in animal growth

  6. Fine organization of genomic regions tagged to the 5S rDNA locus of the bread wheat 5B chromosome.

    PubMed

    Sergeeva, Ekaterina M; Shcherban, Andrey B; Adonina, Irina G; Nesterov, Michail A; Beletsky, Alexey V; Rakitin, Andrey L; Mardanov, Andrey V; Ravin, Nikolai V; Salina, Elena A

    2017-11-14

    The multigene family encoding the 5S rRNA, one of the most important structurally-functional part of the large ribosomal subunit, is an obligate component of all eukaryotic genomes. 5S rDNA has long been a favored target for cytological and phylogenetic studies due to the inherent peculiarities of its structural organization, such as the tandem arrays of repetitive units and their high interspecific divergence. The complex polyploid nature of the genome of bread wheat, Triticum aestivum, and the technically difficult task of sequencing clusters of tandem repeats mean that the detailed organization of extended genomic regions containing 5S rRNA genes remains unclear. This is despite the recent progress made in wheat genomic sequencing. Using pyrosequencing of BAC clones, in this work we studied the organization of two distinct 5S rDNA-tagged regions of the 5BS chromosome of bread wheat. Three BAC-clones containing 5S rDNA were identified in the 5BS chromosome-specific BAC-library of Triticum aestivum. Using the results of pyrosequencing and assembling, we obtained six 5S rDNA- containing contigs with a total length of 140,417 bp, and two sets (pools) of individual 5S rDNA sequences belonging to separate, but closely located genomic regions on the 5BS chromosome. Both regions are characterized by the presence of approximately 70-80 copies of 5S rDNA, however, they are completely different in their structural organization. The first region contained highly diverged short-type 5S rDNA units that were disrupted by multiple insertions of transposable elements. The second region contained the more conserved long-type 5S rDNA, organized as a single tandem array. FISH using probes specific to both 5S rDNA unit types showed differences in the distribution and intensity of signals on the chromosomes of polyploid wheat species and their diploid progenitors. A detailed structural organization of two closely located 5S rDNA-tagged genomic regions on the 5BS chromosome of bread

  7. Identification guide to skates (Family Rajidae) of the Canadian Atlantic and adjacent regions

    USGS Publications Warehouse

    Sulak, Kenneth J.; MacWhirter, P. D.; Luke, K.E.; Norem, A.D.; Miller, J.M.; Cooper, J.A.; Harris, L.E.

    2009-01-01

    Ecosystem-based management requires sound information on the distribution and abundance of species both common and rare. Therefore, the accurate identification for all marine species has assumed a much greater importance. The identification of many skate species is difficult as several are easily confused and has been found to be problematic in both survey data and fisheries data collection. Identification guides, in combination with training and periodic validation of taxonomic information, improve our accuracy in monitoring data required for ecosystem-based management and monitoring of populations. This guide offers a comparative synthesis of skate species known to occur in Atlantic Canada and adjacent regions. The taxonomic nomenclature and descriptions of key morphological features are based on the most up-to-date understanding of diversity among these species. Although this information will aid the user in accurate identification, some features vary geographically (such as colour) and others with life stage (most notably the proportion of tail length to body length; the presence of spines either sharper in juveniles or in some cases not yet present; and also increases in the number of tooth rows as species grow into maturity). Additional information on juvenile features are needed to facilitate problematic identifications (e.g. L. erinacea vs. L. ocellata). Information on size at maturity is still required for many of these species throughout their geographic distribution.

  8. Fast ancestral gene order reconstruction of genomes with unequal gene content.

    PubMed

    Feijão, Pedro; Araujo, Eloi

    2016-11-11

    During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content

  9. Seismic hazard and seismic risk assessment based on the unified scaling law for earthquakes: Himalayas and adjacent regions

    NASA Astrophysics Data System (ADS)

    Nekrasova, A. K.; Kossobokov, V. G.; Parvez, I. A.

    2015-03-01

    For the Himalayas and neighboring regions, the maps of seismic hazard and seismic risk are constructed with the use of the estimates for the parameters of the unified scaling law for earthquakes (USLE), in which the Gutenberg-Richter law for magnitude distribution of seismic events within a given area is applied in the modified version with allowance for linear dimensions of the area, namely, log N( M, L) = A + B (5 - M) + C log L, where N( M, L) is the expected annual number of the earthquakes with magnitude M in the area with linear dimension L. The spatial variations in the parameters A, B, and C for the Himalayas and adjacent regions are studied on two time intervals from 1965 to 2011 and from 1980 to 2011. The difference in A, B, and C between these two time intervals indicates that seismic activity experiences significant variations on a scale of a few decades. With a global consideration of the seismic belts of the Earth overall, the estimates of coefficient A, which determines the logarithm of the annual average frequency of the earthquakes with a magnitude of 5.0 and higher in the zone with a linear dimension of 1 degree of the Earth's meridian, differ by a factor of 30 and more and mainly fall in the interval from -1.1 to 0.5. The values of coefficient B, which describes the balance between the number of earthquakes with different magnitudes, gravitate to 0.9 and range from less than 0.6 to 1.1 and higher. The values of coefficient C, which estimates the fractal dimension of the local distribution of epicenters, vary from 0.5 to 1.4 and higher. In the Himalayas and neighboring regions, the USLE coefficients mainly fall in the intervals of -1.1 to 0.3 for A, 0.8 to 1.3 for B, and 1.0 to 1.4 for C. The calculations of the local value of the expected peak ground acceleration (PGA) from the maximal expected magnitude provided the necessary basis for mapping the seismic hazards in the studied region. When doing this, we used the local estimates of the

  10. Assembling the Setaria italica L. Beauv. genome into nine chromosomes and insights into regions affecting growth and drought tolerance

    PubMed Central

    Tsai, Kevin J.; Lu, Mei-Yeh Jade; Yang, Kai-Jung; Li, Mengyun; Teng, Yuchuan; Chen, Shihmay; Ku, Maurice S. B.; Li, Wen-Hsiung

    2016-01-01

    The diploid C4 plant foxtail millet (Setaria italica L. Beauv.) is an important crop in many parts of Africa and Asia for the vast consumption of its grain and ability to grow in harsh environments, but remains understudied in terms of complete genomic architecture. To date, there have been only two genome assembly and annotation efforts with neither assembly reaching over 86% of the estimated genome size. We have combined de novo assembly with custom reference-guided improvements on a popular cultivar of foxtail millet and have achieved a genome assembly of 477 Mbp in length, which represents over 97% of the estimated 490 Mbp. The assembly anchors over 98% of the predicted genes to the nine assembled nuclear chromosomes and contains more functional annotation gene models than previous assemblies. Our annotation has identified a large number of unique gene ontology terms related to metabolic activities, a region of chromosome 9 with several growth factor proteins, and regions syntenic with pearl millet or maize genomic regions that have been previously shown to affect growth. The new assembly and annotation for this important species can be used for detailed investigation and future innovations in growth for millet and other grains. PMID:27734962

  11. Assembling the Setaria italica L. Beauv. genome into nine chromosomes and insights into regions affecting growth and drought tolerance.

    PubMed

    Tsai, Kevin J; Lu, Mei-Yeh Jade; Yang, Kai-Jung; Li, Mengyun; Teng, Yuchuan; Chen, Shihmay; Ku, Maurice S B; Li, Wen-Hsiung

    2016-10-13

    The diploid C 4 plant foxtail millet (Setaria italica L. Beauv.) is an important crop in many parts of Africa and Asia for the vast consumption of its grain and ability to grow in harsh environments, but remains understudied in terms of complete genomic architecture. To date, there have been only two genome assembly and annotation efforts with neither assembly reaching over 86% of the estimated genome size. We have combined de novo assembly with custom reference-guided improvements on a popular cultivar of foxtail millet and have achieved a genome assembly of 477 Mbp in length, which represents over 97% of the estimated 490 Mbp. The assembly anchors over 98% of the predicted genes to the nine assembled nuclear chromosomes and contains more functional annotation gene models than previous assemblies. Our annotation has identified a large number of unique gene ontology terms related to metabolic activities, a region of chromosome 9 with several growth factor proteins, and regions syntenic with pearl millet or maize genomic regions that have been previously shown to affect growth. The new assembly and annotation for this important species can be used for detailed investigation and future innovations in growth for millet and other grains.

  12. Genomic scan of selective sweeps in thin and fat tail sheep breeds for identifying of candidate regions associated with fat deposition

    PubMed Central

    2012-01-01

    Background Identification of genomic regions that have been targets of selection for phenotypic traits is one of the most important and challenging areas of research in animal genetics. However, currently there are relatively few genomic regions identified that have been subject to positive selection. In this study, a genome-wide scan using ~50,000 Single Nucleotide Polymorphisms (SNPs) was performed in an attempt to identify genomic regions associated with fat deposition in fat-tail breeds. This trait and its modification are very important in those countries grazing these breeds. Results Two independent experiments using either Iranian or Ovine HapMap genotyping data contrasted thin and fat tail breeds. Population differentiation using FST in Iranian thin and fat tail breeds revealed seven genomic regions. Almost all of these regions overlapped with QTLs that had previously been identified as affecting fat and carcass yield traits in beef and dairy cattle. Study of selection sweep signatures using FST in thin and fat tail breeds sampled from the Ovine HapMap project confirmed three of these regions located on Chromosomes 5, 7 and X. We found increased homozygosity in these regions in favour of fat tail breeds on chromosome 5 and X and in favour of thin tail breeds on chromosome 7. Conclusions In this study, we were able to identify three novel regions associated with fat deposition in thin and fat tail sheep breeds. Two of these were associated with an increase of homozygosity in the fat tail breeds which would be consistent with selection for mutations affecting fat tail size several thousand years after domestication. PMID:22364287

  13. Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus

    PubMed Central

    Brinton, Margo A.; Basu, Mausumi

    2015-01-01

    The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510

  14. Identification and Potential Regulatory Properties of Evolutionary Conserved Regions (ECRs) at the Schizophrenia-Associated MIR137 Locus.

    PubMed

    Gianfrancesco, Olympia; Griffiths, Daniel; Myers, Paul; Collier, David A; Bubb, Vivien J; Quinn, John P

    2016-10-01

    Genome-wide association studies (GWAS) have identified a region at chromosome 1p21.3, containing the microRNA MIR137, to be among the most significant associations for schizophrenia. However, the mechanism by which genetic variation at this locus increases risk of schizophrenia is unknown. Identifying key regulatory regions around MIR137 is crucial to understanding the potential role of this gene in the aetiology of psychiatric disorders. Through alignment of vertebrate genomes, we identified seven non-coding regions at the MIR137 locus with conservation comparable to exons (>70 %). Bioinformatic analysis using the Psychiatric Genomics Consortium GWAS dataset for schizophrenia showed five of the ECRs to have genome-wide significant SNPs in or adjacent to their sequence. Analysis of available datasets on chromatin marks and histone modification data showed that three of the ECRs were predicted to be functional in the human brain, and three in development. In vitro analysis of ECR activity using reporter gene assays showed that all seven of the selected ECRs displayed transcriptional regulatory activity in the SH-SY5Y neuroblastoma cell line. This data suggests a regulatory role in the developing and adult brain for these highly conserved regions at the MIR137 schizophrenia-associated locus and further that these domains could act individually or synergistically to regulate levels of MIR137 expression.

  15. Integrated genomic and interfacility patient-transfer data reveal the transmission pathways of multidrug-resistant Klebsiella pneumoniae in a regional outbreak.

    PubMed

    Snitkin, Evan S; Won, Sarah; Pirani, Ali; Lapp, Zena; Weinstein, Robert A; Lolans, Karen; Hayden, Mary K

    2017-11-22

    Development of effective strategies to limit the proliferation of multidrug-resistant organisms requires a thorough understanding of how such organisms spread among health care facilities. We sought to uncover the chains of transmission underlying a 2008 U.S. regional outbreak of carbapenem-resistant Klebsiella pneumoniae by performing an integrated analysis of genomic and interfacility patient-transfer data. Genomic analysis yielded a high-resolution transmission network that assigned directionality to regional transmission events and discriminated between intra- and interfacility transmission when epidemiologic data were ambiguous or misleading. Examining the genomic transmission network in the context of interfacility patient transfers (patient-sharing networks) supported the role of patient transfers in driving the outbreak, with genomic analysis revealing that a small subset of patient-transfer events was sufficient to explain regional spread. Further integration of the genomic and patient-sharing networks identified one nursing home as an important bridge facility early in the outbreak-a role that was not apparent from analysis of genomic or patient-transfer data alone. Last, we found that when simulating a real-time regional outbreak, our methodology was able to accurately infer the facility at which patients acquired their infections. This approach has the potential to identify facilities with high rates of intra- or interfacility transmission, data that will be useful for triggering targeted interventions to prevent further spread of multidrug-resistant organisms. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  16. Summary of the geology and resources of uranium in the San Juan Basin and adjacent region, New Mexico, Arizona, Utah, and Colorado

    USGS Publications Warehouse

    Ridgley, Jennie L.; Green, M.W.; Pierson, C.T.; Finch, W.I.; Lupe, R.D.

    1978-01-01

    The San Juan Basin and adjacent region lie predominantly in the southeastern part of the uranium-rich Colorado Plateau of New Mexico, Arizona, Utah, and Colorado. Underlying the province are rocks of the Precambrian basement complex composed mainly of igneous and metamorphic rocks; a thickness of about 3,600 meters of generally horizontal Paleozoic, Mesozoic, and Cenozoic sedimentary rocks; and a variety of Upper Cretaceous and Cenozoic igneous rocks. Sedimentary rocks of the sequence are commonly eroded and well exposed near the present basin margins where Tertiary tectonic activity has uplifted, folded, and faulted the sequence into its present geologic configuration of basins, platforms, monoclines, and other related structural features. Sedimentary rocks of Jurassic age in the southern part of the San Juan Basin contain the largest uranium deposits in the United States, and offer the promise of additional uranium deposits. Elsewhere in the basin and the adjacent Colorado Plateau, reserves and resources of uranium are known primarily in Triassic, Jurassic, and Cretaceous strata. Only scattered occurrences of uranium are known in Paleozoic

  17. Optimizing Restriction Site Placement for Synthetic Genomes

    NASA Astrophysics Data System (ADS)

    Montes, Pablo; Memelli, Heraldo; Ward, Charles; Kim, Joondong; Mitchell, Joseph S. B.; Skiena, Steven

    Restriction enzymes are the workhorses of molecular biology. We introduce a new problem that arises in the course of our project to design virus variants to serve as potential vaccines: we wish to modify virus-length genomes to introduce large numbers of unique restriction enzyme recognition sites while preserving wild-type function by substitution of synonymous codons. We show that the resulting problem is NP-Complete, give an exponential-time algorithm, and propose effective heuristics, which we show give excellent results for five sample viral genomes. Our resulting modified genomes have several times more unique restriction sites and reduce the maximum gap between adjacent sites by three to nine-fold.

  18. The mitochondrial genome of Pocillopora (Cnidaria: Scleractinia) contains two variable regions: the putative D-loop and a novel ORF of unknown function.

    PubMed

    Flot, Jean-François; Tillier, Simon

    2007-10-15

    The complete mitochondrial genomes of two individuals attributed to different morphospecies of the scleractinian coral genus Pocillopora have been sequenced. Both genomes, respectively 17,415 and 17,422 nt long, share the presence of a previously undescribed ORF encoding a putative protein made up of 302 amino acids and of unknown function. Surprisingly, this ORF turns out to be the second most variable region of the mitochondrial genome (1% nucleotide sequence difference between the two individuals) after the putative control region (1.5% sequence difference). Except for the presence of this ORF and for the location of the putative control region, the mitochondrial genome of Pocillopora is organized in a fashion similar to the other scleractinian coral genomes published to date. For the first time in a cnidarian, a putative second origin of replication is described based on its secondary structure similar to the stem-loop structure of O(L), the origin of L-strand replication in vertebrates.

  19. The tiger beetles (Coleoptera, Carabidae, Cicindelinae) of Israel and adjacent lands

    PubMed Central

    Matalin, Andrey V.; Chikatunov, Vladimir I.

    2016-01-01

    Abstract Based on field studies, museums collections and literature sources, the current knowledge of the tiger beetle fauna of Israel and adjacent lands is presented. In Israel eight species occur, one of them with two subspecies, while in the Sinai Peninsula nine species of tiger beetles are now known. In the combined regions seven genera from two tribes were found. The Rift Valley with six cicindelids species is the most specious region of Israel. Cylindera contorta valdenbergi and Cicindela javeti azari have localized distributions and should be considered regional endemics. A similarity analysis of the tiger beetles faunas of different regions of Israel and the Sinai Peninsula reveal two clusters of species. The first includes the Great Rift Valley and most parts of the Sinai Peninsula, and the second incorporates most regions of Israel together with Central Sinai Foothills. Five distinct adult phenological groups of tiger beetles can be distinguished in these two clusters: active all-year (three species), spring-fall (five species), summer (two species), spring-summer (one species) and spring (one species). The likely origins of the tiger beetle fauna of this area are presented. An annotated list and illustrated identification key of the Cicindelinae of Israel and adjacent lands are provided. PMID:27110198

  20. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions.

    PubMed

    Conte, Matthew A; Gammerdinger, William J; Bartie, Kerry L; Penman, David J; Kocher, Thomas D

    2017-05-02

    Tilapias are the second most farmed fishes in the world and a sustainable source of food. Like many other fish, tilapias are sexually dimorphic and sex is a commercially important trait in these fish. In this study, we developed a significantly improved assembly of the tilapia genome using the latest genome sequencing methods and show how it improves the characterization of two sex determination regions in two tilapia species. A homozygous clonal XX female Nile tilapia (Oreochromis niloticus) was sequenced to 44X coverage using Pacific Biosciences (PacBio) SMRT sequencing. Dozens of candidate de novo assemblies were generated and an optimal assembly (contig NG50 of 3.3Mbp) was selected using principal component analysis of likelihood scores calculated from several paired-end sequencing libraries. Comparison of the new assembly to the previous O. niloticus genome assembly reveals that recently duplicated portions of the genome are now well represented. The overall number of genes in the new assembly increased by 27.3%, including a 67% increase in pseudogenes. The new tilapia genome assembly correctly represents two recent vasa gene duplication events that have been verified with BAC sequencing. At total of 146Mbp of additional transposable element sequence are now assembled, a large proportion of which are recent insertions. Large centromeric satellite repeats are assembled and annotated in cichlid fish for the first time. Finally, the new assembly identifies the long-range structure of both a ~9Mbp XY sex determination region on LG1 in O. niloticus, and a ~50Mbp WZ sex determination region on LG3 in the related species O. aureus. This study highlights the use of long read sequencing to correctly assemble recent duplications and to characterize repeat-filled regions of the genome. The study serves as an example of the need for high quality genome assemblies and provides a framework for identifying sex determining genes in tilapia and related fish species.

  1. Differentially Methylated Region-Representational Difference Analysis (DMR-RDA): A Powerful Method to Identify DMRs in Uncharacterized Genomes.

    PubMed

    Sasheva, Pavlina; Grossniklaus, Ueli

    2017-01-01

    Over the last years, it has become increasingly clear that environmental influences can affect the epigenomic landscape and that some epigenetic variants can have heritable, phenotypic effects. While there are a variety of methods to perform genome-wide analyses of DNA methylation in model organisms, this is still a challenging task for non-model organisms without a reference genome. Differentially methylated region-representational difference analysis (DMR-RDA) is a sensitive and powerful PCR-based technique that isolates DNA fragments that are differentially methylated between two otherwise identical genomes. The technique does not require special equipment and is independent of prior knowledge about the genome. It is even applicable to genomes that have high complexity and a large size, being the method of choice for the analysis of plant non-model systems.

  2. Genome-Wide Locations of Potential Epimutations Associated with Environmentally Induced Epigenetic Transgenerational Inheritance of Disease Using a Sequential Machine Learning Prediction Approach.

    PubMed

    Haque, M Muksitul; Holder, Lawrence B; Skinner, Michael K

    2015-01-01

    Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set

  3. Redundancy analysis allows improved detection of methylation changes in large genomic regions.

    PubMed

    Ruiz-Arenas, Carlos; González, Juan R

    2017-12-14

    DNA methylation is an epigenetic process that regulates gene expression. Methylation can be modified by environmental exposures and changes in the methylation patterns have been associated with diseases. Methylation microarrays measure methylation levels at more than 450,000 CpGs in a single experiment, and the most common analysis strategy is to perform a single probe analysis to find methylation probes associated with the outcome of interest. However, methylation changes usually occur at the regional level: for example, genomic structural variants can affect methylation patterns in regions up to several megabases in length. Existing DMR methods provide lists of Differentially Methylated Regions (DMRs) of up to only few kilobases in length, and cannot check if a target region is differentially methylated. Therefore, these methods are not suitable to evaluate methylation changes in large regions. To address these limitations, we developed a new DMR approach based on redundancy analysis (RDA) that assesses whether a target region is differentially methylated. Using simulated and real datasets, we compared our approach to three common DMR detection methods (Bumphunter, blockFinder, and DMRcate). We found that Bumphunter underestimated methylation changes and blockFinder showed poor performance. DMRcate showed poor power in the simulated datasets and low specificity in the real data analysis. Our method showed very high performance in all simulation settings, even with small sample sizes and subtle methylation changes, while controlling type I error. Other advantages of our method are: 1) it estimates the degree of association between the DMR and the outcome; 2) it can analyze a targeted or region of interest; and 3) it can evaluate the simultaneous effects of different variables. The proposed methodology is implemented in MEAL, a Bioconductor package designed to facilitate the analysis of methylation data. We propose a multivariate approach to decipher whether an

  4. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep

    PubMed Central

    Mousel, Michelle R.; Reynolds, James O.; White, Stephen N.

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10-5) were identified including markers in or near PIK3CB (P = 2.22x10-6; additive model), KCNB1 (P = 2.93x10-6; dominance model), ZC3H12C (P = 3.25x10-6; genotypic model), JPH1 (P = 4.68x20-6; genotypic model), and MYO3B (P = 5.74x10-6; recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection. PMID:26098909

  5. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep.

    PubMed

    Mousel, Michelle R; Reynolds, James O; White, Stephen N

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10(-5)) were identified including markers in or near PIK3CB (P = 2.22x10(-6); additive model), KCNB1 (P = 2.93x10(-6); dominance model), ZC3H12C (P = 3.25x10(-6); genotypic model), JPH1 (P = 4.68x20(-6); genotypic model), and MYO3B (P = 5.74x10(-6); recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection.

  6. Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae)

    PubMed Central

    Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

    2008-01-01

    Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages

  7. Variants in Several Genomic Regions Associated with Asperger Disorder

    PubMed Central

    Salyakina, D.; Ma, D.Q.; Jaworski, J.M.; Konidari, I.; Whitehead, P.L.; Henson, R.; Martinez, D.; Robinson, J.L.; Sacharow, S.; Wright, H.H.; Abramson, R.K.; Gilbert, J.R.; Cuccaro, M.L.; Pericak-Vance, M.A.

    2010-01-01

    Asperger disorder (ASP) is one of the autism spectrum disorders (ASD) and is differentiated from autism largely on the absence of clinically significant cognitive and language delays. Analysis of a homogenous subset of families with ASP may help to address the corresponding effect of genetic heterogeneity on identifying ASD genetic risk factors. To examine the hypothesis that common variation is important in ASD, we performed a genome-wide association study (GWAS) in 124 ASP families in a discovery data set and 110 ASP families in a validation data set. We prioritized the top 100 association results from both cohorts by employing a ranking strategy. Novel regions on 5q21.1 (P = 9.7 × 10−7) and 15q22.1–q22.2 (P = 7.3 × 10−6) were our most significant findings in the combined data set. Three chromosomal regions showing association, 3p14.2 (P = 3.6 × 10−6), 3q25–26 (P = 6.0 × 10−5) and 3p23 (P = 3.3 × 10−4) overlapped linkage regions reported in Finnish ASP families, and eight association regions overlapped ASD linkage areas. Our findings suggest that ASP shares both ASD-related genetic risk factors, as well as has genetic risk factors unique to the ASP phenotype. PMID:21182207

  8. Genome Dynamics and Evolution of the Mla (Powdery Mildew) Resistance Locus in BarleyW⃞

    PubMed Central

    Wei, Fusheng; Wing, Rod A.; Wise, Roger P.

    2002-01-01

    Genes that confer defense against pathogens often are clustered in the genome and evolve via diverse mechanisms. To evaluate the organization and content of a major defense gene complex in cereals, we determined the complete sequence of a 261-kb BAC contig from barley cv Morex that spans the Mla (powdery mildew) resistance locus. Among the 32 predicted genes on this contig, 15 are associated with plant defense responses; 6 of these are associated with defense responses to powdery mildew disease but function in different signaling pathways. The Mla region is organized as three gene-rich islands separated by two nested complexes of transposable elements and a 45-kb gene-poor region. A heterochromatic-like region is positioned directly proximal to Mla and is composed of a gene-poor core with 17 families of diverse tandem repeats that overlap a hypermethylated, but transcriptionally active, gene-dense island. Paleontology analysis of long terminal repeat retrotransposons indicates that the present Mla region evolved over a period of >7 million years through a variety of duplication, inversion, and transposon-insertion events. Sequence-based recombination estimates indicate that R genes positioned adjacent to nested long terminal repeat retrotransposons, such as Mla, do not favor recombination as a means of diversification. We present a model for the evolution of the Mla region that encompasses several emerging features of large cereal genomes. PMID:12172030

  9. Remarkably Divergent Regions Punctuate the Genome Assembly of the Caenorhabditis elegans Hawaiian Strain CB4856

    PubMed Central

    Thompson, Owen A.; Snoek, L. Basten; Nijveen, Harm; Sterken, Mark G.; Volkers, Rita J. M.; Brenchley, Rachel; van’t Hof, Arjen; Bevers, Roel P. J.; Cossins, Andrew R.; Yanai, Itai; Hajnal, Alex; Schmid, Tobias; Perkins, Jaryn D.; Spencer, David; Kruglyak, Leonid; Andersen, Erik C.; Moerman, Donald G.; Hillier, LaDeana W.; Kammenga, Jan E.; Waterston, Robert H.

    2015-01-01

    The Hawaiian strain (CB4856) of Caenorhabditis elegans is one of the most divergent from the canonical laboratory strain N2 and has been widely used in developmental, population, and evolutionary studies. To enhance the utility of the strain, we have generated a draft sequence of the CB4856 genome, exploiting a variety of resources and strategies. When compared against the N2 reference, the CB4856 genome has 327,050 single nucleotide variants (SNVs) and 79,529 insertion–deletion events that result in a total of 3.3 Mb of N2 sequence missing from CB4856 and 1.4 Mb of sequence present in CB4856 but not present in N2. As previously reported, the density of SNVs varies along the chromosomes, with the arms of chromosomes showing greater average variation than the centers. In addition, we find 61 regions totaling 2.8 Mb, distributed across all six chromosomes, which have a greatly elevated SNV density, ranging from 2 to 16% SNVs. A survey of other wild isolates show that the two alternative haplotypes for each region are widely distributed, suggesting they have been maintained by balancing selection over long evolutionary times. These divergent regions contain an abundance of genes from large rapidly evolving families encoding F-box, MATH, BATH, seven-transmembrane G-coupled receptors, and nuclear hormone receptors, suggesting that they provide selective advantages in natural environments. The draft sequence makes available a comprehensive catalog of sequence differences between the CB4856 and N2 strains that will facilitate the molecular dissection of their phenotypic differences. Our work also emphasizes the importance of going beyond simple alignment of reads to a reference genome when assessing differences between genomes. PMID:25995208

  10. Genomic organization of the canine herpesvirus US region.

    PubMed

    Haanes, E J; Tomlinson, C C

    1998-02-01

    Canine herpesvirus (CHV) is an alpha-herpesvirus of limited pathogenicity in healthy adult dogs and infectivity of the virus appears to be largely limited to cells of canine origin. CHV's low virulence and species specificity make it an attractive candidate for a recombinant vaccine vector to protect dogs against a variety of pathogens. As part of the analysis of the CHV genome, the authors determined the complete nucleotide sequence of the CHV US region as well as portions of the flanking inverted repeats. Seven full open reading frames (ORFs) encoding proteins larger than 100 amino acids were identified within, or partially within the CHV US: cUS2, cUS3, cUS4, cUS6, cUS7, cUS8 and cUS9; which are homologs of the herpes simplex virus type-1 US2; protein kinase; gG, gD, gI, gE; and US9 genes, respectively. An eighth ORF was identified in the inverted repeat region, cIR6, a homolog of the equine herpesvirus type-1 IR6 gene. The authors identified and mapped most of the major transcripts for the predicted CHV US ORFs by Northern analysis.

  11. Short interspersed nuclear elements (SINEs) are abundant in Solanaceae and have a family-specific impact on gene structure and genome organization.

    PubMed

    Seibt, Kathrin M; Wenke, Torsten; Muders, Katja; Truberg, Bernd; Schmidt, Thomas

    2016-05-01

    Short interspersed nuclear elements (SINEs) are highly abundant non-autonomous retrotransposons that are widespread in plants. They are short in size, non-coding, show high sequence diversity, and are therefore mostly not or not correctly annotated in plant genome sequences. Hence, comparative studies on genomic SINE populations are rare. To explore the structural organization and impact of SINEs, we comparatively investigated the genome sequences of the Solanaceae species potato (Solanum tuberosum), tomato (Solanum lycopersicum), wild tomato (Solanum pennellii), and two pepper cultivars (Capsicum annuum). Based on 8.5 Gbp sequence data, we annotated 82 983 SINE copies belonging to 10 families and subfamilies on a base pair level. Solanaceae SINEs are dispersed over all chromosomes with enrichments in distal regions. Depending on the genome assemblies and gene predictions, 30% of all SINE copies are associated with genes, particularly frequent in introns and untranslated regions (UTRs). The close association with genes is family specific. More than 10% of all genes annotated in the Solanaceae species investigated contain at least one SINE insertion, and we found genes harbouring up to 16 SINE copies. We demonstrate the involvement of SINEs in gene and genome evolution including the donation of splice sites, start and stop codons and exons to genes, enlargement of introns and UTRs, generation of tandem-like duplications and transduction of adjacent sequence regions. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  12. Size cues and the adjacency principle.

    DOT National Transportation Integrated Search

    1963-11-01

    The purpose of the present study was to apply the adjacency principle to the perception of relative depth from size cues. In agreement with the adjacency principle, it was found that the size cue between adjacent objects was more effective than the s...

  13. Indel-seq: a fast-forward genetics approach for identification of trait-associated putative candidate genomic regions and its application in pigeonpea (Cajanus cajan).

    PubMed

    Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Sinha, Pallavi; Kale, Sandip M; Parupalli, Swathi; Kumar, Vinay; Chitikineni, Annapurna; Vechalapu, Suryanarayana; Sameer Kumar, Chanda Venkata; Sharma, Mamta; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Muniswamy, Sonnappa; Varshney, Rajeev K

    2017-07-01

    Identification of candidate genomic regions associated with target traits using conventional mapping methods is challenging and time-consuming. In recent years, a number of single nucleotide polymorphism (SNP)-based mapping approaches have been developed and used for identification of candidate/putative genomic regions. However, in the majority of these studies, insertion-deletion (Indel) were largely ignored. For efficient use of Indels in mapping target traits, we propose Indel-seq approach, which is a combination of whole-genome resequencing (WGRS) and bulked segregant analysis (BSA) and relies on the Indel frequencies in extreme bulks. Deployment of Indel-seq approach for identification of candidate genomic regions associated with fusarium wilt (FW) and sterility mosaic disease (SMD) resistance in pigeonpea has identified 16 Indels affecting 26 putative candidate genes. Of these 26 affected putative candidate genes, 24 genes showed effect in the upstream/downstream of the genic region and two genes showed effect in the genes. Validation of these 16 candidate Indels in other FW- and SMD-resistant and FW- and SMD-susceptible genotypes revealed a significant association of five Indels (three for FW and two for SMD resistance). Comparative analysis of Indel-seq with other genetic mapping approaches highlighted the importance of the approach in identification of significant genomic regions associated with target traits. Therefore, the Indel-seq approach can be used for quick and precise identification of candidate genomic regions for any target traits in any crop species. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  14. Application of selection mapping to identify genomic regions associated with dairy production in sheep.

    PubMed

    Gutiérrez-Gil, Beatriz; Arranz, Juan Jose; Pong-Wong, Ricardo; García-Gámez, Elsa; Kijas, James; Wiener, Pamela

    2014-01-01

    In Europe, especially in Mediterranean areas, the sheep has been traditionally exploited as a dual purpose species, with income from both meat and milk. Modernization of husbandry methods and the establishment of breeding schemes focused on milk production have led to the development of "dairy breeds." This study investigated selective sweeps specifically related to dairy production in sheep by searching for regions commonly identified in different European dairy breeds. With this aim, genotypes from 44,545 SNP markers covering the sheep autosomes were analysed in both European dairy and non-dairy sheep breeds using two approaches: (i) identification of genomic regions showing extreme genetic differentiation between each dairy breed and a closely related non-dairy breed, and (ii) identification of regions with reduced variation (heterozygosity) in the dairy breeds using two methods. Regions detected in at least two breeds (breed pairs) by the two approaches (genetic differentiation and at least one of the heterozygosity-based analyses) were labeled as core candidate convergence regions and further investigated for candidate genes. Following this approach six regions were detected. For some of them, strong candidate genes have been proposed (e.g. ABCG2, SPP1), whereas some other genes designated as candidates based on their association with sheep and cattle dairy traits (e.g. LALBA, DGAT1A) were not associated with a detectable sweep signal. Few of the identified regions were coincident with QTL previously reported in sheep, although many of them corresponded to orthologous regions in cattle where QTL for dairy traits have been identified. Due to the limited number of QTL studies reported in sheep compared with cattle, the results illustrate the potential value of selection mapping to identify genomic regions associated with dairy traits in sheep.

  15. Application of Selection Mapping to Identify Genomic Regions Associated with Dairy Production in Sheep

    PubMed Central

    Gutiérrez-Gil, Beatriz; Arranz, Juan Jose; Pong-Wong, Ricardo; García-Gámez, Elsa; Kijas, James; Wiener, Pamela

    2014-01-01

    In Europe, especially in Mediterranean areas, the sheep has been traditionally exploited as a dual purpose species, with income from both meat and milk. Modernization of husbandry methods and the establishment of breeding schemes focused on milk production have led to the development of “dairy breeds.” This study investigated selective sweeps specifically related to dairy production in sheep by searching for regions commonly identified in different European dairy breeds. With this aim, genotypes from 44,545 SNP markers covering the sheep autosomes were analysed in both European dairy and non-dairy sheep breeds using two approaches: (i) identification of genomic regions showing extreme genetic differentiation between each dairy breed and a closely related non-dairy breed, and (ii) identification of regions with reduced variation (heterozygosity) in the dairy breeds using two methods. Regions detected in at least two breeds (breed pairs) by the two approaches (genetic differentiation and at least one of the heterozygosity-based analyses) were labeled as core candidate convergence regions and further investigated for candidate genes. Following this approach six regions were detected. For some of them, strong candidate genes have been proposed (e.g. ABCG2, SPP1), whereas some other genes designated as candidates based on their association with sheep and cattle dairy traits (e.g. LALBA, DGAT1A) were not associated with a detectable sweep signal. Few of the identified regions were coincident with QTL previously reported in sheep, although many of them corresponded to orthologous regions in cattle where QTL for dairy traits have been identified. Due to the limited number of QTL studies reported in sheep compared with cattle, the results illustrate the potential value of selection mapping to identify genomic regions associated with dairy traits in sheep. PMID:24788864

  16. Elucidating the genomic architecture of Asian EGFR-mutant lung adenocarcinoma through multi-region exome sequencing.

    PubMed

    Nahar, Rahul; Zhai, Weiwei; Zhang, Tong; Takano, Angela; Khng, Alexis J; Lee, Yin Yeng; Liu, Xingliang; Lim, Chong Hee; Koh, Tina P T; Aung, Zaw Win; Lim, Tony Kiat Hon; Veeravalli, Lavanya; Yuan, Ju; Teo, Audrey S M; Chan, Cheryl X; Poh, Huay Mei; Chua, Ivan M L; Liew, Audrey Ann; Lau, Dawn Ping Xi; Kwang, Xue Lin; Toh, Chee Keong; Lim, Wan-Teck; Lim, Bing; Tam, Wai Leong; Tan, Eng-Huat; Hillmer, Axel M; Tan, Daniel S W

    2018-01-15

    EGFR-mutant lung adenocarcinomas (LUAD) display diverse clinical trajectories and are characterized by rapid but short-lived responses to EGFR tyrosine kinase inhibitors (TKIs). Through sequencing of 79 spatially distinct regions from 16 early stage tumors, we show that despite low mutation burdens, EGFR-mutant Asian LUADs unexpectedly exhibit a complex genomic landscape with frequent and early whole-genome doubling, aneuploidy, and high clonal diversity. Multiple truncal alterations, including TP53 mutations and loss of CDKN2A and RB1, converge on cell cycle dysregulation, with late sector-specific high-amplitude amplifications and deletions that potentially beget drug resistant clones. We highlight the association between genomic architecture and clinical phenotypes, such as co-occurring truncal drivers and primary TKI resistance. Through comparative analysis with published smoking-related LUAD, we postulate that the high intra-tumor heterogeneity observed in Asian EGFR-mutant LUAD may be contributed by an early dominant driver, genomic instability, and low background mutation rates.

  17. Genomic regions controlling shape variation in the first upper molar of the house mouse

    PubMed Central

    Pantalacci, Sophie; Turner, Leslie M; Steingrimsson, Eirikur; Renaud, Sabrina

    2017-01-01

    Numerous loci of large effect have been shown to underlie phenotypic variation between species. However, loci with subtle effects are presumably more frequently involved in microevolutionary processes but have rarely been discovered. We explore the genetic basis of shape variation in the first upper molar of hybrid mice between Mus musculus musculus and M. m. domesticus. We performed the first genome-wide association study for molar shape and used 3D surface morphometrics to quantify subtle variation between individuals. We show that many loci of small effect underlie phenotypic variation, and identify five genomic regions associated with tooth shape; one region contained the gene microphthalmia-associated transcription factor Mitf that has previously been associated with tooth malformations. Using a panel of five mutant laboratory strains, we show the effect of the Mitf gene on tooth shape. This is the first report of a gene causing subtle but consistent variation in tooth shape resembling variation in nature. PMID:29091026

  18. Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences.

    PubMed

    Colonna, Vincenza; Ayub, Qasim; Chen, Yuan; Pagani, Luca; Luisi, Pierre; Pybus, Marc; Garrison, Erik; Xue, Yali; Tyler-Smith, Chris; Abecasis, Goncalo R; Auton, Adam; Brooks, Lisa D; DePristo, Mark A; Durbin, Richard M; Handsaker, Robert E; Kang, Hyun Min; Marth, Gabor T; McVean, Gil A

    2014-06-30

    Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.

  19. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome

    PubMed Central

    Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin

    2018-01-01

    Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712

  20. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    PubMed

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  1. Origins of the Xylella fastidiosa Prophage-Like Regions and Their Impact in Genome Differentiation

    PubMed Central

    de Mello Varani, Alessandro; Souza, Rangel Celso; Nakaya, Helder I.; de Lima, Wanessa Cristina; Paula de Almeida, Luiz Gonzaga; Kitajima, Elliot Watanabe; Chen, Jianchi; Civerolo, Edwin; Vasconcelos, Ana Tereza Ribeiro; Van Sluys, Marie-Anne

    2008-01-01

    Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes. PMID:19116666

  2. Origins of the Xylella fastidiosa prophage-like regions and their impact in genome differentiation.

    PubMed

    de Mello Varani, Alessandro; Souza, Rangel Celso; Nakaya, Helder I; de Lima, Wanessa Cristina; Paula de Almeida, Luiz Gonzaga; Kitajima, Elliot Watanabe; Chen, Jianchi; Civerolo, Edwin; Vasconcelos, Ana Tereza Ribeiro; Van Sluys, Marie-Anne

    2008-01-01

    Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes.

  3. 46 CFR 148.445 - Adjacent spaces.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 46 Shipping 5 2011-10-01 2011-10-01 false Adjacent spaces. 148.445 Section 148.445 Shipping COAST... THAT REQUIRE SPECIAL HANDLING Additional Special Requirements § 148.445 Adjacent spaces. When... following requirements must be met: (a) Each space adjacent to a cargo hold must be ventilated by natural...

  4. 46 CFR 148.445 - Adjacent spaces.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 46 Shipping 5 2014-10-01 2014-10-01 false Adjacent spaces. 148.445 Section 148.445 Shipping COAST... THAT REQUIRE SPECIAL HANDLING Additional Special Requirements § 148.445 Adjacent spaces. When... following requirements must be met: (a) Each space adjacent to a cargo hold must be ventilated by natural...

  5. 46 CFR 148.445 - Adjacent spaces.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 46 Shipping 5 2013-10-01 2013-10-01 false Adjacent spaces. 148.445 Section 148.445 Shipping COAST... THAT REQUIRE SPECIAL HANDLING Additional Special Requirements § 148.445 Adjacent spaces. When... following requirements must be met: (a) Each space adjacent to a cargo hold must be ventilated by natural...

  6. 46 CFR 148.445 - Adjacent spaces.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 46 Shipping 5 2012-10-01 2012-10-01 false Adjacent spaces. 148.445 Section 148.445 Shipping COAST... THAT REQUIRE SPECIAL HANDLING Additional Special Requirements § 148.445 Adjacent spaces. When... following requirements must be met: (a) Each space adjacent to a cargo hold must be ventilated by natural...

  7. The patterns of genomic variances and covariances across genome for milk production traits between Chinese and Nordic Holstein populations.

    PubMed

    Li, Xiujin; Lund, Mogens Sandø; Janss, Luc; Wang, Chonglong; Ding, Xiangdong; Zhang, Qin; Su, Guosheng

    2017-03-15

    With the development of SNP chips, SNP information provides an efficient approach to further disentangle different patterns of genomic variances and covariances across the genome for traits of interest. Due to the interaction between genotype and environment as well as possible differences in genetic background, it is reasonable to treat the performances of a biological trait in different populations as different but genetic correlated traits. In the present study, we performed an investigation on the patterns of region-specific genomic variances, covariances and correlations between Chinese and Nordic Holstein populations for three milk production traits. Variances and covariances between Chinese and Nordic Holstein populations were estimated for genomic regions at three different levels of genome region (all SNP as one region, each chromosome as one region and every 100 SNP as one region) using a novel multi-trait random regression model which uses latent variables to model heterogeneous variance and covariance. In the scenario of the whole genome as one region, the genomic variances, covariances and correlations obtained from the new multi-trait Bayesian method were comparable to those obtained from a multi-trait GBLUP for all the three milk production traits. In the scenario of each chromosome as one region, BTA 14 and BTA 5 accounted for very large genomic variance, covariance and correlation for milk yield and fat yield, whereas no specific chromosome showed very large genomic variance, covariance and correlation for protein yield. In the scenario of every 100 SNP as one region, most regions explained <0.50% of genomic variance and covariance for milk yield and fat yield, and explained <0.30% for protein yield, while some regions could present large variance and covariance. Although overall correlations between two populations for the three traits were positive and high, a few regions still showed weakly positive or highly negative genomic correlations for

  8. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    USDA-ARS?s Scientific Manuscript database

    Background: BAC-based physical maps provide for sequencing across an entire genome or selected sub-genome regions of biological interest. Using the minimum tiling path as a guide, it is possible to select specific BAC clones from prioritized genome sections such as a genetically defined QTL interv...

  9. Ebolavirus comparative genomics

    DOE PAGES

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  10. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia

    PubMed Central

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M.

    2016-01-01

    Abstract Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints. PMID:28175287

  11. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia.

    PubMed

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M; Ruiz-Herrera, Aurora

    2016-12-01

    Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.

  12. Sedimentary and crustal thicknesses and Poisson's ratios for the NE Tibetan Plateau and its adjacent regions based on dense seismic arrays

    NASA Astrophysics Data System (ADS)

    Wang, Weilai; Wu, Jianping; Fang, Lihua; Lai, Guijuan; Cai, Yan

    2017-03-01

    The sedimentary and crustal thicknesses and Poisson's ratios of the NE Tibetan Plateau and its adjacent regions are estimated by the h- κ stacking and CCP image of receiver functions from the data of 1,317 stations. The horizontal resolution of the obtained results is as high as 0.5° × 0.5°, which can be used for further high resolution model construction in the region. The crustal thicknesses from Airy's equilibrium are smaller than our results in the Sichuan Basin, Qilian tectonic belt, northern Alxa block and Qaidam Basin, which is consistent with the high densities in the mantle lithosphere and may indicate that the high-density lithosphere drags crust down overall. High Poisson's ratios and low velocity zones are found in the mid- and lower crust beneath eastern Qilian tectonic belt and the boundary areas of the Ordos block, indicating that partial melting may exist in these regions. Low Poisson's ratios and low-velocity anomalies are observed in the crust in the NE Tibetan Plateau, implying that the mafic lower crust is thinning or missing and that the mid- and lower crust does not exhibit melting or partial melting in the NE Tibetan Plateau, and weak flow layers are not likely to exist in this region.

  13. In situ optical sequencing and structure analysis of a trinucleotide repeat genome region by localization microscopy after specific COMBO-FISH nano-probing

    NASA Astrophysics Data System (ADS)

    Stuhlmüller, M.; Schwarz-Finsterle, J.; Fey, E.; Lux, J.; Bach, M.; Cremer, C.; Hinderhofer, K.; Hausmann, M.; Hildenbrand, G.

    2015-10-01

    Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5' untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.

  14. Dynamics of re-constitution of the human nuclear proteome after cell division is regulated by NLS-adjacent phosphorylation

    PubMed Central

    Róna, Gergely; Borsos, Máté; Ellis, Jonathan J; Mehdi, Ahmed M; Christie, Mary; Környei, Zsuzsanna; Neubrandt, Máté; Tóth, Judit; Bozóky, Zoltán; Buday, László; Madarász, Emília; Bodén, Mikael; Kobe, Bostjan; Vértessy, Beáta G

    2014-01-01

    Phosphorylation by the cyclin-dependent kinase 1 (Cdk1) adjacent to nuclear localization signals (NLSs) is an important mechanism of regulation of nucleocytoplasmic transport. However, no systematic survey has yet been performed in human cells to analyze this regulatory process, and the corresponding cell-cycle dynamics have not yet been investigated. Here, we focused on the human proteome and found that numerous proteins, previously not identified in this context, are associated with Cdk1-dependent phosphorylation sites adjacent to their NLSs. Interestingly, these proteins are involved in key regulatory events of DNA repair, epigenetics, or RNA editing and splicing. This finding indicates that cell-cycle dependent events of genome editing and gene expression profiling may be controlled by nucleocytoplasmic trafficking. For in-depth investigations, we selected a number of these proteins and analyzed how point mutations, expected to modify the phosphorylation ability of the NLS segments, perturb nucleocytoplasmic localization. In each case, we found that mutations mimicking hyper-phosphorylation abolish nuclear import processes. To understand the mechanism underlying these phenomena, we performed a video microscopy-based kinetic analysis to obtain information on cell-cycle dynamics on a model protein, dUTPase. We show that the NLS-adjacent phosphorylation by Cdk1 of human dUTPase, an enzyme essential for genomic integrity, results in dynamic cell cycle-dependent distribution of the protein. Non-phosphorylatable mutants have drastically altered protein re-import characteristics into the nucleus during the G1 phase. Our results suggest a dynamic Cdk1-driven mechanism of regulation of the nuclear proteome composition during the cell cycle. PMID:25483092

  15. A Genome-Wide Association Study Identifies Genomic Regions for Virulence in the Non-Model Organism Heterobasidion annosum s.s

    PubMed Central

    Dalman, Kerstin; Himmelstrand, Kajsa; Olson, Åke; Lind, Mårten; Brandström-Durling, Mikael; Stenlid, Jan

    2013-01-01

    The dense single nucleotide polymorphisms (SNP) panels needed for genome wide association (GWA) studies have hitherto been expensive to establish and use on non-model organisms. To overcome this, we used a next generation sequencing approach to both establish SNPs and to determine genotypes. We conducted a GWA study on a fungal species, analysing the virulence of Heterobasidion annosum s.s., a necrotrophic pathogen, on its hosts Picea abies and Pinus sylvestris. From a set of 33,018 single nucleotide polymorphisms (SNP) in 23 haploid isolates, twelve SNP markers distributed on seven contigs were associated with virulence (P<0.0001). Four of the contigs harbour known virulence genes from other fungal pathogens and the remaining three harbour novel candidate genes. Two contigs link closely to virulence regions recognized previously by QTL mapping in the congeneric hybrid H. irregulare × H. occidentale. Our study demonstrates the efficiency of GWA studies for dissecting important complex traits of small populations of non-model haploid organisms with small genomes. PMID:23341945

  16. Global Identification and Characterization of Transcriptionally Active Regions in the Rice Genome

    PubMed Central

    Stolc, Viktor; Deng, Wei; He, Hang; Korbel, Jan; Chen, Xuewei; Tongprasit, Waraporn; Ronald, Pamela; Chen, Runsheng; Gerstein, Mark; Wang Deng, Xing

    2007-01-01

    Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome. PMID:17372628

  17. Hundreds of conserved non-coding genomic regions are independently lost in mammals

    PubMed Central

    Hiller, Michael; Schaar, Bruce T.; Bejerano, Gill

    2012-01-01

    Conserved non-protein-coding DNA elements (CNEs) often encode cis-regulatory elements and are rarely lost during evolution. However, CNE losses that do occur can be associated with phenotypic changes, exemplified by pelvic spine loss in sticklebacks. Using a computational strategy to detect complete loss of CNEs in mammalian genomes while strictly controlling for artifacts, we find >600 CNEs that are independently lost in at least two mammalian lineages, including a spinal cord enhancer near GDF11. We observed several genomic regions where multiple independent CNE loss events happened; the most extreme is the DIAPH2 locus. We show that CNE losses often involve deletions and that CNE loss frequencies are non-uniform. Similar to less pleiotropic enhancers, we find that independently lost CNEs are shorter, slightly less constrained and evolutionarily younger than CNEs without detected losses. This suggests that independently lost CNEs are less pleiotropic and that pleiotropic constraints contribute to non-uniform CNE loss frequencies. We also detected 35 CNEs that are independently lost in the human lineage and in other mammals. Our study uncovers an interesting aspect of the evolution of functional DNA in mammalian genomes. Experiments are necessary to test if these independently lost CNEs are associated with parallel phenotype changes in mammals. PMID:23042682

  18. Trends in genome-wide and region-specific genetic diversity in the Dutch-Flemish Holstein-Friesian breeding program from 1986 to 2015.

    PubMed

    Doekes, Harmen P; Veerkamp, Roel F; Bijma, Piter; Hiemstra, Sipke J; Windig, Jack J

    2018-04-11

    In recent decades, Holstein-Friesian (HF) selection schemes have undergone profound changes, including the introduction of optimal contribution selection (OCS; around 2000), a major shift in breeding goal composition (around 2000) and the implementation of genomic selection (GS; around 2010). These changes are expected to have influenced genetic diversity trends. Our aim was to evaluate genome-wide and region-specific diversity in HF artificial insemination (AI) bulls in the Dutch-Flemish breeding program from 1986 to 2015. Pedigree and genotype data (~ 75.5 k) of 6280 AI-bulls were used to estimate rates of genome-wide inbreeding and kinship and corresponding effective population sizes. Region-specific inbreeding trends were evaluated using regions of homozygosity (ROH). Changes in observed allele frequencies were compared to those expected under pure drift to identify putative regions under selection. We also investigated the direction of changes in allele frequency over time. Effective population size estimates for the 1986-2015 period ranged from 69 to 102. Two major breakpoints were observed in genome-wide inbreeding and kinship trends. Around 2000, inbreeding and kinship levels temporarily dropped. From 2010 onwards, they steeply increased, with pedigree-based, ROH-based and marker-based inbreeding rates as high as 1.8, 2.1 and 2.8% per generation, respectively. Accumulation of inbreeding varied substantially across the genome. A considerable fraction of markers showed changes in allele frequency that were greater than expected under pure drift. Putative selected regions harboured many quantitative trait loci (QTL) associated to a wide range of traits. In consecutive 5-year periods, allele frequencies changed more often in the same direction than in opposite directions, except when comparing the 1996-2000 and 2001-2005 periods. Genome-wide and region-specific diversity trends reflect major changes in the Dutch-Flemish HF breeding program. Introduction of

  19. Genome-wide DNA methylation profile identified a unique set of differentially methylated immune genes in oral squamous cell carcinoma patients in India.

    PubMed

    Basu, Baidehi; Chakraborty, Joyeeta; Chandra, Aditi; Katarkar, Atul; Baldevbhai, Jadav Ritesh Kumar; Dhar Chowdhury, Debjit; Ray, Jay Gopal; Chaudhuri, Keya; Chatterjee, Raghunath

    2017-01-01

    Oral squamous cell carcinoma (OSCC) is one of the common malignancies in Southeast Asia. Epigenetic changes, mainly the altered DNA methylation, have been implicated in many cancers. Considering the varied environmental and genotoxic exposures among the Indian population, we conducted a genome-wide DNA methylation study on paired tumor and adjacent normal tissues of ten well-differentiated OSCC patients and validated in an additional 53 well-differentiated OSCC and adjacent normal samples. Genome-wide DNA methylation analysis identified several novel differentially methylated regions associated with OSCC. Hypermethylation is primarily enriched in the CpG-rich regions, while hypomethylation is mainly in the open sea. Distinct epigenetic drifts for hypo- and hypermethylation across CpG islands suggested independent mechanisms of hypo- and hypermethylation in OSCC development. Aberrant DNA methylation in the promoter regions are concomitant with gene expression. Hypomethylation of immune genes reflect the lymphocyte infiltration into the tumor microenvironment. Comparison of methylome data with 312 TCGA HNSCC samples identified a unique set of hypomethylated promoters among the OSCC patients in India. Pathway analysis of unique hypomethylated promoters indicated that the OSCC patients in India induce an anti-tumor T cell response, with mobilization of T lymphocytes in the neoplastic environment. Survival analysis of these epigenetically regulated immune genes suggested their prominent role in OSCC progression. Our study identified a unique set of hypomethylated regions, enriched in the promoters of immune response genes, and indicated the presence of a strong immune component in the tumor microenvironment. These methylation changes may serve as potential molecular markers to define risk and to monitor the prognosis of OSCC patients in India.

  20. A Voronoi interior adjacency-based approach for generating a contour tree

    NASA Astrophysics Data System (ADS)

    Chen, Jun; Qiao, Chaofei; Zhao, Renliang

    2004-05-01

    A contour tree is a good graphical tool for representing the spatial relations of contour lines and has found many applications in map generalization, map annotation, terrain analysis, etc. A new approach for generating contour trees by introducing a Voronoi-based interior adjacency set concept is proposed in this paper. The immediate interior adjacency set is employed to identify all of the children contours of each contour without contour elevations. It has advantages over existing methods such as the point-in-polygon method and the region growing-based method. This new approach can be used for spatial data mining and knowledge discovering, such as the automatic extraction of terrain features and construction of multi-resolution digital elevation model.

  1. Secondary structure of the 3'-noncoding region of flavivirus genomes: comparative analysis of base pairing probabilities.

    PubMed

    Rauscher, S; Flamm, C; Mandl, C W; Heinz, F X; Stadler, P F

    1997-07-01

    The prediction of the complete matrix of base pairing probabilities was applied to the 3' noncoding region (NCR) of flavivirus genomes. This approach identifies not only well-defined secondary structure elements, but also regions of high structural flexibility. Flaviviruses, many of which are important human pathogens, have a common genomic organization, but exhibit a significant degree of RNA sequence diversity in the functionally important 3'-NCR. We demonstrate the presence of secondary structures shared by all flaviviruses, as well as structural features that are characteristic for groups of viruses within the genus reflecting the established classification scheme. The significance of most of the predicted structures is corroborated by compensatory mutations. The availability of infectious clones for several flaviviruses will allow the assessment of these structural elements in processes of the viral life cycle, such as replication and assembly.

  2. Exploiting Genome Structure in Association Analysis

    PubMed Central

    Kim, Seyoung

    2014-01-01

    Abstract A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association. PMID:21548809

  3. Variability among Cucurbitaceae species (melon, cucumber and watermelon) in a genomic region containing a cluster of NBS-LRR genes.

    PubMed

    Morata, Jordi; Puigdomènech, Pere

    2017-02-08

    Cucurbitaceae species contain a significantly lower number of genes coding for proteins with similarity to plant resistance genes belonging to the NBS-LRR family than other plant species of similar genome size. A large proportion of these genes are organized in clusters that appear to be hotspots of variability. The genomes of the Cucurbitaceae species measured until now are intermediate in size (between 350 and 450 Mb) and they apparently have not undergone any genome duplications beside those at the origin of eudicots. The cluster containing the largest number of NBS-LRR genes has previously been analyzed in melon and related species and showed a high degree of interspecific and intraspecific variability. It was of interest to study whether similar behavior occurred in other cluster of the same family of genes. The cluster of NBS-LRR genes located in melon chromosome 9 was analyzed and compared with the syntenic regions in other cucurbit genomes. This is the second cluster in number within this species and it contains nine sequences with a NBS-LRR annotation including two genes, Fom1 and Prv, providing resistance against Fusarium and Ppapaya ring-spot virus (PRSV). The variability within the melon species appears to consist essentially of single nucleotide polymorphisms. Clusters of similar genes are present in the syntenic regions of the two species of Cucurbitaceae that were sequenced, cucumber and watermelon. Most of the genes in the syntenic clusters can be aligned between species and a hypothesis of generation of the cluster is proposed. The number of genes in the watermelon cluster is similar to that in melon while a higher number of genes (12) is present in cucumber, a species with a smaller genome than melon. After comparing genome resequencing data of 115 cucumber varieties, deletion of a group of genes is observed in a group of varieties of Indian origin. Clusters of genes coding for NBS-LRR proteins in cucurbits appear to have specific variability in

  4. Comprehensive Analysis of Genome Rearrangements in Eight Human Malignant Tumor Tissues

    PubMed Central

    Wang, Chong

    2016-01-01

    Carcinogenesis is a complex multifactorial, multistage process, but the precise mechanisms are not well understood. In this study, we performed a genome-wide analysis of the copy number variation (CNV), breakpoint region (BPR) and fragile sites in 2,737 tumor samples from eight tumor entities and in 432 normal samples. CNV detection and BPR identification revealed that BPRs tended to accumulate in specific genomic regions in tumor samples whereas being dispersed genome-wide in the normal samples. Hotspots were observed, at which segments with similar alteration in copy number were overlapped along with BPRs adjacently clustered. Evaluation of BPR occurrence frequency showed that at least one was detected in about and more than 15% of samples for each tumor entity while BPRs were maximal in 12% of the normal samples. 127 of 2,716 tumor-relevant BPRs (termed ‘common BPRs’) exhibited also a noticeable occurrence frequency in the normal samples. Colocalization assessment identified 20,077 CNV-affecting genes and 169 of these being known tumor-related genes. The most noteworthy genes are KIAA0513 important for immunologic, synaptic and apoptotic signal pathways, intergenic non-coding RNA RP11-115C21.2 possibly acting as oncogene or tumor suppressor by changing the structure of chromatin, and ADAM32 likely importance in cancer cell proliferation and progression by ectodomain-shedding of diverse growth factors, and the well-known tumor suppressor gene p53. The BPR distributions indicate that CNV mutations are likely non-random in tumor genomes. The marked recurrence of BPRs at specific regions supports common progression mechanisms in tumors. The presence of hotspots together with common BPRs, despite its small group size, imply a relation between fragile sites and cancer-gene alteration. Our data further suggest that both protein-coding and non-coding genes possessing a range of biological functions might play a causative or functional role in tumor biology. This

  5. Genomic regions with a history of divergent selection affect fitness of hybrids between two butterfly species.

    PubMed

    Gompert, Zachariah; Lucas, Lauren K; Nice, Chris C; Fordyce, James A; Forister, Matthew L; Buerkle, C Alex

    2012-07-01

    Speciation is the process by which reproductively isolated lineages arise, and is one of the fundamental means by which the diversity of life increases. Whereas numerous studies have documented an association between ecological divergence and reproductive isolation, relatively little is known about the role of natural selection in genome divergence during the process of speciation. Here, we use genome-wide DNA sequences and Bayesian models to test the hypothesis that loci under divergent selection between two butterfly species (Lycaeides idas and L. melissa) also affect fitness in an admixed population. Locus-specific measures of genetic differentiation between L. idas and L. melissa and genomic introgression in hybrids varied across the genome. The most differentiated genetic regions were characterized by elevated L. idas ancestry in the admixed population, which occurs in L. idas-like habitat, consistent with the hypothesis that local adaptation contributes to speciation. Moreover, locus-specific measures of genetic differentiation (a metric of divergent selection) were positively associated with extreme genomic introgression (a metric of hybrid fitness). Interestingly, concordance of differentiation and introgression was only partial. We discuss multiple, complementary explanations for this partial concordance. © 2012 The Author(s).

  6. Genomic regions associated with kyphosis in swine

    PubMed Central

    2010-01-01

    Background A back curvature defect similar to kyphosis in humans has been observed in swine herds. The defect ranges from mild to severe curvature of the thoracic vertebrate in split carcasses and has an estimated heritability of 0.3. The objective of this study was to identify genomic regions that affect this trait. Results Single nucleotide polymorphism (SNP) associations performed with 198 SNPs and microsatellite markers in a Duroc-Landrace-Yorkshire resource population (U.S. Meat Animal Research Center, USMARC resource population) of swine provided regions of association with this trait on 15 chromosomes. Positional candidate genes, especially those involved in human skeletal development pathways, were selected for SNP identification. SNPs in 16 candidate genes were genotyped in an F2 population (n = 371) and the USMARC resource herd (n = 1,257) with kyphosis scores. SNPs in KCNN2 on SSC2, RYR1 and PLOD1 on SSC6 and MYST4 on SSC14 were significantly associated with kyphosis in the resource population of swine (P ≤ 0.05). SNPs in CER1 and CDH7 on SSC1, PSMA5 on SSC4, HOXC6 and HOXC8 on SSC5, ADAMTS18 on SSC6 and SOX9 on SSC12 were significantly associated with the kyphosis trait in the F2 population of swine (P ≤ 0.05). Conclusions These data suggest that this kyphosis trait may be affected by several loci and that these may differ by population. Carcass value could be improved by effectively removing this undesirable trait from pig populations. PMID:21176156

  7. Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS

    PubMed Central

    2010-01-01

    Background Physical maps employing libraries of bacterial artificial chromosome (BAC) clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum), Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat. Results We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs. Conclusions The physical map reported here is the first physical map using fingerprinting of a complete Triticeae genome. This study

  8. Successive Two-sided Loop Jets Caused by Magnetic Reconnection between Two Adjacent Filamentary Threads

    NASA Astrophysics Data System (ADS)

    Tian, Zhanjun; Liu, Yu; Shen, Yuandeng; Elmhamdi, Abouazza; Su, Jiangtao; Liu, Ying D.; Kordi, Ayman. S.

    2017-08-01

    We present observational analysis of two successive two-sided loop jets observed by the ground-based New Vacuum Solar Telescope and the space-borne Solar Dynamics Observatory. The two successive two-sided loop jets manifested similar evolution processes and both were associated with the interaction of two small-scale adjacent filamentary threads, magnetic emerging, and cancellation processes at the jet’s source region. High temporal and high spatial resolution observations reveal that the two adjacent ends of the two filamentary threads are rooted in opposite magnetic polarities within the source region. The two threads approached each other, and then an obvious brightening patch is observed at the interaction position. Subsequently, a pair of hot plasma ejections are observed heading in opposite directions along the paths of the two filamentary threads at a typical speed for two-sided loop jets of the order 150 km s-1. Close to the end of the second jet, we report the formation of a bright hot loop structure at the source region, which suggests the formation of new loops during the interaction. Based on the observational results, we propose that the observed two-sided loop jets are caused by magnetic reconnection between the two adjacent filamentary threads, largely different from the previous scenario that a two-sided loop jet is generated by magnetic reconnection between an emerging bipole and the overlying horizontal magnetic fields.

  9. Endothelin-1 stimulates colon cancer adjacent fibroblasts.

    PubMed

    Knowles, Jonathan P; Shi-Wen, Xu; Haque, Samer-ul; Bhalla, Ashish; Dashwood, Michael R; Yang, Shiyu; Taylor, Irving; Winslet, Marc C; Abraham, David J; Loizidou, Marilena

    2012-03-15

    Endothelin-1 (ET-1) is produced by and stimulates colorectal cancer cells. Fibroblasts produce tumour stroma required for cancer development. We investigated whether ET-1 stimulated processes involved in tumour stroma production by colonic fibroblasts. Primary human fibroblasts, isolated from normal tissues adjacent to colon cancers, were cultured with or without ET-1 and its antagonists. Cellular proliferation, migration and contraction were measured. Expression of enzymes involved in tumour stroma development and alterations in gene transcription were determined by Western blotting and genome microarrays. ET-1 stimulated proliferation, contraction and migration (p < 0.01 v control) and the expression of matrix degrading enzymes TIMP-1 and MMP-2, but not MMP-3. ET-1 upregulated genes for profibrotic growth factors and receptors, signalling molecules, actin modulators and extracellular matrix components. ET-1 stimulated colonic fibroblast cellular processes in vitro that are involved in developing tumour stroma. Upregulated genes were consistent with these processes. By acting as a strong stimulus for tumour stroma creation, ET-1 is proposed as a target for adjuvant cancer therapy. Copyright © 2011 UICC.

  10. Synteny conservation between the Prunus genome and both the present and ancestral Arabidopsis genomes

    PubMed Central

    Jung, Sook; Main, Dorrie; Staton, Margaret; Cho, Ilhyung; Zhebentyayeva, Tatyana; Arús, Pere; Abbott, Albert

    2006-01-01

    Background Due to the lack of availability of large genomic sequences for peach or other Prunus species, the degree of synteny conservation between the Prunus species and Arabidopsis has not been systematically assessed. Using the recently available peach EST sequences that are anchored to Prunus genetic maps and to peach physical map, we analyzed the extent of conserved synteny between the Prunus and the Arabidopsis genomes. The reconstructed pseudo-ancestral Arabidopsis genome, existed prior to the proposed recent polyploidy event, was also utilized in our analysis to further elucidate the evolutionary relationship. Results We analyzed the synteny conservation between the Prunus and the Arabidopsis genomes by comparing 475 peach ESTs that are anchored to Prunus genetic maps and their Arabidopsis homologs detected by sequence similarity. Microsyntenic regions were detected between all five Arabidopsis chromosomes and seven of the eight linkage groups of the Prunus reference map. An additional 1097 peach ESTs that are anchored to 431 BAC contigs of the peach physical map and their Arabidopsis homologs were also analyzed. Microsyntenic regions were detected in 77 BAC contigs. The syntenic regions from both data sets were short and contained only a couple of conserved gene pairs. The synteny between peach and Arabidopsis was fragmentary; all the Prunus linkage groups containing syntenic regions matched to more than two different Arabidopsis chromosomes, and most BAC contigs with multiple conserved syntenic regions corresponded to multiple Arabidopsis chromosomes. Using the same peach EST datasets and their Arabidopsis homologs, we also detected conserved syntenic regions in the pseudo-ancestral Arabidopsis genome. In many cases, the gene order and content of peach regions was more conserved in the ancestral genome than in the present Arabidopsis region. Statistical significance of each syntenic group was calculated using simulated Arabidopsis genome. Conclusion We

  11. Stable isotopes in juvenile marine fishes and their invertebrate prey from the Thames Estuary, UK, and adjacent coastal regions

    NASA Astrophysics Data System (ADS)

    Leakey, Chris D. B.; Attrill, Martin J.; Jennings, Simon; Fitzsimons, Mark F.

    2008-04-01

    Estuaries are regarded as valuable nursery habitats for many commercially important marine fishes, potentially providing a thermal resource, refuge from predators and a source of abundant prey. Stable isotope analysis may be used to assess relative resource use from isotopically distinct sources. This study comprised two major components: (1) development of a spatial map and discriminant function model of stable isotope variation in selected invertebrate groups inhabiting the Thames Estuary and adjacent coastal regions; and (2) analysis of stable isotope signatures of juvenile bass ( Dicentrarchus labrax), sole ( Solea solea) and whiting ( Merlangius merlangus) for assessment of resource use and feeding strategies. The data were also used to consider anthropogenic enrichment of the estuary and potential energetic benefits of feeding in estuarine nursery habitat. Analysis of carbon (δ 13C), nitrogen (δ 15N) and sulphur (δ 34S) isotope data identified significant differences in the 'baseline' isotopic signatures between estuarine and coastal invertebrates, and discriminant function analysis allowed samples to be re-classified to estuarine and coastal regions with 98.8% accuracy. Using invertebrate signatures as source indicators, stable isotope data classified juvenile fishes to the region in which they fed. Feeding signals appear to reflect physiological (freshwater tolerance) and functional (mobility) differences between species. Juvenile sole were found to exist as two isotopically-discrete sub-populations, with no evidence of mixing between the two. An apparent energetic benefit of estuarine feeding was only found for sole.

  12. [Exon-intron structure of the fet5+ gene of Schizosaccharomyces pombe and physical mapping of genome encompassing regions].

    PubMed

    Shpakovskiĭ, G V; Lebedenko, E N

    1998-01-01

    Plasmid pYUK3 bearing the fet5+ gene of Schizosaccharomyces pombe was isolated from a genomic library of the fission yeast, and a detailed physical map of the whole genomic insert (ca. 9.6 Kbp) was constructed. The primary structure of the fet5+ gene and its flanking regions is established. The gene contains a single 45-bp intron in its distal part. A typical TATA-box (TATAAG) was found in the 5'-noncoding region ca. 50 bp upstream of the putative start of transcription, and the 3'-noncoding region contains AT-rich palindromes, which are probably involved in termination of the fet5+ transcription. A previously unidentified gene of Sz. pombe encoding a protein with some similarity to one of the transcriptional activators from the TBP (TATA-binding protein) group of SPT factors of transcription was found in the vicinity of the fet5+ gene. Taking into account that cDNA of the fet5(+)-gene was isolated as a suppressor of the genetic-defect of nuclear RNA polymerases I-III (Bioorg. Khim., 1997, vol. 23, No 3, pp. 234-237), this vicinity may be the first evidence of possible clustering, in the genome of the fission yeast, of genes participating in transcription regulation.

  13. A difference in the pattern of repair in a large genomic region in UV-irradiated normal human and Cockayne syndrome cells.

    PubMed

    Shanower, G A; Kantor, G J

    1997-11-01

    Xeroderma pigmentosum group C cells repair DNA damaged by ultraviolet radiation in an unusual pattern throughout the genome. They remove cyclobutane pyrimidine dimers only from the DNA of transcriptionally active chromatin regions and only from the strand that contains the transcribed strand. The repair proceeds in a manner that creates damage-free islands which are in some cases much larger than the active gene associated with them. For example, the small transcriptionally active beta-actin gene (3.5 kb) is repaired as part of a 50 kb single-stranded region. The repair responsible for creating these islands requires active transcription, suggesting that the two activities are coupled. A preferential repair pathway in normal human cells promotes repair of actively transcribed DNA strands and is coupled to transcription. It is not known if similar large islands, referred to as repair domains, are preferentially created as a result of the coupling. Data are presented showing that in normal cells, preferential repair in the beta-actin region is associated with the creation of a large, completely repaired region in the partially repaired genome. Repair at other genomic locations which contain inactive genes (insulin, 754) does not create similar large regions as quickly. In contrast, repair in Cockayne syndrome cells, which are defective in the preferential repair pathway but not in genome-overall repair, proceeds in the beta-actin region by a mechanism which does not create preferentially a large repaired region. Thus a correlation between the activity required to preferentially repair active genes and that required to create repaired domains is detected. We propose an involvement of the transcription-repair coupling factor in a coordinated repair pathway for removing DNA damage from entire transcription units.

  14. Whole-genome sequencing of a quarter-century melioidosis outbreak in temperate Australia uncovers a region of low-prevalence endemicity

    PubMed Central

    Chapple, Stephanie N. J.; Sarovich, Derek S.; Holden, Matthew T. G.; Peacock, Sharon J.; Buller, Nicky; Golledge, Clayton; Mayo, Mark; Currie, Bart J.

    2016-01-01

    Melioidosis, caused by the highly recombinogenic bacterium Burkholderia pseudomallei, is a disease with high mortality. Tracing the origin of melioidosis outbreaks and understanding how the bacterium spreads and persists in the environment are essential to protecting public and veterinary health and reducing mortality associated with outbreaks. We used whole-genome sequencing to compare isolates from a historical quarter-century outbreak that occurred between 1966 and 1991 in the Avon Valley, Western Australia, a region far outside the known range of B. pseudomallei endemicity. All Avon Valley outbreak isolates shared the same multilocus sequence type (ST-284), which has not been identified outside this region. We found substantial genetic diversity among isolates based on a comparison of genome-wide variants, with no clear correlation between genotypes and temporal, geographical or source data. We observed little evidence of recombination in the outbreak strains, indicating that genetic diversity among these isolates has primarily accrued by mutation. Phylogenomic analysis demonstrated that the isolates confidently grouped within the Australian B. pseudomallei clade, thereby ruling out introduction from a melioidosis-endemic region outside Australia. Collectively, our results point to B. pseudomallei ST-284 being present in the Avon Valley for longer than previously recognized, with its persistence and genomic diversity suggesting long-term, low-prevalence endemicity in this temperate region. Our findings provide a concerning demonstration of the potential for environmental persistence of B. pseudomallei far outside the conventional endemic regions. An expected increase in extreme weather events may reactivate latent B. pseudomallei populations in this region. PMID:28348862

  15. Impacts of Chromatin States and Long-Range Genomic Segments on Aging and DNA Methylation

    PubMed Central

    Sun, Dan; Yi, Soojin V.

    2015-01-01

    Understanding the fundamental dynamics of epigenome variation during normal aging is critical for elucidating key epigenetic alterations that affect development, cell differentiation and diseases. Advances in the field of aging and DNA methylation strongly support the aging epigenetic drift model. Although this model aligns with previous studies, the role of other epigenetic marks, such as histone modification, as well as the impact of sampling specific CpGs, must be evaluated. Ultimately, it is crucial to investigate how all CpGs in the human genome change their methylation with aging in their specific genomic and epigenomic contexts. Here, we analyze whole genome bisulfite sequencing DNA methylation maps of brain frontal cortex from individuals of diverse ages. Comparisons with blood data reveal tissue-specific patterns of epigenetic drift. By integrating chromatin state information, divergent degrees and directions of aging-associated methylation in different genomic regions are revealed. Whole genome bisulfite sequencing data also open a new door to investigate whether adjacent CpG sites exhibit coordinated DNA methylation changes with aging. We identified significant ‘aging-segments’, which are clusters of nearby CpGs that respond to aging by similar DNA methylation changes. These segments not only capture previously identified aging-CpGs but also include specific functional categories of genes with implications on epigenetic regulation of aging. For example, genes associated with development are highly enriched in positive aging segments, which are gradually hyper-methylated with aging. On the other hand, regions that are gradually hypo-methylated with aging (‘negative aging segments’) in the brain harbor genes involved in metabolism and protein ubiquitination. Given the importance of protein ubiquitination in proteome homeostasis of aging brains and neurodegenerative disorders, our finding suggests the significance of epigenetic regulation of this

  16. Adaptation to Low Salinity Promotes Genomic Divergence in Atlantic Cod (Gadus morhua L.)

    PubMed Central

    Berg, Paul R.; Jentoft, Sissel; Star, Bastiaan; Ring, Kristoffer H.; Knutsen, Halvor; Lien, Sigbjørn; Jakobsen, Kjetill S.; André, Carl

    2015-01-01

    How genomic selection enables species to adapt to divergent environments is a fundamental question in ecology and evolution. We investigated the genomic signatures of local adaptation in Atlantic cod (Gadus morhua L.) along a natural salinity gradient, ranging from 35‰ in the North Sea to 7‰ within the Baltic Sea. By utilizing a 12 K SNPchip, we simultaneously assessed neutral and adaptive genetic divergence across the Atlantic cod genome. Combining outlier analyses with a landscape genomic approach, we identified a set of directionally selected loci that are strongly correlated with habitat differences in salinity, oxygen, and temperature. Our results show that discrete regions within the Atlantic cod genome are subject to directional selection and associated with adaptation to the local environmental conditions in the Baltic- and the North Sea, indicating divergence hitchhiking and the presence of genomic islands of divergence. We report a suite of outlier single nucleotide polymorphisms within or closely located to genes associated with osmoregulation, as well as genes known to play important roles in the hydration and development of oocytes. These genes are likely to have key functions within a general osmoregulatory framework and are important for the survival of eggs and larvae, contributing to the buildup of reproductive isolation between the low-salinity adapted Baltic cod and the adjacent cod populations. Hence, our data suggest that adaptive responses to the environmental conditions in the Baltic Sea may contribute to a strong and effective reproductive barrier, and that Baltic cod can be viewed as an example of ongoing speciation. PMID:25994933

  17. Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR).

    PubMed

    Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J; Laclette, Juan P; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

    2015-05-19

    Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest.

  18. Genome analysis of Excretory/Secretory proteins in Taenia solium reveals their Abundance of Antigenic Regions (AAR)

    PubMed Central

    Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J.; Laclette, Juan P.; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián

    2015-01-01

    Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest. PMID:25989346

  19. Aberrant gene expression in mucosa adjacent to tumor reveals a molecular crosstalk in colon cancer

    PubMed Central

    2014-01-01

    Background A colorectal tumor is not an isolated entity growing in a restricted location of the body. The patient’s gut environment constitutes the framework where the tumor evolves and this relationship promotes and includes a complex and tight correlation of the tumor with inflammation, blood vessels formation, nutrition, and gut microbiome composition. The tumor influence in the environment could both promote an anti-tumor or a pro-tumor response. Methods A set of 98 paired adjacent mucosa and tumor tissues from colorectal cancer (CRC) patients and 50 colon mucosa from healthy donors (246 samples in total) were included in this work. RNA extracted from each sample was hybridized in Affymetrix chips Human Genome U219. Functional relationships between genes were inferred by means of systems biology using both transcriptional regulation networks (ARACNe algorithm) and protein-protein interaction networks (BIANA software). Results Here we report a transcriptomic analysis revealing a number of genes activated in adjacent mucosa from CRC patients, not activated in mucosa from healthy donors. A functional analysis of these genes suggested that this active reaction of the adjacent mucosa was related to the presence of the tumor. Transcriptional and protein-interaction networks were used to further elucidate this response of normal gut in front of the tumor, revealing a crosstalk between proteins secreted by the tumor and receptors activated in the adjacent colon tissue; and vice versa. Remarkably, Slit family of proteins activated ROBO receptors in tumor whereas tumor-secreted proteins transduced a cellular signal finally activating AP-1 in adjacent tissue. Conclusions The systems-level approach provides new insights into the micro-ecology of colorectal tumorogenesis. Disrupting this intricate molecular network of cell-cell communication and pro-inflammatory microenvironment could be a therapeutic target in CRC patients. PMID:24597571

  20. Sequencing intractable DNA to close microbial genomes.

    PubMed

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  1. St2-80: a new FISH marker for St genome and genome analysis in Triticeae.

    PubMed

    Wang, Long; Shi, Qinghua; Su, Handong; Wang, Yi; Sha, Lina; Fan, Xing; Kang, Houyang; Zhang, Haiqin; Zhou, Yonghong

    2017-07-01

    The St genome is one of the most fundamental genomes in Triticeae. Repetitive sequences are widely used to distinguish different genomes or species. The primary objectives of this study were to (i) screen a new sequence that could easily distinguish the chromosome of the St genome from those of other genomes by fluorescence in situ hybridization (FISH) and (ii) investigate the genome constitution of some species that remain uncertain and controversial. We used degenerated oligonucleotide primer PCR (Dop-PCR), Dot-blot, and FISH to screen for a new marker of the St genome and to test the efficiency of this marker in the detection of the St chromosome at different ploidy levels. Signals produced by a new FISH marker (denoted St 2 -80) were present on the entire arm of chromosomes of the St genome, except in the centromeric region. On the contrary, St 2 -80 signals were present in the terminal region of chromosomes of the E, H, P, and Y genomes. No signal was detected in the A and B genomes, and only weak signals were detected in the terminal region of chromosomes of the D genome. St 2 -80 signals were obvious and stable in chromosomes of different genomes, whether diploid or polyploid. Therefore, St 2 -80 is a potential and useful FISH marker that can be used to distinguish the St genome from those of other genomes in Triticeae.

  2. Activation gating kinetics of GIRK channels are mediated by cytoplasmic residues adjacent to transmembrane domains.

    PubMed

    Sadja, Rona; Reuveny, Eitan

    2009-01-01

    G-protein-coupled inwardly rectifying potassium channels (GIRK/Kir3.x) are involved in neurotransmission-mediated reduction of excitability. The gating mechanism following G protein activation of these channels likely proceeds from movement of inner transmembrane helices to allow K(+) ions movement through the pore of the channel. There is limited understanding of how the binding of G-protein betagamma subunits to cytoplasmic regions of the channel transduces the signal to the transmembrane regions. In this study, we examined the molecular basis that governs the activation kinetics of these channels, using a chimeric approach. We identified two regions as being important in determining the kinetics of activation. One region is the bottom of the outer transmembrane helix (TM1) and the cytoplasmic domain immediately adjacent (the slide helix); and the second region is the bottom of the inner transmembrane helix (TM2) and the cytoplasmic domain immediately adjacent. Interestingly, both of these regions are sufficient in mediating the kinetics of fast activation gating. This result suggests that there is a cooperative movement of either one of these domains to allow fast and efficient activation gating of GIRK channels.

  3. On Computing Breakpoint Distances for Genomes with Duplicate Genes.

    PubMed

    Shao, Mingfu; Moret, Bernard M E

    2017-06-01

    A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.

  4. Substantial genome synteny preservation among woody angiosperm species: comparative genomics of Chinese chestnut (Castanea mollissima) and plant reference genomes.

    PubMed

    Staton, Margaret; Zhebentyayeva, Tetyana; Olukolu, Bode; Fang, Guang Chen; Nelson, Dana; Carlson, John E; Abbott, Albert G

    2015-10-05

    Chinese chestnut (Castanea mollissima) has emerged as a model species for the Fagaceae family with extensive genomic resources including a physical map, a dense genetic map and quantitative trait loci (QTLs) for chestnut blight resistance. These resources enable comparative genomics analyses relative to model plants. We assessed the degree of conservation between the chestnut genome and other well annotated and assembled plant genomic sequences, focusing on the QTL regions of most interest to the chestnut breeding community. The integrated physical and genetic map of Chinese chestnut has been improved to now include 858 shared sequence-based markers. The utility of the integrated map has also been improved through the addition of 42,970 BAC (bacterial artificial chromosome) end sequences spanning over 26 million bases of the estimated 800 Mb chestnut genome. Synteny between chestnut and ten model plant species was conducted on a macro-syntenic scale using sequences from both individual probes and BAC end sequences across the chestnut physical map. Blocks of synteny with chestnut were found in all ten reference species, with the percent of the chestnut physical map that could be aligned ranging from 10 to 39 %. The integrated genetic and physical map was utilized to identify BACs that spanned the three previously identified QTL regions conferring blight resistance. The clones were pooled and sequenced, yielding 396 sequence scaffolds covering 13.9 Mbp. Comparative genomic analysis on a microsytenic scale, using the QTL-associated genomic sequence, identified synteny from chestnut to other plant genomes ranging from 5.4 to 12.9 % of the genome sequences aligning. On both the macro- and micro-synteny levels, the peach, grape and poplar genomes were found to be the most structurally conserved with chestnut. Interestingly, these results did not strictly follow the expectation that decreased phylogenetic distance would correspond to increased levels of genome

  5. Genomic profiling of 766 cancer-related genes in archived esophageal normal and carcinoma tissues.

    PubMed

    Chen, Jing; Guo, Liping; Peiffer, Daniel A; Zhou, Lixin; Chan, Owen Tsan Mo; Bibikova, Marina; Wickham-Garcia, Eliza; Lu, Shih-Hsin; Zhan, Qimin; Wang-Rodriguez, Jessica; Jiang, Wei; Fan, Jian-Bing

    2008-05-15

    We employed the BeadArraytrade mark technology to perform a genetic analysis in 33 formalin-fixed, paraffin-embedded (FFPE) human esophageal carcinomas, mostly squamous-cell-carcinoma (ESCC), and their adjacent normal tissues. A total of 1,432 single nucleotide polymorphisms (SNPs) derived from 766 cancer-related genes were genotyped with partially degraded genomic DNAs isolated from these samples. This directly targeted genomic profiling identified not only previously reported somatic gene amplifications (e.g., CCND1) and deletions (e.g., CDKN2A and CDKN2B) but also novel genomic aberrations. Among these novel targets, the most frequently deleted genomic regions were chromosome 3p (including tumor suppressor genes FANCD2 and CTNNB1) and chromosome 5 (including tumor suppressor gene APC). The most frequently amplified genomic region was chromosome 3q (containing DVL3, MLF1, ABCC5, BCL6, AGTR1 and known oncogenes TNK2, TNFSF10, FGF12). The chromosome 3p deletion and 3q amplification occurred coincidently in nearly all of the affected cases, suggesting a molecular mechanism for the generation of somatic chromosomal aberrations. We also detected significant differences in germline allele frequency between the esophageal cohort of our study and normal control samples from the International HapMap Project for 10 genes (CSF1, KIAA1804, IL2, PMS2, IRF7, FLT3, NTRK2, MAP3K9, ERBB2 and PRKAR1A), suggesting that they might play roles in esophageal cancer susceptibility and/or development. Taken together, our results demonstrated the utility of the BeadArray technology for high-throughput genetic analysis in FFPE tumor tissues and provided a detailed genetic profiling of cancer-related genes in human esophageal cancer. (c) 2008 Wiley-Liss, Inc.

  6. Regions of very low H3K27me3 partition the Drosophila genome into topological domains

    PubMed Central

    Flower, Rosalyn; Choo, Siew Woh

    2017-01-01

    It is now well established that eukaryote genomes have a common architectural organization into topologically associated domains (TADs) and evidence is accumulating that this organization plays an important role in gene regulation. However, the mechanisms that partition the genome into TADs and the nature of domain boundaries are still poorly understood. We have investigated boundary regions in the Drosophila genome and find that they can be identified as domains of very low H3K27me3. The genome-wide H3K27me3 profile partitions into two states; very low H3K27me3 identifies Depleted (D) domains that contain housekeeping genes and their regulators such as the histone acetyltransferase-containing NSL complex, whereas domains containing moderate-to-high levels of H3K27me3 (Enriched or E domains) are associated with regulated genes, irrespective of whether they are active or inactive. The D domains correlate with the boundaries of TADs and are enriched in a subset of architectural proteins, particularly Chromator, BEAF-32, and Z4/Putzig. However, rather than being clustered at the borders of these domains, these proteins bind throughout the H3K27me3-depleted regions and are much more strongly associated with the transcription start sites of housekeeping genes than with the H3K27me3 domain boundaries. While we have not demonstrated causality, we suggest that the D domain chromatin state, characterised by very low or absent H3K27me3 and established by housekeeping gene regulators, acts to separate topological domains thereby setting up the domain architecture of the genome. PMID:28282436

  7. Genomic mutation consequence calculator.

    PubMed

    Major, John E

    2007-11-15

    The genomic mutation consequence calculator (GMCC) is a tool that will reliably and quickly calculate the consequence of arbitrary genomic mutations. GMCC also reports supporting annotations for the specified genomic region. The particular strength of the GMCC is it works in genomic space, not simply in spliced transcript space as some similar tools do. Within gene features, GMCC can report on the effects on splice site, UTR and coding regions in all isoforms affected by the mutation. A considerable number of genomic annotations are also reported, including: genomic conservation score, known SNPs, COSMIC mutations, disease associations and others. The manual interface also offers link outs to various external databases and resources. In batch mode, GMCC returns a csv file which can easily be parsed by the end user. GMCC is intended to support the many tumor resequencing efforts, but can be useful to any study investigating genomic mutations.

  8. DNA methylation in the APOE genomic region is associated with cognitive function in African Americans.

    PubMed

    Liu, Jiaxuan; Zhao, Wei; Ware, Erin B; Turner, Stephen T; Mosley, Thomas H; Smith, Jennifer A

    2018-05-08

    Genetic variations in apolipoprotein E (APOE) and proximal genes (PVRL2, TOMM40, and APOC1) are associated with cognitive function and dementia, particularly Alzheimer's disease. Epigenetic mechanisms such as DNA methylation play a central role in the regulation of gene expression. Recent studies have found evidence that DNA methylation may contribute to the pathogenesis of dementia, but its association with cognitive function in populations without dementia remains unclear. We assessed DNA methylation levels of 48 CpG sites in the APOE genomic region in peripheral blood leukocytes collected from 289 African Americans (mean age = 67 years) from the Genetic Epidemiology Network of Arteriopathy (GENOA) study. Using linear regression, we examined the relationship between methylation in the APOE genomic region and multiple cognitive measures including learning, memory, processing speed, concentration, language and global cognitive function. We identified eight CpG sites in three genes (PVRL2, TOMM40, and APOE) that showed an inverse association between methylation level and delayed recall, a measure of memory, after adjusting for age and sex (False Discovery Rate q-value < 0.1). All eight CpGs are located in either CpG islands (CGIs) or CGI shelves, and six of them are in promoter regions. Education and APOE ε4 carrier status significantly modified the effect of methylation in cg08583001 (PVRL2) and cg22024783 (TOMM40), respectively. Together, methylation of the eight CpGs explained an additional 8.7% of the variance in delayed recall, after adjustment for age, sex, education, and APOE ε4 carrier status. Methylation was not significantly associated with any other cognitive measures. Our results suggest that methylation levels at multiple CpGs in the APOE genomic region are inversely associated with delayed recall during normal cognitive aging, even after accounting for known genetic predictors for cognition. Our findings highlight the important role of

  9. The Valdostana goat: a genome-wide investigation of the distinctiveness of its selective sweep regions.

    PubMed

    Talenti, Andrea; Bertolini, Francesca; Pagnacco, Giulio; Pilla, Fabio; Ajmone-Marsan, Paolo; Rothschild, Max F; Crepaldi, Paola

    2017-04-01

    The Valdostana goat is an alpine breed, raised only in the northern Italian region of the Aosta Valley. This breed's main purpose is to produce milk and meat, but is peculiar for its involvement in the "Batailles de Chèvres," a recent tradition of non-cruel fight tournaments. At both the genetic and genomic levels, only a very limited number of studies have been performed with this breed and there are no studies about the genomic signatures left by selection. In this work, 24 unrelated Valdostana animals were screened for runs of homozygosity to identify highly homozygous regions. Then, six different approaches (ROH comparison, Fst single SNPs and windows based, Bayesian, Rsb, and XP-EHH) were applied comparing the Valdostana dataset with 14 other Italian goat breeds to confirm regions that were different among the comparisons. A total of three regions of selection that were also unique among the Valdostana were identified and located on chromosomes 1, 7, and 12 and contained 144 genes. Enrichment analyses detected genes such as cytokines and lymphocyte/leukocyte proliferation genes involved in the regulation of the immune system. A genetic link between an aggressive challenge, cytokines, and immunity has been hypothesized in many studies both in humans and in other species. Possible hypotheses associated with the signals of selection detected could be therefore related to immune-related factors as well as with the peculiar battle competition, or other breed-specific traits, and provided insights for further investigation of these unique regions, for the understanding and safeguard of the Valdostana breed.

  10. Genome-Wide Association Study of Seed Dormancy and the Genomic Consequences of Improvement Footprints in Rice (Oryza sativa L.)

    PubMed Central

    Lu, Qing; Niu, Xiaojun; Zhang, Mengchen; Wang, Caihong; Xu, Qun; Feng, Yue; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Chen, Xiaoping; Liang, Xuanqiang; Wei, Xinghua

    2018-01-01

    Seed dormancy is an important agronomic trait affecting grain yield and quality because of pre-harvest germination and is influenced by both environmental and genetic factors. However, our knowledge of the factors controlling seed dormancy remains limited. To better reveal the molecular mechanism underlying this trait, a genome-wide association study was conducted in an indica-only population consisting of 453 accessions genotyped using 5,291 SNPs. Nine known and new significant SNPs were identified on eight chromosomes. These lead SNPs explained 34.9% of the phenotypic variation, and four of them were designed as dCAPS markers in the hope of accelerating molecular breeding. Moreover, a total of 212 candidate genes was predicted and eight candidate genes showed plant tissue-specific expression in expression profile data from different public bioinformatics databases. In particular, LOC_Os03g10110, which had a maize homolog involved in embryo development, was identified as a candidate regulator for further biological function investigations. Additionally, a polymorphism information content ratio method was used to screen improvement footprints and 27 selective sweeps were identified, most of which harbored domestication-related genes. Further studies suggested that three significant SNPs were adjacent to the candidate selection signals, supporting the accuracy of our genome-wide association study (GWAS) results. These findings show that genome-wide screening for selective sweeps can be used to identify new improvement-related DNA regions, although the phenotypes are unknown. This study enhances our knowledge of the genetic variation in seed dormancy, and the new dormancy-associated SNPs will provide real benefits in molecular breeding. PMID:29354150

  11. Genome-wide association study of multiple congenital heart disease phenotypes identifies a susceptibility locus for atrial septal defect at chromosome 4p16

    PubMed Central

    Cordell, Heather J.; Bentham, Jamie; Topf, Ana; Zelenika, Diana; Heath, Simon; Mamasoula, Chrysovalanto; Cosgrove, Catherine; Blue, Gillian; Granados-Riveron, Javier; Setchfield, Kerry; Thornborough, Chris; Breckpot, Jeroen; Soemedi, Rachel; Martin, Ruairidh; Rahman, Thahira J.; Hall, Darroch; van Engelen, Klaartje; Moorman, Antoon F.M.; Zwinderman, Aelko H; Barnett, Phil; Koopmann, Tamara T.; Adriaens, Michiel E.; Varro, Andras; George, Alfred L.; dos Remedios, Christobal; Bishopric, Nanette H.; Bezzina, Connie R.; O’Sullivan, John; Gewillig, Marc; Bu’Lock, Frances A.; Winlaw, David; Bhattacharya, Shoumo; Devriendt, Koen; Brook, J. David; Mulder, Barbara J.M.; Mital, Seema; Postma, Alex V.; Lathrop, G. Mark; Farrall, Martin; Goodship, Judith A.; Keavney, Bernard D.

    2013-01-01

    We carried out a genome-wide association study (GWAS) of congenital heart disease (CHD). Our discovery cohort comprised 1,995 CHD cases and 5,159 controls, and included patients from each of the three major clinical CHD categories (septal, obstructive and cyanotic defects). When all CHD phenotypes were considered together, no regions achieved genome-wide significant association. However, a region on chromosome 4p16, adjacent to the MSX1 and STX18 genes, was associated (P=9.5×10−7) with the risk of ostium secundum atrial septal defect (ASD) in the discovery cohort (N=340 cases), and this was replicated in a further 417 ASD cases and 2520 controls (replication P=5.0×10−5; OR in replication cohort 1.40 [95% CI 1.19-1.65]; combined P=2.6×10−10). Genotype accounted for ~9% of the population attributable risk of ASD. PMID:23708191

  12. A variable region within the genome of Streptococcus pneumoniae contributes to strain-strain variation in virulence.

    PubMed

    Harvey, Richard M; Stroeher, Uwe H; Ogunniyi, Abiodun D; Smith-Vaughan, Heidi C; Leach, Amanda J; Paton, James C

    2011-05-05

    The bacterial factors responsible for the variation in invasive potential between different clones and serotypes of Streptococcus pneumoniae are largely unknown. Therefore, the isolation of rare serotype 1 carriage strains in Indigenous Australian communities provided a unique opportunity to compare the genomes of non-invasive and invasive isolates of the same serotype in order to identify such factors. The human virulence status of non-invasive, intermediately virulent and highly virulent serotype 1 isolates was reflected in mice and showed that whilst both human non-invasive and highly virulent isolates were able to colonize the murine nasopharynx equally, only the human highly virulent isolates were able to invade and survive in the murine lungs and blood. Genomic sequencing comparisons between these isolates identified 8 regions >1 kb in size that were specific to only the highly virulent isolates, and included a version of the pneumococcal pathogenicity island 1 variable region (PPI-1v), phage-associated adherence factors, transporters and metabolic enzymes. In particular, a phage-associated endolysin, a putative iron/lead permease and an operon within PPI-1v exhibited niche-specific changes in expression that suggest important roles for these genes in the lungs and blood. Moreover, in vivo competition between pneumococci carrying PPI-1v derivatives representing the two identified versions of the region showed that the version of PPI-1v in the highly virulent isolates was more competitive than the version from the less virulent isolates in the nasopharyngeal tissue, blood and lungs. This study is the first to perform genomic comparisons between serotype 1 isolates with distinct virulence profiles that correlate between mice and humans, and has highlighted the important role that hypervariable genomic loci, such as PPI-1v, play in pneumococcal disease. The findings of this study have important implications for understanding the processes that drive progression

  13. Invited review: Inbreeding in the genomics era: Inbreeding, inbreeding depression, and management of genomic variability.

    PubMed

    Howard, Jeremy T; Pryce, Jennie E; Baes, Christine; Maltecca, Christian

    2017-08-01

    Traditionally, pedigree-based relationship coefficients have been used to manage the inbreeding and degree of inbreeding depression that exists within a population. The widespread incorporation of genomic information in dairy cattle genetic evaluations allows for the opportunity to develop and implement methods to manage populations at the genomic level. As a result, the realized proportion of the genome that 2 individuals share can be more accurately estimated instead of using pedigree information to estimate the expected proportion of shared alleles. Furthermore, genomic information allows genome-wide relationship or inbreeding estimates to be augmented to characterize relationships for specific regions of the genome. Region-specific stretches can be used to more effectively manage areas of low genetic diversity or areas that, when homozygous, result in reduced performance across economically important traits. The use of region-specific metrics should allow breeders to more precisely manage the trade-off between the genetic value of the progeny and undesirable side effects associated with inbreeding. Methods tailored toward more effectively identifying regions affected by inbreeding and their associated use to manage the genome at the herd level, however, still need to be developed. We have reviewed topics related to inbreeding, measures of relatedness, genetic diversity and methods to manage populations at the genomic level, and we discuss future challenges related to managing populations through implementing genomic methods at the herd and population levels. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  14. Identification of genomic regions associated with feed efficiency in Nelore cattle.

    PubMed

    de Oliveira, Priscila S N; Cesar, Aline S M; do Nascimento, Michele L; Chaves, Amália S; Tizioto, Polyana C; Tullio, Rymer R; Lanna, Dante P D; Rosa, Antonio N; Sonstegard, Tad S; Mourao, Gerson B; Reecy, James M; Garrick, Dorian J; Mudadu, Maurício A; Coutinho, Luiz L; Regitano, Luciana C A

    2014-09-26

    Feed efficiency is jointly determined by productivity and feed requirements, both of which are economically relevant traits in beef cattle production systems. The objective of this study was to identify genes/QTLs associated with components of feed efficiency in Nelore cattle using Illumina BovineHD BeadChip (770 k SNP) genotypes from 593 Nelore steers. The traits analyzed included: average daily gain (ADG), dry matter intake (DMI), feed-conversion ratio (FCR), feed efficiency (FE), residual feed intake (RFI), maintenance efficiency (ME), efficiency of gain (EG), partial efficiency of growth (PEG) and relative growth rate (RGR). The Bayes B analysis was completed with Gensel software parameterized to fit fewer markers than animals. Genomic windows containing all the SNP loci in each 1 Mb that accounted for more than 1.0% of genetic variance were considered as QTL region. Candidate genes within windows that explained more than 1% of genetic variance were selected by putative function based on DAVID and Gene Ontology. Thirty-six QTL (1-Mb SNP window) were identified on chromosomes 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 18, 19, 20, 21, 22, 24, 25 and 26 (UMD 3.1). The amount of genetic variance explained by individual QTL windows for feed efficiency traits ranged from 0.5% to 9.07%. Some of these QTL minimally overlapped with previously reported feed efficiency QTL for Bos taurus. The QTL regions described in this study harbor genes with biological functions related to metabolic processes, lipid and protein metabolism, generation of energy and growth. Among the positional candidate genes selected for feed efficiency are: HRH4, ALDH7A1, APOA2, LIN7C, CXADR, ADAM12 and MAP7. Some genomic regions and some positional candidate genes reported in this study have not been previously reported for feed efficiency traits in Bos indicus. Comparison with published results indicates that different QTLs and genes may be involved in the control of feed efficiency traits in this

  15. A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.

    PubMed

    Gog, Julia R; Lever, Andrew M L; Skittrall, Jordan P

    2018-01-01

    We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.

  16. Genomic variation in Plasmodium vivax malaria reveals regions under selective pressure.

    PubMed

    Diez Benavente, Ernest; Ward, Zoe; Chan, Wilson; Mohareb, Fady R; Sutherland, Colin J; Roper, Cally; Campino, Susana; Clark, Taane G

    2017-01-01

    Although Plasmodium vivax contributes to almost half of all malaria cases outside Africa, it has been relatively neglected compared to the more deadly P. falciparum. It is known that P. vivax populations possess high genetic diversity, differing geographically potentially due to different vector species, host genetics and environmental factors. We analysed the high-quality genomic data for 46 P. vivax isolates spanning 10 countries across 4 continents. Using population genetic methods we identified hotspots of selection pressure, including the previously reported MRP1 and DHPS genes, both putative drug resistance loci. Extra copies and deletions in the promoter region of another drug resistance candidate, MDR1 gene, and duplications in the Duffy binding protein gene (PvDBP) potentially involved in erythrocyte invasion, were also identified. For surveillance applications, continental-informative markers were found in putative drug resistance loci, and we show that organellar polymorphisms could classify P. vivax populations across continents and differentiate between Plasmodia spp. This study has shown that genomic diversity that lies within and between P. vivax populations can be used to elucidate potential drug resistance and invasion mechanisms, as well as facilitate the molecular barcoding of the parasite for surveillance applications.

  17. Finding local genome rearrangements.

    PubMed

    Simonaitis, Pijus; Swenson, Krister M

    2018-01-01

    The double cut and join (DCJ) model of genome rearrangement is well studied due to its mathematical simplicity and power to account for the many events that transform gene order. These studies have mostly been devoted to the understanding of minimum length scenarios transforming one genome into another. In this paper we search instead for rearrangement scenarios that minimize the number of rearrangements whose breakpoints are unlikely due to some biological criteria. One such criterion has recently become accessible due to the advent of the Hi-C experiment, facilitating the study of 3D spacial distance between breakpoint regions. We establish a link between the minimum number of unlikely rearrangements required by a scenario and the problem of finding a maximum edge-disjoint cycle packing on a certain transformed version of the adjacency graph. This link leads to a 3/2-approximation as well as an exact integer linear programming formulation for our problem, which we prove to be NP-complete. We also present experimental results on fruit flies, showing that Hi-C data is informative when used as a criterion for rearrangements. A new variant of the weighted DCJ distance problem is addressed that ignores scenario length in its objective function. A solution to this problem provides a lower bound on the number of unlikely moves necessary when transforming one gene order into another. This lower bound aids in the study of rearrangement scenarios with respect to chromatin structure, and could eventually be used in the design of a fixed parameter algorithm with a more general objective function.

  18. Comparative Genome Analysis of Ciprofloxacin-Resistant Pseudomonas aeruginosa Reveals Genes Within Newly Identified High Variability Regions Associated With Drug Resistance Development

    PubMed Central

    Su, Hsun-Cheng; Khatun, Jainab; Kanavy, Dona M.

    2013-01-01

    The alarming rise of ciprofloxacin-resistant Pseudomonas aeruginosa has been reported in several clinical studies. Though the mutation of resistance genes and their role in drug resistance has been researched, the process by which the bacterium acquires high-level resistance is still not well understood. How does the genomic evolution of P. aeruginosa affect resistance development? Could the exposure of antibiotics to the bacteria enrich genomic variants that lead to the development of resistance, and if so, how are these variants distributed through the genome? To answer these questions, we performed 454 pyrosequencing and a whole genome analysis both before and after exposure to ciprofloxacin. The comparative sequence data revealed 93 unique resistance strain variation sites, which included a mutation in the DNA gyrase subunit A gene. We generated variation-distribution maps comparing the wild and resistant types, and isolated 19 candidates from three discrete resistance-associated high variability regions that had available transposon mutants, to perform a ciprofloxacin exposure assay. Of these region candidates with transposon disruptions, 79% (15/19) showed a reduction in the ability to gain high-level resistance, suggesting that genes within these high variability regions might enrich for certain functions associated with resistance development. PMID:23808957

  19. Generation of Chimeric RNAs by cis-splicing of adjacent genes (cis-SAGe) in mammals.

    PubMed

    Zhuo, Jian-Shu; Jing, Xiao-Yan; Du, Xin; Yang, Xiu-Qin

    2018-02-20

    Chimeric RNA molecules, possessing exons from two or more independent genes, are traditionally believed to be produced by chromosome rearrangement. However, recent studies revealed that cis-splicing of adjacent genes (cis- SAGe) is one of the major mechanisms underlying the formation of chimeric RNAs. cis-SAGe refers to intergenic splicing of directly adjacent genes with the same transcriptional orientation, resulting in read-through transcripts, termed chimeric RNAs, which contain sequences from two or more parental genes. cis-SAGe was first identified in tumor cells, since then its potential in carcinogenesis has attracted extensive attention. More and more scientists are focusing on it. With the development of research, cis-SAGe was found to be ubiquitous in various normal tissues, and might make a crucial contribution to the formation of novel genes in the evolution of genomes. In this review, we summarize the splicing pattern, expression characteristics, possible mechanisms, and significance of cis-SAGe in mammals. This review will be helpful for general understanding of the current status and development tendency of cis-SAGe.

  20. Phylogenetic shadowing of primate sequences to find functional regions of the human genome.

    PubMed

    Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M

    2003-02-28

    Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.

  1. Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes

    PubMed Central

    Doerr, Daniel; Chauve, Cedric

    2017-01-01

    Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains. PMID:29114402

  2. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

    PubMed

    Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

    2017-10-17

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.

  3. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods

    PubMed Central

    Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.

    2017-01-01

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148

  4. Genomic analysis of cow mortality and milk production using a threshold-linear model.

    PubMed

    Tsuruta, S; Lourenco, D A L; Misztal, I; Lawlor, T J

    2017-09-01

    The objective of this study was to investigate the feasibility of genomic evaluation for cow mortality and milk production using a single-step methodology. Genomic relationships between cow mortality and milk production were also analyzed. Data included 883,887 (866,700) first-parity, 733,904 (711,211) second-parity, and 516,256 (492,026) third-parity records on cow mortality (305-d milk yields) of Holsteins from Northeast states in the United States. The pedigree consisted of up to 1,690,481 animals including 34,481 bulls genotyped with 36,951 SNP markers. Analyses were conducted with a bivariate threshold-linear model for each parity separately. Genomic information was incorporated as a genomic relationship matrix in the single-step BLUP. Traditional and genomic estimated breeding values (GEBV) were obtained with Gibbs sampling using fixed variances, whereas reliabilities were calculated from variances of GEBV samples. Genomic EBV were then converted into single nucleotide polymorphism (SNP) marker effects. Those SNP effects were categorized according to values corresponding to 1 to 4 standard deviations. Moving averages and variances of SNP effects were calculated for windows of 30 adjacent SNP, and Manhattan plots were created for SNP variances with the same window size. Using Gibbs sampling, the reliability for genotyped bulls for cow mortality was 28 to 30% in EBV and 70 to 72% in GEBV. The reliability for genotyped bulls for 305-d milk yields was 53 to 65% to 81 to 85% in GEBV. Correlations of SNP effects between mortality and 305-d milk yields within categories were the highest with the largest SNP effects and reached >0.7 at 4 standard deviations. All SNP regions explained less than 0.6% of the genetic variance for both traits, except regions close to the DGAT1 gene, which explained up to 2.5% for cow mortality and 4% for 305-d milk yields. Reliability for GEBV with a moderate number of genotyped animals can be calculated by Gibbs samples. Genomic

  5. Successive Two-sided Loop Jets Caused by Magnetic Reconnection between Two Adjacent Filamentary Threads

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tian, Zhanjun; Liu, Yu; Shen, Yuandeng

    We present observational analysis of two successive two-sided loop jets observed by the ground-based New Vacuum Solar Telescope and the space-borne Solar Dynamics Observatory . The two successive two-sided loop jets manifested similar evolution processes and both were associated with the interaction of two small-scale adjacent filamentary threads, magnetic emerging, and cancellation processes at the jet’s source region. High temporal and high spatial resolution observations reveal that the two adjacent ends of the two filamentary threads are rooted in opposite magnetic polarities within the source region. The two threads approached each other, and then an obvious brightening patch is observedmore » at the interaction position. Subsequently, a pair of hot plasma ejections are observed heading in opposite directions along the paths of the two filamentary threads at a typical speed for two-sided loop jets of the order 150 km s{sup −1}. Close to the end of the second jet, we report the formation of a bright hot loop structure at the source region, which suggests the formation of new loops during the interaction. Based on the observational results, we propose that the observed two-sided loop jets are caused by magnetic reconnection between the two adjacent filamentary threads, largely different from the previous scenario that a two-sided loop jet is generated by magnetic reconnection between an emerging bipole and the overlying horizontal magnetic fields.« less

  6. GenomeVista

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poliakov, Alexander; Couronne, Olivier

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  7. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    PubMed Central

    Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

    2008-01-01

    Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client

  8. A Mitochondrial Genome of Rhyparochromidae (Hemiptera: Heteroptera) and a Comparative Analysis of Related Mitochondrial Genomes.

    PubMed

    Li, Teng; Yang, Jie; Li, Yinwan; Cui, Ying; Xie, Qiang; Bu, Wenjun; Hillis, David M

    2016-10-19

    The Rhyparochromidae, the largest family of Lygaeoidea, encompasses more than 1,850 described species, but no mitochondrial genome has been sequenced to date. Here we describe the first mitochondrial genome for Rhyparochromidae: a complete mitochondrial genome of Panaorus albomaculatus (Scott, 1874). This mitochondrial genome is comprised of 16,345 bp, and contains the expected 37 genes and control region. The majority of the control region is made up of a large tandem-repeat region, which has a novel pattern not previously observed in other insects. The tandem-repeats region of P. albomaculatus consists of 53 tandem duplications (including one partial repeat), which is the largest number of tandem repeats among all the known insect mitochondrial genomes. Slipped-strand mispairing during replication is likely to have generated this novel pattern of tandem repeats. Comparative analysis of tRNA gene families in sequenced Pentatomomorpha and Lygaeoidea species shows that the pattern of nucleotide conservation is markedly higher on the J-strand. Phylogenetic reconstruction based on mitochondrial genomes suggests that Rhyparochromidae is not the sister group to all the remaining Lygaeoidea, and supports the monophyly of Lygaeoidea.

  9. DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila

    PubMed Central

    Liu, Jun; Zimmer, Kurt; Rusch, Douglas B.; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R.

    2015-01-01

    Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC. PMID:26227968

  10. Tracking genes of ecological relevance using a genome scan in two independent regional population samples of Arabis alpina.

    PubMed

    Poncet, Bénédicte N; Herrmann, Doris; Gugerli, Felix; Taberlet, Pierre; Holderegger, Rolf; Gielly, Ludovic; Rioux, Delphine; Thuiller, Wilfried; Aubert, Serge; Manel, Stéphanie

    2010-07-01

    Understanding the genetic basis of adaptation in response to environmental variation is fundamental as adaptation plays a key role in the extension of ecological niches to marginal habitats and in ecological speciation. Based on the assumption that some genomic markers are correlated to environmental variables, we aimed to detect loci of ecological relevance in the alpine plant Arabis alpina L. sampled in two regions, the French (99 locations) and the Swiss (109 locations) Alps. We used an unusually large genome scan [825 amplified fragment length polymorphism loci (AFLPs)] and four environmental variables related to temperature, precipitation and topography. We detected linkage disequilibrium among only 3.5% of the considered AFLP loci. A population structure analysis identified no admixture in the study regions, and the French and Swiss Alps were differentiated and therefore could be considered as two independent regions. We applied generalized estimating equations (GEE) to detect ecologically relevant loci separately in the French and Swiss Alps. We identified 78 loci of ecological relevance (9%), which were mainly related to mean annual minimum temperature. Only four of these loci were common across the French and Swiss Alps. Finally, we discuss that the genomic characterization of these ecologically relevant loci, as identified in this study, opens up new perspectives for studying functional ecology in A. alpina, its relatives and other alpine plant species.

  11. QTL-seq approach identified genomic regions and diagnostic markers for rust and late leaf spot resistance in groundnut (Arachis hypogaea L.)

    USDA-ARS?s Scientific Manuscript database

    Rust and late leaf spot (LLS) are the two major foliar fungal diseases in groundnut, and their co-occurrence leads to yield loss up to 50–70% in addition to the deterioration of fodder quality. To identify candidate genomic regions controlling rust and LLS resistance, we deployed whole genome re-seq...

  12. The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires

    PubMed Central

    Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel

    2016-01-01

    Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423

  13. Genome-environment association study suggests local adaptation to climate at the regional scale in Fagus sylvatica.

    PubMed

    Pluess, Andrea R; Frank, Aline; Heiri, Caroline; Lalagüe, Hadrien; Vendramin, Giovanni G; Oddou-Muratorio, Sylvie

    2016-04-01

    The evolutionary potential of long-lived species, such as forest trees, is fundamental for their local persistence under climate change (CC). Genome-environment association (GEA) analyses reveal if species in heterogeneous environments at the regional scale are under differential selection resulting in populations with potential preadaptation to CC within this area. In 79 natural Fagus sylvatica populations, neutral genetic patterns were characterized using 12 simple sequence repeat (SSR) markers, and genomic variation (144 single nucleotide polymorphisms (SNPs) out of 52 candidate genes) was related to 87 environmental predictors in the latent factor mixed model, logistic regressions and isolation by distance/environmental (IBD/IBE) tests. SSR diversity revealed relatedness at up to 150 m intertree distance but an absence of large-scale spatial genetic structure and IBE. In the GEA analyses, 16 SNPs in 10 genes responded to one or several environmental predictors and IBE, corrected for IBD, was confirmed. The GEA often reflected the proposed gene functions, including indications for adaptation to water availability and temperature. Genomic divergence and the lack of large-scale neutral genetic patterns suggest that gene flow allows the spread of advantageous alleles in adaptive genes. Thereby, adaptation processes are likely to take place in species occurring in heterogeneous environments, which might reduce their regional extinction risk under CC. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  14. Distribution of oil and natural-gas wells in relation to ground-water flow systems in the Great Basin region of Nevada and Utah, and adjacent states

    USGS Publications Warehouse

    Schaefer, Donald H.

    1996-01-01

    This map publication is one of several in a series concerning various aspects of the ground-water hydrology of the Great Basin in Nevada, Utah, and adjacent States.  One report in the series describes the hydrogeologic framework of the Great Basin (Plume and Carlton, 1988).  Another shows the ground-water levels for the aquifer systems of the Great Basin (Thomas and others, 1986).  A third report in the series describes the regional ground-water flow patterns in the Great Basin (Harrill and others, 1988).

  15. Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed.

    PubMed

    Hamidi Hay, E; Roberts, A

    2017-04-01

    Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.

  16. The complete mitochondrial genome sequence of the western flower thrips Frankliniella occidentalis (Thysanoptera: Thripidae) contains triplicate putative control regions.

    PubMed

    Yan, Dankan; Tang, Yunxia; Xue, Xiaofeng; Wang, Minghua; Liu, Fengquan; Fan, Jiaqin

    2012-09-10

    To investigate the features of the control region (CR) and the gene rearrangement in the mitochondrial (mt) genome of Thysanoptera insects, we sequenced the whole mt genome of the western flower thrips Frankliniella occidentalis (Thysanoptera: Thripidae). The mt genome is a circular molecule with 14,889 nucleotides and an A+T content of 76.6%, and it has triplicate putative CRs. We propose that tandem duplication and deletion account for the evolution of the CR and the gene translocations. Intramitochondrial recombination is a plausible model for the gene inversions. We discuss the excessive duplicate CR sequences and the transcription of the rRNA genes, which are distant from one another and from the CR. Finally, we address the significance of the complicated mt genomes in Thysanoptera for the evolution of the CR and the gene arrangement of the mt genome. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.

  17. Comparative Analysis of the Base Compositions of the Pre-mRNA 3′ Cleaved-Off Region and the mRNA 3′ Untranslated Region Relative to the Genomic Base Composition in Animals and Plants

    PubMed Central

    Li, Xiu-Qing

    2014-01-01

    The precursor messenger RNA (pre-mRNA) three-prime cleaved-off region (3′COR) and the mRNA three-prime untranslated region (3′UTR) play critical roles in regulating gene expression. The differences in base composition between these regions and the corresponding genomes are still largely uncharacterized in animals and plants. In this study, the base compositions of non-redundant 3′CORs and 3′UTRs were compared with the corresponding whole genomes of eleven animals, four dicotyledonous plants, and three monocotyledonous (cereal) plants. Among the four bases (A, C, G, and U for adenine, cytosine, guanine, and uracil, respectively), U (which corresponds to T, for thymine, in DNA) was the most frequent, A the second most frequent, G the third most frequent, and C the least frequent in most of the species in both the 3′COR and 3′UTR regions. In comparison with the whole genomes, in both regions the U content was usually the most overrepresented (particularly in the monocotyledonous plants), and the C content was the most underrepresented. The order obtained for the species groups, when ranked from high to low according to the U contents in the 3′COR and 3′UTR was as follows: dicotyledonous plants, monocotyledonous plants, non-mammal animals, and mammals. In contrast, the genomic T content was highest in dicotyledonous plants, lowest in monocotyledonous plants, and intermediate in animals. These results suggest the following: 1) there is a mechanism operating in both animals and plants which is biased toward U and against C in the 3′COR and 3′UTR; 2) the 3′UTR and 3′COR, as functional units, minimized the difference between dicotyledonous and monocotyledonous plants, while the dicotyledonous and monocotyledonous genomes evolved into two extreme groups in terms of base composition. PMID:24941005

  18. Non-coding genomic regions possessing enhancer and silencer potential are associated with healthy aging and exceptional survival.

    PubMed

    Kim, Sangkyu; Welsh, David A; Myers, Leann; Cherry, Katie E; Wyckoff, Jennifer; Jazwinski, S Michal

    2015-02-28

    We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13-14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity.

  19. Non-coding genomic regions possessing enhancer and silencer potential are associated with healthy aging and exceptional survival

    PubMed Central

    Kim, Sangkyu; Welsh, David A.; Myers, Leann; Cherry, Katie E.; Wyckoff, Jennifer; Jazwinski, S. Michal

    2015-01-01

    We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13–14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity. PMID:25682868

  20. Psoralen interstrand cross-link repair is specifically altered by an adjacent triple-stranded structure

    PubMed Central

    Guillonneau, F.; Guieysse, A. L.; Nocentini, S.; Giovannangeli, C.; Praseuth, D.

    2004-01-01

    Targeting DNA-damaging agents to specific DNA sites by using sequence-specific DNA ligands has been successful in directing genomic modifications. The understanding of repair processing of such targeted damage and the influence of the adjacent complex is largely unknown. In this way, directed interstrand cross-links (ICLs) have already been generated by psoralen targeting. The mechanisms responsible for ICL removal are far from being understood in mammalian cells, with the proposed involvement of both mutagenic and recombinogenic pathways. Here, a unique ICL was introduced at a selected site by photoactivation of a psoralen moiety with the use of psoralen conjugates of triplex-forming oligonucleotides. The processing of psoralen ICL was evaluated in vitro and in cells for two types of cross-linked substrates, either containing a psoralen ICL alone or with an adjacent triple-stranded structure. We show that the presence of a neighbouring triplex structure interferes with different stages of psoralen ICL processing: (i) the ICL-induced DNA repair synthesis in HeLa cell extracts is inhibited by the triplex structure, as measured by the efficiency of ‘true’ and futile repair synthesis, stopping at the ICL site; (ii) in HeLa cells, the ICL removal via a nucleotide excision repair (NER) pathway is delayed in the presence of a neighbouring triplex; and (iii) the binding to ICL of recombinant xeroderma pigmentosum A protein, which is involved in pre-incision recruitment of NER factors is impaired by the presence of the third DNA strand. These data characterize triplex-induced modulation of ICL repair pathways at specific steps, which might have implications for the controlled induction of targeted genomic modifications and for the associated cellular responses. PMID:14966263

  1. Core and region-enriched networks of behaviorally regulated genes and the singing genome

    PubMed Central

    Whitney, Osceola; Pfenning, Andreas R.; Howard, Jason T.; Blatti, Charles A; Liu, Fang; Ward, James M.; Wang, Rui; Audet, Jean-Nicolas; Kellis, Manolis; Mukherjee, Sayan; Sinha, Saurabh; Hartemink, Alexander J.; West, Anne E.; Jarvis, Erich D.

    2015-01-01

    Songbirds represent an important model organism for elucidating molecular mechanisms that link genes with complex behaviors, in part because they have discrete vocal learning circuits that have parallels with those that mediate human speech. We found that ~10% of the genes in the avian genome were regulated by singing, and we found a striking regional diversity of both basal and singing-induced programs in the four key song nuclei of the zebra finch, a vocal learning songbird. The region-enriched patterns were a result of distinct combinations of region-enriched transcription factors (TFs), their binding motifs, and presinging acetylation of histone 3 at lysine 27 (H3K27ac) enhancer activity in the regulatory regions of the associated genes. RNA interference manipulations validated the role of the calcium-response transcription factor (CaRF) in regulating genes preferentially expressed in specific song nuclei in response to singing. Thus, differential combinatorial binding of a small group of activity-regulated TFs and predefined epigenetic enhancer activity influences the anatomical diversity of behaviorally regulated gene networks. PMID:25504732

  2. Characterization of the genomic organization of the region bordering the centromere of chromosome V of Podospora anserina by direct sequencing.

    PubMed

    Silar, Philippe; Barreau, Christian; Debuchy, Robert; Kicka, Sébastien; Turcq, Béatrice; Sainsard-Chanet, Annie; Sellem, Carole H; Billault, Alain; Cattolico, Laurence; Duprat, Simone; Weissenbach, Jean

    2003-08-01

    A Podospora anserina BAC library of 4800 clones has been constructed in the vector pBHYG allowing direct selection in fungi. Screening of the BAC collection for centromeric sequences of chromosome V allowed the recovery of clones localized on either sides of the centromere, but no BAC clone was found to contain the centromere. Seven BAC clones containing 322,195 and 156,244bp from either sides of the centromeric region were sequenced and annotated. One 5S rRNA gene, 5 tRNA genes, and 163 putative coding sequences (CDS) were identified. Among these, only six CDS seem specific to P. anserina. The gene density in the centromeric region is approximately one gene every 2.8kb. Extrapolation of this gene density to the whole genome of P. anserina suggests that the genome contains about 11,000 genes. Synteny analyses between P. anserina and Neurospora crassa show that co-linearity extends at the most to a few genes, suggesting rapid genome rearrangements between these two species.

  3. SINEs, evolution and genome structure in the opossum.

    PubMed

    Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

    2007-07-01

    Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.

  4. Comparative sequence analysis of a region on human chromosome 13q14, frequently deleted in B-cell chronic lymphocytic leukemia, and its homologous region on mouse chromosome 14.

    PubMed

    Kapanadze, B; Makeeva, N; Corcoran, M; Jareborg, N; Hammarsund, M; Baranova, A; Zabarovsky, E; Vorontsova, O; Merup, M; Gahrton, G; Jansson, M; Yankovsky, N; Einhorn, S; Oscier, D; Grandér, D; Sangfelt, O

    2000-12-15

    Previous studies have indicated the presence of a putative tumor suppressor gene on human chromosome 13q14, commonly deleted in patients with B-cell chronic lymphocytic leukemia (B-CLL). We have recently identified a minimally deleted region encompassing parts of two adjacent genes, termed LEU1 and LEU2 (leukemia-associated genes 1 and 2), and several additional transcripts. In addition, 50 kb centromeric to this region we have identified another gene, LEU5/RFP2. To elucidate further the complex genomic organization of this region, we have identified, mapped, and sequenced the homologous region in the mouse. Fluorescence in situ hybridization analysis demonstrated that the region maps to mouse chromosome 14. The overall organization and gene order in this region were found to be highly conserved in the mouse. Sequence comparison between the human deletion hotspot region and its homologous mouse region revealed a high degree of sequence conservation with an overall score of 74%. However, our data also show that in terms of transcribed sequences, only two of those, human LEU2 and LEU5/RFP2, are clearly conserved, strengthening the case for these genes as putative candidate B-CLL tumor suppressor genes.

  5. Novel bacteriophages containing a genome of another bacteriophage within their genomes.

    PubMed

    Swanson, Maud M; Reavy, Brian; Makarova, Kira S; Cock, Peter J; Hopkins, David W; Torrance, Lesley; Koonin, Eugene V; Taliansky, Michael

    2012-01-01

    A novel bacteriophage infecting Staphylococus pasteuri was isolated during a screen for phages in Antarctic soils. The phage named SpaA1 is morphologically similar to phages of the family Siphoviridae. The 42,784 bp genome of SpaA1 is a linear, double-stranded DNA molecule with 3' protruding cohesive ends. The SpaA1 genome encompasses 63 predicted protein-coding genes which cluster within three regions of the genome, each of apparently different origin, in a mosaic pattern. In two of these regions, the gene sets resemble those in prophages of Bacillus thuringiensis kurstaki str. T03a001 (genes involved in DNA replication/transcription, cell entry and exit) and B. cereus AH676 (additional regulatory and recombination genes), respectively. The third region represents an almost complete genome (except for the short terminal segments) of a distinct bacteriophage, MZTP02. Nearly the same gene module was identified in prophages of B. thuringiensis serovar monterrey BGSC 4AJ1 and B. cereus Rock4-2. These findings suggest that MZTP02 can be shuttled between genomes of other bacteriophages and prophages, leading to the formation of chimeric genomes. The presence of a complete phage genome in the genome of other phages apparently has not been described previously and might represent a 'fast track' route of virus evolution and horizontal gene transfer. Another phage (BceA1) nearly identical in sequence to SpaA1, and also including the almost complete MZTP02 genome within its own genome, was isolated from a bacterium of the B. cereus/B. thuringiensis group. Remarkably, both SpaA1 and BceA1 phages can infect B. cereus and B. thuringiensis, but only one of them, SpaA1, can infect S. pasteuri. This finding is best compatible with a scenario in which MZTP02 was originally contained in BceA1 infecting Bacillus spp, the common hosts for these two phages, followed by emergence of SpaA1 infecting S. pasteuri.

  6. Genomic structure and paralogous regions of the inversion breakpoint occurring between human chromosome 3p12.3 and orangutan chromosome 2.

    PubMed

    Yue, Y; Grossmann, B; Tsend-Ayush, E; Grützner, F; Ferguson-Smith, M A; Yang, F; Haaf, T

    2005-01-01

    Intrachromosomal duplications play a significant role in human genome pathology and evolution. To better understand the molecular basis of evolutionary chromosome rearrangements, we performed molecular cytogenetic and sequence analyses of the breakpoint region that distinguishes human chromosome 3p12.3 and orangutan chromosome 2. FISH with region-specific BAC clones demonstrated that the breakpoint-flanking sequences are duplicated intrachromosomally on orangutan 2 and human 3q21 as well as at many pericentromeric and subtelomeric sites throughout the genomes. Breakage and rearrangement of the human 3p12.3-homologous region in the orangutan lineage were associated with a partial loss of duplicated sequences in the breakpoint region. Consistent with our FISH mapping results, computational analysis of the human chromosome 3 genomic sequence revealed three 3p12.3-paralogous sequence blocks on human chromosome 3q21 and smaller blocks on the short arm end 3p26-->p25. This is consistent with the view that sequences from an ancestral site at 3q21 were duplicated at 3p12.3 in a common ancestor of orangutan and humans. Our results show that evolutionary chromosome rearrangements are associated with microduplications and microdeletions, contributing to the DNA differences between closely related species. Copyright (c) 2005 S. Karger AG, Basel.

  7. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    PubMed

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  8. A genome-wide association study implicates the APOE locus in nonpathological cognitive ageing.

    PubMed

    Davies, G; Harris, S E; Reynolds, C A; Payton, A; Knight, H M; Liewald, D C; Lopez, L M; Luciano, M; Gow, A J; Corley, J; Henderson, R; Murray, C; Pattie, A; Fox, H C; Redmond, P; Lutz, M W; Chiba-Falek, O; Linnertz, C; Saith, S; Haggarty, P; McNeill, G; Ke, X; Ollier, W; Horan, M; Roses, A D; Ponting, C P; Porteous, D J; Tenesa, A; Pickles, A; Starr, J M; Whalley, L J; Pedersen, N L; Pendleton, N; Visscher, P M; Deary, I J

    2014-01-01

    Cognitive decline is a feared aspect of growing old. It is a major contributor to lower quality of life and loss of independence in old age. We investigated the genetic contribution to individual differences in nonpathological cognitive ageing in five cohorts of older adults. We undertook a genome-wide association analysis using 549 692 single-nucleotide polymorphisms (SNPs) in 3511 unrelated adults in the Cognitive Ageing Genetics in England and Scotland (CAGES) project. These individuals have detailed longitudinal cognitive data from which phenotypes measuring each individual's cognitive changes were constructed. One SNP--rs2075650, located in TOMM40 (translocase of the outer mitochondrial membrane 40 homolog)--had a genome-wide significant association with cognitive ageing (P=2.5 × 10(-8)). This result was replicated in a meta-analysis of three independent Swedish cohorts (P=2.41 × 10(-6)). An Apolipoprotein E (APOE) haplotype (adjacent to TOMM40), previously associated with cognitive ageing, had a significant effect on cognitive ageing in the CAGES sample (P=2.18 × 10(-8); females, P=1.66 × 10(-11); males, P=0.01). Fine SNP mapping of the TOMM40/APOE region identified both APOE (rs429358; P=3.66 × 10(-11)) and TOMM40 (rs11556505; P=2.45 × 10(-8)) as loci that were associated with cognitive ageing. Imputation and conditional analyses in the discovery and replication cohorts strongly suggest that this effect is due to APOE (rs429358). Functional genomic analysis indicated that SNPs in the TOMM40/APOE region have a functional, regulatory non-protein-coding effect. The APOE region is significantly associated with nonpathological cognitive ageing. The identity and mechanism of one or multiple causal variants remain unclear.

  9. Genomic analysis of the chromosome 15q11-q13 Prader-Willi syndrome region and characterization of transcripts for GOLGA8E and WHCD1L1 from the proximal breakpoint region.

    PubMed

    Jiang, Yong-Hui; Wauki, Kekio; Liu, Qian; Bressler, Jan; Pan, Yanzhen; Kashork, Catherine D; Shaffer, Lisa G; Beaudet, Arthur L

    2008-01-28

    Prader-Willi syndrome (PWS) is a neurobehavioral disorder characterized by neonatal hypotonia, childhood obesity, dysmorphic features, hypogonadism, mental retardation, and behavioral problems. Although PWS is most often caused by a paternal interstitial deletion of a 6-Mb region of chromosome 15q11-q13, the identity of the exact protein coding or noncoding RNAs whose deficiency produces the PWS phenotype is uncertain. There are also reports describing a PWS-like phenotype in a subset of patients with full mutations in the FMR1 (fragile X mental retardation 1) gene. Taking advantage of the human genome sequence, we have performed extensive sequence analysis and molecular studies for the PWS candidate region. We have characterized transcripts for the first time for two UCSC Genome Browser predicted protein-coding genes, GOLGA8E (golgin subfamily a, 8E) and WHDC1L1 (WAS protein homology region containing 1-like 1) and have further characterized two previously reported genes, CYF1P1 and NIPA2; all four genes are in the region close to the proximal/centromeric deletion breakpoint (BP1). GOLGA8E belongs to the golgin subfamily of coiled-coil proteins associated with the Golgi apparatus. Six out of 16 golgin subfamily proteins in the human genome have been mapped in the chromosome 15q11-q13 and 15q24-q26 regions. We have also identified more than 38 copies of GOLGA8E-like sequence in the 15q11-q14 and 15q23-q26 regions which supports the presence of a GOLGA8E-associated low copy repeat (LCR). Analysis of the 15q11-q13 region by PFGE also revealed a polymorphic region between BP1 and BP2. WHDC1L1 is a novel gene with similarity to mouse Whdc1 (WAS protein homology region 2 domain containing 1) and human JMY protein (junction-mediating and regulatory protein). Expression analysis of cultured human cells and brain tissues from PWS patients indicates that CYFIP1 and NIPA2 are biallelically expressed. However, we were not able to determine the allele-specific expression

  10. Genomic variation in Plasmodium vivax malaria reveals regions under selective pressure

    PubMed Central

    Diez Benavente, Ernest; Ward, Zoe; Chan, Wilson; Mohareb, Fady R.; Sutherland, Colin J.; Roper, Cally; Campino, Susana

    2017-01-01

    Background Although Plasmodium vivax contributes to almost half of all malaria cases outside Africa, it has been relatively neglected compared to the more deadly P. falciparum. It is known that P. vivax populations possess high genetic diversity, differing geographically potentially due to different vector species, host genetics and environmental factors. Results We analysed the high-quality genomic data for 46 P. vivax isolates spanning 10 countries across 4 continents. Using population genetic methods we identified hotspots of selection pressure, including the previously reported MRP1 and DHPS genes, both putative drug resistance loci. Extra copies and deletions in the promoter region of another drug resistance candidate, MDR1 gene, and duplications in the Duffy binding protein gene (PvDBP) potentially involved in erythrocyte invasion, were also identified. For surveillance applications, continental-informative markers were found in putative drug resistance loci, and we show that organellar polymorphisms could classify P. vivax populations across continents and differentiate between Plasmodia spp. Conclusions This study has shown that genomic diversity that lies within and between P. vivax populations can be used to elucidate potential drug resistance and invasion mechanisms, as well as facilitate the molecular barcoding of the parasite for surveillance applications. PMID:28493919

  11. Draft genome sequence of bitter gourd (Momordica charantia), a vegetable and medicinal plant in tropical and subtropical regions.

    PubMed

    Urasaki, Naoya; Takagi, Hiroki; Natsume, Satoshi; Uemura, Aiko; Taniai, Naoki; Miyagi, Norimichi; Fukushima, Mai; Suzuki, Shouta; Tarora, Kazuhiko; Tamaki, Moritoshi; Sakamoto, Moriaki; Terauchi, Ryohei; Matsumura, Hideo

    2017-02-01

    Bitter gourd (Momordica charantia) is an important vegetable and medicinal plant in tropical and subtropical regions globally. In this study, the draft genome sequence of a monoecious bitter gourd inbred line, OHB3-1, was analyzed. Through Illumina sequencing and de novo assembly, scaffolds of 285.5 Mb in length were generated, corresponding to ∼84% of the estimated genome size of bitter gourd (339 Mb). In this draft genome sequence, 45,859 protein-coding gene loci were identified, and transposable elements accounted for 15.3% of the whole genome. According to synteny mapping and phylogenetic analysis of conserved genes, bitter gourd was more related to watermelon (Citrullus lanatus) than to cucumber (Cucumis sativus) or melon (C. melo). Using RAD-seq analysis, 1507 marker loci were genotyped in an F2 progeny of two bitter gourd lines, resulting in an improved linkage map, comprising 11 linkage groups. By anchoring RAD tag markers, 255 scaffolds were assigned to the linkage map. Comparative analysis of genome sequences and predicted genes determined that putative trypsin-inhibitor and ribosome-inactivating genes were distinctive in the bitter gourd genome. These genes could characterize the bitter gourd as a medicinal plant. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  12. Autopolyploidy genome duplication preserves other ancient genome duplications in Atlantic salmon (Salmo salar).

    PubMed

    Christensen, Kris A; Davidson, William S

    2017-01-01

    Salmonids (e.g. Atlantic salmon, Pacific salmon, and trouts) have a long legacy of genome duplication. In addition to three ancient genome duplications that all teleosts are thought to share, salmonids have had one additional genome duplication. We explored a methodology for untangling these duplications from each other to better understand them in Atlantic salmon. In this methodology, homeologous regions (paralogous/duplicated genomic regions originating from a whole genome duplication) from the most recent genome duplication were assumed to have duplicated genes at greater density and have greater sequence similarity. This assumption was used to differentiate duplicated gene pairs in Atlantic salmon that are either from the most recent genome duplication or from earlier duplications. From a comparison with multiple vertebrate species, it is clear that Atlantic salmon have retained more duplicated genes from ancient genome duplications than other vertebrates--often at higher density in the genome and containing fewer synonymous mutations. It may be that polysomic inheritance is the mechanism responsible for maintaining ancient gene duplicates in salmonids. Polysomic inheritance (when multiple chromosomes pair during meiosis) is thought to be relatively common in salmonids compared to other vertebrate species. These findings illuminate how genome duplications may not only increase the number of duplicated genes, but may also be involved in the maintenance of them from previous genome duplications as well.

  13. A segment of the apospory-specific genomic region is highly microsyntenic not only between the apomicts Pennisetum squamulatum and buffelgrass, but also with a rice chromosome 11 centromeric-proximal genomic region.

    PubMed

    Gualtieri, Gustavo; Conner, Joann A; Morishige, Daryl T; Moore, L David; Mullet, John E; Ozias-Akins, Peggy

    2006-03-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory.

  14. Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes

    PubMed Central

    Cannon, Steven B.; Sterck, Lieven; Rombauts, Stephane; Sato, Shusei; Cheung, Foo; Gouzy, Jérôme; Wang, Xiaohong; Mudge, Joann; Vasdewani, Jayprakash; Schiex, Thomas; Spannagl, Manuel; Monaghan, Erin; Nicholson, Christine; Humphray, Sean J.; Schoof, Heiko; Mayer, Klaus F. X.; Rogers, Jane; Quétier, Francis; Oldroyd, Giles E.; Debellé, Frédéric; Cook, Douglas R.; Retzel, Ernest F.; Roe, Bruce A.; Town, Christopher D.; Tabata, Satoshi; Van de Peer, Yves; Young, Nevin D.

    2006-01-01

    Genome sequencing of the model legumes, Medicago truncatula and Lotus japonicus, provides an opportunity for large-scale sequence-based comparison of two genomes in the same plant family. Here we report synteny comparisons between these species, including details about chromosome relationships, large-scale synteny blocks, microsynteny within blocks, and genome regions lacking clear correspondence. The Lotus and Medicago genomes share a minimum of 10 large-scale synteny blocks, each with substantial collinearity and frequently extending the length of whole chromosome arms. The proportion of genes syntenic and collinear within each synteny block is relatively homogeneous. Medicago–Lotus comparisons also indicate similar and largely homogeneous gene densities, although gene-containing regions in Mt occupy 20–30% more space than Lj counterparts, primarily because of larger numbers of Mt retrotransposons. Because the interpretation of genome comparisons is complicated by large-scale genome duplications, we describe synteny, synonymous substitutions and phylogenetic analyses to identify and date a probable whole-genome duplication event. There is no direct evidence for any recent large-scale genome duplication in either Medicago or Lotus but instead a duplication predating speciation. Phylogenetic comparisons place this duplication within the Rosid I clade, clearly after the split between legumes and Salicaceae (poplar). PMID:17003129

  15. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    PubMed

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Genomic Alterations in Biliary Atresia Suggests Region of Potential Disease Susceptibility in 2q37.3

    PubMed Central

    Leyva-Vega, Melissa; Gerfen, Jennifer; Thiel, Brian D.; Jurkiewicz, Dorota; Rand, Elizabeth B.; Pawlowska, Joanna; Kaminska, Diana; Russo, Pierre; Gai, Xiaowu; Krantz, Ian D.; Kamath, Binita M.; Hakonarson, Hakon; Haber, Barbara A.; Spinner, Nancy B.

    2010-01-01

    Biliary atresia (BA) is a progressive, idiopathic obliteration of the extrahepatic biliary system occurring exclusively in the neonatal period. It is the most common disease leading to liver transplantation in children. The etiology of BA is unknown, although infectious, immune and genetic causes have been suggested. While the recurrence of BA in families is not common, there are more than 30 multiplex families reported and an underlying genetic susceptibility has been hypothesized. We screened a cohort of 35 BA patients for genomic alterations that might confer susceptibility to BA. DNA was genotyped on the Illumina Quad550 platform, which analyzes over 550,000 single nucleotide polymorphisms (SNPs) for genomic deletions and duplications. Areas of increased and decreased copy number were compared to those found in control populations. In order to identify regions that could serve as susceptibility factors for BA, we searched for regions that were found in BA patients, but not in controls. We identified two unrelated BA patients with overlapping heterozygous deletions of 2q37.3. Patient 1 had a 1.76 Mb (280 SNP), heterozygous deletion containing thirty genes. Patient 2 had a 5.87 Mb (1,346 SNP) heterozygous deletion containing fifty-five genes. The overlapping 1.76 Mb deletion on chromosome 2q37.3 from 240,936,900 to 242,692,820 constitutes the critical region and the genes within this region could be candidates for susceptibility to BA. PMID:20358598

  17. The evolution of vertebrate somatostatin receptors and their gene regions involves extensive chromosomal rearrangements

    PubMed Central

    2012-01-01

    Background Somatostatin and its related neuroendocrine peptides have a wide variety of physiological functions that are mediated by five somatostatin receptors with gene names SSTR1-5 in mammals. To resolve their evolution in vertebrates we have investigated the SSTR genes and a large number of adjacent gene families by phylogeny and conserved synteny analyses in a broad range of vertebrate species. Results We find that the SSTRs form two families that belong to distinct paralogons. We observe not only chromosomal similarities reflecting the paralogy relationships between the SSTR-bearing chromosome regions, but also extensive rearrangements between these regions in teleost fish genomes, including fusions and translocations followed by reshuffling through intrachromosomal rearrangements. These events obscure the paralogy relationships but are still tractable thanks to the many genomes now available. We have identified a previously unrecognized SSTR subtype, SSTR6, previously misidentified as either SSTR1 or SSTR4. Conclusions Two ancestral SSTR-bearing chromosome regions were duplicated in the two basal vertebrate tetraploidizations (2R). One of these ancestral SSTR genes generated SSTR2, -3 and -5, the other gave rise to SSTR1, -4 and -6. Subsequently SSTR6 was lost in tetrapods and SSTR4 in teleosts. Our study shows that extensive chromosomal rearrangements have taken place between related chromosome regions in teleosts, but that these events can be resolved by investigating several distantly related species. PMID:23194088

  18. The bonobo genome compared with the chimpanzee and human genomes

    PubMed Central

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

    2012-01-01

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832

  19. Genome scan study of prostate cancer in Arabs: identification of three genomic regions with multiple prostate cancer susceptibility loci in Tunisians.

    PubMed

    Shan, Jingxuan; Al-Rumaihi, Khalid; Rabah, Danny; Al-Bozom, Issam; Kizhakayil, Dhanya; Farhat, Karim; Al-Said, Sami; Kfoury, Hala; Dsouza, Shoba P; Rowe, Jillian; Khalak, Hanif G; Jafri, Shahzad; Aigha, Idil I; Chouchane, Lotfi

    2013-05-13

    Large databases focused on genetic susceptibility to prostate cancer have been accumulated from population studies of different ancestries, including Europeans and African-Americans. Arab populations, however, have been only rarely studied. Using Affymetrix Genome-Wide Human SNP Array 6, we conducted a genome-wide association study (GWAS) in which 534,781 single nucleotide polymorphisms (SNPs) were genotyped in 221 Tunisians (90 prostate cancer patients and 131 age-matched healthy controls). TaqMan SNP Genotyping Assays on 11 prostate cancer associated SNPs were performed in a distinct cohort of 337 individuals from Arab ancestry living in Qatar and Saudi Arabia (155 prostate cancer patients and 182 age-matched controls). In-silico expression quantitative trait locus (eQTL) analysis along with mRNA quantification of nearby genes was performed to identify loci potentially cis-regulated by the identified SNPs. Three chromosomal regions, encompassing 14 SNPs, are significantly associated with prostate cancer risk in the Tunisian population (P = 1 × 10-4 to P = 1 × 10-5). In addition to SNPs located on chromosome 17q21, previously found associated with prostate cancer in Western populations, two novel chromosomal regions are revealed on chromosome 9p24 and 22q13. eQTL analysis and mRNA quantification indicate that the prostate cancer associated SNPs of chromosome 17 could enhance the expression of STAT5B gene. Our findings, identifying novel GWAS prostate cancer susceptibility loci, indicate that prostate cancer genetic risk factors could be ethnic specific.

  20. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

    PubMed

    Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

    2014-04-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.

  1. Analyses of Charophyte Chloroplast Genomes Help Characterize the Ancestral Chloroplast Genome of Land Plants

    PubMed Central

    Civáň, Peter; Foster, Peter G.; Embley, Martin T.; Séneca, Ana; Cox, Cymon J.

    2014-01-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes. PMID:24682153

  2. NAHR-mediated copy-number variants in a clinical population: mechanistic insights into both genomic disorders and Mendelizing traits.

    PubMed

    Dittwald, Piotr; Gambin, Tomasz; Szafranski, Przemyslaw; Li, Jian; Amato, Stephen; Divon, Michael Y; Rodríguez Rojas, Lisa Ximena; Elton, Lindsay E; Scott, Daryl A; Schaaf, Christian P; Torres-Martinez, Wilfredo; Stevens, Abby K; Rosenfeld, Jill A; Agadi, Satish; Francis, David; Kang, Sung-Hae L; Breman, Amy; Lalani, Seema R; Bacino, Carlos A; Bi, Weimin; Milosavljevic, Aleksandar; Beaudet, Arthur L; Patel, Ankita; Shaw, Chad A; Lupski, James R; Gambin, Anna; Cheung, Sau Wai; Stankiewicz, Pawel

    2013-09-01

    We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease.

  3. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    PubMed

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  4. Variability of adjacency effects in sky reflectance measurements.

    PubMed

    Groetsch, Philipp M M; Gege, Peter; Simis, Stefan G H; Eleveld, Marieke A; Peters, Steef W M

    2017-09-01

    Sky reflectance R sky (λ) is used to correct in situ reflectance measurements in the remote detection of water color. We analyzed the directional and spectral variability in R sky (λ) due to adjacency effects against an atmospheric radiance model. The analysis is based on one year of semi-continuous R sky (λ) observations that were recorded in two azimuth directions. Adjacency effects contributed to R sky (λ) dependence on season and viewing angle and predominantly in the near-infrared (NIR). For our test area, adjacency effects spectrally resembled a generic vegetation spectrum. The adjacency effect was weakly dependent on the magnitude of Rayleigh- and aerosol-scattered radiance. The reflectance differed between viewing directions 5.4±6.3% for adjacency effects and 21.0±19.8% for Rayleigh- and aerosol-scattered R sky (λ) in the NIR. Under which conditions in situ water reflectance observations require dedicated correction for adjacency effects is discussed. We provide an open source implementation of our method to aid identification of such conditions.

  5. Fully automatic lesion segmentation in breast MRI using mean-shift and graph-cuts on a region adjacency graph.

    PubMed

    McClymont, Darryl; Mehnert, Andrew; Trakic, Adnan; Kennedy, Dominic; Crozier, Stuart

    2014-04-01

    To present and evaluate a fully automatic method for segmentation (i.e., detection and delineation) of suspicious tissue in breast MRI. The method, based on mean-shift clustering and graph-cuts on a region adjacency graph, was developed and its parameters tuned using multimodal (T1, T2, DCE-MRI) clinical breast MRI data from 35 subjects (training data). It was then tested using two data sets. Test set 1 comprises data for 85 subjects (93 lesions) acquired using the same protocol and scanner system used to acquire the training data. Test set 2 comprises data for eight subjects (nine lesions) acquired using a similar protocol but a different vendor's scanner system. Each lesion was manually delineated in three-dimensions by an experienced breast radiographer to establish segmentation ground truth. The regions of interest identified by the method were compared with the ground truth and the detection and delineation accuracies quantitatively evaluated. One hundred percent of the lesions were detected with a mean of 4.5 ± 1.2 false positives per subject. This false-positive rate is nearly 50% better than previously reported for a fully automatic breast lesion detection system. The median Dice coefficient for Test set 1 was 0.76 (interquartile range, 0.17), and 0.75 (interquartile range, 0.16) for Test set 2. The results demonstrate the efficacy and accuracy of the proposed method as well as its potential for direct application across different MRI systems. It is (to the authors' knowledge) the first fully automatic method for breast lesion detection and delineation in breast MRI.

  6. The human genome: a multifractal analysis

    PubMed Central

    2011-01-01

    Background Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode. Results We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed. Conclusions Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful. PMID:21999602

  7. Flow and transport within a coastal aquifer adjacent to a stratified water body

    NASA Astrophysics Data System (ADS)

    Oz, Imri; Yechieli, Yoseph; Eyal, Shalev; Gavrieli, Ittai; Gvirtzman, Haim

    2016-04-01

    The existence of a freshwater-saltwater interface and the circulation flow of saltwater beneath the interface is a well-known phenomenon found at coastal aquifers. This flow is a natural phenomenon that occurs due to density differences between fresh groundwater and the saltwater body. The goals of this research are to use analytical, numerical, and physical models in order to examine the configuration of the freshwater-saltwater interface and the density-driven flow patterns within a coastal aquifer adjacent to long-term stratified saltwater bodies (e.g. meromictic lake). Such hydrological systems are unique, as they consist of three different water types: the regional fresh groundwater, and low and high salinity brines forming the upper and lower water layers of the stratified water body, respectively. This research also aims to examine the influence of such stratification on hydrogeological processes within the coastal aquifer. The coastal aquifer adjacent to the Dead Sea, under its possible future meromictic conditions, serves as an ideal example to examine these processes. The results show that adjacent to a stratified saltwater body three interfaces between three different water bodies are formed, and that a complex flow system, controlled by the density differences, is created, where three circulation cells are developed. These results are significantly different from the classic circulation cell that is found adjacent to non-stratified water bodies (lakes or oceans). In order to obtain a more generalized insight into the groundwater behavior adjacent to a stratified water body, we used the numerical model to perform sensitivity analysis. The hydrological system was found be sensitive to three dimensionless parameters: dimensionless density (i.e. the relative density of the three water bodies'); dimensionless thickness (i.e. the ratio between the relative thickness of the upper layer and the whole thickness of the lake); and dimensionless flux. The results

  8. Matrix Intensification Alters Avian Functional Group Composition in Adjacent Rainforest Fragments

    PubMed Central

    Deikumah, Justus P.; McAlpine, Clive A.; Maron, Martine

    2013-01-01

    Conversion of farmland land-use matrices to surface mining is an increasing threat to the habitat quality of forest remnants and their constituent biota, with consequences for ecosystem functionality. We evaluated the effects of matrix type on bird community composition and the abundance and evenness within avian functional groups in south-west Ghana. We hypothesized that surface mining near remnants may result in a shift in functional composition of avifaunal communities, potentially disrupting ecological processes within tropical forest ecosystems. Matrix intensification and proximity to the remnant edge strongly influenced the abundance of members of several functional guilds. Obligate frugivores, strict terrestrial insectivores, lower and upper strata birds, and insect gleaners were most negatively affected by adjacent mining matrices, suggesting certain ecosystem processes such as seed dispersal may be disrupted by landscape change in this region. Evenness of these functional guilds was also lower in remnants adjacent to surface mining, regardless of the distance from remnant edge, with the exception of strict terrestrial insectivores. These shifts suggest matrix intensification can influence avian functional group composition and related ecosystem-level processes in adjacent forest remnants. The management of matrix habitat quality near and within mine concessions is important for improving efforts to preserveavian biodiversity in landscapes undergoing intensification such as through increased surface mining. PMID:24058634

  9. Matrix intensification alters avian functional group composition in adjacent rainforest fragments.

    PubMed

    Deikumah, Justus P; McAlpine, Clive A; Maron, Martine

    2013-01-01

    Conversion of farmland land-use matrices to surface mining is an increasing threat to the habitat quality of forest remnants and their constituent biota, with consequences for ecosystem functionality. We evaluated the effects of matrix type on bird community composition and the abundance and evenness within avian functional groups in south-west Ghana. We hypothesized that surface mining near remnants may result in a shift in functional composition of avifaunal communities, potentially disrupting ecological processes within tropical forest ecosystems. Matrix intensification and proximity to the remnant edge strongly influenced the abundance of members of several functional guilds. Obligate frugivores, strict terrestrial insectivores, lower and upper strata birds, and insect gleaners were most negatively affected by adjacent mining matrices, suggesting certain ecosystem processes such as seed dispersal may be disrupted by landscape change in this region. Evenness of these functional guilds was also lower in remnants adjacent to surface mining, regardless of the distance from remnant edge, with the exception of strict terrestrial insectivores. These shifts suggest matrix intensification can influence avian functional group composition and related ecosystem-level processes in adjacent forest remnants. The management of matrix habitat quality near and within mine concessions is important for improving efforts to preserveavian biodiversity in landscapes undergoing intensification such as through increased surface mining.

  10. Dynamic Nucleotide Mutation Gradients and Control Region Usage in Squamate Reptile Mitochondrial Genomes

    PubMed Central

    Castoe, T.A.; Gu, W.; de Koning, A.P.J.; Daza, J.M.; Jiang, Z.J.; Parkinson, C.L.; Pollock, D.D.

    2010-01-01

    Gradients of nucleotide bias and substitution rates occur in vertebrate mitochondrial genomes due to the asymmetric nature of the replication process. The evolution of these gradients has previously been studied in detail in primates, but not in other vertebrate groups. From the primate study, the strengths of these gradients are known to evolve in ways that can substantially alter the substitution process, but it is unclear how rapidly they evolve over evolutionary time or how different they may be in different lineages or groups of vertebrates. Given the importance of mitochondrial genomes in phylogenetics and molecular evolutionary research, a better understanding of how asymmetric mitochondrial substitution gradients evolve would contribute key insights into how this gradient evolution may mislead evolutionary inferences, and how it may also be incorporated into new evolutionary models. Most snake mitochondrial genomes have an additional interesting feature, 2 nearly identical control regions, which vary among different species in the extent that they are used as origins of replication. Given the expanded sampling of complete snake genomes currently available, together with 2 additional snakes sequenced in this study, we reexamined gradient strength and CR usage in alethinophidian snakes as well as several lizards that possess dual CRs. Our results suggest that nucleotide substitution gradients (and corresponding nucleotide bias) and CR usage is highly labile over the ∼200 m.y. of squamate evolution, and demonstrates greater overall variability than previously shown in primates. The evidence for the existence of such gradients, and their ability to evolve rapidly and converge among unrelated species suggests that gradient dynamics could easily mislead phylogenetic and molecular evolutionary inferences, and argues strongly that these dynamics should be incorporated into phylogenetic models. PMID:20215734

  11. Spatiotemporal evolution of Calophaca (Fabaceae) reveals multiple dispersals in the Central Asian mountains and adjacent regions

    Treesearch

    Ming-Li Zhang; Zhi-Bin Wen; Peter W. Fritsch; Stewart C. Sanderson

    2015-01-01

    The Central Asian flora plays a significant role in Eurasia and the Northern Hemisphere. Calophaca, a member of this flora, includes eight currently recognized species, and is centered in Central Asia, with some taxa extending into adjacent areas. A phylogenetic analysis of the genus utilizing nuclear ribosomal ITS and plastid trnS-trnG and rbcL sequences was carried...

  12. A survey of copy number variation in the porcine genome detected from whole-genome sequence

    USDA-ARS?s Scientific Manuscript database

    An important challenge to post-genomic biology is relating observed phenotypic variation to the underlying genotypic variation. Genome-wide association studies (GWAS) have made thousands of connections between single nucleotide polymorphisms (SNPs) and phenotypes, implicating regions of the genome t...

  13. First complete genome sequences of genogroup V, genotype 3 porcine sapoviruses: common 5'-terminal genomic feature of sapoviruses.

    PubMed

    Oka, Tomoichiro; Doan, Yen Hai; Shimoike, Takashi; Haga, Kei; Takizawa, Takenori

    2017-12-01

    Sapoviruses (SaVs) are enteric viruses and have been detected in various mammals. They are divided into multiple genogroups and genotypes based on the entire major capsid protein (VP1) encoding region sequences. In this study, we determined the first complete genome sequences of two genogroup V, genotype 3 (GV.3) SaV strains detected from swine fecal samples, in combination with Illumina MiSeq sequencing of the libraries prepared from viral RNA and PCR products. The lengths of the viral genome (7494 nucleotides [nt] excluding polyA tail) and short 5'-untranslated region (14 nt) as well as two predicted open reading frames are similar to those of other SaVs. The amino acid differences between the two porcine SaVs are most frequent in the central region of the VP1-encoding region. A stem-loop structure which was predicted in the first 41 nt of the 5'-terminal region of GV.3 SaVs and the other available complete genome sequences of SaVs may have a critical role in viral genome replication. Our study provides complete genome sequences of rarely reported GV.3 SaV strains and highlights the common 5'-terminal genomic feature of SaVs detected from different mammalian species.

  14. Genome-wide analysis of murine renal distal convoluted tubular cells for the target genes of mineralocorticoid receptor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ueda, Kohei; Fujiki, Katsunori; Shirahige, Katsuhiko

    Highlights: • We define a target gene of MR as that with MR-binding to the adjacent region of DNA. • We use ChIP-seq analysis in combination with microarray. • We, for the first time, explore the genome-wide binding profile of MR. • We reveal 5 genes as the direct target genes of MR in the renal epithelial cell-line. - Abstract: Background and objective: Mineralocorticoid receptor (MR) is a member of nuclear receptor family proteins and contributes to fluid homeostasis in the kidney. Although aldosterone-MR pathway induces several gene expressions in the kidney, it is often unclear whether the gene expressionsmore » are accompanied by direct regulations of MR through its binding to the regulatory region of each gene. The purpose of this study is to identify the direct target genes of MR in a murine distal convoluted tubular epithelial cell-line (mDCT). Methods: We analyzed the DNA samples of mDCT cells overexpressing 3xFLAG-hMR after treatment with 10{sup −7} M aldosterone for 1 h by chromatin immunoprecipitation with deep-sequence (ChIP-seq) and mRNA of the cell-line with treatment of 10{sup −7} M aldosterone for 3 h by microarray. Results: 3xFLAG-hMR overexpressed in mDCT cells accumulated in the nucleus in response to 10{sup −9} M aldosterone. Twenty-five genes were indicated as the candidate target genes of MR by ChIP-seq and microarray analyses. Five genes, Sgk1, Fkbp5, Rasl12, Tns1 and Tsc22d3 (Gilz), were validated as the direct target genes of MR by quantitative RT-qPCR and ChIP-qPCR. MR binding regions adjacent to Ctgf and Serpine1 were also validated. Conclusions: We, for the first time, captured the genome-wide distribution of MR in mDCT cells and, furthermore, identified five MR target genes in the cell-line. These results will contribute to further studies on the mechanisms of kidney diseases.« less

  15. Transposon Mutagenesis of the Zika Virus Genome Highlights Regions Essential for RNA Replication and Restricted for Immune Evasion.

    PubMed

    Fulton, Benjamin O; Sachs, David; Schwarz, Megan C; Palese, Peter; Evans, Matthew J

    2017-08-01

    The molecular constraints affecting Zika virus (ZIKV) evolution are not well understood. To investigate ZIKV genetic flexibility, we used transposon mutagenesis to add 15-nucleotide insertions throughout the ZIKV MR766 genome and subsequently deep sequenced the viable mutants. Few ZIKV insertion mutants replicated, which likely reflects a high degree of functional constraints on the genome. The NS1 gene exhibited distinct mutational tolerances at different stages of the screen. This result may define regions of the NS1 protein that are required for the different stages of the viral life cycle. The ZIKV structural genes showed the highest degree of insertional tolerance. Although the envelope (E) protein exhibited particular flexibility, the highly conserved envelope domain II (EDII) fusion loop of the E protein was intolerant of transposon insertions. The fusion loop is also a target of pan-flavivirus antibodies that are generated against other flaviviruses and neutralize a broad range of dengue virus and ZIKV isolates. The genetic restrictions identified within the epitopes in the EDII fusion loop likely explain the sequence and antigenic conservation of these regions in ZIKV and among multiple flaviviruses. Thus, our results provide insights into the genetic restrictions on ZIKV that may affect the evolution of this virus. IMPORTANCE Zika virus recently emerged as a significant human pathogen. Determining the genetic constraints on Zika virus is important for understanding the factors affecting viral evolution. We used a genome-wide transposon mutagenesis screen to identify where mutations were tolerated in replicating viruses. We found that the genetic regions involved in RNA replication were mostly intolerant of mutations. The genes coding for structural proteins were more permissive to mutations. Despite the flexibility observed in these regions, we found that epitopes bound by broadly reactive antibodies were genetically constrained. This finding may explain

  16. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems

    PubMed Central

    Wade, Len J.; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K.; Reddy, K. R. Kamalnath; Varma, C. Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R.; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H. E.; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L.; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7–40.7 Mb) and on chromosome 8 (20.3–21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  17. Draft Genome Sequence of Bacillus sp. GZT, a 2,4,6-Tribromophenol-Degrading Strain Isolated from the River Sludge of an Electronic Waste-Dismantling Region

    PubMed Central

    Liang, Zhishu; Li, Guiying; Das, Ranjit

    2016-01-01

    Here, we report the draft genome sequence of Bacillus sp. strain GZT, a 2,4,6-tribromophenol (TBP)-degrading bacterium previously isolated from an electronic waste-dismantling region. The draft genome sequence is 5.18 Mb and has a G+C content of 35.1%. This is the first genome report of a brominated flame retardant-degrading strain. PMID:27257197

  18. Using Pool-seq to Search for Genomic Regions Affected by Hybrid Inviability in the copepod T. californicus.

    PubMed

    Lima, Thiago G; Willett, Christopher S

    2018-05-11

    The formation of reproductive barriers between allopatric populations involves the accumulation of incompatibilities that lead to intrinsic postzygotic isolation. The evolution of these incompatibilities is usually explained by the Dobzhansky-Muller model, where epistatic interactions that arise within the diverging populations, lead to deleterious interactions when they come together in a hybrid genome. These incompatibilities can lead to hybrid inviability, killing individuals with certain genotypic combinations, and causing the population's allele frequency to deviate from Mendelian expectations. Traditionally, hybrid inviability loci have been detected by genotyping individuals at different loci across the genome. However, this method becomes time consuming and expensive as the number of markers or individuals increases. Here, we test if a Pool-seq method can be used to scan the genome of F2 hybrids to detect genomic regions that are affected by hybrid inviability. We survey the genome of hybrids between 2 populations of the copepod Tigriopus californicus, and show that this method has enough power to detect even small changes in allele frequency caused by hybrid inviability. We show that allele frequency estimates in Pool-seq can be affected by the sampling of alleles from the pool of DNA during the library preparation and sequencing steps and that special considerations must be taken when aligning hybrid reads to a reference when the populations/species are divergent.

  19. Genome rearrangement shapes Prochlorococcus ecological adaptation.

    PubMed

    Yan, Wei; Wei, Shuzhen; Wang, Qiong; Xiao, Xilin; Zeng, Qinglu; Jiao, Nianzhi; Zhang, Rui

    2018-06-18

    Prochlorococcus is the most abundant and smallest known free-living photosynthetic microorganism and is a key player in marine ecosystems and biogeochemical cycles. Prochlorococcus can be broadly divided into high-light-adapted (HL) and low-light-adapted (LL) clades. In this study, we isolated two low-light-adapted I (LLI) strains from the western Pacific Ocean and obtained their genomic data. We reconstructed Prochlorococcus evolution based on genome rearrangement. Our results showed that genome rearrangement might have played an important role in Prochlorococcus evolution. We also found that the Prochlorococcus clades with streamlined genomes maintained relatively high synteny throughout most of their genomes, and several regions served as rearrangement hotspots. Backbone analysis showed that different clades shared a conserved backbone but also had clade-specific regions, and the genes in these regions were associated with ecological adaptations. Importance Prochlorococcus , the most abundant and smallest known free-living photosynthetic microorganism, play a key role in marine ecosystems and biogeochemical cycles. The Prochlorococcus genome evolution is a fundamental question related to how Prochlorococcus clades adapted to different ecological niches. Recent studies revealed that the gene gain and loss is crucial to the clade differentiation. The significance of our research is that we interpreted the Prochlorococcus genome evolution from the perspective of genome structure, and associated the genome rearrangement with the Prochlorococcus clade differentiation and subsequent ecological adaptation. Copyright © 2018 Yan et al.

  20. Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome.

    PubMed

    Wang, Yi; Liu, Xianju; Ren, Chong; Zhong, Gan-Yuan; Yang, Long; Li, Shaohua; Liang, Zhenchang

    2016-04-21

    CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of humans, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific target sites for CRISPR/Cas9 have been computationally identified for several annual model and crop species, but such sites have not been reported for perennial, woody fruit species. In this study, we identified and characterized five types of CRISPR/Cas9 target sites in the widely cultivated grape species Vitis vinifera and developed a user-friendly database for editing grape genomes in the future. A total of 35,767,960 potential CRISPR/Cas9 target sites were identified from grape genomes in this study. Among them, 22,597,817 target sites were mapped to specific genomic locations and 7,269,788 were found to be highly specific. Protospacers and PAMs were found to distribute uniformly and abundantly in the grape genomes. They were present in all the structural elements of genes with the coding region having the highest abundance. Five PAM types, TGG, AGG, GGG, CGG and NGG, were observed. With the exception of the NGG type, they were abundantly present in the grape genomes. Synteny analysis of similar genes revealed that the synteny of protospacers matched the synteny of homologous genes. A user-friendly database containing protospacers and detailed information of the sites was developed and is available for public use at the Grape-CRISPR website ( http://biodb.sdau.edu.cn/gc/index.html ). Grape genomes harbour millions of potential CRISPR/Cas9 target sites. These sites are widely distributed among and within chromosomes with predominant abundance in the coding regions of genes. We developed a publicly-accessible Grape-CRISPR database for facilitating the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among

  1. Interpreting Mammalian Evolution using Fugu Genome Comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stubbs, L; Ovcharenko, I; Loots, G G

    2004-04-02

    Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.

  2. Genome-Wide Expression Profiling of Complex Regional Pain Syndrome

    PubMed Central

    Jin, Eun-Heui; Zhang, Enji; Ko, Youngkwon; Sim, Woo Seog; Moon, Dong Eon; Yoon, Keon Jung; Hong, Jang Hee; Lee, Won Hyung

    2013-01-01

    Complex regional pain syndrome (CRPS) is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II) and 5 controls (cut-off value: 1.5-fold change and p<0.05). Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1), matrix metalloproteinase 9 (MMP9), alanine aminopeptidase N (ANPEP), l-histidine decarboxylase (HDC), granulocyte colony-stimulating factor 3 receptor (G-CSF3R), and signal transducer and activator of transcription 3 (STAT3) genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10−4). The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression. PMID:24244504

  3. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants.

    PubMed

    Iso-Touru, T; Sahana, G; Guldbrandtsen, B; Lund, M S; Vilkki, J

    2016-03-22

    The Nordic Red Cattle consisting of three different populations from Finland, Sweden and Denmark are under a joint breeding value estimation system. The long history of recording of production and health traits offers a great opportunity to study production traits and identify causal variants behind them. In this study, we used whole genome sequence level data from 4280 progeny tested Nordic Red Cattle bulls to scan the genome for loci affecting milk, fat and protein yields. Using a genome-wise significance threshold, regions on Bos taurus chromosomes 5, 14, 23, 25 and 26 were associated with fat yield. Regions on chromosomes 5, 14, 16, 19, 20 and 25 were associated with milk yield and chromosomes 5, 14 and 25 had regions associated with protein yield. Significantly associated variations were found in 227 genes for fat yield, 72 genes for milk yield and 30 genes for protein yield. Ingenuity Pathway Analysis was used to identify networks connecting these genes displaying significant hits. When compared to previously mapped genomic regions associated with fertility, significantly associated variations were found in 5 genes common for fat yield and fertility, thus linking these two traits via biological networks. This is the first time when whole genome sequence data is utilized to study genomic regions affecting milk production in the Nordic Red Cattle population. Sequence level data offers the possibility to study quantitative traits in detail but still cannot unambiguously reveal which of the associated variations is causative. Linkage disequilibrium creates difficulties to pinpoint the causative genes and variations. One solution to overcome these difficulties is the identification of the functional gene networks and pathways to reveal important interacting genes as candidates for the observed effects. This information on target genomic regions may be exploited to improve genomic prediction.

  4. [Phylogenetic analysis of genomes of Vibrio cholerae strains isolated on the territory of Rostov region].

    PubMed

    Kuleshov, K V; Markelov, M L; Dedkov, V G; Vodop'ianov, A S; Kermanov, A V; Pisanov, R V; Kruglikov, V D; Mazrukho, A B; Maleev, V V; Shipulin, G A

    2013-01-01

    Determination of origin of 2 Vibrio cholerae strains isolated on the territory of Rostov region by using full genome sequencing data. Toxigenic strain 2011 EL- 301 V. cholerae 01 El Tor Inaba No. 301 (ctxAB+, tcpA+) and nontoxigenic strain V. cholerae O1 Ogawa P- 18785 (ctxAB-, tcpA+) were studied. Sequencing was carried out on the MiSeq platform. Phylogenetic analysis of the genomes obtained was carried out based on comparison of conservative part of the studied and 54 previously sequenced genomes. 2011EL-301 strain genome was presented by 164 contigs with an average coverage of 100, N50 parameter was 132 kb, for strain P- 18785 - 159 contigs with a coverage of69, N50 - 83 kb. The contigs obtained for strain 2011 EL-301 were deposited in DDBJ/EMBL/GenBank databases with access code AJFN02000000, for strain P-18785 - ANHS00000000. 716 protein-coding orthologous genes were detected. Based on phylogenetic analysis strain P- 18785 belongs to PG-1 subgroup (a group of predecessor strains of the 7th pandemic). Strain 2011EL-301 belongs to groups of strains of the 7th pandemic and is included into the cluster with later isolates that are associated with cases of cholera in South Africa and cases of import of cholera to the USA from Pakistan. The data obtained allows to establish phylogenetic connections with V cholerae strains isolated earlier.

  5. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  6. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  7. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

    PubMed

    Chung, Dongjun; Kuan, Pei Fen; Li, Bo; Sanalkumar, Rajendran; Liang, Kun; Bresnick, Emery H; Dewey, Colin; Keleş, Sündüz

    2011-07-01

    Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.

  8. Targeted Sequencing of Venom Genes from Cone Snail Genomes Improves Understanding of Conotoxin Molecular Evolution

    PubMed Central

    Mahardika, Gusti N

    2018-01-01

    Abstract To expand our capacity to discover venom sequences from the genomes of venomous organisms, we applied targeted sequencing techniques to selectively recover venom gene superfamilies and nontoxin loci from the genomes of 32 cone snail species (family, Conidae), a diverse group of marine gastropods that capture their prey using a cocktail of neurotoxic peptides (conotoxins). We were able to successfully recover conotoxin gene superfamilies across all species with high confidence (> 100× coverage) and used these data to provide new insights into conotoxin evolution. First, we found that conotoxin gene superfamilies are composed of one to six exons and are typically short in length (mean = ∼85 bp). Second, we expanded our understanding of the following genetic features of conotoxin evolution: 1) positive selection, where exons coding the mature toxin region were often three times more divergent than their adjacent noncoding regions, 2) expression regulation, with comparisons to transcriptome data showing that cone snails only express a fraction of the genes available in their genome (24–63%), and 3) extensive gene turnover, where Conidae species varied from 120 to 859 conotoxin gene copies. Finally, using comparative phylogenetic methods, we found that while diet specificity did not predict patterns of conotoxin evolution, dietary breadth was positively correlated with total conotoxin gene diversity. Overall, the targeted sequencing technique demonstrated here has the potential to radically increase the pace at which venom gene families are sequenced and studied, reshaping our ability to understand the impact of genetic changes on ecologically relevant phenotypes and subsequent diversification. PMID:29514313

  9. 47 CFR 101.1421 - Coordination of adjacent area MVDDS stations.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... SPECIAL RADIO SERVICES FIXED MICROWAVE SERVICES Multichannel Video Distribution and Data Service Rules for... compatible with adjacent and co-channel operations in the adjacent areas on all its frequencies; and (2... adjacent and co-channel operations in adjacent areas. (b) Harmful interference to public safety stations...

  10. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  11. Joint genome-wide association study for milk fatty acid traits in Chinese and Danish Holstein populations.

    PubMed

    Li, X; Buitenhuis, A J; Lund, M S; Li, C; Sun, D; Zhang, Q; Poulsen, N A; Su, G

    2015-11-01

    The identification of causal genes or genomic regions associated with fatty acids (FA) will enhance our understanding of the pathways underlying FA synthesis and provide opportunities for changing milk fat composition through a genetic approach. The linkage disequilibrium between adjacent markers is highly consistent between the Chinese and Danish Holstein populations, such that a joint genome-wide association study (GWAS) can be performed. In this study, a joint GWAS was performed for 16 milk FA traits based on data of 784 Chinese and 371 Danish Holstein cows genotyped by a high-density bovine single nucleotide polymorphism (SNP) array. A total of 486,464 SNP markers on 29 bovine autosomes were used. Bonferroni corrections were applied to adjust the significance thresholds for multiple testing at the genome- and chromosome-wide levels. According to the analysis of either the Chinese or Danish data individually, the total numbers of overlapping SNP that were significant at the chromosome level were 94 for C14:1, 208 for the C14 index, and 1 for C18:0. Joint analysis using the combined data of the 2 populations detected greater numbers of significant SNP compared with either of the individual populations alone for 7 and 10 traits at the genome- and chromosome-wide significance levels, respectively. Greater numbers of significant SNP were detected for C18:0 and the C18 index in the Chinese population compared with the joint analysis. Sixty-five significant SNP across all traits had significantly different effects in the 2 populations. Ten FA were influenced by a quantitative trait loci (QTL) region including DGAT1. Both C14:1 and the C14 index were influenced by a QTL region including SCD1 in the combined population. Other QTL regions also showed significant associations with the studied FA. A large region (14.9-24.9 Mbp) in BTA26 significantly influenced C14:1 and the C14 index in both populations, mostly likely due to the SNP in SCD1. A QTL region (69.97-73.69 Mbp

  12. Genome-wide association study and accuracy of genomic prediction for teat number in Duroc pigs using genotyping-by-sequencing.

    PubMed

    Tan, Cheng; Wu, Zhenfang; Ren, Jiangli; Huang, Zhuolin; Liu, Dewu; He, Xiaoyan; Prakapenka, Dzianis; Zhang, Ran; Li, Ning; Da, Yang; Hu, Xiaoxiang

    2017-03-29

    The number of teats in pigs is related to a sow's ability to rear piglets to weaning age. Several studies have identified genes and genomic regions that affect teat number in swine but few common results were reported. The objective of this study was to identify genetic factors that affect teat number in pigs, evaluate the accuracy of genomic prediction, and evaluate the contribution of significant genes and genomic regions to genomic broad-sense heritability and prediction accuracy using 41,108 autosomal single nucleotide polymorphisms (SNPs) from genotyping-by-sequencing on 2936 Duroc boars. Narrow-sense heritability and dominance heritability of teat number estimated by genomic restricted maximum likelihood were 0.365 ± 0.030 and 0.035 ± 0.019, respectively. The accuracy of genomic predictions, calculated as the average correlation between the genomic best linear unbiased prediction and phenotype in a tenfold validation study, was 0.437 ± 0.064 for the model with additive and dominance effects and 0.435 ± 0.064 for the model with additive effects only. Genome-wide association studies (GWAS) using three methods of analysis identified 85 significant SNP effects for teat number on chromosomes 1, 6, 7, 10, 11, 12 and 14. The region between 102.9 and 106.0 Mb on chromosome 7, which was reported in several studies, had the most significant SNP effects in or near the PTGR2, FAM161B, LIN52, VRTN, FCF1, AREL1 and LRRC74A genes. This region accounted for 10.0% of the genomic additive heritability and 8.0% of the accuracy of prediction. The second most significant chromosome region not reported by previous GWAS was the region between 77.7 and 79.7 Mb on chromosome 11, where SNPs in the FGF14 gene had the most significant effect and accounted for 5.1% of the genomic additive heritability and 5.2% of the accuracy of prediction. The 85 significant SNPs accounted for 28.5 to 28.8% of the genomic additive heritability and 35.8 to 36.8% of the accuracy of

  13. Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions

    PubMed Central

    Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile JW; de Moor, Marleen HM; Madden, Pamela AF; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

    2013-01-01

    Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697

  14. The Complete Mitochondrial Genome of Ctenoptilum vasava (Lepidoptera: Hesperiidae: Pyrginae) and Its Phylogenetic Implication

    PubMed Central

    Hao, Jiasheng; Sun, Qianqian; Zhao, Huabin; Sun, Xiaoyan; Gai, Yonghua; Yang, Qun

    2012-01-01

    We here report the first complete mitochondrial (mt) genome of a skipper, Ctenoptilum vasava Moore, 1865 (Lepidoptera: Hesperiidae: Pyrginae). The mt genome of the skipper is a circular molecule of 15,468 bp, containing 2 ribosomal RNA genes, 24 putative transfer RNA (tRNA), genes including an extra copy of trnS (AGN) and a tRNA-like insertion trnL (UUR), 13 protein-coding genes and an AT-rich region. All protein-coding genes (PCGs) are initiated by ATN codons and terminated by the typical stop codon TAA or TAG, except for COII which ends with a single T. The intergenic spacer sequence between trnS (AGN) and ND1 genes also contains the ATACTAA motif. The AT-rich region of 429 bp is comprised of nonrepetitive sequences, including the motif ATAGA followed by an 19 bp poly-T stretch, a microsatellite-like (AT)3 (TA)9 element next to the ATTTA motif, an 11 bp poly-A adjacent to tRNAs. Phylogenetic analyses (ML and BI methods) showed that Papilionoidea is not a natural group, and Hesperioidea is placed within the Papilionoidea as a sister to ((Pieridae + Lycaenidae) + Nymphalidae) while Papilionoidae is paraphyletic to Hesperioidea. This result is remarkably different from the traditional view where Papilionoidea and Hesperioidea are considered as two distinct superfamilies. PMID:22577351

  15. QTL-seq approach identified genomic regions and diagnostic markers for rust and late leaf spot resistance in groundnut (Arachis hypogaea L.).

    PubMed

    Pandey, Manish K; Khan, Aamir W; Singh, Vikas K; Vishwakarma, Manish K; Shasidhar, Yaduru; Kumar, Vinay; Garg, Vanika; Bhat, Ramesh S; Chitikineni, Annapurna; Janila, Pasupuleti; Guo, Baozhu; Varshney, Rajeev K

    2017-08-01

    Rust and late leaf spot (LLS) are the two major foliar fungal diseases in groundnut, and their co-occurrence leads to significant yield loss in addition to the deterioration of fodder quality. To identify candidate genomic regions controlling resistance to rust and LLS, whole-genome resequencing (WGRS)-based approach referred as 'QTL-seq' was deployed. A total of 231.67 Gb raw and 192.10 Gb of clean sequence data were generated through WGRS of resistant parent and the resistant and susceptible bulks for rust and LLS. Sequence analysis of bulks for rust and LLS with reference-guided resistant parent assembly identified 3136 single-nucleotide polymorphisms (SNPs) for rust and 66 SNPs for LLS with the read depth of ≥7 in the identified genomic region on pseudomolecule A03. Detailed analysis identified 30 nonsynonymous SNPs affecting 25 candidate genes for rust resistance, while 14 intronic and three synonymous SNPs affecting nine candidate genes for LLS resistance. Subsequently, allele-specific diagnostic markers were identified for three SNPs for rust resistance and one SNP for LLS resistance. Genotyping of one RIL population (TAG 24 × GPBD 4) with these four diagnostic markers revealed higher phenotypic variation for these two diseases. These results suggest usefulness of QTL-seq approach in precise and rapid identification of candidate genomic regions and development of diagnostic markers for breeding applications. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  16. Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.

    PubMed

    Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng

    2017-01-01

    Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.

  17. Deep whole-genome sequencing of 90 Han Chinese genomes.

    PubMed

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the

  18. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    USDA-ARS?s Scientific Manuscript database

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  19. Widespread of horizontal gene transfer in the human genome.

    PubMed

    Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

    2017-04-04

    A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.

  20. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  1. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    PubMed Central

    2011-01-01

    Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was

  2. DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

    PubMed

    El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

    2007-01-01

    We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.

  3. Re-evaluating the Localization of Sperm-Retained Histones Revealed the Modification-Dependent Accumulation in Specific Genome Regions.

    PubMed

    Yamaguchi, Kosuke; Hada, Masashi; Fukuda, Yuko; Inoue, Erina; Makino, Yoshinori; Katou, Yuki; Shirahige, Katsuhiko; Okada, Yuki

    2018-06-26

    The question of whether retained histones in the sperm genome localize to gene-coding regions or gene deserts has been debated for years. Previous contradictory observations are likely caused by the non-uniform sensitivity of sperm chromatin to micrococcal nuclease (MNase) digestion. Sperm chromatin has a highly condensed but heterogeneous structure and is composed of 90%∼99% protamines and 1%∼10% histones. In this study, we utilized nucleoplasmin (NPM) to improve the solubility of sperm chromatin by removing protamines in vitro. NPM treatment efficiently solubilized histones while maintaining quality and quantity. Chromatin immunoprecipitation sequencing (ChIP-seq) analyses using NPM-treated sperm demonstrated the predominant localization of H4 to distal intergenic regions, whereas modified histones exhibited a modification-dependent preferential enrichment in specific genomic elements, such as H3K4me3 at CpG-rich promoters and H3K9me3 in satellite repeats, respectively, implying the existence of machinery protecting modified histones from eviction. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  4. A Segment of the Apospory-Specific Genomic Region Is Highly Microsyntenic Not Only between the Apomicts Pennisetum squamulatum and Buffelgrass, But Also with a Rice Chromosome 11 Centromeric-Proximal Genomic Region1[W

    PubMed Central

    Gualtieri, Gustavo; Conner, Joann A.; Morishige, Daryl T.; Moore, L. David; Mullet, John E.; Ozias-Akins, Peggy

    2006-01-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory. PMID:16415213

  5. Organizational heterogeneity of vertebrate genomes.

    PubMed

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  6. Comparative Genomic Analysis for Genetic Variation in Sacbrood Virus of Apis cerana and Apis mellifera Honeybees From Different Regions of Vietnam.

    PubMed

    Reddy, Kondreddy Eswar; Thu, Ha Thi; Yoo, Mi Sun; Ramya, Mummadireddy; Reddy, Bheemireddy Anjana; Lien, Nguyen Thi Kim; Trang, Nguyen Thi Phuong; Duong, Bui Thi Thuy; Lee, Hyun-Jeong; Kang, Seung-Won; Quyen, Dong Van

    2017-09-01

    Sacbrood virus (SBV) is one of the most common viral infections of honeybees. The entire genome sequence for nine SBV infecting honeybees, Apis cerana and Apis mellifera, in Vietnam, namely AcSBV-Viet1, AcSBV-Viet2, AcSBV-Viet3, AmSBV-Viet4, AcSBV-Viet5, AmSBV-Viet6, AcSBV-Viet7, AcSBV-Viet8, and AcSBV-Viet9, was determined. These sequences were aligned with seven previously reported complete genome sequences of SBV from other countries, and various genomic regions were compared. The Vietnamese SBVs (VN-SBVs) shared 91-99% identity with each other, and shared 89-94% identity with strains from other countries. The open reading frames (ORFs) of the VN-SBV genomes differed greatly from those of SBVs from other countries, especially in their VP1 sequences. The AmSBV-Viet6 and AcSBV-Viet9 genome encodes 17 more amino acids within this region than the other VN-SBVs. In a phylogenetic analysis, the strains AmSBV-Viet4, AcSBV-Viet2, and AcSBV-Viet3 were clustered in group with AmSBV-UK, AmSBV-Kor21, and AmSBV-Kor19 strains. Whereas, the strains AmSBV-Viet6 and AcSBV-Viet7 clustered separately with the AcSBV strains from Korea and AcSBV-VietSBM2. And the strains AcSBV-Viet8, AcSBV-Viet1, AcSBV-Viet5, and AcSBV-Viet9 clustered with the AcSBV-India, AcSBV-Kor and AcSBV-VietSBM2. In a Simplot graph, the VN-SBVs diverged stronger in their ORF regions than in their 5' or 3' untranslated regions. The VN-SBVs possess genetic characteristics which are more similar to the Asian AcSBV strains than to AmSBV-UK strain. Taken together, our data indicate that host specificity, geographic distance, and viral cross-infections between different bee species may explain the genetic diversity among the VN-SBVs in A. cerana and A. mellifera and other SBV strains. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America.

  7. Comparative Genomic Analysis for Genetic Variation in Sacbrood Virus of Apis cerana and Apis mellifera Honeybees From Different Regions of Vietnam

    PubMed Central

    Reddy, Kondreddy Eswar; Thu, Ha Thi; Yoo, Mi Sun; Ramya, Mummadireddy; Reddy, Bheemireddy Anjana; Lien, Nguyen Thi Kim; Trang, Nguyen Thi Phuong; Duong, Bui Thi Thuy; Lee, Hyun-Jeong; Kang, Seung-Won

    2017-01-01

    Abstract Sacbrood virus (SBV) is one of the most common viral infections of honeybees. The entire genome sequence for nine SBV infecting honeybees, Apis cerana and Apis mellifera, in Vietnam, namely AcSBV-Viet1, AcSBV-Viet2, AcSBV-Viet3, AmSBV-Viet4, AcSBV-Viet5, AmSBV-Viet6, AcSBV-Viet7, AcSBV-Viet8, and AcSBV-Viet9, was determined. These sequences were aligned with seven previously reported complete genome sequences of SBV from other countries, and various genomic regions were compared. The Vietnamese SBVs (VN-SBVs) shared 91–99% identity with each other, and shared 89–94% identity with strains from other countries. The open reading frames (ORFs) of the VN-SBV genomes differed greatly from those of SBVs from other countries, especially in their VP1 sequences. The AmSBV-Viet6 and AcSBV-Viet9 genome encodes 17 more amino acids within this region than the other VN-SBVs. In a phylogenetic analysis, the strains AmSBV-Viet4, AcSBV-Viet2, and AcSBV-Viet3 were clustered in group with AmSBV-UK, AmSBV-Kor21, and AmSBV-Kor19 strains. Whereas, the strains AmSBV-Viet6 and AcSBV-Viet7 clustered separately with the AcSBV strains from Korea and AcSBV-VietSBM2. And the strains AcSBV-Viet8, AcSBV-Viet1, AcSBV-Viet5, and AcSBV-Viet9 clustered with the AcSBV-India, AcSBV-Kor and AcSBV-VietSBM2. In a Simplot graph, the VN-SBVs diverged stronger in their ORF regions than in their 5′ or 3′ untranslated regions. The VN-SBVs possess genetic characteristics which are more similar to the Asian AcSBV strains than to AmSBV-UK strain. Taken together, our data indicate that host specificity, geographic distance, and viral cross-infections between different bee species may explain the genetic diversity among the VN-SBVs in A. cerana and A. mellifera and other SBV strains. PMID:29117376

  8. A fungal avirulence factor encoded in a highly plastic genomic region triggers partial resistance to septoria tritici blotch.

    PubMed

    Meile, Lukas; Croll, Daniel; Brunner, Patrick C; Plissonneau, Clémence; Hartmann, Fanny E; McDonald, Bruce A; Sánchez-Vallet, Andrea

    2018-04-25

    Cultivar-strain specificity in the wheat-Zymoseptoria tritici pathosystem determines the infection outcome and is controlled by resistance genes on the host side, many of which have been identified. On the pathogen side, however, the molecular determinants of specificity remain largely unknown. We used genetic mapping, targeted gene disruption and allele swapping to characterise the recognition of the new avirulence factor Avr3D1. We then combined population genetic and comparative genomic analyses to characterise the evolutionary trajectory of Avr3D1. Avr3D1 is specifically recognised by wheat cultivars harbouring the Stb7 resistance gene, triggering a strong defence response without preventing pathogen infection and reproduction. Avr3D1 resides in a cluster of putative effector genes located in a genome region populated by independent transposable element insertions. The gene was present in all 132 investigated strains and is highly polymorphic, with 30 different protein variants identified. We demonstrated that specific amino acid substitutions in Avr3D1 led to evasion of recognition. These results demonstrate that quantitative resistance and gene-for-gene interactions are not mutually exclusive. Localising avirulence genes in highly plastic genomic regions probably facilitates accelerated evolution that enables escape from recognition by resistance proteins. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

  9. 33 CFR 80.1395 - Puget Sound and adjacent waters.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 33 Navigation and Navigable Waters 1 2014-07-01 2014-07-01 false Puget Sound and adjacent waters... INTERNATIONAL NAVIGATION RULES COLREGS DEMARCATION LINES Thirteenth District § 80.1395 Puget Sound and adjacent waters. The 72 COLREGS shall apply on all waters of Puget Sound and adjacent waters, including Lake Union...

  10. 33 CFR 80.1395 - Puget Sound and adjacent waters.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 33 Navigation and Navigable Waters 1 2012-07-01 2012-07-01 false Puget Sound and adjacent waters... INTERNATIONAL NAVIGATION RULES COLREGS DEMARCATION LINES Thirteenth District § 80.1395 Puget Sound and adjacent waters. The 72 COLREGS shall apply on all waters of Puget Sound and adjacent waters, including Lake Union...

  11. 33 CFR 80.1395 - Puget Sound and adjacent waters.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 33 Navigation and Navigable Waters 1 2013-07-01 2013-07-01 false Puget Sound and adjacent waters... INTERNATIONAL NAVIGATION RULES COLREGS DEMARCATION LINES Thirteenth District § 80.1395 Puget Sound and adjacent waters. The 72 COLREGS shall apply on all waters of Puget Sound and adjacent waters, including Lake Union...

  12. 33 CFR 80.1395 - Puget Sound and adjacent waters.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 33 Navigation and Navigable Waters 1 2010-07-01 2010-07-01 false Puget Sound and adjacent waters... INTERNATIONAL NAVIGATION RULES COLREGS DEMARCATION LINES Thirteenth District § 80.1395 Puget Sound and adjacent waters. The 72 COLREGS shall apply on all waters of Puget Sound and adjacent waters, including Lake Union...

  13. 33 CFR 80.1395 - Puget Sound and adjacent waters.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 33 Navigation and Navigable Waters 1 2011-07-01 2011-07-01 false Puget Sound and adjacent waters... INTERNATIONAL NAVIGATION RULES COLREGS DEMARCATION LINES Thirteenth District § 80.1395 Puget Sound and adjacent waters. The 72 COLREGS shall apply on all waters of Puget Sound and adjacent waters, including Lake Union...

  14. Using Markov chains of nucleotide sequences as a possible precursor to predict functional roles of human genome: a case study on inactive chromatin regions.

    PubMed

    Lee, K-E; Lee, E-J; Park, H-S

    2016-08-30

    Recent advances in computational epigenetics have provided new opportunities to evaluate n-gram probabilistic language models. In this paper, we describe a systematic genome-wide approach for predicting functional roles in inactive chromatin regions by using a sequence-based Markovian chromatin map of the human genome. We demonstrate that Markov chains of sequences can be used as a precursor to predict functional roles in heterochromatin regions and provide an example comparing two publicly available chromatin annotations of large-scale epigenomics projects: ENCODE project consortium and Roadmap Epigenomics consortium.

  15. Cryopreserved embryo transfer: adjacent or non-adjacent to failed fresh long GnRH-agonist protocol IVF cycle.

    PubMed

    Volodarsky-Perel, Alexander; Eldar-Geva, Talia; Holzer, Hananel E G; Schonberger, Oshrat; Reichman, Orna; Gal, Michael

    2017-03-01

    The optimal time to perform cryopreserved embryo transfer (CET) after a failed oocyte retrieval-embryo transfer (OR-ET) cycle is unknown. Similar clinical pregnancy rates were recently reported in immediate and delayed CET, performed after failed fresh OR-ET, in cycles with the gonadotrophin-releasing hormone (GnRH) antagonist protocol. This study compared outcomes of CET performed adjacently (<50 days, n = 67) and non-adjacently (≥50 to 120 days, n = 62) to the last OR-day of cycles with the GnRH agonist down-regulation protocol. Additional inclusion criteria were patients' age 20-38 years, the transfer of only 1-2 cryopreserved embryos, one treatment cycle per patient and artificial preparation for CET. Significantly higher implantation, clinical pregnancy and live birth rates were found in the non-adjacent group than in the adjacent group: 30.5% versus 11.3% (P = 0.001), 41.9% versus 17.9% (P = 0.003) and 32.3% versus 13.4% (P = 0.01), respectively. These results support the postponement of CET after a failed OR-ET for at least one menstrual cycle, when a preceding long GnRH-agonist protocol is used. Copyright © 2016 Reproductive Healthcare Ltd. Published by Elsevier Ltd. All rights reserved.

  16. Comparative Genomics in Drosophila.

    PubMed

    Oti, Martin; Pane, Attilio; Sammeth, Michael

    2018-01-01

    Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding

  17. Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes).

    PubMed

    Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

    2011-09-01

    Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.

  18. Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes)

    PubMed Central

    Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

    2011-01-01

    Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341

  19. Genomic Organization of the Murine Miller–Dieker/Lissencephaly Region: Conservation of Linkage with the Human Region

    PubMed Central

    Hirotsune, Shinji; Pack, Svetlana D.; Chong, Samuel S.; Robbins, Christiane M.; Pavan, William J.; Ledbetter, David H.; Wynshaw-Boris, Anthony

    1997-01-01

    Several human syndromes are associated with haploinsufficiency of chromosomal regions secondary to microdeletions. Isolated lissencephaly sequence (ILS), a human developmental disease characterized by a smooth cerebral surface (classical lissencephaly) and microscopic evidence of incomplete neuronal migration, is often associated with small deletions or translocations at chromosome 17p13.3. Miller–Dieker syndrome (MDS) is associated with larger deletions of 17p13.3 and consists of classical lissencephaly with additional phenotypes including facial abnormalities. We have isolated the murine homologs of three genes located inside and outside the MDS region: Lis1, Mnt/Rox, and 14-3-3ε. These genes are all located on mouse chromosome 11B2, as determined by metaphase FISH, and the relative order and approximate gene distance was determined by interphase FISH analysis. The transcriptional orientation and intergenic distance of Lis1 and Mnt/Rox were ascertained by fragmentation analysis of a mouse yeast artificial chromosome containing both genes. To determine the distance and orientation of 14-3-3ε with respect to Lis1 and Mnt/Rox, we introduced a super-rare cutter site (VDE) that is unique in the mouse genome into 14-3-3ε by gene targeting. Using the introduced VDE site, the orientation of this gene was determined by pulsed field gel electrophoresis and Southern blot analysis. Our results demonstrate that the MDS region is conserved between human and mouse. This conservation of linkage suggests that the mouse can be used to model microdeletions that occur in ILS and MDS. PMID:9199935

  20. Comparative genomics reveals cotton-specific virulence factors in flexible genomic regions in Verticillium dahliae and evidence of horizontal gene transfer from Fusarium.

    PubMed

    Chen, Jie-Yin; Liu, Chun; Gui, Yue-Jing; Si, Kai-Wei; Zhang, Dan-Dan; Wang, Jie; Short, Dylan P G; Huang, Jin-Qun; Li, Nan-Yang; Liang, Yong; Zhang, Wen-Qi; Yang, Lin; Ma, Xue-Feng; Li, Ting-Gang; Zhou, Lei; Wang, Bao-Li; Bao, Yu-Ming; Subbarao, Krishna V; Zhang, Geng-Yun; Dai, Xiao-Feng

    2018-01-01

    Verticillium dahliae isolates are most virulent on the host from which they were originally isolated. Mechanisms underlying these dominant host adaptations are currently unknown. We sequenced the genome of V. dahliae Vd991, which is highly virulent on its original host, cotton, and performed comparisons with the reference genomes of JR2 (from tomato) and VdLs.17 (from lettuce). Pathogenicity-related factor prediction, orthology and multigene family classification, transcriptome analyses, phylogenetic analyses, and pathogenicity experiments were performed. The Vd991 genome harbored several exclusive, lineage-specific (LS) genes within LS regions (LSRs). Deletion mutants of the seven genes within one LSR (G-LSR2) in Vd991 were less virulent only on cotton. Integration of G-LSR2 genes individually into JR2 and VdLs.17 resulted in significantly enhanced virulence on cotton but did not affect virulence on tomato or lettuce. Transcription levels of the seven LS genes in Vd991 were higher during the early stages of cotton infection, as compared with other hosts. Phylogenetic analyses suggested that G-LSR2 was acquired from Fusarium oxysporum f. sp. vasinfectum through horizontal gene transfer. Our results provide evidence that horizontal gene transfer from Fusarium to Vd991 contributed significantly to its adaptation to cotton and may represent a significant mechanism in the evolution of an asexual plant pathogen. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  1. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae.

    PubMed

    Choi, Kyoung Su; Park, Kyu Tae; Park, SeonJoo

    2017-11-16

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops ( Colocasia , commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna , Spirodela , Wolffiella , Wolffia , Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus . In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region.

  2. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae

    PubMed Central

    Park, Kyu Tae

    2017-01-01

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops (Colocasia, commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna, Spirodela, Wolffiella, Wolffia, Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus. In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region. PMID:29144427

  3. Non-Homologous End Joining and Homology Directed DNA Repair Frequency of Double-Stranded Breaks Introduced by Genome Editing Reagents.

    PubMed

    Zaboikin, Michail; Zaboikina, Tatiana; Freter, Carl; Srinivasakumar, Narasimhachar

    2017-01-01

    Genome editing using transcription-activator like effector nucleases or RNA guided nucleases allows one to precisely engineer desired changes within a given target sequence. The genome editing reagents introduce double stranded breaks (DSBs) at the target site which can then undergo DNA repair by non-homologous end joining (NHEJ) or homology directed recombination (HDR) when a template DNA molecule is available. NHEJ repair results in indel mutations at the target site. As PCR amplified products from mutant target regions are likely to exhibit different melting profiles than PCR products amplified from wild type target region, we designed a high resolution melting analysis (HRMA) for rapid identification of efficient genome editing reagents. We also designed TaqMan assays using probes situated across the cut site to discriminate wild type from mutant sequences present after genome editing. The experiments revealed that the sensitivity of the assays to detect NHEJ-mediated DNA repair could be enhanced by selection of transfected cells to reduce the contribution of unmodified genomic DNA from untransfected cells to the DNA melting profile. The presence of donor template DNA lacking the target sequence at the time of genome editing further enhanced the sensitivity of the assays for detection of mutant DNA molecules by excluding the wild-type sequences modified by HDR. A second TaqMan probe that bound to an adjacent site, outside of the primary target cut site, was used to directly determine the contribution of HDR to DNA repair in the presence of the donor template sequence. The TaqMan qPCR assay, designed to measure the contribution of NHEJ and HDR in DNA repair, corroborated the results from HRMA. The data indicated that genome editing reagents can produce DSBs at high efficiency in HEK293T cells but a significant proportion of these are likely masked by reversion to wild type as a result of HDR. Supplying a donor plasmid to provide a template for HDR (that

  4. Seismicity and S-wave velocity structure of the crust and the upper mantle in the Baikal rift and adjacent regions

    NASA Astrophysics Data System (ADS)

    Seredkina, Alena; Kozhevnikov, Vladimir; Melnikova, Valentina; Solovey, Oksana

    2016-12-01

    Correlations between seismicity, seismotectonic deformation (STD) field and velocity structure of the crust and the upper mantle in the Baikal rift and the adjacent areas of the Siberian platform and the Mongol-Okhotsk fold belt have been investigated. The 3D S-wave velocity structure up to the depths of 500 km has been modeled using a representative sample of Rayleigh wave group velocity dispersion curves (about 3200 paths) at periods from 10 to 250 s. The STD pattern has been reconstructed from mechanisms of large earthquakes, and is in good agreement with GPS and structural data. Analysis of the results has shown that most of large shallow earthquakes fall in regions of low S-wave velocities in the uppermost mantle (western Mongolia and areas of recent mountain building in southern Siberia) and in zones of their relatively high lateral variations (northeastern flank of the Baikal rift). In the first case the dominant STD regime is compression manifested in a mixture of thrust and strike-slip deformations. In the second case we observe a general predominance of extension.

  5. Genome-wide association study in Asia-adapted tropical maize reveals novel and explored genomic regions for sorghum downy mildew resistance.

    PubMed

    Rashid, Zerka; Singh, Pradeep Kumar; Vemuri, Hindu; Zaidi, Pervez Haider; Prasanna, Boddupalli Maruthi; Nair, Sudha Krishnan

    2018-01-10

    Globally, downy mildews are among the important foliar diseases of maize that cause significant yield losses. We conducted a genome-wide association study for sorghum downy mildew (SDM; Peronosclerospora sorghi) resistance in a panel of 368 inbred lines adapted to the Asian tropics. High density SNPs from Genotyping-by-sequencing were used in GWAS after controlling for population structure and kinship in the panel using a single locus mixed model. The study identified a set of 26 SNPs that were significantly associated with SDM resistance, with Bonferroni corrected P values ≤ 0.05. Among all the identified SNPs, the minor alleles were found to be favorable to SDM resistance in the mapping panel. Trend regression analysis with 16 independent genetic variants including 12 SNPs and four haplotype blocks identified SNP S2_6154311 on chromosome 2 with P value 2.61E-24 and contributing 26.7% of the phenotypic variation. Six of the SNPs/haplotypes were within the same chromosomal bins as the QTLs for SDM resistance mapped in previous studies. Apart from this, eight novel genomic regions for SDM resistance were identified in this study; they need further validation before being applied in the breeding pipeline. Ten SNPs identified in this study were co-located in reported mildew resistance genes.

  6. Genome-wide evidence for local DNA methylation spreading from small RNA-targeted sequences in Arabidopsis.

    PubMed

    Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi

    2011-09-01

    Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.

  7. Repair of DNA double-strand breaks by templated nucleotide sequence insertions derived from distant regions of the genome.

    PubMed

    Onozawa, Masahiro; Zhang, Zhenhua; Kim, Yoo Jung; Goldberg, Liat; Varga, Tamas; Bergsagel, P Leif; Kuehl, W Michael; Aplan, Peter D

    2014-05-27

    We used the I-SceI endonuclease to produce DNA double-strand breaks (DSBs) and observed that a fraction of these DSBs were repaired by insertion of sequences, which we termed "templated sequence insertions" (TSIs), derived from distant regions of the genome. These TSIs were derived from genic, retrotransposon, or telomere sequences and were not deleted from the donor site in the genome, leading to the hypothesis that they were derived from reverse-transcribed RNA. Cotransfection of RNA and an I-SceI expression vector demonstrated insertion of RNA-derived sequences at the DNA-DSB site, and TSIs were suppressed by reverse-transcriptase inhibitors. Both observations support the hypothesis that TSIs were derived from RNA templates. In addition, similar insertions were detected at sites of DNA DSBs induced by transcription activator-like effector nuclease proteins. Whole-genome sequencing of myeloma cell lines revealed additional TSIs, demonstrating that repair of DNA DSBs via insertion was not restricted to experimentally produced DNA DSBs. Analysis of publicly available databases revealed that many of these TSIs are polymorphic in the human genome. Taken together, these results indicate that insertional events should be considered as alternatives to gross chromosomal rearrangements in the interpretation of whole-genome sequence data and that this mutagenic form of DNA repair may play a role in genetic disease, exon shuffling, and mammalian evolution.

  8. Genome scans on experimentally evolved populations reveal candidate regions for adaptation to plant resistance in the potato cyst nematode Globodera pallida.

    PubMed

    Eoche-Bosy, D; Gautier, M; Esquibet, M; Legeai, F; Bretaudeau, A; Bouchez, O; Fournet, S; Grenier, E; Montarry, J

    2017-09-01

    Improving resistance durability involves to be able to predict the adaptation speed of pathogen populations. Identifying the genetic bases of pathogen adaptation to plant resistances is a useful step to better understand and anticipate this phenomenon. Globodera pallida is a major pest of potato crop for which a resistance QTL, GpaV vrn , has been identified in Solanum vernei. However, its durability is threatened as G. pallida populations are able to adapt to the resistance in few generations. The aim of this study was to investigate the genomic regions involved in the resistance breakdown by coupling experimental evolution and high-density genome scan. We performed a whole-genome resequencing of pools of individuals (Pool-Seq) belonging to G. pallida lineages derived from two independent populations having experimentally evolved on susceptible and resistant potato cultivars. About 1.6 million SNPs were used to perform the genome scan using a recent model testing for adaptive differentiation and association to population-specific covariables. We identified 275 outliers and 31 of them, which also showed a significant reduction in diversity in adapted lineages, were investigated for their genic environment. Some candidate genomic regions contained genes putatively encoding effectors and were enriched in SPRYSECs, known in cyst nematodes to be involved in pathogenicity and in (a)virulence. Validated candidate SNPs will provide a useful molecular tool to follow frequencies of virulence alleles in natural G. pallida populations and define efficient strategies of use of potato resistances maximizing their durability. © 2017 John Wiley & Sons Ltd.

  9. Genome-scale prediction of proteins with long intrinsically disordered regions.

    PubMed

    Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

    2014-01-01

    Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.

  10. Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.

    PubMed

    Wu, Jia Qian; Du, Jiang; Rozowsky, Joel; Zhang, Zhengdong; Urban, Alexander E; Euskirchen, Ghia; Weissman, Sherman; Gerstein, Mark; Snyder, Michael

    2008-01-03

    Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.

  11. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    PubMed

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  12. Cortical Gray and Adjacent White Matter Demonstrate Synchronous Maturation in Very Preterm Infants.

    PubMed

    Smyser, Tara A; Smyser, Christopher D; Rogers, Cynthia E; Gillespie, Sarah K; Inder, Terrie E; Neil, Jeffrey J

    2016-08-01

    Spatial and functional gradients of development have been described for the maturation of cerebral gray and white matter using histological and radiological approaches. We evaluated these patterns in very preterm (VPT) infants using diffusion tensor imaging. Data were obtained from 3 groups: 1) 22 VPT infants without white matter injury (WMI), of whom all had serial MRI studies during the neonatal period, 2) 19 VPT infants with WMI, of whom 3 had serial MRI studies and 3) 12 healthy, term-born infants. Regions of interest were placed in the cortical gray and adjacent white matter in primary motor, primary visual, visual association, and prefrontal regions. From the MRI data at term-equivalent postmenstrual age, differences in mean diffusivity were found in all areas between VPT infants with WMI and the other 2 groups. In contrast, minimal differences in fractional anisotropy were found between the 3 groups. These findings suggest that cortical maturation is delayed in VPT infants with WMI when compared with term control infants and VPT infants without WMI. From the serial MRI data from VPT infants, synchronous development between gray and white matter was evident in all areas and all groups, with maturation in primary motor and sensory regions preceding that of association areas. This finding highlights the regionally varying but locally synchronous nature of the development of cortical gray matter and its adjacent white matter. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Mind the gap! The mitochondrial control region and its power as a phylogenetic marker in echinoids.

    PubMed

    Bronstein, Omri; Kroh, Andreas; Haring, Elisabeth

    2018-05-30

    In Metazoa, mitochondrial markers are the most commonly used targets for inferring species-level molecular phylogenies due to their extremely low rate of recombination, maternal inheritance, ease of use and fast substitution rate in comparison to nuclear DNA. The mitochondrial control region (CR) is the main non-coding area of the mitochondrial genome and contains the mitochondrial origin of replication and transcription. While sequences of the cytochrome oxidase subunit 1 (COI) and 16S rRNA genes are the prime mitochondrial markers in phylogenetic studies, the highly variable CR is typically ignored and not targeted in such analyses. However, the higher substitution rate of the CR can be harnessed to infer the phylogeny of closely related species, and the use of a non-coding region alleviates biases resulting from both directional and purifying selection. Additionally, complete mitochondrial genome assemblies utilizing next generation sequencing (NGS) data often show exceptionally low coverage at specific regions, including the CR. This can only be resolved by targeted sequencing of this region. Here we provide novel sequence data for the echinoid mitochondrial control region in over 40 species across the echinoid phylogenetic tree. We demonstrate the advantages of directly targeting the CR and adjacent tRNAs to facilitate complementing low coverage NGS data from complete mitochondrial genome assemblies. Finally, we test the performance of this region as a phylogenetic marker both in the lab and in phylogenetic analyses, and demonstrate its superior performance over the other available mitochondrial markers in echinoids. Our target region of the mitochondrial CR (1) facilitates the first thorough investigation of this region across a wide range of echinoid taxa, (2) provides a tool for complementing missing data in NGS experiments, and (3) identifies the CR as a powerful, novel marker for phylogenetic inference in echinoids due to its high variability, lack of

  14. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

    PubMed

    Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.

  15. Virtual Genomes in Flux: An Interplay of Neutrality and Adaptability Explains Genome Expansion and Streamlining

    PubMed Central

    Cuypers, Thomas D.; Hogeweg, Paulien

    2012-01-01

    The picture that emerges from phylogenetic gene content reconstructions is that genomes evolve in a dynamic pattern of rapid expansion and gradual streamlining. Ancestral organisms have been estimated to possess remarkably rich gene complements, although gene loss is a driving force in subsequent lineage adaptation and diversification. Here, we study genome dynamics in a model of virtual cells evolving to maintain homeostasis. We observe a pattern of an initial rapid expansion of the genome and a prolonged phase of mutational load reduction. Generally, load reduction is achieved by the deletion of redundant genes, generating a streamlining pattern. Load reduction can also occur as a result of the generation of highly neutral genomic regions. These regions can expand and contract in a neutral fashion. Our study suggests that genome expansion and streamlining are generic patterns of evolving systems. We propose that the complex genotype to phenotype mapping in virtual cells as well as in their biological counterparts drives genome size dynamics, due to an emerging interplay between adaptation, neutrality, and evolvability. PMID:22234601

  16. Defining Genomic Changes in Triple-Negative Breast Cancer in Women of African Descent

    DTIC Science & Technology

    2012-06-01

    African and African - American breast cancer cases. Gene Expression Array Studies The 31 triple negative Kijabe samples were... American Adjacent Normal Breast Tissue PI: Pegram & Baumbach Defining Genomic Changes in Triple Negative Breast Cancer in Women of African ...Tissues from African - American and East African Patients with Triple Negative Breast

  17. The complete chloroplast genome of Sinopodophyllum hexandrum (Berberidaceae).

    PubMed

    Li, Huie; Guo, Qiqiang

    2016-07-01

    The complete chloroplast (cp) genome of the Sinopodophyllum hexandrum (Berberidaceae) was determined in this study. The circular genome is 157,940 bp in size, and comprises a pair of inverted repeat (IR) regions of 26,077 bp each, a large single-copy (LSC) region of 86,460 bp and a small single-copy (SSC) region of 19,326 bp. The GC content of the whole cp genome was 38.5%. A total of 133 genes were identified, including 88 protein-coding genes, 37 tRNA genes and eight rRNA genes. The whole cp genome consists of 114 unique genes, and 19 genes are duplicated in the IR regions. The phylogenetic analysis revealed that S. hexandrum is closely related to Nandina domestica within the family Berberidaceae.

  18. Development of improved connection details for adjacent prestressed member bridges.

    DOT National Transportation Integrated Search

    2017-06-01

    Adjacent prestressed member girder bridges are economical systems for short spans and generally come in two types: adjacent box beam bridges and adjacent voided slab bridges. Each type provides the advantages of having low clearances because of their...

  19. Comparative analyses of CTCF and BORIS occupancies uncover two distinct classes of CTCF binding genomic regions.

    PubMed

    Pugacheva, Elena M; Rivero-Hinojosa, Samuel; Espinoza, Celso A; Méndez-Catalá, Claudia Fabiola; Kang, Sungyun; Suzuki, Teruhiko; Kosaka-Suzuki, Natsuki; Robinson, Susan; Nagarajan, Vijayaraj; Ye, Zhen; Boukaba, Abdelhalim; Rasko, John E J; Strunnikov, Alexander V; Loukinov, Dmitri; Ren, Bing; Lobanenkov, Victor V

    2015-08-14

    CTCF and BORIS (CTCFL), two paralogous mammalian proteins sharing nearly identical DNA binding domains, are thought to function in a mutually exclusive manner in DNA binding and transcriptional regulation. Here we show that these two proteins co-occupy a specific subset of regulatory elements consisting of clustered CTCF binding motifs (termed 2xCTSes). BORIS occupancy at 2xCTSes is largely invariant in BORIS-positive cancer cells, with the genomic pattern recapitulating the germline-specific BORIS binding to chromatin. In contrast to the single-motif CTCF target sites (1xCTSes), the 2xCTS elements are preferentially found at active promoters and enhancers, both in cancer and germ cells. 2xCTSes are also enriched in genomic regions that escape histone to protamine replacement in human and mouse sperm. Depletion of the BORIS gene leads to altered transcription of a large number of genes and the differentiation of K562 cells, while the ectopic expression of this CTCF paralog leads to specific changes in transcription in MCF7 cells. We discover two functionally and structurally different classes of CTCF binding regions, 2xCTSes and 1xCTSes, revealed by their predisposition to bind BORIS. We propose that 2xCTSes play key roles in the transcriptional program of cancer and germ cells.

  20. Genomic adaptation of admixed dairy cattle in East Africa

    PubMed Central

    Kim, Eui-Soo; Rothschild, Max F.

    2014-01-01

    Dairy cattle in East Africa imported from the U.S. and Europe have been adapted to new environments. In small local farms, cattle have generally been maintained by crossbreeding that could increase survivability under a severe environment. Eventually, genomic ancestry of a specific breed will be nearly fixed in genomic regions of local breeds or crossbreds when it is advantageous for survival or production in harsh environments. To examine this situation, 25 Friesians and 162 local cattle produced by crossbreeding of dairy breeds in Kenya were sampled and genotyped using 50K SNPs. Using principal component analysis (PCA), the admixed local cattle were found to consist of several imported breeds, including Guernsey, Norwegian Red, and Holstein. To infer the influence of parental breeds on genomic regions, local ancestry mapping was performed based on the similarity of haplotypes. As a consequence, it appears that no genomic region has been under the complete influence of a specific parental breed. Nonetheless, the ancestry of Holstein-Friesians was substantial in most genomic regions (>80%). Furthermore, we examined the frequency of the most common haplotypes from parental breeds that have changed substantially in Kenyan crossbreds during admixture. The frequency of these haplotypes from parental breeds, which were likely to be selected in temperate regions, has deviated considerably from expected frequency in 11 genomic regions. Additionally, extended haplotype homozygosity (EHH) based methods were applied to identify the regions responding to recent selection in crossbreds, called candidate regions, resulting in seven regions that appeared to be affected by Holstein-Friesians. However, some signatures of selection were less dependent on Holsteins-Friesians, suggesting evidence of adaptation in East Africa. The analysis of local ancestry is a useful approach to understand the detailed genomic structure and may reveal regions of the genome required for specialized

  1. From Genes to Milk: Genomic Organization and Epigenetic Regulation of the Mammary Transcriptome

    PubMed Central

    Lemay, Danielle G.; Pollard, Katherine S.; Martin, William F.; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J. Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival

  2. Genomics of the hop psuedo-autosomal regions

    USDA-ARS?s Scientific Manuscript database

    Hop is one of the few crop species with female and male plants with sex being determined by either XX or XY chromosomes. Hop cones are only produced in female hops with or without fertilization. This has lead to most genomic research being directed toward female plants. Very little work has been don...

  3. Improved connection details for adjacent prestressed bridge beams.

    DOT National Transportation Integrated Search

    2015-03-01

    Bridges with adjacent box beams and voided slabs are simply and rapidly constructed, and are well suited to : short to medium spans. The traditional connection between the adjacent members is a shear key lled with a : conventional non-shrink grout...

  4. Complete Chloroplast Genome Sequences of Important Oilseed Crop Sesamum indicum L

    PubMed Central

    Yi, Dong-Keun; Kim, Ki-Joong

    2012-01-01

    Sesamum indicum is an important crop plant species for yielding oil. The complete chloroplast (cp) genome of S. indicum (GenBank acc no. JN637766) is 153,324 bp in length, and has a pair of inverted repeat (IR) regions consisting of 25,141 bp each. The lengths of the large single copy (LSC) and the small single copy (SSC) regions are 85,170 bp and 17,872 bp, respectively. Comparative cp DNA sequence analyses of S. indicum with other cp genomes reveal that the genome structure, gene order, gene and intron contents, AT contents, codon usage, and transcription units are similar to the typical angiosperm cp genomes. Nucleotide diversity of the IR region between Sesamum and three other cp genomes is much lower than that of the LSC and SSC regions in both the coding region and noncoding region. As a summary, the regional constraints strongly affect the sequence evolution of the cp genomes, while the functional constraints weakly affect the sequence evolution of cp genomes. Five short inversions associated with short palindromic sequences that form step-loop structures were observed in the chloroplast genome of S. indicum. Twenty-eight different simple sequence repeat loci have been detected in the chloroplast genome of S. indicum. Almost all of the SSR loci were composed of A or T, so this may also contribute to the A-T richness of the cp genome of S. indicum. Seven large repeated loci in the chloroplast genome of S. indicum were also identified and these loci are useful to developing S. indicum-specific cp genome vectors. The complete cp DNA sequences of S. indicum reported in this paper are prerequisite to modifying this important oilseed crop by cp genetic engineering techniques. PMID:22606240

  5. Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value.

    PubMed

    Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-Hyeon

    2017-03-01

    Holsteins are known as the world's highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.

  6. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection.

    PubMed

    Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter

    2017-05-12

    A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P < 0.05). GFBLUP provides a framework for integrating multiple layers of biological knowledge to provide novel insights into the biological basis of complex traits

  7. Context based computational analysis and characterization of ARS consensus sequences (ACS) of Saccharomyces cerevisiae genome.

    PubMed

    Singh, Vinod Kumar; Krishnamachari, Annangarachari

    2016-09-01

    Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  8. Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds.

    PubMed

    Zhang, Qianqian; Guldbrandtsen, Bernt; Thomasen, Jørn Rind; Lund, Mogens Sandø; Sahana, Goutam

    2016-09-01

    Longevity is an important economic trait in dairy production. Improvements in longevity could increase the average number of lactations per cow, thereby affecting the profitability of the dairy cattle industry. Improved longevity for cows reduces the replacement cost of stock and enables animals to achieve the highest production period. Moreover, longevity is an indirect indicator of animal welfare. Using whole-genome sequencing variants in 3 dairy cattle breeds, we carried out an association study and identified 7 genomic regions in Holstein and 5 regions in Red Dairy Cattle that were associated with longevity. Meta-analyses of 3 breeds revealed 2 significant genomic regions, located on chromosomes 6 (META-CHR6-88MB) and 18 (META-CHR18-58MB). META-CHR6-88MB overlaps with 2 known genes: neuropeptide G-protein coupled receptor (NPFFR2; 89,052,210-89,059,348 bp) and vitamin D-binding protein precursor (GC; 88,695,940-88,739,180 bp). The NPFFR2 gene was previously identified as a candidate gene for mastitis resistance. META-CHR18-58MB overlaps with zinc finger protein 717 (ZNF717; 58,130,465-58,141,877 bp) and zinc finger protein 613 (ZNF613; 58,115,782-58,117,110 bp), which have been associated with calving difficulties. Information on longevity-associated genomic regions could be used to find causal genes/variants influencing longevity and exploited to improve the reliability of genomic prediction. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  9. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  10. Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

    PubMed

    Lee, Hayan; Schatz, Michael C

    2012-08-15

    Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net

  11. Genome-wide uniformity of human ‘open’ pre-initiation complexes

    PubMed Central

    Lai, William K.M.; Pugh, B. Franklin

    2017-01-01

    Transcription of protein-coding and noncoding DNA occurs pervasively throughout the mammalian genome. Their sites of initiation are generally inferred from transcript 5′ ends and are thought to be either locally dispersed or focused. How these two modes of initiation relate is unclear. Here, we apply permanganate treatment and chromatin immunoprecipitation (PIP-seq) of initiation factors to identify the precise location of melted DNA separately associated with the preinitiation complex (PIC) and the adjacent paused complex (PC). This approach revealed the two known modes of transcription initiation. However, in contrast to prevailing views, they co-occurred within the same promoter region: initiation originating from a focused PIC, and broad nucleosome-linked initiation. PIP-seq allowed transcriptional orientation of Pol II to be determined, which may be useful near promoters where sufficient sense/anti-sense transcript mapping information is lacking. PIP-seq detected divergently oriented Pol II at both coding and noncoding promoters, as well as at enhancers. Their occupancy levels were not necessarily coupled in the two orientations. DNA sequence and shape analysis of initiation complex sites suggest that both sequence and shape contribute to specificity, but in a context-restricted manner. That is, initiation sites have the locally “best” initiator (INR) sequence and/or shape. These findings reveal a common core to pervasive Pol II initiation throughout the human genome. PMID:27927716

  12. Characterization of the complete chloroplast genome of Platycarya strobilacea (Juglandaceae)

    Treesearch

    Jing Yan; Kai Han; Shuyun Zeng; Peng Zhao; Keith Woeste; Jianfang Li; Zhan-Lin Liu

    2017-01-01

    The whole chloroplast genome (cp genome) sequence of Platycarya strobilacea was characterized from Illumina pair-end sequencing data. The complete cp genome was 160,994 bp in length and contained a large single copy region (LSC) of 90,225 bp and a small single copy region (SSC) of 18,371 bp, which were separated by a pair of inverted repeat regions...

  13. Region 9 Tribal Lands

    EPA Pesticide Factsheets

    Dataset of all Indian Reservations in US EPA Region 9 (California, Arizona and Nevada) with some reservation border areas of adjacent states included (adjacent areas of Colorado, New Mexico and Utah). Reservation boundaries are compiled from multiple sources and are derived from several different source scales. Information such as reservation type, primary tribe name are included with the feature dataset. Public Domain Allotments are not included in this data set.

  14. Functional annotation of HOT regions in the human genome: implications for human disease and cancer

    PubMed Central

    Li, Hao; Chen, Hebing; Liu, Feng; Ren, Chao; Wang, Shengqi; Bo, Xiaochen; Shu, Wenjie

    2015-01-01

    Advances in genome-wide association studies (GWAS) and large-scale sequencing studies have resulted in an impressive and growing list of disease- and trait-associated genetic variants. Most studies have emphasised the discovery of genetic variation in coding sequences, however, the noncoding regulatory effects responsible for human disease and cancer biology have been substantially understudied. To better characterise the cis-regulatory effects of noncoding variation, we performed a comprehensive analysis of the genetic variants in HOT (high-occupancy target) regions, which are considered to be one of the most intriguing findings of recent large-scale sequencing studies. We observed that GWAS variants that map to HOT regions undergo a substantial net decrease and illustrate development-specific localisation during haematopoiesis. Additionally, genetic risk variants are disproportionally enriched in HOT regions compared with LOT (low-occupancy target) regions in both disease-relevant and cancer cells. Importantly, this enrichment is biased toward disease- or cancer-specific cell types. Furthermore, we observed that cancer cells generally acquire cancer-specific HOT regions at oncogenes through diverse mechanisms of cancer pathogenesis. Collectively, our findings demonstrate the key roles of HOT regions in human disease and cancer and represent a critical step toward further understanding disease biology, diagnosis, and therapy. PMID:26113264

  15. Functional annotation of HOT regions in the human genome: implications for human disease and cancer.

    PubMed

    Li, Hao; Chen, Hebing; Liu, Feng; Ren, Chao; Wang, Shengqi; Bo, Xiaochen; Shu, Wenjie

    2015-06-26

    Advances in genome-wide association studies (GWAS) and large-scale sequencing studies have resulted in an impressive and growing list of disease- and trait-associated genetic variants. Most studies have emphasised the discovery of genetic variation in coding sequences, however, the noncoding regulatory effects responsible for human disease and cancer biology have been substantially understudied. To better characterise the cis-regulatory effects of noncoding variation, we performed a comprehensive analysis of the genetic variants in HOT (high-occupancy target) regions, which are considered to be one of the most intriguing findings of recent large-scale sequencing studies. We observed that GWAS variants that map to HOT regions undergo a substantial net decrease and illustrate development-specific localisation during haematopoiesis. Additionally, genetic risk variants are disproportionally enriched in HOT regions compared with LOT (low-occupancy target) regions in both disease-relevant and cancer cells. Importantly, this enrichment is biased toward disease- or cancer-specific cell types. Furthermore, we observed that cancer cells generally acquire cancer-specific HOT regions at oncogenes through diverse mechanisms of cancer pathogenesis. Collectively, our findings demonstrate the key roles of HOT regions in human disease and cancer and represent a critical step toward further understanding disease biology, diagnosis, and therapy.

  16. Mitochondrial and chloroplast phylogeography of Picea crassifolia Kom. (Pinaceae) in the Qinghai-Tibetan Plateau and adjacent highlands.

    PubMed

    Meng, Lihua; Yang, Rui; Abbott, Richard J; Miehe, Georg; Hu, Tianhua; Liu, Jianquan

    2007-10-01

    The disjunct distribution of forests in the Qinghai-Tibetan Plateau (QTP) and adjacent Helan Shan and Daqing Shan highlands provides an excellent model to examine vegetation shifts, glacial refugia and gene flow of key species in this complex landscape region in response to past climatic oscillations and human disturbance. In this study, we examined maternally inherited mitochondrial DNA (nad1 intron b/c and nad5 intron 1) and paternally inherited chloroplast DNA (trnC-trnD) sequence variation within a dominant forest species, Picea crassifolia Kom. We recovered nine mitotypes and two chlorotypes in a survey of 442 individuals from 32 populations sampled throughout the species' range. Significant mitochondrial DNA population subdivision was detected (G(ST) = 0.512; N(ST) = 0.679), suggesting low levels of recurrent gene flow through seeds among populations and significant phylogeographical structure (N(ST) > GST, P < 0.05). Plateau haplotypes differed in sequence from those in the adjacent highlands, suggesting a long period of allopatric fragmentation between the species in the two regions and the presence of independent refugia in each region during Quaternary glaciations. On the QTP platform, all but one of the disjunct populations surveyed were fixed for the same mitotype, while most populations at the plateau edge contained more than one haplotype with the mitotype that was fixed in plateau platform populations always present at high frequency. This distribution pattern suggests that present-day disjunct populations on the QTP platform experienced a common recolonization history. The same phylogeographical pattern, however, was not detected for paternally inherited chloroplast DNA haplotypes. Two chlorotypes were distributed throughout the range of the species with little geographical population differentiation (G(ST) = N(ST) = 0.093). This provides evidence for highly efficient pollen-mediated gene flow among isolated forest patches, both within and between

  17. Accounting for discovery bias in genomic prediction

    USDA-ARS?s Scientific Manuscript database

    Our objective was to evaluate an approach to mitigating discovery bias in genomic prediction. Accuracy may be improved by placing greater emphasis on regions of the genome expected to be more influential on a trait. Methods emphasizing regions result in a phenomenon known as “discovery bias” if info...

  18. The complete mitochondrial genome and its remarkable secondary structure for a stonefly Acroneuria hainana Wu (Insecta: Plecoptera, Perlidae).

    PubMed

    Huang, Mingchao; Wang, Yuyu; Liu, Xingyue; Li, Weihai; Kang, Zehui; Wang, Kai; Li, Xuankun; Yang, Ding

    2015-02-15

    The Plecoptera (stoneflies) is a hemimetabolous order of insects, whose larvae are usually used as indicators for fresh water biomonitoring. Herein, we describe the complete mitochondrial (mt) genome of a stonefly species, namely Acroneuria hainana Wu belonging to the family Perlidae. This mt genome contains 13 PCGs, 22 tRNA-coding genes and 2 rRNA-coding genes that are conserved in most insect mt genomes, and it also has the identical gene order with the insect ancestral gene order. However, there are three special initiation codons of ND1, ND5 and COI in PCGs: TTG, GTG and CGA, coding for L, V and R, respectively. Additionally, the 899-bp control region, with 73.30% A+T content, has two long repeated sequences which are found at the 3'-end closing to the tRNA(Ile) gene. Both of them can be folded into a stem-loop structure, whose adjacent upstream and downstream sequences can be also folded into stem-loop structures. It is presumed that the four special structures in series could be associated with the D-loop replication. It might be able to adjust the replication speed of two replicate directions. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Complete mitochondrial genome sequences of the northern spotted owl (Strix occidentalis caurina) and the barred owl (Strix varia; Aves: Strigiformes: Strigidae) confirm the presence of a duplicated control region

    PubMed Central

    Henderson, James B.; Sellas, Anna B.; Fuchs, Jérôme; Bowie, Rauri C.K.; Dumbacher, John P.

    2017-01-01

    We report here the successful assembly of the complete mitochondrial genomes of the northern spotted owl (Strix occidentalis caurina) and the barred owl (S. varia). We utilized sequence data from two sequencing methodologies, Illumina paired-end sequence data with insert lengths ranging from approximately 250 nucleotides (nt) to 9,600 nt and read lengths from 100–375 nt and Sanger-derived sequences. We employed multiple assemblers and alignment methods to generate the final assemblies. The circular genomes of S. o. caurina and S. varia are comprised of 19,948 nt and 18,975 nt, respectively. Both code for two rRNAs, twenty-two tRNAs, and thirteen polypeptides. They both have duplicated control region sequences with complex repeat structures. We were not able to assemble the control regions solely using Illumina paired-end sequence data. By fully spanning the control regions, Sanger-derived sequences enabled accurate and complete assembly of these mitochondrial genomes. These are the first complete mitochondrial genome sequences of owls (Aves: Strigiformes) possessing duplicated control regions. We searched the nuclear genome of S. o. caurina for copies of mitochondrial genes and found at least nine separate stretches of nuclear copies of gene sequences originating in the mitochondrial genome (Numts). The Numts ranged from 226–19,522 nt in length and included copies of all mitochondrial genes except tRNAPro, ND6, and tRNAGlu. Strix occidentalis caurina and S. varia exhibited an average of 10.74% (8.68% uncorrected p-distance) divergence across the non-tRNA mitochondrial genes. PMID:29038757

  20. Identification of coding and non-coding mutational hotspots in cancer genomes.

    PubMed

    Piraino, Scott W; Furney, Simon J

    2017-01-05

    The identification of mutations that play a causal role in tumour development, so called "driver" mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes. To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis. We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from

  1. Regional centromeres in the yeast Candida lusitaniae lack pericentromeric heterochromatin

    PubMed Central

    Kapoor, Shivali; Zhu, Lisha; Froyd, Cara; Liu, Tao; Rusche, Laura N.

    2015-01-01

    Point centromeres are specified by a short consensus sequence that seeds kinetochore formation, whereas regional centromeres lack a conserved sequence and instead are epigenetically inherited. Regional centromeres are generally flanked by heterochromatin that ensures high levels of cohesin and promotes faithful chromosome segregation. However, it is not known whether regional centromeres require pericentromeric heterochromatin. In the yeast Candida lusitaniae, we identified a distinct type of regional centromere that lacks pericentromeric heterochromatin. Centromere locations were determined by ChIP-sequencing of two key centromere proteins, Cse4 and Mif2, and are consistent with bioinformatic predictions. The centromeric DNA sequence was unique for each chromosome and spanned 4–4.5 kbp, consistent with regional epigenetically inherited centromeres. However, unlike other regional centromeres, there was no evidence of pericentromeric heterochromatin in C. lusitaniae. In particular, flanking genes were expressed at a similar level to the rest of the genome, and a URA3 reporter inserted adjacent to a centromere was not repressed. In addition, regions flanking the centromeric core were not associated with hypoacetylated histones or a sirtuin deacetylase that generates heterochromatin in other yeast. Interestingly, the centromeric chromatin had a distinct pattern of histone modifications, being enriched for methylated H3K79 and H3R2 but lacking methylation of H3K4, which is found at other regional centromeres. Thus, not all regional centromeres require flanking heterochromatin. PMID:26371315

  2. Interactive Exploration on Large Genomic Datasets.

    PubMed

    Tu, Eric

    2016-01-01

    The prevalence of large genomics datasets has made the the need to explore this data more important. Large sequencing projects like the 1000 Genomes Project [1], which reconstructed the genomes of 2,504 individuals sampled from 26 populations, have produced over 200TB of publically available data. Meanwhile, existing genomic visualization tools have been unable to scale with the growing amount of larger, more complex data. This difficulty is acute when viewing large regions (over 1 megabase, or 1,000,000 bases of DNA), or when concurrently viewing multiple samples of data. While genomic processing pipelines have shifted towards using distributed computing techniques, such as with ADAM [4], genomic visualization tools have not. In this work we present Mango, a scalable genome browser built on top of ADAM that can run both locally and on a cluster. Mango presents a combination of different optimizations that can be combined in a single application to drive novel genomic visualization techniques over terabytes of genomic data. By building visualization on top of a distributed processing pipeline, we can perform visualization queries over large regions that are not possible with current tools, and decrease the time for viewing large data sets. Mango is part of the Big Data Genomics project at University of California-Berkeley [25] and is published under the Apache 2 license. Mango is available at https://github.com/bigdatagenomics/mango.

  3. Relative stability of DNA as a generic criterion for promoter prediction: whole genome annotation of microbial genomes with varying nucleotide base composition.

    PubMed

    Rangannan, Vetriselvi; Bansal, Manju

    2009-12-01

    The rapid increase in genome sequence information has necessitated the annotation of their functional elements, particularly those occurring in the non-coding regions, in the genomic context. Promoter region is the key regulatory region, which enables the gene to be transcribed or repressed, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. In this analysis, we demonstrate that while the promoter regions are in general less stable than the flanking regions, their average free energy varies depending on the GC composition of the flanking genomic sequence. We have therefore obtained a set of free energy threshold values, for genomic DNA with varying GC content and used them as generic criteria for predicting promoter regions in several microbial genomes, using an in-house developed tool PromPredict. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (50.8% GC) and B. subtilis (43.5% GC) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.

  4. A genome-wide association study for body weight in Japanese Thoroughbred racehorses clarifies candidate regions on chromosomes 3, 9, 15, and 18

    PubMed Central

    TOZAKI, Teruaki; KIKUCHI, Mio; KAKOI, Hironaga; HIROTA, Kei-ichi; NAGATA, Shun-ichi

    2017-01-01

    ABSTRACT Body weight is an important trait to confirm growth and development in humans and animals. In Thoroughbred racehorses, it is measured in the postnatal, training, and racing periods to evaluate growth and training degrees. The body weight of mature Thoroughbred racehorses generally ranges from 400 to 600 kg, and this broad range is likely influenced by environmental and genetic factors. Therefore, a genome-wide association study (GWAS) using the Equine SNP70 BeadChip was performed to identify the genomic regions associated with body weight in Japanese Thoroughbred racehorses using 851 individuals. The average body weight of these horses was 473.9 kg (standard deviation: 28.0) at the age of 3, and GWAS identified statistically significant SNPs on chromosomes 3 (BIEC2_808466, P=2.32E-14), 9 (BIEC2_1105503, P=1.03E-7), 15 (BIEC2_322669, P=9.50E-6), and 18 (BIEC2_417274, P=1.44E-14), which were associated with body weight as a quantitative trait. The genomic regions on chromosomes 3, 9, 15, and 18 included ligand-dependent nuclear receptor compressor-like protein (LCORL), zinc finger and AT hook domain containing (ZFAT), tribbles pseudokinase 2 (TRIB2), and myostatin (MSTN), respectively, as candidate genes. LCORL and ZFAT are associated with withers height in horses, whereas MSTN affects muscle mass. Thus, the genomic regions identified in this study seem to affect the body weight of Thoroughbred racehorses. Although this information is useful for breeding and growth management of the horses, the production of genetically modified animals and gene doping (abuse/misuse of gene therapy) should be prohibited to maintain horse racing integrity. PMID:29270069

  5. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    USDA-ARS?s Scientific Manuscript database

    Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...

  6. The primary structures of two yeast enolase genes. Homology between the 5' noncoding flanking regions of yeast enolase and glyceraldehyde-3-phosphate dehydrogenase genes.

    PubMed

    Holland, M J; Holland, J P; Thill, G P; Jackson, K A

    1981-02-10

    Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5

  7. GRIL: genome rearrangement and inversion locator.

    PubMed

    Darling, Aaron E; Mau, Bob; Blattner, Frederick R; Perna, Nicole T

    2004-01-01

    GRIL is a tool to automatically identify collinear regions in a set of bacterial-size genome sequences. GRIL uses three basic steps. First, regions of high sequence identity are located. Second, some of these regions are filtered based on user-specified criteria. Finally, the remaining regions of sequence identity are used to define significant collinear regions among the sequences. By locating collinear regions of sequence, GRIL provides a basis for multiple genome alignment using current alignment systems. GRIL also provides a basis for using current inversion distance tools to infer phylogeny. GRIL is implemented in C++ and runs on any x86-based Linux or Windows platform. It is available from http://asap.ahabs.wisc.edu/gril

  8. Genome of Horsepox Virus

    PubMed Central

    Tulman, E. R.; Delhon, G.; Afonso, C. L.; Lu, Z.; Zsak, L.; Sandybaev, N. T.; Kerembekova, U. Z.; Zaitsev, V. L.; Kutish, G. F.; Rock, D. L.

    2006-01-01

    Here we present the genomic sequence of horsepox virus (HSPV) isolate MNR-76, an orthopoxvirus (OPV) isolated in 1976 from diseased Mongolian horses. The 212-kbp genome contained 7.5-kbp inverted terminal repeats and lacked extensive terminal tandem repetition. HSPV contained 236 open reading frames (ORFs) with similarity to those in other OPVs, with those in the central 100-kbp region most conserved relative to other OPVs. Phylogenetic analysis of the conserved region indicated that HSPV is closely related to sequenced isolates of vaccinia virus (VACV) and rabbitpox virus, clearly grouping together these VACV-like viruses. Fifty-four HSPV ORFs likely represented fragments of 25 orthologous OPV genes, including in the central region the only known fragmented form of an OPV ribonucleotide reductase large subunit gene. In terminal genomic regions, HSPV lacked full-length homologues of genes variably fragmented in other VACV-like viruses but was unique in fragmentation of the homologue of VACV strain Copenhagen B6R, a gene intact in other known VACV-like viruses. Notably, HSPV contained in terminal genomic regions 17 kbp of OPV-like sequence absent in known VACV-like viruses, including fragments of genes intact in other OPVs and approximately 1.4 kb of sequence present only in cowpox virus (CPXV). HSPV also contained seven full-length genes fragmented or missing in other VACV-like viruses, including intact homologues of the CPXV strain GRI-90 D2L/I4R CrmB and D13L CD30-like tumor necrosis factor receptors, D3L/I3R and C1L ankyrin repeat proteins, B19R kelch-like protein, D7L BTB/POZ domain protein, and B22R variola virus B22R-like protein. These results indicated that HSPV contains unique genomic features likely contributing to a unique virulence/host range phenotype. They also indicated that while closely related to known VACV-like viruses, HSPV contains additional, potentially ancestral sequences absent in other VACV-like viruses. PMID:16940536

  9. The Basque Paradigm: Genetic Evidence of a Maternal Continuity in the Franco-Cantabrian Region since Pre-Neolithic Times

    PubMed Central

    Behar, Doron M.; Harmant, Christine; Manry, Jeremy; van Oven, Mannis; Haak, Wolfgang; Martinez-Cruz, Begoña; Salaberria, Jasone; Oyharçabal, Bernard; Bauduer, Frédéric; Comas, David; Quintana-Murci, Lluis

    2012-01-01

    Different lines of evidence point to the resettlement of much of western and central Europe by populations from the Franco-Cantabrian region during the Late Glacial and Postglacial periods. In this context, the study of the genetic diversity of contemporary Basques, a population located at the epicenter of the Franco-Cantabrian region, is particularly useful because they speak a non-Indo-European language that is considered to be a linguistic isolate. In contrast with genome-wide analysis and Y chromosome data, where the problem of poor time estimates remains, a new timescale has been established for the human mtDNA and makes this genome the most informative marker for studying European prehistory. Here, we aim to increase knowledge of the origins of the Basque people and, more generally, of the role of the Franco-Cantabrian refuge in the postglacial repopulation of Europe. We thus characterize the maternal ancestry of 908 Basque and non-Basque individuals from the Basque Country and immediate adjacent regions and, by sequencing 420 complete mtDNA genomes, we focused on haplogroup H. We identified six mtDNA haplogroups, H1j1, H1t1, H2a5a1, H1av1, H3c2a, and H1e1a1, which are autochthonous to the Franco-Cantabrian region and, more specifically, to Basque-speaking populations. We detected signals of the expansion of these haplogroups at ∼4,000 years before present (YBP) and estimated their separation from the pan-European gene pool at ∼8,000 YBP, antedating the Indo-European arrival to the region. Our results clearly support the hypothesis of a partial genetic continuity of contemporary Basques with the preceding Paleolithic/Mesolithic settlers of their homeland. PMID:22365151

  10. Genomic organization of the human heparan sulfate-N-deacetylase/N-sulfotransferase gene: Exclusion from a causative role in the pathogenesis of Treacher Collins syndrome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gladwin, A.J.; Dixon, J.; Loftus, S.K.

    Heparan sulfate-N-deacetylase/N-sulfotransferase (HSST) catalyzes both the N-deacetylation and the N-sulfation of heparan sulfate. Previous studies have resulted in the isolation of the human HSST gene from within the Treacher Collins syndrome locus (TCOF1) critical region on 5q. In the present study, the genomic organization of the HSST gene has been elucidated, and the 14 exons identified have been tested for TCOF1-specific mutations. As a result of these studies, mutations within the coding sequence and adjacent splice junctions of HSST can be excluded from a causative role in the pathogenesis of Treacher Collins syndrome. 13 refs., 1 fig., 2 tabs.

  11. Use of whole-genome sequencing to trace, control and characterize the regional expansion of extended-spectrum β-lactamase producing ST15 Klebsiella pneumoniae.

    PubMed

    Zhou, Kai; Lokate, Mariette; Deurenberg, Ruud H; Tepper, Marga; Arends, Jan P; Raangs, Erwin G C; Lo-Ten-Foe, Jerome; Grundmann, Hajo; Rossen, John W A; Friedrich, Alexander W

    2016-02-11

    The study describes the transmission of a CTX-M-15-producing ST15 Klebsiella pneumoniae between patients treated in a single center and the subsequent inter-institutional spread by patient referral occurring between May 2012 and September 2013. A suspected epidemiological link between clinical K. pneumoniae isolates was supported by patient contact tracing and genomic phylogenetic analysis from May to November 2012. By May 2013, a patient treated in three institutions in two cities was involved in an expanding cluster caused by this high-risk clone (HiRiC) (local expansion, CTX-M-15 producing, and containing hypervirulence factors). A clone-specific multiplex PCR was developed for patient screening by which another patient was identified in September 2013. Genomic phylogenetic analysis including published ST15 genomes revealed a close homology with isolates previously found in the USA. Environmental contamination and lack of consistent patient screening were identified as being responsible for the clone dissemination. The investigation addresses the advantages of whole-genome sequencing in the early detection of HiRiC with a high propensity of nosocomial transmission and prolonged circulation in the regional patient population. Our study suggests the necessity for inter-institutional/regional collaboration for infection/outbreak management of K. pneumoniae HiRiCs.

  12. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  13. Defining functional DNA elements in the human genome

    PubMed Central

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  14. GENOMIC DIVERSITY AND THE MICROENVIRONMENT AS DRIVERS OF PROGRESSION IN DCIS

    DTIC Science & Technology

    2017-10-01

    stains, including quantitative analysis, 7) Identification of upstaged DCIS cases for the radiology aim, 8) Development of image analysis methods for...goals of the project? Aim 1. Determine whether genetic diversity of DCIS is greater in DCIS with adjacent invasive disease compared to DCIS without... compared to DCIS without IDC. Since genomics is not the sole driver of tumor behavior, we will phenotypically characterize DCIS and its

  15. Learning Non-Adjacent Regularities at Age 0 ; 7

    ERIC Educational Resources Information Center

    Gervain, Judit; Werker, Janet F.

    2013-01-01

    One important mechanism suggested to underlie the acquisition of grammar is rule learning. Indeed, infants aged 0 ; 7 are able to learn rules based on simple identity relations (adjacent repetitions, ABB: "wo fe fe" and non-adjacent repetitions, ABA: "wo fe wo", respectively; Marcus et al., 1999). One unexplored issue is…

  16. Cloning, genomic organization, and chromosomal localization of human citrate transport protein to the DiGeorge/velocardiofacial syndrome minimal critical region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldmuntz, E.; Budarf, M.L.; Wang, Zhili

    1996-04-15

    DiGeorge syndrome (DGS) and velocardiofacial syndrome have been shown to be associated with microdeletions of chromosomal region 22q11. More recently, patients with conotruncal anomaly face syndrome and some nonsyndromic patients with isolated forms of conotruncal cardiac defects have been found to have 22q11 microdeletions as well. The commonly deleted region, called the DiGeorge chromosomal region (DGCR), spans approximately 1.2 mb and is estimated to contain at least 30 genes. We report a computational approach for gene identification that makes use of large-scale sequencing of cosmids from a contig spanning the DGCR. Using this methodology, we have mapped the human homologmore » of a rodent citrate transport protein to the DGCR. We have isolated a partial cDNA containing the complete open reading frame and have determined the genomic structure by comparing the genomic sequence from the cosmid to the sequence of the cDNA clone. Whether the citrate transport protein can be implicated in the biological etiology of DGS or other 22q11 microdeletion syndromes remains to be defined. 36 refs., 3 figs., 1 tab.« less

  17. Genome organization of epidemic Acinetobacter baumannii strains.

    PubMed

    Di Nocera, Pier Paolo; Rocco, Francesco; Giannouli, Maria; Triassi, Maria; Zarrilli, Raffaele

    2011-10-10

    Acinetobacter baumannii is an opportunistic pathogen responsible for hospital-acquired infections. A. baumannii epidemics described world-wide were caused by few genotypic clusters of strains. The occurrence of epidemics caused by multi-drug resistant strains assigned to novel genotypes have been reported over the last few years. In the present study, we compared whole genome sequences of three A. baumannii strains assigned to genotypes ST2, ST25 and ST78, representative of the most frequent genotypes responsible for epidemics in several Mediterranean hospitals, and four complete genome sequences of A. baumannii strains assigned to genotypes ST1, ST2 and ST77. Comparative genome analysis showed extensive synteny and identified 3068 coding regions which are conserved, at the same chromosomal position, in all A. baumannii genomes. Genome alignments also identified 63 DNA regions, ranging in size from 4 o 126 kb, all defined as genomic islands, which were present in some genomes, but were either missing or replaced by non-homologous DNA sequences in others. Some islands are involved in resistance to drugs and metals, others carry genes encoding surface proteins or enzymes involved in specific metabolic pathways, and others correspond to prophage-like elements. Accessory DNA regions encode 12 to 19% of the potential gene products of the analyzed strains. The analysis of a collection of epidemic A. baumannii strains showed that some islands were restricted to specific genotypes. The definition of the genome components of A. baumannii provides a scaffold to rapidly evaluate the genomic organization of novel clinical A. baumannii isolates. Changes in island profiling will be useful in genomic epidemiology of A. baumannii population.

  18. Reference-free comparative genomics of 174 chloroplasts.

    PubMed

    Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R; Yu, Jun; Cannon, Charles H

    2012-01-01

    Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ~18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and

  19. The evolution of sex ratio distorter suppression affects a 25 cM genomic region in the butterfly Hypolimnas bolina.

    PubMed

    Hornett, Emily A; Moran, Bruce; Reynolds, Louise A; Charlat, Sylvain; Tazzyman, Samuel; Wedell, Nina; Jiggins, Chris D; Hurst, Greg D D

    2014-12-01

    Symbionts that distort their host's sex ratio by favouring the production and survival of females are common in arthropods. Their presence produces intense Fisherian selection to return the sex ratio to parity, typified by the rapid spread of host 'suppressor' loci that restore male survival/development. In this study, we investigated the genomic impact of a selective event of this kind in the butterfly Hypolimnas bolina. Through linkage mapping, we first identified a genomic region that was necessary for males to survive Wolbachia-induced male-killing. We then investigated the genomic impact of the rapid spread of suppression, which converted the Samoan population of this butterfly from a 100:1 female-biased sex ratio in 2001 to a 1:1 sex ratio by 2006. Models of this process revealed the potential for a chromosome-wide effect. To measure the impact of this episode of selection directly, the pattern of genetic variation before and after the spread of suppression was compared. Changes in allele frequencies were observed over a 25 cM region surrounding the suppressor locus, with a reduction in overall diversity observed at loci that co-segregate with the suppressor. These changes exceeded those expected from drift and occurred alongside the generation of linkage disequilibrium. The presence of novel allelic variants in 2006 suggests that the suppressor was likely to have been introduced via immigration rather than through de novo mutation. In addition, further sampling in 2010 indicated that many of the introduced variants were lost or had declined in frequency since 2006. We hypothesize that this loss may have resulted from a period of purifying selection, removing deleterious material that introgressed during the initial sweep. Our observations of the impact of suppression of sex ratio distorting activity reveal a very wide genomic imprint, reflecting its status as one of the strongest selective forces in nature.

  20. A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.

    PubMed

    Moraes, Fernanda; Góes, Andréa

    2016-05-06

    The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.

  1. Lost region in amyloid precursor protein (APP) through TALEN-mediated genome editing alters mitochondrial morphology.

    PubMed

    Wang, Yajie; Wu, Fengyi; Pan, Haining; Zheng, Wenzhong; Feng, Chi; Wang, Yunfu; Deng, Zixin; Wang, Lianrong; Luo, Jie; Chen, Shi

    2016-02-29

    Alzheimer's disease (AD) is characterized by amyloid-β (Aβ) deposition in the brain. Aβ plaques are produced through sequential β/γ cleavage of amyloid precursor protein (APP), of which there are three main APP isoforms: APP695, APP751 and APP770. KPI-APPs (APP751 and APP770) are known to be elevated in AD, but the reason remains unclear. Transcription activator-like (TAL) effector nucleases (TALENs) induce mutations with high efficiency at specific genomic loci, and it is thus possible to knock out specific regions using TALENs. In this study, we designed and expressed TALENs specific for the C-terminus of APP in HeLa cells, in which KPI-APPs are predominantly expressed. The KPI-APP mutants lack a 12-aa region that encompasses a 5-aa trans-membrane (TM) region and 7-aa juxta-membrane (JM) region. The mutated KPI-APPs exhibited decreased mitochondrial localization. In addition, mitochondrial morphology was altered, resulting in an increase in spherical mitochondria in the mutant cells through the disruption of the balance between fission and fusion. Mitochondrial dysfunction, including decreased ATP levels, disrupted mitochondrial membrane potential, increased ROS generation and impaired mitochondrial dehydrogenase activity, was also found. These results suggest that specific regions of KPI-APPs are important for mitochondrial localization and function.

  2. Culicoides biting midges (Diptera, Ceratopogonidae) in various climatic zones of Russia and adjacent lands.

    PubMed

    Sprygin, A V; Fiodorova, O A; Babin, Yu Yu; Elatkin, N P; Mathieu, B; England, M E; Kononov, A V

    2014-12-01

    Culicoides biting midges play an important role in the epidemiology of many vector-borne infections, including bluetongue virus, an internationally important virus of ruminants. The territory of the Russian Federation includes regions with diverse climatic conditions and a wide range of habitats suitable for Culicoides. This review summarizes available data on Culicoides studied in the Russian Federation covering geographically different regions, as well as findings from adjacent countries. Previous literature on species composition, ranges of dominant species, breeding sites, and host preferences is reviewed and suggestions made for future studies to elucidate vector-virus relationships. © 2014 The Society for Vector Ecology.

  3. Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value

    PubMed Central

    Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-hyeon

    2017-01-01

    Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins. PMID:26954162

  4. Thermokarst in pingos and adjacent collapse scar bogs in interior Alaska

    NASA Astrophysics Data System (ADS)

    Douglas, T. A.; Turetsky, M. R.

    2017-12-01

    A region of discontinuous permafrost 50 kilometers southeast of Fairbanks, Alaska exhibits rapid thermokarst and landscape change. The area contains a dozen pingos (hydrolaccoliths), mounds of ice covered by earth material typically 100 meters across and 20 meters above the surrounding ground surface. The pingos have sunken craters in their centers formed through melting and collapse of an inner ice lens core. Adjacent to the pingos are collapse scar bogs in various states of formation and ice wedge terrain undergoing thaw subsidence to polygons and thermokarst mounds (baydzherakhs). With a mean annual temperature of -1 degree C the area contains warm ecosystem-protected permafrost vulnerable to thaw. We analyzed historical imagery to the 1970s to track water features in a subset of pingos. The craters have expanded over the past few decades suggesting melting and collapse of the ice cored center and potential permafrost degradation along pingo margins. Collapse scar bogs in adjacent low-elevation terrain are roughly the same size as the pingos but have little vertical elevation gradient compared to the surrounding terrain. Electrical resistivity tomography (ERT) measurements, high resolution GPS surveys, SIPRE coring, and thaw depth probing were focused along nine 400 meter transects across three of the pingos to identify relationships between geophysical properties, permafrost composition, seasonal thaw, and ecological state. A large ( 40 meters across and 20 meters thick) lens shaped region of thawed permafrost is evident in the ERT results about 10 meters below the ground surface in the center of one pingo we surveyed in detail. This is believed to be the original ice cored region of the pingo that has melted. A thin (1-5 meters thick) layer of permafrost is present above this thawed region while the rampart margins surrounding the pingo are underlain by thick (10-30 m) permafrost. The pingo and thermokarst features reside in a location where rapid permafrost

  5. Identification of genome regions determining semen quality in Holstein-Friesian bulls using information theory.

    PubMed

    Borowska, Alicja; Szwaczkowski, Tomasz; Kamiński, Stanisław; Hering, Dorota M; Kordan, Władysław; Lecewicz, Marek

    2018-05-01

    Use of information theory can be an alternative statistical approach to detect genome regions and candidate genes that are associated with livestock traits. The aim of this study was to verify the validity of the SNPs effects on some semen quality variables of bulls using entropy analysis. Records from 288 Holstein-Friesian bulls from one AI station were included. The following semen quality variables were analyzed: CASA kinematic variables of sperm (total motility, average path velocity, straight line velocity, curvilinear velocity, amplitude of lateral head displacement, beat cross frequency, straightness, linearity), sperm membrane integrity (plazmolema, mitochondrial function), sperm ATP content. Molecular data included 48,192 SNPs. After filtering (call rate = 0.95 and MAF = 0.05), 34,794 SNPs were included in the entropy analysis. The entropy and conditional entropy were estimated for each SNP. Conditional entropy quantifies the remaining uncertainty about values of the variable with the knowledge of SNP. The most informative SNPs for each variable were determined. The computations were performed using the R statistical package. A majority of the loci had relatively small contributions. The most informative SNPs for all variables were mainly located on chromosomes: 3, 4, 5 and 16. The results from the study indicate that important genome regions and candidate genes that determine semen quality variables in bulls are located on a number of chromosomes. Some detected clusters of SNPs were located in RNA (U6 and 5S_rRNA) for all the variables for which analysis occurred. Associations between PARK2 as well GALNT13 genes and some semen characteristics were also detected. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. A Gene-Oriented Haplotype Comparison Reveals Recently Selected Genomic Regions in Temperate and Tropical Maize Germplasm

    PubMed Central

    Zhang, Jie; Li, Yongxiang; Zheng, Jun; Zhang, Hongwei; Yang, Xiaohong; Wang, Jianhua; Wang, Guoying

    2017-01-01

    The extensive genetic variation present in maize (Zea mays) germplasm makes it possible to detect signatures of positive artificial selection that occurred during temperate and tropical maize improvement. Here we report an analysis of 532,815 polymorphisms from a maize association panel consisting of 368 diverse temperate and tropical inbred lines. We developed a gene-oriented approach adapting exonic polymorphisms to identify recently selected alleles by comparing haplotypes across the maize genome. This analysis revealed evidence of selection for more than 1100 genomic regions during recent improvement, and included regulatory genes and key genes with visible mutant phenotypes. We find that selected candidate target genes in temperate maize are enriched in biosynthetic processes, and further examination of these candidates highlights two cases, sucrose flux and oil storage, in which multiple genes in a common pathway can be cooperatively selected. Finally, based on available parallel gene expression data, we hypothesize that some genes were selected for regulatory variations, resulting in altered gene expression. PMID:28099470

  7. Benchmarking database performance for genomic data.

    PubMed

    Khushi, Matloob

    2015-06-01

    Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.

  8. Construction of a map-based reference genome sequence for barley, Hordeum vulgare L.

    PubMed Central

    Beier, Sebastian; Himmelbach, Axel; Colmsee, Christian; Zhang, Xiao-Qi; Barrero, Roberto A.; Zhang, Qisen; Li, Lin; Bayer, Micha; Bolser, Daniel; Taudien, Stefan; Groth, Marco; Felder, Marius; Hastie, Alex; Šimková, Hana; Staňková, Helena; Vrána, Jan; Chan, Saki; Muñoz-Amatriaín, María; Ounit, Rachid; Wanamaker, Steve; Schmutzer, Thomas; Aliyeva-Schnorr, Lala; Grasso, Stefano; Tanskanen, Jaakko; Sampath, Dharanya; Heavens, Darren; Cao, Sujie; Chapman, Brett; Dai, Fei; Han, Yong; Li, Hua; Li, Xuan; Lin, Chongyun; McCooke, John K.; Tan, Cong; Wang, Songbo; Yin, Shuya; Zhou, Gaofeng; Poland, Jesse A.; Bellgard, Matthew I.; Houben, Andreas; Doležel, Jaroslav; Ayling, Sarah; Lonardi, Stefano; Langridge, Peter; Muehlbauer, Gary J.; Kersey, Paul; Clark, Matthew D.; Caccamo, Mario; Schulman, Alan H.; Platzer, Matthias; Close, Timothy J.; Hansson, Mats; Zhang, Guoping; Braumann, Ilka; Li, Chengdao; Waugh, Robbie; Scholz, Uwe; Stein, Nils; Mascher, Martin

    2017-01-01

    Barley (Hordeum vulgare L.) is a cereal grass mainly used as animal fodder and raw material for the malting industry. The map-based reference genome sequence of barley cv. ‘Morex’ was constructed by the International Barley Genome Sequencing Consortium (IBSC) using hierarchical shotgun sequencing. Here, we report the experimental and computational procedures to (i) sequence and assemble more than 80,000 bacterial artificial chromosome (BAC) clones along the minimum tiling path of a genome-wide physical map, (ii) find and validate overlaps between adjacent BACs, (iii) construct 4,265 non-redundant sequence scaffolds representing clusters of overlapping BACs, and (iv) order and orient these BAC clusters along the seven barley chromosomes using positional information provided by dense genetic maps, an optical map and chromosome conformation capture sequencing (Hi-C). Integrative access to these sequence and mapping resources is provided by the barley genome explorer (BARLEX). PMID:28448065

  9. A linear mitochondrial genome of Cyclospora cayetanensis (Eimeriidae, Eucoccidiorida, Coccidiasina, Apicomplexa) suggests the ancestral start position within mitochondrial genomes of eimeriid coccidia.

    PubMed

    Ogedengbe, Mosun E; Qvarnstrom, Yvonne; da Silva, Alexandre J; Arrowood, Michael J; Barta, John R

    2015-05-01

    The near complete mitochondrial genome for Cyclospora cayetanensis is 6184 bp in length with three protein-coding genes (Cox1, Cox3, CytB) and numerous lsrDNA and ssrDNA fragments. Gene arrangements were conserved with other coccidia in the Eimeriidae, but the C. cayetanensis mitochondrial genome is not circular-mapping. Terminal transferase tailing and nested PCR completed the 5'-terminus of the genome starting with a 21 bp A/T-only region that forms a potential stem-loop. Regions homologous to the C. cayetanensis mitochondrial genome 5'-terminus are found in all eimeriid mitochondrial genomes available and suggest this may be the ancestral start of eimeriid mitochondrial genomes. Copyright © 2015 Australian Society for Parasitology Inc. All rights reserved.

  10. Indexcov: fast coverage quality control for whole-genome sequencing.

    PubMed

    Pedersen, Brent S; Collins, Ryan L; Talkowski, Michael E; Quinlan, Aaron R

    2017-11-01

    The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at https://github.com/brentp/goleft under the MIT license. © The Authors 2017. Published by Oxford University Press.

  11. Integrated Analysis of Genome-wide Copy Number Alterations and Gene Expression in MSS, CIMP-negative Colon Cancer

    PubMed Central

    Loo, Lenora WM; Tiirikainen, Maarit; Cheng, Iona; Lum-Jones, Annette; Seifried, Ann; Church, James M; Gryfe, Robert; Weisenberger, Daniel J; Lindor, Noralane M; Gallinger, Steven; Haile, Robert W; Duggan, David J; Thibodeau, Stephen N; Casey, Graham; Le Marchand, Loïc

    2014-01-01

    Microsatellite stable (MSS), CpG island methylator phenotype (CIMP)-negative colorectal tumors, the most prevalent molecular subtype of colorectal cancer, are associated with extensive copy number alteration (CNA) events and aneuploidy. We report on the identification of characteristic recurrent CNA (with frequency >25%) events and associated gene expression profiles for a total of 40 paired tumor and adjacent normal colon tissues using genome-wide microarrays. We observed recurrent CNAs, namely gains at 1q, 7p, 7q, 8p12-11, 8q, 12p13, 13q, 20p, 20q, Xp, and Xq and losses at 1p36, 1p31, 1p21, 4p15-12, 4q12-35, 5q21-22, 6q26, 8p, 14q, 15q11-12, 17p, 18p, 18q, 21q21-22, and 22q. Within these genomic regions we identified 356 genes with significant differential expression (P<0.0001 and ±1.5 fold change) in the tumor compared to adjacent normal tissue. Gene ontology and pathway analyses indicated that many of these genes were involved in functional mechanisms that regulate cell cycle, cell death, and metabolism. An amplicon present in >70% of the tumor samples at 20q11-20q13 contained several cancer-related genes (AHCY, POFUT1, RPN2, TH1L and PRPF6) that were up-regulated and demonstrated a significant linear correlation (P<0.05) for gene dosage and gene expression. Copy number loss at 8p, a CNA associated with adenocarcinoma and poor prognosis, was observed in >50% of the tumor samples and demonstrated a significant linear correlation for gene dosage and gene expression for two potential tumor suppressor genes, MTUS1 (8p22) and PPP2CB (8p12). The results from our integration analysis illustrate the complex relationship between genomic alterations and gene expression in colon cancer. PMID:23341073

  12. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

  13. Genome-wide association study of Alzheimer's disease.

    PubMed

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; Dekosky, S T; Lopez, O L; Barmada, M M

    2012-05-15

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ~2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69-180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P = 3.05E-07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples.

  14. Genome-wide association study of Alzheimer's disease

    PubMed Central

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; DeKosky, S T; Lopez, O L; Barmada, M M

    2012-01-01

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples. PMID:22832961

  15. A systematic review of definitions and classification systems of adjacent segment pathology.

    PubMed

    Kraemer, Paul; Fehlings, Michael G; Hashimoto, Robin; Lee, Michael J; Anderson, Paul A; Chapman, Jens R; Raich, Annie; Norvell, Daniel C

    2012-10-15

    Systematic review. To undertake a systematic review to determine how "adjacent segment degeneration," "adjacent segment disease," or clinical pathological processes that serve as surrogates for adjacent segment pathology are classified and defined in the peer-reviewed literature. Adjacent segment degeneration and adjacent segment disease are terms referring to degenerative changes known to occur after reconstructive spine surgery, most commonly at an immediately adjacent functional spinal unit. These can include disc degeneration, instability, spinal stenosis, facet degeneration, and deformity. The true incidence and clinical impact of degenerative changes at the adjacent segment is unclear because there is lack of a universally accepted classification system that rigorously addresses clinical and radiological issues. A systematic review of the English language literature was undertaken and articles were classified using the Grades of Recommendation Assessment, Development, and Evaluation criteria. RESULTS.: Seven classification systems of spinal degeneration, including degeneration at the adjacent segment, were identified. None have been evaluated for reliability or validity specific to patients with degeneration at the adjacent segment. The ways in which terms related to adjacent segment "degeneration" or "disease" are defined in the peer-reviewed literature are highly variable. On the basis of the systematic review presented in this article, no formal classification system for either cervical or thoracolumbar adjacent segment disorders currently exists. No recommendations regarding the use of current classification of degeneration at any segments can be made based on the available literature. A new comprehensive definition for adjacent segment pathology (ASP, the now preferred terminology) has been proposed in this Focus Issue, which reflects the diverse pathology observed at functional spinal units adjacent to previous spinal reconstruction and balances

  16. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspectmore » centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.« less

  17. Identification of genomic region controlling resistance to aflatoxin contamination in a peanut recombinant inbred line population (Tifrunner x GT-C20)

    USDA-ARS?s Scientific Manuscript database

    Aflatoxin contamination of peanut is a significant threat to global food safety. In this study we performed quantitative trait loci (QTL) analysis to identify peanut genomic regions contributing to aflatoxin contamination resistance in a recombinant inbred line (RIL) population derived from the Tifr...

  18. A genome-wide activity assessment of terminator regions in Saccharomyces cerevisiae provides a ″terminatome″ toolbox.

    PubMed

    Yamanishi, Mamoru; Ito, Yoichiro; Kintaka, Reiko; Imamura, Chie; Katahira, Satoshi; Ikeuchi, Akinori; Moriya, Hisao; Matsuyama, Takashi

    2013-06-21

    The terminator regions of eukaryotes encode functional elements in the 3' untranslated region (3'-UTR) that influence the 3'-end processing of mRNA, mRNA stability, and translational efficiency, which can modulate protein production. However, the contribution of these terminator regions to gene expression remains unclear, and therefore their utilization in metabolic engineering or synthetic genetic circuits has been limited. Here, we comprehensively evaluated the activity of 5302 terminator regions from a total of 5880 genes in the budding yeast Saccharomyces cerevisiae by inserting each terminator region downstream of the P TDH3 - green fluorescent protein (GFP) reporter gene and measuring the fluorescent intensity of GFP. Terminator region activities relative to that of the PGK1 standard terminator ranged from 0.036 to 2.52, with a mean of 0.87. We thus could isolate the most and least active terminator regions. The activities of the terminator regions showed a positive correlation with mRNA abundance, indicating that the terminator region is a determinant of mRNA abundance. The least active terminator regions tended to encode longer 3'-UTRs, suggesting the existence of active degradation mechanisms for those mRNAs. The terminator regions of ribosomal protein genes tended to be the most active, suggesting the existence of a common regulator of those genes. The ″terminatome″ (the genome-wide set of terminator regions) thus not only provides valuable information to understand the modulatory roles of terminator regions on gene expression but also serves as a useful toolbox for the development of metabolically and genetically engineered yeast.

  19. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    PubMed

    Vercoe, Reuben B; Chang, James T; Dy, Ron L; Taylor, Corinda; Gristwood, Tamzin; Clulow, James S; Richter, Corinna; Przybilski, Rita; Pitman, Andrew R; Fineran, Peter C

    2013-04-01

    In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  20. Delayed Acquisition of Non-Adjacent Vocalic Distributional Regularities

    ERIC Educational Resources Information Center

    Gonzalez-Gomez, Nayeli; Nazzi, Thierry

    2016-01-01

    The ability to compute non-adjacent regularities is key in the acquisition of a new language. In the domain of phonology/phonotactics, sensitivity to non-adjacent regularities between consonants has been found to appear between 7 and 10 months. The present study focuses on the emergence of a posterior-anterior (PA) bias, a regularity involving two…

  1. From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

    PubMed

    Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

    2016-04-01

    The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Evolution of bird genomes-a transposon's-eye view.

    PubMed

    Kapusta, Aurélie; Suh, Alexander

    2017-02-01

    Birds, the most species-rich monophyletic group of land vertebrates, have been subject to some of the most intense sequencing efforts to date, making them an ideal case study for recent developments in genomics research. Here, we review how our understanding of bird genomes has changed with the recent sequencing of more than 75 species from all major avian taxa. We illuminate avian genome evolution from a previously neglected perspective: their repetitive genomic parasites, transposable elements (TEs) and endogenous viral elements (EVEs). We show that (1) birds are unique among vertebrates in terms of their genome organization; (2) information about the diversity of avian TEs and EVEs is changing rapidly; (3) flying birds have smaller genomes yet more TEs than flightless birds; (4) current second-generation genome assemblies fail to capture the variation in avian chromosome number and genome size determined with cytogenetics; (5) the genomic microcosm of bird-TE "arms races" has yet to be explored; and (6) upcoming third-generation genome assemblies suggest that birds exhibit stability in gene-rich regions and instability in TE-rich regions. We emphasize that integration of cytogenetics and single-molecule technologies with repeat-resolved genome assemblies is essential for understanding the evolution of (bird) genomes. © 2016 New York Academy of Sciences.

  3. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs

    PubMed Central

    Freedman, Adam H.; Schweizer, Rena M.; Ortega-Del Vecchyo, Diego; Han, Eunjung; Davis, Brian W.; Gronau, Ilan; Silva, Pedro M.; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R.; Parker, Heidi G.; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D.; Harkins, Timothy T.; Nelson, Stanley F.; Marques-Bonet, Tomas; Ostrander, Elaine A.; Wayne, Robert K.; Novembre, John

    2016-01-01

    Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers. PMID:26943675

  4. End-to-end crosstalk within the hepatitis C virus genome mediates the conformational switch of the 3′X-tail region

    PubMed Central

    Romero-López, Cristina; Barroso-delJesus, Alicia; García-Sacristán, Ana; Briones, Carlos; Berzal-Herranz, Alfredo

    2014-01-01

    The hepatitis C virus (HCV) RNA genome contains multiple structurally conserved domains that make long-distance RNA–RNA contacts important in the establishment of viral infection. Microarray antisense oligonucelotide assays, improved dimethyl sulfate probing methods and 2′ acylation chemistry (selective 2’-hydroxyl acylation and primer extension, SHAPE) showed the folding of the genomic RNA 3′ end to be regulated by the internal ribosome entry site (IRES) element via direct RNA–RNA interactions. The essential cis-acting replicating element (CRE) and the 3′X-tail region adopted different 3D conformations in the presence and absence of the genomic RNA 5′ terminus. Further, the structural transition in the 3′X-tail from the replication-competent conformer (consisting of three stem-loops) to the dimerizable form (with two stem-loops), was found to depend on the presence of both the IRES and the CRE elements. Complex interplay between the IRES, the CRE and the 3′X-tail region would therefore appear to occur. The preservation of this RNA–RNA interacting network, and the maintenance of the proper balance between different contacts, may play a crucial role in the switch between different steps of the HCV cycle. PMID:24049069

  5. Adjacent Segment Disease After Cervical Spine Fusion: Evaluation of a 70 Patient Long-Term Follow-Up.

    PubMed

    Alhashash, Mohamed; Shousha, Mootaz; Boehm, Heinrich

    2018-05-01

    A retrospective study of 70 patients undergoing surgical treatment for adjacent segment disease (ASD) after anterior cervical decompression and fusion (ACDF). To analyze the risk factors for the development of ASD in patients who underwent ACDF. ACDF has provided a high rate of clinical success for the cervical degenerative disc disease; nevertheless, adjacent segment degeneration has been reported as a complication at the adjacent level secondary to the rigid fixation. Between January 2005 and December 2012, 70 consecutive patients underwent surgery for ASD after ACDF in our institution. In all patients thorough clinical and radiological examination was performed preoperatively, postoperatively, and at the final follow-up. The clinical data included the Neck Disability Index (NDI) and the Visual Analogue Scale (VAS). The radiological evaluation included x-rays and magnetic resonance imaging (MRI) for all patients. The duration of follow up after the adjacent segment operation ranged from 3 to 10 years. Surgery for ASD was performed after a mean period of 32 months from the primary ACDF. ASD occurred after single level ACDF in 54% of cases, most commonly after C5/6 fusion (28%). Risk factors for ASD were found to be preexisting radiological signs of degeneration at the primary surgery (74%) and bad sagittal profile after the primary ACDF (90%). ASD occurred predominantly in the middle cervical region (C4-6); especially in patients with preexisting evidence of radiological degeneration in the adjacent segment at the time of primary cervical fusion, notably when this surgery failed to restore or maintain the cervical lordosis. 4.

  6. Sequence-specific epigenetic effects of the maternal somatic genome on developmental rearrangements of the zygotic genome in Paramecium primaurelia.

    PubMed Central

    Meyer, E; Butler, A; Dubrana, K; Duharcourt, S; Caron, F

    1997-01-01

    In ciliates, the germ line genome is extensively rearranged during the development of the somatic macronucleus from a mitotic product of the zygotic nucleus. Germ line chromosomes are fragmented in specific regions, and a large number of internal sequence elements are eliminated. It was previously shown that transformation of the vegetative macronucleus of Paramecium primaurelia with a plasmid containing a subtelomeric surface antigen gene can affect the processing of the homologous germ line genomic region during development of a new macronucleus in sexual progeny of transformed clones. The gene and telomere-proximal flanking sequences are deleted from the new macronuclear genome, although the germ line genome remains wild type. Here we show that plasmids containing nonoverlapping segments of the same genomic region are able to induce similar terminal deletions; the locations of deletion end points depend on the particular sequence used. Transformation of the maternal macronucleus with a sequence internal to a macronuclear chromosome also causes the occurrence of internal deletions between short direct repeats composed of alternating thymines and adenines. The epigenetic influence of maternal macronuclear sequences on developmental rearrangements of the zygotic genome thus appears to be both sequence specific and general, suggesting that this trans-nucleus effect is mediated by pairing of homologous sequences. PMID:9199294

  7. Characterization of the complete mitochondrial genome of the Grey-backed Shrike, Lanius tephronotus (Aves: Passeriformes): the first representative of the family Laniidae with a novel CAA stop codon at the end of cox2 gene.

    PubMed

    Qian, Chaoju; Yan, Xia; Guo, Zhichun; Wang, Yuanxiu; Li, Xixi; Yang, Jianke; Kan, Xianzhao

    2013-08-01

    The complete Grey-backed Shrike mitochondrial genome has been sequenced to be 16,820 bp in length, consisting of 37 encode genes: 13 protein-coding genes, 2 ribosomal RNA genes, and 22 transfer RNA genes. In addition, a single control region was also observed. Compared with other reported Passeriformes mtgenome sequences, three bases CAA were detected at the end of Lanius tephronotus cox2 gene with the downstream adjacent base T. The first base of CAA probably occurred C to U transcript editing event resulting in a normal stop codon UAA.

  8. Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

    PubMed Central

    Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

    2012-01-01

    Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893

  9. Megabase replication domains along the human genome: relation to chromatin structure and genome organisation.

    PubMed

    Audit, Benjamin; Zaghloul, Lamia; Baker, Antoine; Arneodo, Alain; Chen, Chun-Long; d'Aubenton-Carafa, Yves; Thermes, Claude

    2013-01-01

    In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.

  10. Phylogenetic Invariants for Metazoan Mitochondrial Genome Evolution.

    PubMed

    Sankoff; Blanchette

    1998-01-01

    The method of phylogenetic invariants was developed to apply to aligned sequence data generated, according to a stochastic substitution model, for N species related through an unknown phylogenetic tree. The invariants are functions of the probabilities of the observable N-tuples, which are identically zero, over all choices of branch length, for some trees. Evaluating the invariants associated with all possible trees, using observed N-tuple frequencies over all sequence positions, enables us to rapidly infer the generating tree. An aspect of evolution at the genomic level much studied recently is the rearrangements of gene order along the chromosome from one species to another. Instead of the substitutions responsible for sequence evolution, we examine the non-local processes responsible for genome rearrangements such as inversion of arbitrarily long segments of chromosomes. By treating the potential adjacency of each possible pair of genes as a position", an appropriate substitution" model can be recognized as governing the rearrangement process, and a probabilistically principled phylogenetic inference can be set up. We calculate the invariants for this process for N=5, and apply them to mitochondrial genome data from coelomate metazoans, showing how they resolve key aspects of branching order.

  11. Semiconductor laser having a non-absorbing passive region with beam guiding

    NASA Technical Reports Server (NTRS)

    Botez, Dan (Inventor)

    1986-01-01

    A laser comprises a semiconductor body having a pair of end faces and including an active region comprising adjacent active and guide layers which is spaced a distance from the end face and a passive region comprising adjacent non-absorbing guide and mode control layers which extends between the active region and the end face. The combination of the guide and mode control layers provides a weak positive index waveguide in the lateral direction thereby providing lateral mode control in the passive region between the active region and the end face.

  12. Comparison of Genome-Wide Binding of MyoD in Normal Human Myogenic Cells and Rhabdomyosarcomas Identifies Regional and Local Suppression of Promyogenic Transcription Factors

    PubMed Central

    MacQuarrie, Kyle L.; Yao, Zizhen; Fong, Abraham P.; Diede, Scott J.; Rudzinski, Erin R.; Hawkins, Douglas S.

    2013-01-01

    Rhabdomyosarcoma is a pediatric tumor of skeletal muscle that expresses the myogenic basic helix-loop-helix protein MyoD but fails to undergo terminal differentiation. Prior work has determined that DNA binding by MyoD occurs in the tumor cells, but myogenic targets fail to activate. Using MyoD chromatin immunoprecipitation coupled to high-throughput sequencing and gene expression analysis in both primary human muscle cells and RD rhabdomyosarcoma cells, we demonstrate that MyoD binds in a similar genome-wide pattern in both tumor and normal cells but binds poorly at a subset of myogenic genes that fail to activate in the tumor cells. Binding differences are found both across genomic regions and locally at specific sites that are associated with binding motifs for RUNX1, MEF2C, JDP2, and NFIC. These factors are expressed at lower levels in RD cells than muscle cells and rescue myogenesis when expressed in RD cells. MEF2C is located in a genomic region that exhibits poor MyoD binding in RD cells, whereas JDP2 exhibits local DNA hypermethylation in its promoter in both RD cells and primary tumor samples. These results demonstrate that regional and local silencing of differentiation factors contributes to the differentiation defect in rhabdomyosarcomas. PMID:23230269

  13. Somatic Mutation Patterns in Hemizygous Genomic Regions Unveil Purifying Selection during Tumor Evolution

    PubMed Central

    Basu, Swaraj; Larsson, Erik

    2016-01-01

    Identification of cancer driver genes using somatic mutation patterns indicative of positive selection has become a major goal in cancer genomics. However, cancer cells additionally depend on a large number of genes involved in basic cellular processes. While such genes should in theory be subject to strong purifying (negative) selection against damaging somatic mutations, these patterns have been elusive and purifying selection remains inadequately explored in cancer. Here, we hypothesized that purifying selection should be evident in hemizygous genomic regions, where damaging mutations cannot be compensated for by healthy alleles. Using a 7,781-sample pan-cancer dataset, we first confirmed this in POLR2A, an essential gene where hemizygous deletions are known to confer elevated sensitivity to pharmacological suppression. We next used this principle to identify several genes and pathways that show patterns indicative of purifying selection to avoid deleterious mutations. These include the POLR2A interacting protein INTS10 as well as genes involved in mRNA splicing, nonsense-mediated mRNA decay and other RNA processing pathways. Many of these genes belong to large protein complexes, and strong overlaps were observed with recent functional screens for gene essentiality in human cells. Our analysis supports that purifying selection acts to preserve the remaining function of many hemizygously deleted essential genes in tumors, indicating vulnerabilities that might be exploited by future therapeutic strategies. PMID:28027311

  14. Identification of accessory genome regions in poultry Clostridium perfringens isolates carrying the netB plasmid.

    PubMed

    Lepp, D; Gong, J; Songer, J G; Boerlin, P; Parreira, V R; Prescott, J F

    2013-03-01

    Necrotic enteritis (NE) is an economically important disease of poultry caused by certain Clostridium perfringens type A strains. NE pathogenesis involves the NetB toxin, which is encoded on a large conjugative plasmid within a 42-kb pathogenicity locus. Recent multilocus sequence type (MLST) studies have identified two predominant NE-associated clonal groups, suggesting that host genes are also involved in NE pathogenesis. We used microarray comparative genomic hybridization (CGH) to assess the gene content of 54 poultry isolates from birds that were healthy or that suffered from NE. A total of 400 genes were variably present among the poultry isolates and nine nonpoultry strains, many of which had putative functions related to nutrient uptake and metabolism and cell wall and capsule biosynthesis. The variable genes were organized into 142 genomic regions, 49 of which contained genes significantly associated with netB-positive isolates. These regions included three previously identified NE-associated loci as well as several apparent fitness-related loci, such as a carbohydrate ABC transporter, a ferric-iron siderophore uptake system, and an adhesion locus. Additional loci were related to plasmid maintenance. Cluster analysis of the CGH data grouped all of the netB-positive poultry isolates into two major groups, separated according to two prevalent clonal groups based on MLST analysis. This study identifies chromosomal loci associated with netB-positive poultry strains, suggesting that the chromosomal background can confer a selective advantage to NE-causing strains, possibly through mechanisms involving iron acquisition, carbohydrate metabolism, and plasmid maintenance.

  15. The complete chloroplast genome sequence of Dendrobium nobile.

    PubMed

    Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu

    2016-11-01

    The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.

  16. Complete mitochondrial genome of endangered Yellow-shouldered Amazon (Amazona barbadensis): two control region copies in parrot species of the Amazona genus.

    PubMed

    Urantowka, Adam Dawid; Hajduk, Kacper; Kosowska, Barbara

    2013-08-01

    Amazona barbadensis is an endangered species of parrot living in northern coastal Venezuela and in several Caribbean islands. In this study, we sequenced full mitochondrial genome of the considered species. The total length of the mitogenome was 18,983 bp and contained 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, duplicated control region, and degenerate copies of ND6 and tRNA (Glu) genes. High degree of identity between two copies of control region suggests their coincident evolution and functionality. Comparative analysis of both the control region sequences from four Amazona species revealed their 89.1% identity over a region of 1300 bp and indicates the presence of distinctive parts of two control region copies.

  17. CRISPR/Cas9 for genome editing: progress, implications and challenges.

    PubMed

    Zhang, Feng; Wen, Yan; Guo, Xiong

    2014-09-15

    Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the elucidation of target gene function in biology and diseases. CRISPR/Cas9 comprises of a nonspecific Cas9 nuclease and a set of programmable sequence-specific CRISPR RNA (crRNA), which can guide Cas9 to cleave DNA and generate double-strand breaks at target sites. Subsequent cellular DNA repair process leads to desired insertions, deletions or substitutions at target sites. The specificity of CRISPR/Cas9-mediated DNA cleavage requires target sequences matching crRNA and a protospacer adjacent motif locating at downstream of target sequences. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  19. Genome-Wide Analysis of the Arabidopsis Replication Timing Program1[OPEN

    PubMed Central

    Brooks, Ashley M.; Wheeler, Emily; LeBlanc, Chantal; Lee, Tae-Jin; Martienssen, Robert A.; Thompson, William F.

    2018-01-01

    Eukaryotes use a temporally regulated process, known as the replication timing program, to ensure that their genomes are fully and accurately duplicated during S phase. Replication timing programs are predictive of genomic features and activity and are considered to be functional readouts of chromatin organization. Although replication timing programs have been described for yeast and animal systems, much less is known about the temporal regulation of plant DNA replication or its relationship to genome sequence and chromatin structure. We used the thymidine analog, 5-ethynyl-2′-deoxyuridine, in combination with flow sorting and Repli-Seq to describe, at high-resolution, the genome-wide replication timing program for Arabidopsis (Arabidopsis thaliana) Col-0 suspension cells. We identified genomic regions that replicate predominantly during early, mid, and late S phase, and correlated these regions with genomic features and with data for chromatin state, accessibility, and long-distance interaction. Arabidopsis chromosome arms tend to replicate early while pericentromeric regions replicate late. Early and mid-replicating regions are gene-rich and predominantly euchromatic, while late regions are rich in transposable elements and primarily heterochromatic. However, the distribution of chromatin states across the different times is complex, with each replication time corresponding to a mixture of states. Early and mid-replicating sequences interact with each other and not with late sequences, but early regions are more accessible than mid regions. The replication timing program in Arabidopsis reflects a bipartite genomic organization with early/mid-replicating regions and late regions forming separate, noninteracting compartments. The temporal order of DNA replication within the early/mid compartment may be modulated largely by chromatin accessibility. PMID:29301956

  20. Genome-wide signatures of flowering adaptation to climate temperature: Regional analyses in a highly diverse native range of Arabidopsis thaliana.

    PubMed

    Tabas-Madrid, Daniel; Méndez-Vigo, Belén; Arteaga, Noelia; Marcer, Arnald; Pascual-Montano, Alberto; Weigel, Detlef; Xavier Picó, F; Alonso-Blanco, Carlos

    2018-03-08

    Current global change is fueling an interest to understand the genetic and molecular mechanisms of plant adaptation to climate. In particular, altered flowering time is a common strategy for escape from unfavourable climate temperature. In order to determine the genomic bases underlying flowering time adaptation to this climatic factor, we have systematically analysed a collection of 174 highly diverse Arabidopsis thaliana accessions from the Iberian Peninsula. Analyses of 1.88 million single nucleotide polymorphisms provide evidence for a spatially heterogeneous contribution of demographic and adaptive processes to geographic patterns of genetic variation. Mountains appear to be allele dispersal barriers, whereas the relationship between flowering time and temperature depended on the precise temperature range. Environmental genome-wide associations supported an overall genome adaptation to temperature, with 9.4% of the genes showing significant associations. Furthermore, phenotypic genome-wide associations provided a catalogue of candidate genes underlying flowering time variation. Finally, comparison of environmental and phenotypic genome-wide associations identified known (Twin Sister of FT, FRIGIDA-like 1, and Casein Kinase II Beta chain 1) and new (Epithiospecifer Modifier 1 and Voltage-Dependent Anion Channel 5) genes as candidates for adaptation to climate temperature by altered flowering time. Thus, this regional collection provides an excellent resource to address the spatial complexity of climate adaptation in annual plants. © 2018 John Wiley & Sons Ltd.

  1. Evolutionary genomics of animal personality.

    PubMed

    van Oers, Kees; Mueller, Jakob C

    2010-12-27

    Research on animal personality can be approached from both a phenotypic and a genetic perspective. While using a phenotypic approach one can measure present selection on personality traits and their combinations. However, this approach cannot reconstruct the historical trajectory that was taken by evolution. Therefore, it is essential for our understanding of the causes and consequences of personality diversity to link phenotypic variation in personality traits with polymorphisms in genomic regions that code for this trait variation. Identifying genes or genome regions that underlie personality traits will open exciting possibilities to study natural selection at the molecular level, gene-gene and gene-environment interactions, pleiotropic effects and how gene expression shapes personality phenotypes. In this paper, we will discuss how genome information revealed by already established approaches and some more recent techniques such as high-throughput sequencing of genomic regions in a large number of individuals can be used to infer micro-evolutionary processes, historical selection and finally the maintenance of personality trait variation. We will do this by reviewing recent advances in molecular genetics of animal personality, but will also use advanced human personality studies as case studies of how molecular information may be used in animal personality research in the near future.

  2. A survey of genome-wide single nucleotide polymorphisms through genome resequencing in the Périgord black truffle (Tuber melanosporum Vittad.).

    PubMed

    Payen, Thibaut; Murat, Claude; Gigant, Anaïs; Morin, Emmanuelle; De Mita, Stéphane; Martin, Francis

    2015-09-01

    The Périgord black truffle (Tuber melanosporum Vittad.), considered a gastronomic delicacy worldwide, is an ectomycorrhizal filamentous fungus that is ecologically important in Mediterranean French, Italian and Spanish woodlands. In this study, we developed a novel resource of single nucleotide polymorphisms (SNPs) for T. melanosporum using Illumina high-throughput resequencing. The genome from six T. melanosporum geographical accessions was sequenced to a depth of approximately 20×. These geographical accessions were selected from different populations within the northern and southern regions of the geographical species distribution. Approximately 80% of the reads for each of the six resequenced geographical accessions mapped against the reference T. melanosporum genome assembly, estimating the core genome size of this organism to be approximately 110 Mbp. A total of 442 326 SNPs corresponding to 3540 SNPs/Mbps were identified as being included in all seven genomes. The SNPs occurred more frequently in repeated sequences (85%), although 4501 SNPs were also identified in the coding regions of 2587 genes. Using the ratio of nonsynonymous mutations per nonsynonymous site (pN) to synonymous mutations per synonymous site (pS) and Tajima's D index scanning the whole genome, we were able to identify genomic regions and genes potentially subjected to positive or purifying selection. The SNPs identified represent a valuable resource for future population genetics and genomics studies. © 2015 John Wiley & Sons Ltd.

  3. The genomes of three Bradyrhizobium sp. isolated from root nodules of Lupinus albescens grown in extremely poor soils display important genes for resistance to environmental stress.

    PubMed

    Granada, Camille E; Vargas, Luciano K; Sant'Anna, Fernando Hayashi; Balsanelli, Eduardo; Baura, Valter Antonio de; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Falcon, Tiago; Passaglia, Luciane M P

    2018-05-17

    Lupinus albescens is a resistant cover plant that establishes symbiotic relationships with bacteria belonging to the Bradyrhizobium genus. This symbiosis helps the development of these plants in adverse environmental conditions, such as the ones found in arenized areas of Southern Brazil. This work studied three Bradyrhizobium sp. (AS23, NAS80 and NAS96) isolated from L. albescens plants that grow in extremely poor soils (arenized areas and adjacent grasslands). The genomes of these three strains were sequenced in the Ion Torrent platform using the IonXpress library preparation kit, and presented a total number of bases of 1,230,460,823 for AS23, 1,320,104,022 for NAS80, and 1,236,105,093 for NAS96. The genome comparison with closest strains Bradyrhizobium japonicum USDA6 and Bradyrhizobium diazoefficiens USDA110 showed important variable regions (with less than 80% of similarity). Genes encoding for factors for resistance/tolerance to heavy metal, flagellar motility, response to osmotic and oxidative stresses, heat shock proteins (present only in the three sequenced genomes) could be responsible for the ability of these microorganisms to survive in inhospitable environments. Knowledge about these genomes will provide a foundation for future development of an inoculant bioproduct that should optimize the recovery of degraded soils using cover crops.

  4. The complete chloroplast genome of the Dendrobium strongylanthum (Orchidaceae: Epidendroideae).

    PubMed

    Li, Jing; Chen, Chen; Wang, Zhe-Zhi

    2016-07-01

    Complete chloroplast genome sequence is very useful for studying the phylogenetic and evolution of species. In this study, the complete chloroplast genome of Dendrobium strongylanthum was constructed from whole-genome Illumina sequencing data. The chloroplast genome is 153 058 bp in length with 37.6% GC content and consists of two inverted repeats (IRs) of 26 316 bp. The IR regions are separated by large single-copy region (LSC, 85 836 bp) and small single-copy (SSC, 14 590 bp) region. A total of 130 chloroplast genes were successfully annotated, including 84 protein coding genes, 38 tRNA genes, and eight rRNA genes. Phylogenetic analyses showed that the chloroplast genome of Dendrobium strongylanthum is related to that of the Dendrobium officinal.

  5. The complete chloroplast genome of a medicinal plant Epimedium koreanum Nakai (Berberidaceae).

    PubMed

    Lee, Jung-Hoon; Kim, Kyunghee; Kim, Na-Rae; Lee, Sang-Choon; Yang, Tae-Jin; Kim, Young-Dong

    2016-11-01

    Epimedium koreanum is a perennial medicinal plant distributed in Eastern Asia. The complete chloroplast genome sequences of E. koreanum was obtained by de novo assembly using whole genome next-generation sequences. The chloroplast genome of E. koreanum was 157 218 bp in length and separated into four distinct regions such as large single copy region (89 600 bp), small single copy region (17 222 bp) and a pair of inverted repeat regions (25 198 bp). The genome contained a total of 112 genes including 78 protein-coding genes, 30 tRNA genes, and 4 rRNA genes. Phylogenetic analysis with the reported chloroplast genomes revealed that E. koreanum is most closely related to Berberis bealei, a traditional medicinal plant in the Berberidaceae family.

  6. Implications of the plastid genome sequence of typha (typhaceae, poales) for understanding genome evolution in poaceae.

    PubMed

    Guisinger, Mary M; Chumley, Timothy W; Kuehl, Jennifer V; Boore, Jeffrey L; Jansen, Robert K

    2010-02-01

    Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.

  7. The ground beetles (Coleoptera: Carabidae) of the Strandzha Mountain and adjacent coastal territories (Bulgaria and Turkey)

    PubMed Central

    Guéorguiev, Borislav

    2016-01-01

    Abstract Background The knowledge of the ground-beetle fauna of Strandzha is currently incomplete, and is largely based on data from the Bulgarian part of the region and on records resulting from casual collecting. This study represents a critical revision of the available literature, museum collections and a three years field study of the carabid beetles of the Bulgarian and Turkish parts of Strandzha Mountain and the adjacent Black Sea Coast territories. New information A total of 328 species and subspecies of Carabidae, belonging to 327 species from the region of Strandzha Mountain and adjacent seacoast area, have been listed. Of these, 77 taxa represent new records for the Bulgarian part of the region, and 110 taxa new records for Turkish part of the studied region. Two taxa, one subgenus (Haptotapinus Reitter, 1886) and one species (Pterostichus crassiusculus), are new to the fauna of Bulgaria. Based on a misidentification, the species Apotomus testaceus is excluded from the list of the Bulgarian fauna. Seven species (Carabus violaceus azurescens, Apotomus rufus, Platynus proximus, Molops alpestris kalofericus, M. dilatatus angulicollis, Pterostichus merklii, and Calathus metallicus) are treated as doubtful for the regional fauna, and one (Apotomus rufus) also for the Bulgarian fauna. Altogether, 43 taxa collected in the Turkish part of the region are new for European Turkey. New taxa for Turkey are the genera Myas and Oxypselaphus, the subgenus Feronidius, and nine species and subspecies (Carabus granulatus granulatus, Dyschirius tristis, Bembidion normannum apfelbecki, B. subcostatum vau, Acupalpus exiguus, Myas chalybaeus, Oxypselaphus obscurus, Pterostichus leonisi, Pt. melas). In addition, there are a further seven species that are here confirmed for Turkey. PMID:27099564

  8. Genome-wide association study identifies three new melanoma susceptibility loci.

    PubMed

    Barrett, Jennifer H; Iles, Mark M; Harland, Mark; Taylor, John C; Aitken, Joanne F; Andresen, Per Arne; Akslen, Lars A; Armstrong, Bruce K; Avril, Marie-Francoise; Azizi, Esther; Bakker, Bert; Bergman, Wilma; Bianchi-Scarrà, Giovanna; Bressac-de Paillerets, Brigitte; Calista, Donato; Cannon-Albright, Lisa A; Corda, Eve; Cust, Anne E; Dębniak, Tadeusz; Duffy, David; Dunning, Alison M; Easton, Douglas F; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Giles, Graham G; Hansson, Johan; Hocevar, Marko; Höiom, Veronica; Hopper, John L; Ingvar, Christian; Janssen, Bart; Jenkins, Mark A; Jönsson, Göran; Kefford, Richard F; Landi, Giorgio; Landi, Maria Teresa; Lang, Julie; Lubiński, Jan; Mackie, Rona; Malvehy, Josep; Martin, Nicholas G; Molven, Anders; Montgomery, Grant W; van Nieuwpoort, Frans A; Novakovic, Srdjan; Olsson, Håkan; Pastorino, Lorenza; Puig, Susana; Puig-Butille, Joan Anton; Randerson-Moor, Juliette; Snowden, Helen; Tuominen, Rainer; Van Belle, Patricia; van der Stoep, Nienke; Whiteman, David C; Zelenika, Diana; Han, Jiali; Fang, Shenying; Lee, Jeffrey E; Wei, Qingyi; Lathrop, G Mark; Gillanders, Elizabeth M; Brown, Kevin M; Goldstein, Alisa M; Kanetsky, Peter A; Mann, Graham J; Macgregor, Stuart; Elder, David E; Amos, Christopher I; Hayward, Nicholas K; Gruis, Nelleke A; Demenais, Florence; Bishop, Julia A Newton; Bishop, D Timothy

    2011-10-09

    We report a genome-wide association study for melanoma that was conducted by the GenoMEL Consortium. Our discovery phase included 2,981 individuals with melanoma and 1,982 study-specific control individuals of European ancestry, as well as an additional 6,426 control subjects from French or British populations, all of whom were genotyped for 317,000 or 610,000 single-nucleotide polymorphisms (SNPs). Our analysis replicated previously known melanoma susceptibility loci. Seven new regions with at least one SNP with P < 10(-5) and further local imputed or genotyped support were selected for replication using two other genome-wide studies (from Australia and Texas, USA). Additional replication came from case-control series from the UK and The Netherlands. Variants at three of the seven loci replicated at P < 10(-3): an SNP in ATM (rs1801516, overall P = 3.4 × 10(-9)), an SNP in MX2 (rs45430, P = 2.9 × 10(-9)) and an SNP adjacent to CASP8 (rs13016963, P = 8.6 × 10(-10)). A fourth locus near CCND1 remains of potential interest, showing suggestive but inconclusive evidence of replication (rs1485993, overall P = 4.6 × 10(-7) under a fixed-effects model and P = 1.2 × 10(-3) under a random-effects model). These newly associated variants showed no association with nevus or pigmentation phenotypes in a large British case-control series.

  9. A Single Multiplex crRNA Array for FnCpf1-Mediated Human Genome Editing.

    PubMed

    Sun, Huihui; Li, Fanfan; Liu, Jie; Yang, Fayu; Zeng, Zhenhai; Lv, Xiujuan; Tu, Mengjun; Liu, Yeqing; Ge, Xianglian; Liu, Changbao; Zhao, Junzhao; Zhang, Zongduan; Qu, Jia; Song, Zongming; Gu, Feng

    2018-06-15

    Cpf1 has been harnessed as a tool for genome manipulation in various species because of its simplicity and high efficiency. Our recent study demonstrated that FnCpf1 could be utilized for human genome editing with notable advantages for target sequence selection due to the flexibility of the protospacer adjacent motif (PAM) sequence. Multiplex genome editing provides a powerful tool for targeting members of multigene families, dissecting gene networks, modeling multigenic disorders in vivo, and applying gene therapy. However, there are no reports at present that show FnCpf1-mediated multiplex genome editing via a single customized CRISPR RNA (crRNA) array. In the present study, we utilize a single customized crRNA array to simultaneously target multiple genes in human cells. In addition, we also demonstrate that a single customized crRNA array to target multiple sites in one gene could be achieved. Collectively, FnCpf1, a powerful genome-editing tool for multiple genomic targets, can be harnessed for effective manipulation of the human genome. Copyright © 2018 The American Society of Gene and Cell Therapy. Published by Elsevier Inc. All rights reserved.

  10. A periodic pattern of SNPs in the human genome

    PubMed Central

    Madsen, Bo Eskerod; Villesen, Palle; Wiuf, Carsten

    2007-01-01

    By surveying a filtered, high-quality set of SNPs in the human genome, we have found that SNPs positioned 1, 2, 4, 6, or 8 bp apart are more frequent than SNPs positioned 3, 5, 7, or 9 bp apart. The observed pattern is not restricted to genomic regions that are known to cause sequencing or alignment errors, for example, transposable elements (SINE, LINE, and LTR), tandem repeats, and large duplicated regions. However, we found that the pattern is almost entirely confined to what we define as “periodic DNA.” Periodic DNA is a genomic region with a high degree of periodicity in nucleotide usage. It turned out that periodic DNA is mainly small regions (average length 16.9 bp), widely distributed in the genome. Furthermore, periodic DNA has a 1.8 times higher SNP density than the rest of the genome and SNPs inside periodic DNA have a significantly higher genotyping error rate than SNPs outside periodic DNA. Our results suggest that not all SNPs in the human genome are created by independent single nucleotide mutations, and that care should be taken in analysis of SNPs from periodic DNA. The latter may have important consequences for SNP and association studies. PMID:17673700

  11. Exploring lateral genetic transfer among microbial genomes using TF-IDF.

    PubMed

    Cong, Yingnan; Chan, Yao-Ban; Ragan, Mark A

    2016-07-25

    Many microbes can acquire genetic material from their environment and incorporate it into their genome, a process known as lateral genetic transfer (LGT). Computational approaches have been developed to detect genomic regions of lateral origin, but typically lack sensitivity, ability to distinguish donor from recipient, and scalability to very large datasets. To address these issues we have introduced an alignment-free method based on ideas from document analysis, term frequency-inverse document frequency (TF-IDF). Here we examine the performance of TF-IDF on three empirical datasets: 27 genomes of Escherichia coli and Shigella, 110 genomes of enteric bacteria, and 143 genomes across 12 bacterial and three archaeal phyla. We investigate the effect of k-mer size, gap size and delineation of groups on the inference of genomic regions of lateral origin, finding an interplay among these parameters and sequence divergence. Because TF-IDF identifies donor groups and delineates regions of lateral origin within recipient genomes, aggregating these regions by gene enables us to explore, for the first time, the mosaic nature of lateral genes including the multiplicity of biological sources, ancestry of transfer and over-writing by subsequent transfers. We carry out Gene Ontology enrichment tests to investigate which biological processes are potentially affected by LGT.

  12. Discovering genetic variants in Crohn's disease by exploring genomic regions enriched of weak association signals.

    PubMed

    D'Addabbo, Annarita; Palmieri, Orazio; Maglietta, Rosalia; Latiano, Anna; Mukherjee, Sayan; Annese, Vito; Ancona, Nicola

    2011-08-01

    A meta-analysis has re-analysed previous genome-wide association scanning definitively confirming eleven genes and further identifying 21 new loci. However, the identified genes/loci still explain only the minority of genetic predisposition of Crohn's disease. To identify genes weakly involved in disease predisposition by analysing chromosomal regions enriched of single nucleotide polymorphisms with modest statistical association. We utilized the WTCCC data set evaluating 1748 CD and 2938 controls. The identification of candidate genes/loci was performed by a two-step procedure: first of all chromosomal regions enriched of weak association signals were localized; subsequently, weak signals clustered in gene regions were identified. The statistical significance was assessed by non parametric permutation tests. The cytoband enrichment analysis highlighted 44 regions (P≤0.05) enriched with single nucleotide polymorphisms significantly associated with the trait including 23 out of 31 previously confirmed and replicated genes. Importantly, we highlight further 20 novel chromosomal regions carrying approximately one hundred genes/loci with modest association. Amongst these we find compelling functional candidate genes such as MAPT, GRB2 and CREM, LCT, and IL12RB2. Our study suggests a different statistical perspective to discover genes weakly associated with a given trait, although further confirmatory functional studies are needed. Copyright © 2011 Editrice Gastroenterologica Italiana S.r.l. All rights reserved.

  13. Seismic anisotropy beneath the southeastern margin of the Tibetan Plateau and adjacent regions revealed by shear-wave splitting analyses

    NASA Astrophysics Data System (ADS)

    Gao, S. S.; Kong, F.; Wu, J.; Liu, L.; Liu, K. H.

    2017-12-01

    Seismic azimuthal anisotropy is measured at 83 stations situated at the southeastern margin of the Tibetan Plateau and adjacent regions based on shear-wave splitting analyses. A total of 1701 individual pairs of splitting parameters (fast polarization orientations and splitting delay times) are obtained using the PKS, SKKS, and SKS phases. The splitting parameters from 21 stations exhibit systematic back-azimuthal variations with a 90° periodicity, which is consistent with a two-layer anisotropy model. The resulting upper-layer splitting parameters computed based on a grid-search algorithm are comparable with crustal anisotropy measurements obtained independently based on the sinusoidal moveout of P-to-S conversions from the Moho. The fast orientations of the upper layer anisotropy, which is mostly parallel with major shear zones, are associated with crustal fabrics with a vertical foliation plane. The lower layer anisotropy and the station averaged splitting parameters at stations with azimuthally invariant splitting parameters can be adequately explained by the differential movement between the lithosphere and asthenosphere. The NW-SE fast orientations obtained in the northern part of the study area probably reflect the southeastward extruded mantle flow from central Tibet. In contrast, the NE-SW to E-W fast orientations observed in the southern part of the study area are most likely related to the northeastward to eastward mantle flow induced by the subduction of the Burma microplate.

  14. Mitochondrial Genome Analyses Suggest Multiple Trichuris Species in Humans, Baboons, and Pigs from Different Geographical Regions.

    PubMed

    Hawash, Mohamed B F; Andersen, Lee O; Gasser, Robin B; Stensvold, Christen Rune; Nejsum, Peter

    2015-01-01

    The whipworms Trichuris trichiura and Trichuris suis are two parasitic nematodes of humans and pigs, respectively. Although whipworms in human and non-human primates historically have been referred to as T. trichiura, recent reports suggest that several Trichuris spp. are found in primates. We sequenced and annotated complete mitochondrial genomes of Trichuris recovered from a human in Uganda, an olive baboon in the US, a hamadryas baboon in Denmark, and two pigs from Denmark and Uganda. Comparative analyses using other published mitochondrial genomes of Trichuris recovered from a human and a porcine host in China and from a françois' leaf-monkey (China) were performed, including phylogenetic analyses and pairwise genetic and amino acid distances. Genetic and protein distances between human Trichuris in Uganda and China were high (~19% and 15%, respectively) suggesting that they represented different species. Trichuris from the olive baboon in US was genetically related to human Trichuris in China, while the other from the hamadryas baboon in Denmark was nearly identical to human Trichuris from Uganda. Baboon-derived Trichuris was genetically distinct from Trichuris from françois' leaf monkey, suggesting multiple whipworm species circulating among non-human primates. The genetic and protein distances between pig Trichuris from Denmark and other regions were roughly 9% and 6%, respectively, while Chinese and Ugandan whipworms were more closely related. Our results indicate that Trichuris species infecting humans and pigs are phylogenetically distinct across geographical regions, which might have important implications for the implementation of suitable and effective control strategies in different regions. Moreover, we provide support for the hypothesis that Trichuris infecting primates represents a complex of cryptic species with some species being able to infect both humans and non-human primates.

  15. Mitochondrial Genome Analyses Suggest Multiple Trichuris Species in Humans, Baboons, and Pigs from Different Geographical Regions

    PubMed Central

    Hawash, Mohamed B. F.; Andersen, Lee O.; Gasser, Robin B.; Stensvold, Christen Rune; Nejsum, Peter

    2015-01-01

    Background The whipworms Trichuris trichiura and Trichuris suis are two parasitic nematodes of humans and pigs, respectively. Although whipworms in human and non-human primates historically have been referred to as T. trichiura, recent reports suggest that several Trichuris spp. are found in primates. Methods and Findings We sequenced and annotated complete mitochondrial genomes of Trichuris recovered from a human in Uganda, an olive baboon in the US, a hamadryas baboon in Denmark, and two pigs from Denmark and Uganda. Comparative analyses using other published mitochondrial genomes of Trichuris recovered from a human and a porcine host in China and from a françois’ leaf-monkey (China) were performed, including phylogenetic analyses and pairwise genetic and amino acid distances. Genetic and protein distances between human Trichuris in Uganda and China were high (~19% and 15%, respectively) suggesting that they represented different species. Trichuris from the olive baboon in US was genetically related to human Trichuris in China, while the other from the hamadryas baboon in Denmark was nearly identical to human Trichuris from Uganda. Baboon-derived Trichuris was genetically distinct from Trichuris from françois’ leaf monkey, suggesting multiple whipworm species circulating among non-human primates. The genetic and protein distances between pig Trichuris from Denmark and other regions were roughly 9% and 6%, respectively, while Chinese and Ugandan whipworms were more closely related. Conclusion and Significance Our results indicate that Trichuris species infecting humans and pigs are phylogenetically distinct across geographical regions, which might have important implications for the implementation of suitable and effective control strategies in different regions. Moreover, we provide support for the hypothesis that Trichuris infecting primates represents a complex of cryptic species with some species being able to infect both humans and non-human primates

  16. Objectifying the adjacent and opposite angles: a cultural historical analysis

    NASA Astrophysics Data System (ADS)

    Daher, Wajeeh; Musallam, Nadera

    2018-02-01

    The angle topic is central to the development of geometric knowledge. Two of the basic concepts associated with this topic are the adjacent and opposite angles. It is the goal of the present study to analyze, based on the cultural historical semiotics framework, how high-achieving seventh grade students objectify the adjacent and opposite angles' concepts. We videoed the learning of a group of three high-achieving students who used technology, specifically GeoGebra, to explore geometric relations related to the adjacent and opposite angles' concepts. To analyze students' objectification of these concepts, we used the categories of objectification of knowledge (attention and awareness) and the categories of generalization (factual, contextual and symbolic), developed by Radford. The research results indicate that teacher's and students' verbal and visual signs, together with the software dynamic tools, mediated the students' objectification of the adjacent and opposite angles' concepts. Specifically, eye and gestures perceiving were part of the semiosis cycles in which the participating students were engaged and which related to the mathematical signs that signified the adjacent and the opposite angles. Moreover, the teacher's suggestions/requests/questions included/suggested semiotic signs/tools, including verbal signs that helped the students pay attention, be aware of and objectify the adjacent and opposite angles' concepts.

  17. Genome sequences of wild and domestic bactrian camels

    PubMed Central

    Jirimutu; Wang, Zhen; Ding, Guohui; Chen, Gangliang; Sun, Yamin; Sun, Zhihong; Zhang, Heping; Wang, Lei; Hasi, Surong; Zhang, Yan; Li, Jianmei; Shi, Yixiang; Xu, Ze; He, Chuan; Yu, Siriguleng; Li, Shengdi; Zhang, Wenbin; Batmunkh, Mijiddorj; Ts, Batsukh; Narenbatu; Unierhu; Bat-Ireedui, Shirzana; Gao, Hongwei; Baysgalan, Banzragch; Li, Qing; Jia, Zhiling; Turigenbayila; Subudenggerile; Narenmanduhu; Wang, Zhaoxia; Wang, Juan; Pan, Lei; Chen, Yongcan; Ganerdene, Yaichil; Dabxilt; Erdemt; Altansha; Altansukh; Liu, Tuya; Cao, Minhui; Aruuntsever; Bayart; Hosblig; He, Fei; Zha-ti, A; Zheng, Guangyong; Qiu, Feng; Sun, Zikui; Zhao, Lele; Zhao, Wenjing; Liu, Baohong; Li, Chao; Chen, Yunqin; Tang, Xiaoyan; Guo, Chunyan; Liu, Wei; Ming, Liang; Temuulen; Cui, Aiying; Li, Yi; Gao, Junhui; Li, Jing; Wurentaodi; Niu, Shen; Sun, Tao; Zhai, Zhengxiao; Zhang, Min; Chen, Chen; Baldan, Tunteg; Bayaer, Tuman; Li, Yixue; Meng, He

    2012-01-01

    Bactrian camels serve as an important means of transportation in the cold desert regions of China and Mongolia. Here we present a 2.01 Gb draft genome sequence from both a wild and a domestic bactrian camel. We estimate the camel genome to be 2.38 Gb, containing 20,821 protein-coding genes. Our phylogenomics analysis reveals that camels shared common ancestors with other even-toed ungulates about 55–60 million years ago. Rapidly evolving genes in the camel lineage are significantly enriched in metabolic pathways, and these changes may underlie the insulin resistance typically observed in these animals. We estimate the genome-wide heterozygosity rates in both wild and domestic camels to be 1.0 × 10−3. However, genomic regions with significantly lower heterozygosity are found in the domestic camel, and olfactory receptors are enriched in these regions. Our comparative genomics analyses may also shed light on the genetic basis of the camel's remarkable salt tolerance and unusual immune system. PMID:23149746

  18. Recurrent DNA inversion rearrangements in the human genome

    PubMed Central

    Flores, Margarita; Morales, Lucía; Gonzaga-Jauregui, Claudia; Domínguez-Vidaña, Rocío; Zepeda, Cinthya; Yañez, Omar; Gutiérrez, María; Lemus, Tzitziki; Valle, David; Avila, Ma. Carmen; Blanco, Daniel; Medina-Ruiz, Sofía; Meza, Karla; Ayala, Erandi; García, Delfino; Bustos, Patricia; González, Víctor; Girard, Lourdes; Tusie-Luna, Teresa; Dávila, Guillermo; Palacios, Rafael

    2007-01-01

    Several lines of evidence suggest that reiterated sequences in the human genome are targets for nonallelic homologous recombination (NAHR), which facilitates genomic rearrangements. We have used a PCR-based approach to identify breakpoint regions of rearranged structures in the human genome. In particular, we have identified intrachromosomal identical repeats that are located in reverse orientation, which may lead to chromosomal inversions. A bioinformatic workflow pathway to select appropriate regions for analysis was developed. Three such regions overlapping with known human genes, located on chromosomes 3, 15, and 19, were analyzed. The relative proportion of wild-type to rearranged structures was determined in DNA samples from blood obtained from different, unrelated individuals. The results obtained indicate that recurrent genomic rearrangements occur at relatively high frequency in somatic cells. Interestingly, the rearrangements studied were significantly more abundant in adults than in newborn individuals, suggesting that such DNA rearrangements might start to appear during embryogenesis or fetal life and continue to accumulate after birth. The relevance of our results in regard to human genomic variation is discussed. PMID:17389356

  19. MinGenome: An In Silico Top-Down Approach for the Synthesis of Minimized Genomes.

    PubMed

    Wang, Lin; Maranas, Costas D

    2018-02-16

    Genome minimized strains offer advantages as production chassis by reducing transcriptional cost, eliminating competing functions and limiting unwanted regulatory interactions. Existing approaches for identifying stretches of DNA to remove are largely ad hoc based on information on presumably dispensable regions through experimentally determined nonessential genes and comparative genomics. Here we introduce a versatile genome reduction algorithm MinGenome that implements a mixed-integer linear programming (MILP) algorithm to identify in size descending order all dispensable contiguous sequences without affecting the organism's growth or other desirable traits. Known essential genes or genes that cause significant fitness or performance loss can be flagged and their deletion can be prohibited. MinGenome also preserves needed transcription factors and promoter regions ensuring that retained genes will be properly transcribed while also avoiding the simultaneous deletion of synthetic lethal pairs. The potential benefit of removing even larger contiguous stretches of DNA if only one or two essential genes (to be reinserted elsewhere) are within the deleted sequence is explored. We applied the algorithm to design a minimized E. coli strain and found that we were able to recapitulate the long deletions identified in previous experimental studies and discover alternative combinations of deletions that have not yet been explored in vivo.

  20. Lessons learned from the initial sequencing of the pig genome: comparative analysis of an 8 Mb region of pig chromosome 17

    PubMed Central

    Hart, Elizabeth A; Caccamo, Mario; Harrow, Jennifer L; Humphray, Sean J; Gilbert, James GR; Trevanion, Steve; Hubbard, Tim; Rogers, Jane; Rothschild, Max F

    2007-01-01

    Background We describe here the sequencing, annotation and comparative analysis of an 8 Mb region of pig chromosome 17, which provides a useful test region to assess coverage and quality for the pig genome sequencing project. We report our findings comparing the annotation of draft sequence assembled at different depths of coverage. Results Within this region we annotated 71 loci, of which 53 are orthologous to human known coding genes. When compared to the syntenic regions in human (20q13.13-q13.33) and mouse (chromosome 2, 167.5 Mb-178.3 Mb), this region was found to be highly conserved with respect to gene order. The most notable difference between the three species is the presence of a large expansion of zinc finger coding genes and pseudogenes on mouse chromosome 2 between Edn3 and Phactr3 that is absent from pig and human. All of our annotation has been made publicly available in the Vertebrate Genome Annotation browser, VEGA. We assessed the impact of coverage on sequence assembly across this region and found, as expected, that increased sequence depth resulted in fewer, longer contigs. One-third of our annotated loci could not be fully re-aligned back to the low coverage version of the sequence, principally because the transcripts are fragmented over several contigs. Conclusion We have demonstrated the considerable advantages of sequencing at increased read depths and discuss the implications that lower coverage sequence may have on subsequent comparative and functional studies, particularly those involving complex loci such as GNAS. PMID:17705864

  1. Reference-Free Comparative Genomics of 174 Chloroplasts

    PubMed Central

    Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R.; Yu, Jun; Cannon, Charles H.

    2012-01-01

    Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ∼18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied

  2. Copy Number Variations in Tilapia Genomes.

    PubMed

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  3. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non

  4. Image-guided genomic analysis of tissue response to laser-induced thermal stress

    NASA Astrophysics Data System (ADS)

    Mackanos, Mark A.; Helms, Mike; Kalish, Flora; Contag, Christopher H.

    2011-05-01

    The cytoprotective response to thermal injury is characterized by transcriptional activation of ``heat shock proteins'' (hsp) and proinflammatory proteins. Expression of these proteins may predict cellular survival. Microarray analyses were performed to identify spatially distinct gene expression patterns responding to thermal injury. Laser injury zones were identified by expression of a transgene reporter comprised of the 70 kD hsp gene and the firefly luciferase coding sequence. Zones included the laser spot, the surrounding region where hsp70-luc expression was increased, and a region adjacent to the surrounding region. A total of 145 genes were up-regulated in the laser irradiated region, while 69 were up-regulated in the adjacent region. At 7 hours the chemokine Cxcl3 was the highest expressed gene in the laser spot (24 fold) and adjacent region (32 fold). Chemokines were the most common up-regulated genes identified. Microarray gene expression was successfully validated using qRT- polymerase chain reaction for selected genes of interest. The early response genes are likely involved in cytoprotection and initiation of the healing response. Their regulatory elements will benefit creating the next generation reporter mice and controlling expression of therapeutic proteins. The identified genes serve as drug development targets that may prevent acute tissue damage and accelerate healing.

  5. Phytoparasitic Nematodes Adjacent to Established Strawberry Plantations

    PubMed Central

    Crow, R. V.; MacDonald, D. H.

    1978-01-01

    Plant-nematode populations associated with uncultivated vegetation, adjacent strawberry plants, and alternate crop sites were studied at three locations in Minnesota. At one site (Forest Lake), Paratylenchus projectus, Meloidogyne hapla, and Pratylenchus tenuis were frequently associated with the roots of native vegetation. These nematode species were also present in adjacent strawberry beds. Among alternate crops observed, oats and muskmelon usually supported the fewest nematodes although moderate densities of Xiphinema americanum and P. tenuis were found at one location in plots planted to oats. Pratylenchus tenuis was also found on rye at one location. PMID:19305841

  6. Optimization of genome editing through CRISPR-Cas9 engineering.

    PubMed

    Zhang, Jian-Hua; Adikaram, Poorni; Pandey, Mritunjay; Genis, Allison; Simonds, William F

    2016-04-01

    CRISPR (Clustered Regularly-Interspaced Short Palindromic Repeats)-Cas9 (CRISPR associated protein 9) has rapidly become the most promising genome editing tool with great potential to revolutionize medicine. Through guidance of a 20 nucleotide RNA (gRNA), CRISPR-Cas9 finds and cuts target protospacer DNA precisely 3 base pairs upstream of a PAM (Protospacer Adjacent Motif). The broken DNA ends are repaired by either NHEJ (Non-Homologous End Joining) resulting in small indels, or by HDR (Homology Directed Repair) for precise gene or nucleotide replacement. Theoretically, CRISPR-Cas9 could be used to modify any genomic sequences, thereby providing a simple, easy, and cost effective means of genome wide gene editing. However, the off-target activity of CRISPR-Cas9 that cuts DNA sites with imperfect matches with gRNA have been of significant concern because clinical applications require 100% accuracy. Additionally, CRISPR-Cas9 has unpredictable efficiency among different DNA target sites and the PAM requirements greatly restrict its genome editing frequency. A large number of efforts have been made to address these impeding issues, but much more is needed to fully realize the medical potential of CRISPR-Cas9. In this article, we summarize the existing problems and current advances of the CRISPR-Cas9 technology and provide perspectives for the ultimate perfection of Cas9-mediated genome editing.

  7. Optimization of genome editing through CRISPR-Cas9 engineering

    PubMed Central

    Zhang, Jian-Hua; Adikaram, Poorni; Pandey, Mritunjay; Genis, Allison; Simonds, William F.

    2016-01-01

    ABSTRACT CRISPR (Clustered Regularly-Interspaced Short Palindromic Repeats)-Cas9 (CRISPR associated protein 9) has rapidly become the most promising genome editing tool with great potential to revolutionize medicine. Through guidance of a 20 nucleotide RNA (gRNA), CRISPR-Cas9 finds and cuts target protospacer DNA precisely 3 base pairs upstream of a PAM (Protospacer Adjacent Motif). The broken DNA ends are repaired by either NHEJ (Non-Homologous End Joining) resulting in small indels, or by HDR (Homology Directed Repair) for precise gene or nucleotide replacement. Theoretically, CRISPR-Cas9 could be used to modify any genomic sequences, thereby providing a simple, easy, and cost effective means of genome wide gene editing. However, the off-target activity of CRISPR-Cas9 that cuts DNA sites with imperfect matches with gRNA have been of significant concern because clinical applications require 100% accuracy. Additionally, CRISPR-Cas9 has unpredictable efficiency among different DNA target sites and the PAM requirements greatly restrict its genome editing frequency. A large number of efforts have been made to address these impeding issues, but much more is needed to fully realize the medical potential of CRISPR-Cas9. In this article, we summarize the existing problems and current advances of the CRISPR-Cas9 technology and provide perspectives for the ultimate perfection of Cas9-mediated genome editing. PMID:27340770

  8. Self-organizing approach for meta-genomes.

    PubMed

    Zhu, Jianfeng; Zheng, Wei-Mou

    2014-12-01

    We extend the self-organizing approach for annotation of a bacterial genome to analyze the raw sequencing data of the human gut metagenome without sequence assembling. The original approach divides the genomic sequence of a bacterium into non-overlapping segments of equal length and assigns to each segment one of seven 'phases', among which one is for the noncoding regions, three for the direct coding regions to indicate the three possible codon positions of the segment starting site, and three for the reverse coding regions. The noncoding phase and the six coding phases are described by two frequency tables of the 64 triplet types or 'codon usages'. A set of codon usages can be used to update the phase assignment and vice versa. An iteration after an initialization leads to a convergent phase assignment to give an annotation of the genome. In the extension of the approach to a metagenome, we consider a mixture model of a number of categories described by different codon usages. The Illumina Genome Analyzer sequencing data of the total DNA from faecal samples are then examined to understand the diversity of the human gut microbiome. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Whole genome sequencing and comparative genomics of closely related Fusarium Head Blight fungi: Fusarium graminearum, F. meridionale and F. asiaticum.

    PubMed

    Walkowiak, Sean; Rowland, Owen; Rodrigue, Nicolas; Subramaniam, Rajagopal

    2016-12-09

    The Fusarium graminearum species complex is composed of many distinct fungal species that cause several diseases in economically important crops, including Fusarium Head Blight of wheat. Despite being closely related, these species and individuals within species have distinct phenotypic differences in toxin production and pathogenicity, with some isolates reported as non-pathogenic on certain hosts. In this report, we compare genomes and gene content of six new isolates from the species complex, including the first available genomes of F. asiaticum and F. meridionale, with four other genomes reported in previous studies. A comparison of genome structure and gene content revealed a 93-99% overlap across all ten genomes. We identified more than 700 k base pairs (kb) of single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) within common regions of the genome, which validated the species and genetic populations reported within species. We constructed a non-redundant pan gene list containing 15,297 genes from the ten genomes and among them 1827 genes or 12% were absent in at least one genome. These genes were co-localized in telomeric regions and select regions within chromosomes with a corresponding increase in SNPs and indels. Many are also predicted to encode for proteins involved in secondary metabolism and other functions associated with disease. Genes that were common between isolates contained high levels of nucleotide variation and may be pseudogenes, allelic, or under diversifying selection. The genomic resources we have contributed will be useful for the identification of genes that contribute to the phenotypic variation and niche specialization that have been reported among members of the F. graminearum species complex.

  10. Integrating the genomic architecture of human nucleolar organizer regions with the biophysical properties of nucleoli.

    PubMed

    Mangan, Hazel; Gailín, Michael Ó; McStay, Brian

    2017-12-01

    Nucleoli are the sites of ribosome biogenesis and the largest membraneless subnuclear structures. They are intimately linked with growth and proliferation control and function as sensors of cellular stress. Nucleoli form around arrays of ribosomal gene (rDNA) repeats also called nucleolar organizer regions (NORs). In humans, NORs are located on the short arms of all five human acrocentric chromosomes. Multiple NORs contribute to the formation of large heterochromatin-surrounded nucleoli observed in most human cells. Here we will review recent findings about their genomic architecture. The dynamic nature of nucleoli began to be appreciated with the advent of photodynamic experiments using fluorescent protein fusions. We review more recent data on nucleoli in Xenopus germinal vesicles (GVs) which has revealed a liquid droplet-like behavior that facilitates nucleolar fusion. Further analysis in both XenopusGVs and Drosophila embryos indicates that the internal organization of nucleoli is generated by a combination of liquid-liquid phase separation and active processes involving rDNA. We will attempt to integrate these recent findings with the genomic architecture of human NORs to advance our understanding of how nucleoli form and respond to stress in human cells. © 2017 Federation of European Biochemical Societies.

  11. Genomic Diversity of Lactobacillus salivarius▿ †

    PubMed Central

    Raftis, Emma J.; Salvetti, Elisa; Torriani, Sandra; Felis, Giovanna E.; O'Toole, Paul W.

    2011-01-01

    Strains of Lactobacillus salivarius are increasingly employed as probiotic agents for humans or animals. Despite the diversity of environmental sources from which they have been isolated, the genomic diversity of L. salivarius has been poorly characterized, and the implications of this diversity for strain selection have not been examined. To tackle this, we applied comparative genomic hybridization (CGH) and multilocus sequence typing (MLST) to 33 strains derived from humans, animals, or food. The CGH, based on total genome content, including small plasmids, identified 18 major regions of genomic variation, or hot spots for variation. Three major divisions were thus identified, with only a subset of the human isolates constituting an ecologically discernible group. Omission of the small plasmids from the CGH or analysis by MLST provided broadly concordant fine divisions and separated human-derived and animal-derived strains more clearly. The two gene clusters for exopolysaccharide (EPS) biosynthesis corresponded to regions of significant genomic diversity. The CGH-based groupings of these regions did not correlate with levels of production of bound or released EPS. Furthermore, EPS production was significantly modulated by available carbohydrate. In addition to proving difficult to predict from the gene content, EPS production levels correlated inversely with production of biofilms, a trait considered desirable in probiotic commensals. L. salivarius displays a high level of genomic diversity, and while selection of L. salivarius strains for probiotic use can be informed by CGH or MLST, it also requires pragmatic experimental validation of desired phenotypic traits. PMID:21131523

  12. Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform.

    PubMed

    Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Wang, Zhenyi; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Cheng, Rui; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-Chen; Paterson, Andrew H; Wang, Xiyin

    2017-05-01

    Mainly due to their economic importance, genomes of 10 legumes, including soybean ( Glycine max ), wild peanut ( Arachis duranensis and Arachis ipaensis ), and barrel medic ( Medicago truncatula ), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape ( Vitis vinifera ) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). © 2017 American Society of Plant Biologists. All Rights Reserved.

  13. GenomicTools: a computational platform for developing high-throughput analytics in genomics.

    PubMed

    Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

    2012-01-15

    Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.

  14. Correlative anatomy for the electrophysiologist: ablation for atrial fibrillation. Part II: regional anatomy of the atria and relevance to damage of adjacent structures during AF ablation.

    PubMed

    Macedo, Paula G; Kapa, Suraj; Mears, Jennifer A; Fratianni, Amy; Asirvatham, Samuel J

    2010-07-01

    Ablation procedures for atrial fibrillation have become an established and increasingly used option for managing patients with symptomatic arrhythmia. The anatomic structures relevant to the pathogenesis of atrial fibrillation and ablation procedures are varied and include the pulmonary veins, other thoracic veins, the left atrial myocardium, and autonomic ganglia. Exact regional anatomic knowledge of these structures is essential to allow correlation with fluoroscopy and electrograms and, importantly, to avoid complications from damage of adjacent structures within the chest. We present this information as a series of 2 articles. In a prior issue, we have discussed the thoracic vein anatomy relevant to paroxysmal atrial fibrillation. In the present article, we focus on the atria themselves, the autonomic ganglia, and anatomic issues relevant for minimizing complications during atrial fibrillation ablation.

  15. Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.

    PubMed

    Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A

    2011-06-04

    Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.

  16. Complete mitochondrial genome of Skylark, Alauda arvensis (Aves: Passeriformes): the first representative of the family Alaudidae with two extensive heteroplasmic control regions.

    PubMed

    Qian, Chaoju; Wang, Yuanxiu; Guo, Zhichun; Yang, Jianke; Kan, Xianzhao

    2013-06-01

    The circular mitochondrial genome of Alauda arvensis is 17,018 bp in length, containing 13 protein-coding genes (PCGs), 2 ribosomal RNA genes, 22 transfer RNA (tRNA) genes, and 2 extensive heteroplasmic control regions. All of the genes encoded on the H-strand, with the exceptions of one PCG (nad6) and eight tRNA genes (tRNA(Gln), tRNA(Ala), tRNA(Asn), tRNA(Cys), tRNA(Tyr), tRNA(Ser(UCN)), tRNA(Pro), and tRNA(Glu)), as found in other birds' mitochondrial genomes. All of these PCGs are initiated with ATG, while stopped by six types of stop codons. All tRNA genes have the potential to fold into typical clover-leaf structure. Two extensive heteroplasmic control regions were found, and more interestingly, a minisatellite of 37 nucleotides (5'-TCAATCCCATTGATTTCATTATATTAGTATAAAGAAA-3') with 6 tandem repeats was detected at the end of CR2.

  17. "Harnessing genomics to improve health in Africa" - an executive course to support genomics policy.

    PubMed

    Smith, Alyna C; Mugabe, John; Singer, Peter A; Daar, Abdallah S

    2005-01-24

    BACKGROUND: Africa in the twenty-first century is faced with a heavy burden of disease, combined with ill-equipped medical systems and underdeveloped technological capacity. A major challenge for the international community is to bring scientific and technological advances like genomics to bear on the health priorities of poorer countries. The New Partnership for Africa's Development has identified science and technology as a key platform for Africa's renewal. Recognizing the timeliness of this issue, the African Centre for Technology Studies and the University of Toronto Joint Centre for Bioethics co-organized a course on Genomics and Public Health Policy in Nairobi, Kenya, the first of a series of similar courses to take place in the developing world. This article presents the findings and recommendations that emerged from this process, recommendations which suggest that a regional approach to developing sound science and technology policies is the key to harnessing genome-related biotechnology to improve health and contribute to human development in Africa. METHODS: The objectives of the course were to familiarize participants with the current status and implications of genomics for health in Africa; to provide frameworks for analyzing and debating the policy and ethical questions; and to begin developing a network across different sectors by sharing perspectives and building relationships. To achieve these goals the course brought together a diverse group of stakeholders from academic research centres, the media, non-governmental, voluntary and legal organizations to stimulate multi-sectoral debate around issues of policy. Topics included scientific advances in genomics innovation systems and business models, international regulatory frameworks, as well as ethical and legal issues. RESULTS: Seven main recommendations emerged: establish a network for sustained dialogue among participants; identify champions among politicians; use the New Plan for African

  18. Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed

    USDA-ARS?s Scientific Manuscript database

    Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination (CGC) c...

  19. Markov models of genome segmentation

    NASA Astrophysics Data System (ADS)

    Thakur, Vivek; Azad, Rajeev K.; Ramaswamy, Ram

    2007-01-01

    We introduce Markov models for segmentation of symbolic sequences, extending a segmentation procedure based on the Jensen-Shannon divergence that has been introduced earlier. Higher-order Markov models are more sensitive to the details of local patterns and in application to genome analysis, this makes it possible to segment a sequence at positions that are biologically meaningful. We show the advantage of higher-order Markov-model-based segmentation procedures in detecting compositional inhomogeneity in chimeric DNA sequences constructed from genomes of diverse species, and in application to the E. coli K12 genome, boundaries of genomic islands, cryptic prophages, and horizontally acquired regions are accurately identified.

  20. Genomic Anatomy of a Premier Major Histocompatibility Complex Paralogous Region on Chromosome 1q21–q22

    PubMed Central

    Shiina, Takashi; Ando, Asako; Suto, Yumiko; Kasai, Fumio; Shigenari, Atsuko; Takishima, Nobusada; Kikkawa, Eri; Iwata, Kyoko; Kuwano, Yuko; Kitamura, Yuka; Matsuzawa, Yumiko; Sano, Kazumi; Nogami, Masahiro; Kawata, Hisako; Li, Suyun; Fukuzumi, Yasuhito; Yamazaki, Masaaki; Tashiro, Hiroyuki; Tamiya, Gen; Kohda, Atsushi; Okumura, Katsuzumi; Ikemura, Toshimichi; Soeda, Eiichi; Mizuki, Nobuhisa; Kimura, Minoru; Bahram, Seiamak; Inoko, Hidetoshi

    2001-01-01

    Human chromosomes 1q21–q25, 6p21.3–22.2, 9q33–q34, and 19p13.1–p13.4 carry clusters of paralogous loci, to date best defined by the flagship 6p MHC region. They have presumably been created by two rounds of large-scale genomic duplications around the time of vertebrate emergence. Phylogenetically, the 1q21–25 region seems most closely related to the 6p21.3 MHC region, as it is only the MHC paralogous region that includes bona fide MHC class I genes, the CD1 and MR1 loci. Here, to clarify the genomic structure of this model MHC paralogous region as well as to gain insight into the evolutionary dynamics of the entire quadriplication process, a detailed analysis of a critical 1.7 megabase (Mb) region was performed. To this end, a composite, deep, YAC, BAC, and PAC contig encompassing all five CD1 genes and linking the centromeric +P5 locus to the telomeric KRTC7 locus was constructed. Within this contig a 1.1-Mb BAC and PAC core segment joining CD1D to FCER1A was fully sequenced and thoroughly analyzed. This led to the mapping of a total of 41 genes (12 expressed genes, 12 possibly expressed genes, and 17 pseudogenes), among which 31 were novel. The latter include 20 olfactory receptor (OR) genes, 9 of which are potentially expressed. Importantly, CD1, SPTA1, OR, and FCERIA belong to multigene families, which have paralogues in the other three regions. Furthermore, it is noteworthy that 12 of the 13 expressed genes in the 1q21–q22 region around the CD1 loci are immunologically relevant. In addition to CD1A-E, these include SPTA1, MNDA, IFI-16, AIM2, BL1A, FY and FCERIA. This functional convergence of structurally unrelated genes is reminiscent of the 6p MHC region, and perhaps represents the emergence of yet another antigen presentation gene cluster, in this case dedicated to lipid/glycolipid antigens rather than antigen-derived peptides. [The nucleotide sequence data reported in this paper have been submitted to the DDBJ, EMBL, and GenBank databases under