De Novo Origin of Human Protein-Coding Genes
Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping
2011-01-01
The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831
Basu, Swaraj; Larsson, Erik
2018-05-31
Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
The contribution of de novo coding mutations to autism spectrum disorder
Iossifov, Ivan; O’Roak, Brian J.; Sanders, Stephan J.; Ronemus, Michael; Krumm, Niklas; Levy, Dan; Stessman, Holly A.; Witherspoon, Kali; Vives, Laura; Patterson, Karynne E.; Smith, Joshua D.; Paeper, Bryan; Nickerson, Deborah A.; Dea, Jeanselle; Dong, Shan; Gonzalez, Luis E.; Mandell, Jefferey D.; Mane, Shrikant M.; Murtha, Michael T.; Sullivan, Catherine A.; Walker, Michael F.; Waqar, Zainulabedin; Wei, Liping; Willsey, A. Jeremy; Yamrom, Boris; Lee, Yoon-ha; Grabowska, Ewa; Dalkic, Ertugrul; Wang, Zihua; Marks, Steven; Andrews, Peter; Leotta, Anthony; Kendall, Jude; Hakker, Inessa; Rosenbaum, Julie; Ma, Beicong; Rodgers, Linda; Troge, Jennifer; Narzisi, Giuseppe; Yoon, Seungtai; Schatz, Michael C.; Ye, Kenny; McCombie, W. Richard; Shendure, Jay; Eichler, Evan E.; State, Matthew W.; Wigler, Michael
2015-01-01
We sequenced exomes from more than 2,500 simplex families each having a child with an autistic spectrum disorder (ASD). By comparing affected to unaffected siblings, we estimate that 13% of de novo (DN) missense mutations and 42% of DN likely gene-disrupting (LGD) mutations contribute to 12% and 9% of diagnoses, respectively. Including copy number variants, coding DN mutations contribute to about 30% of all simplex and 45% of female diagnoses. Virtually all LGD mutations occur opposite wild-type alleles. LGD targets in affected females significantly overlap the targets in males of lower IQ, but neither overlaps significantly with targets in males of higher IQ. We estimate that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to causative missense mutation. LGD targets in the joint class overlap with published targets for intellectual disability and schizophrenia, and are enriched for chromatin modifiers, FMRP-associated genes and embryonically expressed genes. Virtually all significance for the latter comes from affected females. PMID:25363768
Tsai, Yi-Ming; Chang, An; Kuo, Chih-Horng
2018-06-01
Genome reduction is a recurring theme of symbiont evolution. The genus Spiroplasma contains species that are mostly facultative insect symbionts. The typical genome sizes of those species within the Apis clade were estimated to be ∼1.0-1.4 Mb. Intriguingly, Spiroplasma clarkii was found to have a genome size that is > 30% larger than the median of other species within the same clade. To investigate the molecular evolution events that led to the genome expansion of this bacterium, we determined its complete genome sequence and inferred the evolutionary origin of each protein-coding gene based on the phylogenetic distribution of homologs. Among the 1,346 annotated protein-coding genes, 641 were originated from within the Apis clade while 233 were putatively acquired from outside of the clade (including 91 high-confidence candidates). Additionally, 472 were specific to S. clarkii without homologs in the current database (i.e., the origins remained unknown). The acquisition of protein-coding genes, rather than mobile genetic elements, appeared to be a major contributing factor of genome expansion. Notably, >50% of the high-confidence acquired genes are related to carbohydrate transport and metabolism, suggesting that these acquired genes contributed to the expansion of both genome size and metabolic capability. The findings of this work provided an interesting case against the general evolutionary trend observed among symbiotic bacteria and further demonstrated the flexibility of Spiroplasma genomes. For future studies, investigation on the functional integration of these acquired genes, as well as the inference of their contribution to fitness could improve our knowledge of symbiont evolution.
The impact of rare variation on gene expression across tissues.
Li, Xin; Kim, Yungil; Tsang, Emily K; Davis, Joe R; Damani, Farhan N; Chiang, Colby; Hess, Gaelen T; Zappala, Zachary; Strober, Benjamin J; Scott, Alexandra J; Li, Amy; Ganna, Andrea; Bassik, Michael C; Merker, Jason D; Hall, Ira M; Battle, Alexis; Montgomery, Stephen B
2017-10-11
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
Auer, Paul L; Nalls, Mike; Meschia, James F; Worrall, Bradford B; Longstreth, W T; Seshadri, Sudha; Kooperberg, Charles; Burger, Kathleen M; Carlson, Christopher S; Carty, Cara L; Chen, Wei-Min; Cupples, L Adrienne; DeStefano, Anita L; Fornage, Myriam; Hardy, John; Hsu, Li; Jackson, Rebecca D; Jarvik, Gail P; Kim, Daniel S; Lakshminarayan, Kamakshi; Lange, Leslie A; Manichaikul, Ani; Quinlan, Aaron R; Singleton, Andrew B; Thornton, Timothy A; Nickerson, Deborah A; Peters, Ulrike; Rich, Stephen S
2015-07-01
Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk. To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome. The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013. Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis). We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10(-8)) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10(-7)) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke). Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
The contribution of de novo coding mutations to autism spectrum disorder.
Iossifov, Ivan; O'Roak, Brian J; Sanders, Stephan J; Ronemus, Michael; Krumm, Niklas; Levy, Dan; Stessman, Holly A; Witherspoon, Kali T; Vives, Laura; Patterson, Karynne E; Smith, Joshua D; Paeper, Bryan; Nickerson, Deborah A; Dea, Jeanselle; Dong, Shan; Gonzalez, Luis E; Mandell, Jeffrey D; Mane, Shrikant M; Murtha, Michael T; Sullivan, Catherine A; Walker, Michael F; Waqar, Zainulabedin; Wei, Liping; Willsey, A Jeremy; Yamrom, Boris; Lee, Yoon-ha; Grabowska, Ewa; Dalkic, Ertugrul; Wang, Zihua; Marks, Steven; Andrews, Peter; Leotta, Anthony; Kendall, Jude; Hakker, Inessa; Rosenbaum, Julie; Ma, Beicong; Rodgers, Linda; Troge, Jennifer; Narzisi, Giuseppe; Yoon, Seungtai; Schatz, Michael C; Ye, Kenny; McCombie, W Richard; Shendure, Jay; Eichler, Evan E; State, Matthew W; Wigler, Michael
2014-11-13
Whole exome sequencing has proven to be a powerful tool for understanding the genetic architecture of human disease. Here we apply it to more than 2,500 simplex families, each having a child with an autistic spectrum disorder. By comparing affected to unaffected siblings, we show that 13% of de novo missense mutations and 43% of de novo likely gene-disrupting (LGD) mutations contribute to 12% and 9% of diagnoses, respectively. Including copy number variants, coding de novo mutations contribute to about 30% of all simplex and 45% of female diagnoses. Almost all LGD mutations occur opposite wild-type alleles. LGD targets in affected females significantly overlap the targets in males of lower intelligence quotient (IQ), but neither overlaps significantly with targets in males of higher IQ. We estimate that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation. LGD targets in the joint class overlap with published targets for intellectual disability and schizophrenia, and are enriched for chromatin modifiers, FMRP-associated genes and embryonically expressed genes. Most of the significance for the latter comes from affected females.
Le Scouarnec, Solena; Karakachoff, Matilde; Gourraud, Jean-Baptiste; Lindenbaum, Pierre; Bonnaud, Stéphanie; Portero, Vincent; Duboscq-Bidot, Laëtitia; Daumy, Xavier; Simonet, Floriane; Teusan, Raluca; Baron, Estelle; Violleau, Jade; Persyn, Elodie; Bellanger, Lise; Barc, Julien; Chatel, Stéphanie; Martins, Raphaël; Mabo, Philippe; Sacher, Frédéric; Haïssaguerre, Michel; Kyndt, Florence; Schmitt, Sébastien; Bézieau, Stéphane; Le Marec, Hervé; Dina, Christian; Schott, Jean-Jacques; Probst, Vincent; Redon, Richard
2015-05-15
The Brugada syndrome (BrS) is a rare heritable cardiac arrhythmia disorder associated with ventricular fibrillation and sudden cardiac death. Mutations in the SCN5A gene have been causally related to BrS in 20-30% of cases. Twenty other genes have been described as involved in BrS, but their overall contribution to disease prevalence is still unclear. This study aims to estimate the burden of rare coding variation in arrhythmia-susceptibility genes among a large group of patients with BrS. We have developed a custom kit to capture and sequence the coding regions of 45 previously reported arrhythmia-susceptibility genes and applied this kit to 167 index cases presenting with a Brugada pattern on the electrocardiogram as well as 167 individuals aged over 65-year old and showing no history of cardiac arrhythmia. By applying burden tests, a significant enrichment in rare coding variation (with a minor allele frequency below 0.1%) was observed only for SCN5A, with rare coding variants carried by 20.4% of cases with BrS versus 2.4% of control individuals (P = 1.4 × 10(-7)). No significant enrichment was observed for any other arrhythmia-susceptibility gene, including SCN10A and CACNA1C. These results indicate that, except for SCN5A, rare coding variation in previously reported arrhythmia-susceptibility genes do not contribute significantly to the occurrence of BrS in a population with European ancestry. Extreme caution should thus be taken when interpreting genetic variation in molecular diagnostic setting, since rare coding variants were observed in a similar extent among cases versus controls, for most previously reported BrS-susceptibility genes. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.
2013-01-01
How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
MicroRNAs in genetic disease: rethinking the dosage.
Henrion-Caude, Alexandra; Girard, Muriel; Amiel, Jeanne
2012-08-01
To date, the general assumption was that most mutations interested protein-coding genes only. Thus, only few illustrations have mentioned here that mutations may occur in non-protein coding genes such as microRNAs (miRNAs). We thus report progress in delineating their contribution as phenotypic modulators, genetic switches and fine-tuners of gene expression. We reasoned that browsing their contribution to genetic disease may provide a framework for understanding the proper requirements to devise miRNA-based therapy strategies, in particular the relief of an appropriate dosage. Gain and loss of function of miRNA enforce the need to respectively antagonize or supply the miRNAs. We further categorized human disease according to the different ways in which the miRNA was altered arising either de novo, or inherited whether as a mendelian or as an epistatic trait, uncovering its role in epigenetics. We discuss how improving our knowledge on the contribution of miRNAs to genetic disease may be beneficial to devise appropriate gene therapy strategies.
Usein, C R; Damian, M; Tatu-Chitoiu, D; Capusa, C; Fagaras, R; Tudorache, D; Nica, M; Le Bouguénec, C
2001-01-01
A total of 78 E. coli strains isolated from adults with different types of urinary tract infections were screened by polymerase chain reaction for prevalence of genetic regions coding for virulence factors. The targeted genetic determinants were those coding for type 1 fimbriae (fimH), pili associated with pyelonephritis (pap), S and F1C fimbriae (sfa and foc), afimbrial adhesins (afa), hemolysin (hly), cytotoxic necrotizing factor (cnf), aerobactin (aer). Among the studied strains, the prevalence of genes coding for fimbrial adhesive systems was 86%, 36%, and 23% for fimH, pap, and sfa/foc,respectively. The operons coding for Afa afimbrial adhesins were identified in 14% of strains. The hly and cnf genes coding for toxins were amplified in 23% and 13% of strains, respectively. A prevalence of 54% was found for the aer gene. The various combinations of detected genes were designated as virulence patterns. The strains isolated from the hospitalized patients displayed a greater number of virulence genes and a diversity of gene associations compared to the strains isolated from the ambulatory subjects. A rapid assessment of the bacterial pathogenicity characteristics may contribute to a better medical approach of the patients with urinary tract infections.
San Diego Supercomputer Center
Nile and Zika virusLearn More image Variants in Non-Coding DNA Contribute to Inherited Autism RiskGene mutations appearing for the first time contribute to approximately one-third of cases of autism spectrum
Non-coding variants contribute to the clinical heterogeneity of TTR amyloidosis.
Iorio, Andrea; De Lillo, Antonella; De Angelis, Flavio; Di Girolamo, Marco; Luigetti, Marco; Sabatelli, Mario; Pradotto, Luca; Mauro, Alessandro; Mazzeo, Anna; Stancanelli, Claudia; Perfetto, Federico; Frusconi, Sabrina; My, Filomena; Manfellotto, Dario; Fuciarelli, Maria; Polimanti, Renato
2017-09-01
Coding mutations in TTR gene cause a rare hereditary form of systemic amyloidosis, which has a complex genotype-phenotype correlation. We investigated the role of non-coding variants in regulating TTR gene expression and consequently amyloidosis symptoms. We evaluated the genotype-phenotype correlation considering the clinical information of 129 Italian patients with TTR amyloidosis. Then, we conducted a re-sequencing of TTR gene to investigate how non-coding variants affect TTR expression and, consequently, phenotypic presentation in carriers of amyloidogenic mutations. Polygenic scores for genetically determined TTR expression were constructed using data from our re-sequencing analysis and the GTEx (Genotype-Tissue Expression) project. We confirmed a strong phenotypic heterogeneity across coding mutations causing TTR amyloidosis. Considering the effects of non-coding variants on TTR expression, we identified three patient clusters with specific expression patterns associated with certain phenotypic presentations, including late onset, autonomic neurological involvement, and gastrointestinal symptoms. This study provides novel data regarding the role of non-coding variation and the gene expression profiles in patients affected by TTR amyloidosis, also putting forth an approach that could be used to investigate the mechanisms at the basis of the genotype-phenotype correlation of the disease.
Long non-coding RNAs and mRNAs profiling during spleen development in pig.
Che, Tiandong; Li, Diyan; Jin, Long; Fu, Yuhua; Liu, Yingkai; Liu, Pengliang; Wang, Yixin; Tang, Qianzi; Ma, Jideng; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou
2018-01-01
Genome-wide transcriptomic studies in humans and mice have become extensive and mature. However, a comprehensive and systematic understanding of protein-coding genes and long non-coding RNAs (lncRNAs) expressed during pig spleen development has not been achieved. LncRNAs are known to participate in regulatory networks for an array of biological processes. Here, we constructed 18 RNA libraries from developing fetal pig spleen (55 days before birth), postnatal pig spleens (0, 30, 180 days and 2 years after birth), and the samples from the 2-year-old Wild Boar. A total of 15,040 lncRNA transcripts were identified among these samples. We found that the temporal expression pattern of lncRNAs was more restricted than observed for protein-coding genes. Time-series analysis showed two large modules for protein-coding genes and lncRNAs. The up-regulated module was enriched for genes related to immune and inflammatory function, while the down-regulated module was enriched for cell proliferation processes such as cell division and DNA replication. Co-expression networks indicated the functional relatedness between protein-coding genes and lncRNAs, which were enriched for similar functions over the series of time points examined. We identified numerous differentially expressed protein-coding genes and lncRNAs in all five developmental stages. Notably, ceruloplasmin precursor (CP), a protein-coding gene participating in antioxidant and iron transport processes, was differentially expressed in all stages. This study provides the first catalog of the developing pig spleen, and contributes to a fuller understanding of the molecular mechanisms underpinning mammalian spleen development.
Ming-Xing, Lu; Zhi-Teng, Chen; Wei-Wei, Yu; Yu-Zhou, Du
2017-03-01
We report the complete mitochondrial genome (mitogenome) of a spiraling whitefly, Aleurodicus dispersus (Hemiptera: Aleyrodidae). The 16 170 bp long genome consists of 13 protein-coding genes, 20 transfer RNAs, 2 ribosomal RNAs, and a control region. The A. dispersus mitogenome also includes a cytb-like non-coding region and shows several variations relative to the typical insect mitogenome. A phylogenetic tree has been constructed using the 13 protein-coding genes of 12 related species from Hemiptera. Our results would contribute to further study of phylogeny in Aleyrodidae and Hemiptera.
Qin, Wanhai; Wang, Lei; Zhai, Ruidong; Ma, Qiuyue; Liu, Jianfang; Bao, Chuntong; Zhang, Hu; Sun, Changjiang; Feng, Xin; Gu, Jingmin; Du, Chongtao; Han, Wenyu; Langford, P R; Lei, Liancheng
2016-01-01
Actinobacillus pleuropneumoniae is an important pathogen that causes respiratory disease in pigs. Trimeric autotransporter adhesin (TAA) is a recently discovered bacterial virulence factor that mediates bacterial adhesion and colonization. Two TAA coding genes have been found in the genome of A. pleuropneumoniae strain 5b L20, but whether they contribute to bacterial pathogenicity is unclear. In this study, we used homologous recombination to construct a double-gene deletion mutant, ΔTAA, in which both TAA coding genes were deleted and used it in in vivo and in vitro studies to confirm that TAAs participate in bacterial auto-aggregation, biofilm formation, cell adhesion and virulence in mice. A microarray analysis was used to determine whether TAAs can regulate other A. pleuropneumoniae genes during interactions with porcine primary alveolar macrophages. The results showed that deletion of both TAA coding genes up-regulated 36 genes, including ene1514, hofB and tbpB2, and simultaneously down-regulated 36 genes, including lgt, murF and ftsY. These data illustrate that TAAs help to maintain full bacterial virulence both directly, through their bioactivity, and indirectly by regulating the bacterial type II and IV secretion systems and regulating the synthesis or secretion of virulence factors. This study not only enhances our understanding of the role of TAAs but also has significance for those studying A. pleuropneumoniae pathogenesis.
Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.
2014-01-01
Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168
Luan, Jun-Bo; Chen, Wenbo; Hasegawa, Daniel K; Simmons, Alvin M; Wintermantel, William M; Ling, Kai-Shu; Fei, Zhangjun; Liu, Shu-Sheng; Douglas, Angela E
2015-09-15
Genomic decay is a common feature of intracellular bacteria that have entered into symbiosis with plant sap-feeding insects. This study of the whitefly Bemisia tabaci and two bacteria (Portiera aleyrodidarum and Hamiltonella defensa) cohoused in each host cell investigated whether the decay of Portiera metabolism genes is complemented by host and Hamiltonella genes, and compared the metabolic traits of the whitefly symbiosis with other sap-feeding insects (aphids, psyllids, and mealybugs). Parallel genomic and transcriptomic analysis revealed that the host genome contributes multiple metabolic reactions that complement or duplicate Portiera function, and that Hamiltonella may contribute multiple cofactors and one essential amino acid, lysine. Homologs of the Bemisia metabolism genes of insect origin have also been implicated in essential amino acid synthesis in other sap-feeding insect hosts, indicative of parallel coevolution of shared metabolic pathways across multiple symbioses. Further metabolism genes coded in the Bemisia genome are of bacterial origin, but phylogenetically distinct from Portiera, Hamiltonella and horizontally transferred genes identified in other sap-feeding insects. Overall, 75% of the metabolism genes of bacterial origin are functionally unique to one symbiosis, indicating that the evolutionary history of metabolic integration in these symbioses is strongly contingent on the pattern of horizontally acquired genes. Our analysis, further, shows that bacteria with genomic decay enable host acquisition of complex metabolic pathways by multiple independent horizontal gene transfers from exogenous bacteria. Specifically, each horizontally acquired gene can function with other genes in the pathway coded by the symbiont, while facilitating the decay of the symbiont gene coding the same reaction. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Identification of phasiRNAs in wild rice (Oryza rufipogon).
Liu, Yang; Wang, Yu; Zhu, Qian-Hao; Fan, Longjiang
2013-08-01
Plant miRNAs can trigger the production of phased, secondary siRNAs from either non-coding or protein-coding genes. In this study, at least 864 and 3,961 loci generating 21-nt and 24-nt phased siRNAs (phasiRNAs),respectively, were identified in three tissues from wild rice. Of these phasiRNA-producing loci, or PHAS genes, biogenesis of phasiRNAs in at least 160 of 21-nt and 254 of 24-nt loci could be triggered by interaction with miRNA(s). Developing seeds had more PHAS genes than leaves and roots. Genetic constrain on miRNA-triggered PHAS genes suggests that phasiRNAs might be one of the driving forces contributed to rice domestication.
McLysaght, Aoife; Guerzoni, Daniele
2015-09-26
The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an 'RNA-first' or 'ORF-first' pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations. © 2015 The Authors.
Ricaño-Ponce, Isis; Zhernakova, Daria V; Deelen, Patrick; Luo, Oscar; Li, Xingwang; Isaacs, Aaron; Karjalainen, Juha; Di Tommaso, Jennifer; Borek, Zuzanna Agnieszka; Zorro, Maria M; Gutierrez-Achury, Javier; Uitterlinden, Andre G; Hofman, Albert; van Meurs, Joyce; Netea, Mihai G; Jonkers, Iris H; Withoff, Sebo; van Duijn, Cornelia M; Li, Yang; Ruan, Yijun; Franke, Lude; Wijmenga, Cisca; Kumar, Vinod
2016-04-01
Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
The complete mitochondrial genome of the bagarius yarrelli from honghe river
NASA Astrophysics Data System (ADS)
Du, M.; Zhou, C. J.; Niu, B. Z.; Liu, Y. H.; Li, N.; Ai, J. L.; Xu, G. L.
2016-08-01
The total length of mitochondrial DNA sequence of the Bagarius yarrelli from the Honghe river of China is determined in this paper. The total length of the circular molecule is 16524 base pair which denoted a similar gene order to that of the other bony fishes, which include a non-coding control region, a replicated origin, two ribosome RNA (rRNA) genes, 22 transfer RNA (tRNA) genes as well as 13 protein-coding genes. Its whole base constitution is 31.4% for A, 26.9% for C, 15.7% for G and 26.0% for T, with an A+T bias of 57.4%. Those mitochondrial data would contribute to further study molecular evolution and population genetics of this species.
Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-Man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H-H; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B; Adair, Linda S; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; Chen, Yii-Der Ida; Shu, Xiao-Ou; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars G; Nielsen, Jonas Bille; Tse, Hung-Fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Kathiresan, Sekar; Mohlke, Karen L; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J
2017-12-01
Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used an exome array to examine protein-coding genetic variants in 47,532 East Asian individuals. We identified 255 variants at 41 loci that reached chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After a meta-analysis including >300,000 European samples, we identified an additional nine novel loci. Sixteen genes were identified by protein-altering variants in both East Asians and Europeans, and thus are likely to be functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.
Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J.; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N.; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H.-H.; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B.; Adair, Linda S.; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; da Chen, Yii-Der I; Shu, XiaoOu; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K.; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars; Nielsen, Jonas Bille; Tse, Hung-fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y. Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Consortium, GLGC; Kathiresan, Sekar; Mohlke, Karen L.; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J
2017-01-01
Most genome-wide association studies have been conducted in European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we examined protein-coding genetic variants in 47,532 East Asian individuals using an exome array. We identified 255 variants at 41 loci reaching chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After meta-analysis with > 300,000 European samples, we identified an additional 9 novel loci. The same 16 genes were identified by the protein-altering variants in both East Asians and Europeans, likely pointing to the functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population-specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci. PMID:29083407
Independent evolution of genomic characters during major metazoan transitions.
Simakov, Oleg; Kawashima, Takeshi
2017-07-15
Metazoan evolution encompasses a vast evolutionary time scale spanning over 600 million years. Our ability to infer ancestral metazoan characters, both morphological and functional, is limited by our understanding of the nature and evolutionary dynamics of the underlying regulatory networks. Increasing coverage of metazoan genomes enables us to identify the evolutionary changes of the relevant genomic characters such as the loss or gain of coding sequences, gene duplications, micro- and macro-synteny, and non-coding element evolution in different lineages. In this review we describe recent advances in our understanding of ancestral metazoan coding and non-coding features, as deduced from genomic comparisons. Some genomic changes such as innovations in gene and linkage content occur at different rates across metazoan clades, suggesting some level of independence among genomic characters. While their contribution to biological innovation remains largely unclear, we review recent literature about certain genomic changes that do correlate with changes to specific developmental pathways and metazoan innovations. In particular, we discuss the origins of the recently described pharyngeal cluster which is conserved across deuterostome genomes, and highlight different genomic features that have contributed to the evolution of this group. We also assess our current capacity to infer ancestral metazoan states from gene models and comparative genomics tools and elaborate on the future directions of metazoan comparative genomics relevant to evo-devo studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Prescott, Natalie J.; Lehne, Benjamin; Stone, Kristina; Lee, James C.; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M.; Simpson, Michael A.; Spain, Sarah L.; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J.; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu’Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C.; Mansfield, John C.; Sanderson, Jeremy; Lewis, Cathryn M.; Weale, Michael E.; Schlitt, Thomas; Mathew, Christopher G.
2015-01-01
The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn’s disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10−10, OR = 2.3[95% CI = 1.75–3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1–5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis. PMID:25671699
Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P; Feltus, F Alex; Paterson, Andrew H
2011-01-01
Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.
Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P.; Feltus, F. Alex; Paterson, Andrew H.
2011-01-01
Background Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. Results In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Conclusion Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution. PMID:22164235
Li, Weijun; Wang, Zongqing; Che, Yanli
2017-11-12
In this study, the complete mitochondrial genome of Cryptocercus meridianus was sequenced. The circular mitochondrial genome is 15,322 bp in size and contains 13 protein-coding genes, two ribosomal RNA genes (12S rRNA and 16S rRNA), 22 transfer RNA genes, and one D-loop region. We compare the mitogenome of C. meridianus with that of C. relictus and C. kyebangensis . The base composition of the whole genome was 45.20%, 9.74%, 16.06%, and 29.00% for A, G, C, and T, respectively; it shows a high AT content (74.2%), similar to the mitogenomes of C. relictus and C. kyebangensis . The protein-coding genes are initiated with typical mitochondrial start codons except for cox1 with TTG. The gene order of the C. meridianus mitogenome differs from the typical insect pattern for the translocation of tRNA-Ser AGN , while the mitogenomes of the other two Cryptocercus species, C. relictus and C. kyebangensis , are consistent with the typical insect pattern. There are two very long non-coding intergenic regions lying on both sides of the rearranged gene tRNA-Ser AGN . The phylogenetic relationships were constructed based on the nucleotide sequence of 13 protein-coding genes and two ribosomal RNA genes. The mitogenome of C. meridianus is the first representative of the order Blattodea that demonstrates rearrangement, and it will contribute to the further study of the phylogeny and evolution of the genus Cryptocercus and related taxa.
Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.
2015-01-01
Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438
Severgnini, Marco; Bicciato, Silvio; Mangano, Eleonora; Scarlatti, Francesca; Mezzelani, Alessandra; Mattioli, Michela; Ghidoni, Riccardo; Peano, Clelia; Bonnal, Raoul; Viti, Federica; Milanesi, Luciano; De Bellis, Gianluca; Battaglia, Cristina
2006-06-01
Meta-analysis of microarray data is increasingly important, considering both the availability of multiple platforms using disparate technologies and the accumulation in public repositories of data sets from different laboratories. We addressed the issue of comparing gene expression profiles from two microarray platforms by devising a standardized investigative strategy. We tested this procedure by studying MDA-MB-231 cells, which undergo apoptosis on treatment with resveratrol. Gene expression profiles were obtained using high-density, short-oligonucleotide, single-color microarray platforms: GeneChip (Affymetrix) and CodeLink (Amersham). Interplatform analyses were carried out on 8414 common transcripts represented on both platforms, as identified by LocusLink ID, representing 70.8% and 88.6% of annotated GeneChip and CodeLink features, respectively. We identified 105 differentially expressed genes (DEGs) on CodeLink and 42 DEGs on GeneChip. Among them, only 9 DEGs were commonly identified by both platforms. Multiple analyses (BLAST alignment of probes with target sequences, gene ontology, literature mining, and quantitative real-time PCR) permitted us to investigate the factors contributing to the generation of platform-dependent results in single-color microarray experiments. An effective approach to cross-platform comparison involves microarrays of similar technologies, samples prepared by identical methods, and a standardized battery of bioinformatic and statistical analyses.
Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.
Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P
2015-04-23
With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.
Łukasiewicz, Kinga; Węgrzyn, Grzegorz
2016-01-01
An isolated population of apollo butterfly (Parnassius apollo, Lepidoptera: Papilionidae) occurs in Pieniny National Park (Poland). Deformations and reductions of wings in a relatively large number of individuals from this population is found, yet the reasons for these defects are unknown. During studies devoted to identify cause(s) of this phenomenon, we found that specific regions of genes coding of enzymes laccases 1 and 2 could not be amplified from DNA samples isolated from large fractions of malformed insects while expected PCR products were detected in almost all (with one exception) normal butterflies. Laccases (p-diphenol:dioxygen oxidoreductases) are oxidases containing several copper atoms. They catalyse single-electron oxidations of phenolic or other compounds with concomitant reduction of oxygen to water. In insects, their enzymatic activities were found previously in epidermis, midgut, Malpighian tubules, salivary glands, and reproductive tissues. Therefore, we suggest that defects in genes coding for laccases might contribute to deformation and reduction of wings in apollo butterflies, though it seems obvious that deficiency in these enzymes could not be the sole cause of these developmental improperties in P. apollo from Pieniny National Park.
The complete mitochondrial genome of Gobiobotia filifer (Teleostei, Cypriniformes: Cyprinidae).
Li, Qiang; Liu, Ya; Zhou, Jian; Gong, Quan; Li, Hua; Lai, Jiansheng; Li, Lianman
2016-09-01
The Gobiobotia filifer is a small economic fish which distributes in the upstream of Yangtze River and its distributaries. For the environmental pollution and overfishing, its population declined drastically in recent decades, so it is essential to protect its resource. In this study, the complete mitochondrial genome sequence of G. filifer was determined with PCR technology, which contains 13 protein-coding genes, 22 tRNA genes, two rRNA genes, and a non-coding control region with the total length of 16,613 bp. The order and composition of genes were similar to most of the other teleost fish. Most of the genes were encoded on heavy strand, except for ND6 genes and eight tRNAs. Just like most other vertebrates, the bias of G and C has been found in different genes/regions. The complete mitochondrial genome sequence of G. filifer would contribute to better understand evolution of this lineage, population genetics, and will help administrative department to make rules and laws to protect this lineage.
The complete mitochondrial genome of Liobagrus marginatus (Teleostei, Siluriformes: Amblycipitidae).
Li, Qiang; Du, Jun; Liu, Ya; Zhou, Jian; Ke, Hongyu; Liu, Chao; Liu, Guangxun
2014-04-01
The Liobagrus marginatus is an economic fish which distribute in the upstream of Yangtze river and its distributary. For its taste fresh, environmental pollution and overfishing, its population declined drastically and body miniaturization in recent decades, so it is essential to protect its resource. In this study, the complete mitochondrial genome sequence of Liobagrus marginatus was sequenced, which contains 22 tRNA genes, 13 protein-coding genes, 2 rRNA genes, and a non-coding control region with the total length of 16,497 bp. The gene arrangement and composition are similar to most of other fish. Most of the genes are encoded on heavy-strand, except for eight tRNA and ND6 genes. Just like most other vertebrates, the bias of G and C has been found in statistics results of different genes/regions. The complete mitochondrial genome sequence of Liobagrus marginatus would contribute to better understand population genetics, evolution of this lineage, and will help administrative departments to make rules and laws to protect it.
Gaddelapati, Sharath Chandra; Kalsi, Megha; Roy, Amit; Palli, Subba Reddy
2018-08-01
The Colorado potato beetle (CPB), Leptinotarsa decemlineata developed resistance to imidacloprid after exposure to this insecticide for multiple generations. Our previous studies showed that xenobiotic transcription factor, cap 'n' collar isoform C (CncC) regulates the expression of multiple cytochrome P450 genes, which play essential roles in resistance to plant allelochemicals and insecticides. In this study, we sought to obtain a comprehensive picture of the genes regulated by CncC in imidacloprid-resistant CPB. We performed sequencing of RNA isolated from imidacloprid-resistant CPB treated with dsRNA targeting CncC or gene coding for green fluorescent protein (control). Comparative transcriptome analysis showed that CncC regulated the expression of 1798 genes, out of which 1499 genes were downregulated in CncC knockdown beetles. Interestingly, expression of 79% of imidacloprid induced P450 genes requires CncC. We performed quantitative real-time PCR to verify the reduction in the expression of 20 genes including those coding for detoxification enzymes (P450s, glutathione S-transferases, and esterases) and ABC transporters. The genes coding for ABC transporters are induced in insecticide resistant CPB and require CncC for their expression. Knockdown of genes coding for ABC transporters simultaneously or individually caused an increase in imidacloprid-induced mortality in resistant beetles confirming their contribution to insecticide resistance. These studies identified CncC as a transcription factor involved in regulation of genes responsible for imidacloprid resistance. Small molecule inhibitors of CncC or suppression of CncC by RNAi could provide effective synergists for pest control or management of insecticide resistance. Copyright © 2018 Elsevier Ltd. All rights reserved.
A Molecular Portrait of De Novo Genes in Yeasts.
Vakirlis, Nikolaos; Hebert, Alex S; Opulente, Dana A; Achaz, Guillaume; Hittinger, Chris Todd; Fischer, Gilles; Coon, Joshua J; Lafontaine, Ingrid
2018-03-01
New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Shi, Lihua; Zhang, Zhe; Yu, Angela M; Wang, Wei; Wei, Zhi; Akhter, Ehtisham; Maurer, Kelly; Costa Reis, Patrícia; Song, Li; Petri, Michelle; Sullivan, Kathleen E
2014-01-01
Gene expression studies of peripheral blood mononuclear cells from patients with systemic lupus erythematosus (SLE) have demonstrated a type I interferon signature and increased expression of inflammatory cytokine genes. Studies of patients with Aicardi Goutières syndrome, commonly cited as a single gene model for SLE, have suggested that accumulation of non-coding RNAs may drive some of the pathologic gene expression, however, no RNA sequencing studies of SLE patients have been performed. This study was designed to define altered expression of coding and non-coding RNAs and to detect globally altered RNA processing in SLE. Purified monocytes from eight healthy age/gender matched controls and nine SLE patients (with low-moderate disease activity and lack of biologic drug use or immune suppressive treatment) were studied using RNA-seq. Quantitative RT-PCR was used to validate findings. Serum levels of endotoxin were measured by ELISA. We found that SLE patients had diminished expression of most endogenous retroviruses and small nucleolar RNAs, but exhibited increased expression of pri-miRNAs. Splicing patterns and polyadenylation were significantly altered. In addition, SLE monocytes expressed novel transcripts, an effect that was replicated by LPS treatment of control monocytes. We further identified increased circulating endotoxin in SLE patients. Monocytes from SLE patients exhibit globally dysregulated gene expression. The transcriptome is not simply altered by the transcriptional activation of a set of genes, but is qualitatively different in SLE. The identification of novel loci, inducible by LPS, suggests that chronic microbial translocation could contribute to the immunologic dysregulation in SLE, a new potential disease mechanism.
Meta-analysis and genome-wide interpretation of genetic susceptibility to drug addiction
2011-01-01
Background Classical genetic studies provide strong evidence for heritable contributions to susceptibility to developing dependence on addictive substances. Candidate gene and genome-wide association studies (GWAS) have sought genes, chromosomal regions and allelic variants likely to contribute to susceptibility to drug addiction. Results Here, we performed a meta-analysis of addiction candidate gene association studies and GWAS to investigate possible functional mechanisms associated with addiction susceptibility. From meta-data retrieved from 212 publications on candidate gene association studies and 5 GWAS reports, we linked a total of 843 haplotypes to addiction susceptibility. We mapped the SNPs in these haplotypes to functional and regulatory elements in the genome and estimated the magnitude of the contributions of different molecular mechanisms to their effects on addiction susceptibility. In addition to SNPs in coding regions, these data suggest that haplotypes in gene regulatory regions may also contribute to addiction susceptibility. When we compared the lists of genes identified by association studies and those identified by molecular biological studies of drug-regulated genes, we observed significantly higher participation in the same gene interaction networks than expected by chance, despite little overlap between the two gene lists. Conclusions These results appear to offer new insights into the genetic factors underlying drug addiction. PMID:21999673
Domestic animals as models for biomedical research.
Andersson, Leif
2016-01-01
Domestic animals are unique models for biomedical research due to their long history (thousands of years) of strong phenotypic selection. This process has enriched for novel mutations that have contributed to phenotype evolution in domestic animals. The characterization of such mutations provides insights in gene function and biological mechanisms. This review summarizes genetic dissection of about 50 genetic variants affecting pigmentation, behaviour, metabolic regulation, and the pattern of locomotion. The variants are controlled by mutations in about 30 different genes, and for 10 of these our group was the first to report an association between the gene and a phenotype. Almost half of the reported mutations occur in non-coding sequences, suggesting that this is the most common type of polymorphism underlying phenotypic variation since this is a biased list where the proportion of coding mutations are inflated as they are easier to find. The review documents that structural changes (duplications, deletions, and inversions) have contributed significantly to the evolution of phenotypic diversity in domestic animals. Finally, we describe five examples of evolution of alleles, which means that alleles have evolved by the accumulation of several consecutive mutations affecting the function of the same gene.
Domestic animals as models for biomedical research
Andersson, Leif
2016-01-01
Domestic animals are unique models for biomedical research due to their long history (thousands of years) of strong phenotypic selection. This process has enriched for novel mutations that have contributed to phenotype evolution in domestic animals. The characterization of such mutations provides insights in gene function and biological mechanisms. This review summarizes genetic dissection of about 50 genetic variants affecting pigmentation, behaviour, metabolic regulation, and the pattern of locomotion. The variants are controlled by mutations in about 30 different genes, and for 10 of these our group was the first to report an association between the gene and a phenotype. Almost half of the reported mutations occur in non-coding sequences, suggesting that this is the most common type of polymorphism underlying phenotypic variation since this is a biased list where the proportion of coding mutations are inflated as they are easier to find. The review documents that structural changes (duplications, deletions, and inversions) have contributed significantly to the evolution of phenotypic diversity in domestic animals. Finally, we describe five examples of evolution of alleles, which means that alleles have evolved by the accumulation of several consecutive mutations affecting the function of the same gene. PMID:26479863
Pleiotropic Effects of Variants in Dementia Genes in Parkinson Disease.
Ibanez, Laura; Dube, Umber; Davis, Albert A; Fernandez, Maria V; Budde, John; Cooper, Breanna; Diez-Fairen, Monica; Ortega-Cubero, Sara; Pastor, Pau; Perlmutter, Joel S; Cruchaga, Carlos; Benitez, Bruno A
2018-01-01
Background: The prevalence of dementia in Parkinson disease (PD) increases dramatically with advancing age, approaching 80% in patients who survive 20 years with the disease. Increasing evidence suggests clinical, pathological and genetic overlap between Alzheimer disease, dementia with Lewy bodies and frontotemporal dementia with PD. However, the contribution of the dementia-causing genes to PD risk, cognitive impairment and dementia in PD is not fully established. Objective: To assess the contribution of coding variants in Mendelian dementia-causing genes on the risk of developing PD and the effect on cognitive performance of PD patients. Methods: We analyzed the coding regions of the amyloid-beta precursor protein ( APP ), Presenilin 1 and 2 ( PSEN1, PSEN2 ), and Granulin ( GRN ) genes from 1,374 PD cases and 973 controls using pooled-DNA targeted sequence, human exome-chip and whole-exome sequencing (WES) data by single variant and gene base (SKAT-O and burden tests) analyses. Global cognitive function was assessed using the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA). The effect of coding variants in dementia-causing genes on cognitive performance was tested by multiple regression analysis adjusting for gender, disease duration, age at dementia assessment, study site and APOE carrier status. Results: Known AD pathogenic mutations in the PSEN1 (p.A79V) and PSEN2 (p.V148I) genes were found in 0.3% of all PD patients. There was a significant burden of rare, likely damaging variants in the GRN and PSEN1 genes in PD patients when compared with frequencies in the European population from the ExAC database. Multiple regression analysis revealed that PD patients carrying rare variants in the APP, PSEN1, PSEN2 , and GRN genes exhibit lower cognitive tests scores than non-carrier PD patients ( p = 2.0 × 10 -4 ), independent of age at PD diagnosis, age at evaluation, APOE status or recruitment site. Conclusions: Pathogenic mutations in the Alzheimer disease-causing genes ( PSEN1 and PSEN2) are found in sporadic PD patients. PD patients with cognitive decline carry rare variants in dementia-causing genes. Variants in genes causing Mendelian neurodegenerative diseases exhibit pleiotropic effects.
Genomic and Epigenomic Insights into Nutrition and Brain Disorders
Dauncey, Margaret Joy
2013-01-01
Considerable evidence links many neuropsychiatric, neurodevelopmental and neurodegenerative disorders with multiple complex interactions between genetics and environmental factors such as nutrition. Mental health problems, autism, eating disorders, Alzheimer’s disease, schizophrenia, Parkinson’s disease and brain tumours are related to individual variability in numerous protein-coding and non-coding regions of the genome. However, genotype does not necessarily determine neurological phenotype because the epigenome modulates gene expression in response to endogenous and exogenous regulators, throughout the life-cycle. Studies using both genome-wide analysis of multiple genes and comprehensive analysis of specific genes are providing new insights into genetic and epigenetic mechanisms underlying nutrition and neuroscience. This review provides a critical evaluation of the following related areas: (1) recent advances in genomic and epigenomic technologies, and their relevance to brain disorders; (2) the emerging role of non-coding RNAs as key regulators of transcription, epigenetic processes and gene silencing; (3) novel approaches to nutrition, epigenetics and neuroscience; (4) gene-environment interactions, especially in the serotonergic system, as a paradigm of the multiple signalling pathways affected in neuropsychiatric and neurological disorders. Current and future advances in these four areas should contribute significantly to the prevention, amelioration and treatment of multiple devastating brain disorders. PMID:23503168
Decoding critical long non-coding RNA in ovarian cancer epithelial-to-mesenchymal transition.
Mitra, Ramkrishna; Chen, Xi; Greenawalt, Evan J; Maulik, Ujjwal; Jiang, Wei; Zhao, Zhongming; Eischen, Christine M
2017-11-17
Long non-coding RNA (lncRNA) are emerging as contributors to malignancies. Little is understood about the contribution of lncRNA to epithelial-to-mesenchymal transition (EMT), which correlates with metastasis. Ovarian cancer is usually diagnosed after metastasis. Here we report an integrated analysis of >700 ovarian cancer molecular profiles, including genomic data sets, from four patient cohorts identifying lncRNA DNM3OS, MEG3, and MIAT overexpression and their reproducible gene regulation in ovarian cancer EMT. Genome-wide mapping shows 73% of MEG3-regulated EMT-linked pathway genes contain MEG3 binding sites. DNM3OS overexpression, but not MEG3 or MIAT, significantly correlates to worse overall patient survival. DNM3OS knockdown results in altered EMT-linked genes/pathways, mesenchymal-to-epithelial transition, and reduced cell migration and invasion. Proteotranscriptomic characterization further supports the DNM3OS and ovarian cancer EMT connection. TWIST1 overexpression and DNM3OS amplification provides an explanation for increased DNM3OS levels. Therefore, our results elucidate lncRNA that regulate EMT and demonstrate DNM3OS specifically contributes to EMT in ovarian cancer.
Non-coding functions of alternative pre-mRNA splicing in development
Mockenhaupt, Stefan; Makeyev, Eugene V.
2015-01-01
A majority of messenger RNA precursors (pre-mRNAs) in the higher eukaryotes undergo alternative splicing to generate more than one mature product. By targeting the open reading frame region this process increases diversity of protein isoforms beyond the nominal coding capacity of the genome. However, alternative splicing also frequently controls output levels and spatiotemporal features of cellular and organismal gene expression programs. Here we discuss how these non-coding functions of alternative splicing contribute to development through regulation of mRNA stability, translational efficiency and cellular localization. PMID:26493705
Yang, Xiaofei; Gao, Lin; Guo, Xingli; Shi, Xinghua; Wu, Hao; Song, Fei; Wang, Bingbo
2014-01-01
Increasing evidence has indicated that long non-coding RNAs (lncRNAs) are implicated in and associated with many complex human diseases. Despite of the accumulation of lncRNA-disease associations, only a few studies had studied the roles of these associations in pathogenesis. In this paper, we investigated lncRNA-disease associations from a network view to understand the contribution of these lncRNAs to complex diseases. Specifically, we studied both the properties of the diseases in which the lncRNAs were implicated, and that of the lncRNAs associated with complex diseases. Regarding the fact that protein coding genes and lncRNAs are involved in human diseases, we constructed a coding-non-coding gene-disease bipartite network based on known associations between diseases and disease-causing genes. We then applied a propagation algorithm to uncover the hidden lncRNA-disease associations in this network. The algorithm was evaluated by leave-one-out cross validation on 103 diseases in which at least two genes were known to be involved, and achieved an AUC of 0.7881. Our algorithm successfully predicted 768 potential lncRNA-disease associations between 66 lncRNAs and 193 diseases. Furthermore, our results for Alzheimer's disease, pancreatic cancer, and gastric cancer were verified by other independent studies. PMID:24498199
Regulation of mammalian cell differentiation by long non-coding RNAs
Hu, Wenqian; Alvarez-Dominguez, Juan R; Lodish, Harvey F
2012-01-01
Differentiation of specialized cell types from stem and progenitor cells is tightly regulated at several levels, both during development and during somatic tissue homeostasis. Many long non-coding RNAs have been recognized as an additional layer of regulation in the specification of cellular identities; these non-coding species can modulate gene-expression programmes in various biological contexts through diverse mechanisms at the transcriptional, translational or messenger RNA stability levels. Here, we summarize findings that implicate long non-coding RNAs in the control of mammalian cell differentiation. We focus on several representative differentiation systems and discuss how specific long non-coding RNAs contribute to the regulation of mammalian development. PMID:23070366
Evaluation of 10 genes encoding cardiac proteins in Doberman Pinschers with dilated cardiomyopathy.
O'Sullivan, M Lynne; O'Grady, Michael R; Pyle, W Glen; Dawson, John F
2011-07-01
To identify a causative mutation for dilated cardiomyopathy (DCM) in Doberman Pinschers by sequencing the coding regions of 10 cardiac genes known to be associated with familial DCM in humans. 5 Doberman Pinschers with DCM and congestive heart failure and 5 control mixed-breed dogs that were euthanized or died. RNA was extracted from frozen ventricular myocardial samples from each dog, and first-strand cDNA was synthesized via reverse transcription, followed by PCR amplification with gene-specific primers. Ten cardiac genes were analyzed: cardiac actin, α-actinin, α-tropomyosin, β-myosin heavy chain, metavinculin, muscle LIM protein, myosinbinding protein C, tafazzin, titin-cap (telethonin), and troponin T. Sequences for DCM-affected and control dogs and the published canine genome were compared. None of the coding sequences yielded a common causative mutation among all Doberman Pinscher samples. However, 3 variants were identified in the α-actinin gene in the DCM-affected Doberman Pinschers. One of these variants, identified in 2 of the 5 Doberman Pinschers, resulted in an amino acid change in the rod-forming triple coiled-coil domain. Mutations in the coding regions of several genes associated with DCM in humans did not appear to consistently account for DCM in Doberman Pinschers. However, an α-actinin variant was detected in some Doberman Pinschers that may contribute to the development of DCM given its potential effect on the structure of this protein. Investigation of additional candidate gene coding and noncoding regions and further evaluation of the role of α-actinin in development of DCM in Doberman Pinschers are warranted.
Neuhaus, Klaus; Landstorfer, Richard; Fellner, Lea; Simon, Svenja; Schafferhans, Andrea; Goldberg, Tatyana; Marx, Harald; Ozoline, Olga N; Rost, Burkhard; Kuster, Bernhard; Keim, Daniel A; Scherer, Siegfried
2016-02-24
Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.
Diversity and evolution of the emerging Pandoraviridae family.
Legendre, Matthieu; Fabre, Elisabeth; Poirot, Olivier; Jeudy, Sandra; Lartigue, Audrey; Alempic, Jean-Marie; Beucher, Laure; Philippe, Nadège; Bertaux, Lionel; Christo-Foroux, Eugène; Labadie, Karine; Couté, Yohann; Abergel, Chantal; Claverie, Jean-Michel
2018-06-11
With DNA genomes reaching 2.5 Mb packed in particles of bacterium-like shape and dimension, the first two Acanthamoeba-infecting pandoraviruses remained up to now the most complex viruses since their discovery in 2013. Our isolation of three new strains from distant locations and environments is now used to perform the first comparative genomics analysis of the emerging worldwide-distributed Pandoraviridae family. Thorough annotation of the genomes combining transcriptomic, proteomic, and bioinformatic analyses reveals many non-coding transcripts and significantly reduces the former set of predicted protein-coding genes. Here we show that the pandoraviruses exhibit an open pan-genome, the enormous size of which is not adequately explained by gene duplications or horizontal transfers. As most of the strain-specific genes have no extant homolog and exhibit statistical features comparable to intergenic regions, we suggest that de novo gene creation could contribute to the evolution of the giant pandoravirus genomes.
Cipriano, Andrea; Ballarino, Monica
2018-01-01
The completion of the human genome sequence together with advances in sequencing technologies have shifted the paradigm of the genome, as composed of discrete and hereditable coding entities, and have shown the abundance of functional noncoding DNA. This part of the genome, previously dismissed as “junk” DNA, increases proportionally with organismal complexity and contributes to gene regulation beyond the boundaries of known protein-coding genes. Different classes of functionally relevant nonprotein-coding RNAs are transcribed from noncoding DNA sequences. Among them are the long noncoding RNAs (lncRNAs), which are thought to participate in the basal regulation of protein-coding genes at both transcriptional and post-transcriptional levels. Although knowledge of this field is still limited, the ability of lncRNAs to localize in different cellular compartments, to fold into specific secondary structures and to interact with different molecules (RNA or proteins) endows them with multiple regulatory mechanisms. It is becoming evident that lncRNAs may play a crucial role in most biological processes such as the control of development, differentiation and cell growth. This review places the evolution of the concept of the gene in its historical context, from Darwin's hypothetical mechanism of heredity to the post-genomic era. We discuss how the original idea of protein-coding genes as unique determinants of phenotypic traits has been reconsidered in light of the existence of noncoding RNAs. We summarize the technological developments which have been made in the genome-wide identification and study of lncRNAs and emphasize the methodologies that have aided our understanding of the complexity of lncRNA-protein interactions in recent years. PMID:29560353
Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O.; Decker, Christian; Preising, Markus N.; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Issa, Peter Charbel; Holz, Frank G.; Baig, Shahid M.; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y.; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S.; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J.
2013-01-01
Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover “hidden mutations” such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5′ exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5′-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even truncating mutations may be misleading. PMID:24265693
Eisenberger, Tobias; Neuhaus, Christine; Khan, Arif O; Decker, Christian; Preising, Markus N; Friedburg, Christoph; Bieg, Anika; Gliem, Martin; Charbel Issa, Peter; Holz, Frank G; Baig, Shahid M; Hellenbroich, Yorck; Galvez, Alberto; Platzer, Konrad; Wollnik, Bernd; Laddach, Nadja; Ghaffari, Saeed Reza; Rafati, Maryam; Botzenhart, Elke; Tinschert, Sigrid; Börger, Doris; Bohring, Axel; Schreml, Julia; Körtge-Jung, Stefani; Schell-Apacik, Chayim; Bakur, Khadijah; Al-Aama, Jumana Y; Neuhann, Teresa; Herkenrath, Peter; Nürnberg, Gudrun; Nürnberg, Peter; Davis, John S; Gal, Andreas; Bergmann, Carsten; Lorenz, Birgit; Bolz, Hanno J
2013-01-01
Retinitis pigmentosa (RP) and Leber congenital amaurosis (LCA) are major causes of blindness. They result from mutations in many genes which has long hampered comprehensive genetic analysis. Recently, targeted next-generation sequencing (NGS) has proven useful to overcome this limitation. To uncover "hidden mutations" such as copy number variations (CNVs) and mutations in non-coding regions, we extended the use of NGS data by quantitative readout for the exons of 55 RP and LCA genes in 126 patients, and by including non-coding 5' exons. We detected several causative CNVs which were key to the diagnosis in hitherto unsolved constellations, e.g. hemizygous point mutations in consanguineous families, and CNVs complemented apparently monoallelic recessive alleles. Mutations of non-coding exon 1 of EYS revealed its contribution to disease. In view of the high carrier frequency for retinal disease gene mutations in the general population, we considered the overall variant load in each patient to assess if a mutation was causative or reflected accidental carriership in patients with mutations in several genes or with single recessive alleles. For example, truncating mutations in RP1, a gene implicated in both recessive and dominant RP, were causative in biallelic constellations, unrelated to disease when heterozygous on a biallelic mutation background of another gene, or even non-pathogenic if close to the C-terminus. Patients with mutations in several loci were common, but without evidence for di- or oligogenic inheritance. Although the number of targeted genes was low compared to previous studies, the mutation detection rate was highest (70%) which likely results from completeness and depth of coverage, and quantitative data analysis. CNV analysis should routinely be applied in targeted NGS, and mutations in non-coding exons give reason to systematically include 5'-UTRs in disease gene or exome panels. Consideration of all variants is indispensable because even truncating mutations may be misleading.
Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.
2015-01-01
Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID:25552301
Contribution of transposable elements in the plant's genome.
Sahebi, Mahbod; Hanafi, Mohamed M; van Wijnen, Andre J; Rice, David; Rafii, M Y; Azizi, Parisa; Osman, Mohamad; Taheri, Sima; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat; Noor, Yusuf Muhammad
2018-07-30
Plants maintain extensive growth flexibility under different environmental conditions, allowing them to continuously and rapidly adapt to alterations in their environment. A large portion of many plant genomes consists of transposable elements (TEs) that create new genetic variations within plant species. Different types of mutations may be created by TEs in plants. Many TEs can avoid the host's defense mechanisms and survive alterations in transposition activity, internal sequence and target site. Thus, plant genomes are expected to utilize a variety of mechanisms to tolerate TEs that are near or within genes. TEs affect the expression of not only nearby genes but also unlinked inserted genes. TEs can create new promoters, leading to novel expression patterns or alternative coding regions to generate alternate transcripts in plant species. TEs can also provide novel cis-acting regulatory elements that act as enhancers or inserts within original enhancers that are required for transcription. Thus, the regulation of plant gene expression is strongly managed by the insertion of TEs into nearby genes. TEs can also lead to chromatin modifications and thereby affect gene expression in plants. TEs are able to generate new genes and modify existing gene structures by duplicating, mobilizing and recombining gene fragments. They can also facilitate cellular functions by sharing their transposase-coding regions. Hence, TE insertions can not only act as simple mutagens but can also alter the elementary functions of the plant genome. Here, we review recent discoveries concerning the contribution of TEs to gene expression in plant genomes and discuss the different mechanisms by which TEs can affect plant gene expression and reduce host defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
A global view of the nonprotein-coding transcriptome in Plasmodium falciparum
Raabe, Carsten A.; Sanchez, Cecilia P.; Randau, Gerrit; Robeck, Thomas; Skryabin, Boris V.; Chinni, Suresh V.; Kube, Michael; Reinhardt, Richard; Ng, Guey Hooi; Manickam, Ravichandran; Kuryshev, Vladimir Y.; Lanzer, Michael; Brosius, Juergen; Tang, Thean Hock; Rozhdestvensky, Timofey S.
2010-01-01
Nonprotein-coding RNAs (npcRNAs) represent an important class of regulatory molecules that act in many cellular pathways. Here, we describe the experimental identification and validation of the small npcRNA transcriptome of the human malaria parasite Plasmodium falciparum. We identified 630 novel npcRNA candidates. Based on sequence and structural motifs, 43 of them belong to the C/D and H/ACA-box subclasses of small nucleolar RNAs (snoRNAs) and small Cajal body-specific RNAs (scaRNAs). We further observed the exonization of a functional H/ACA snoRNA gene, which might contribute to the regulation of ribosomal protein L7a gene expression. Some of the small npcRNA candidates are from telomeric and subtelomeric repetitive regions, suggesting their potential involvement in maintaining telomeric integrity and subtelomeric gene silencing. We also detected 328 cis-encoded antisense npcRNAs (asRNAs) complementary to P. falciparum protein-coding genes of a wide range of biochemical pathways, including determinants of virulence and pathology. All cis-encoded asRNA genes tested exhibit lifecycle-specific expression profiles. For all but one of the respective sense–antisense pairs, we deduced concordant patterns of expression. Our findings have important implications for a better understanding of gene regulatory mechanisms in P. falciparum, revealing an extended and sophisticated npcRNA network that may control the expression of housekeeping genes and virulence factors. PMID:19864253
A global view of the nonprotein-coding transcriptome in Plasmodium falciparum.
Raabe, Carsten A; Sanchez, Cecilia P; Randau, Gerrit; Robeck, Thomas; Skryabin, Boris V; Chinni, Suresh V; Kube, Michael; Reinhardt, Richard; Ng, Guey Hooi; Manickam, Ravichandran; Kuryshev, Vladimir Y; Lanzer, Michael; Brosius, Juergen; Tang, Thean Hock; Rozhdestvensky, Timofey S
2010-01-01
Nonprotein-coding RNAs (npcRNAs) represent an important class of regulatory molecules that act in many cellular pathways. Here, we describe the experimental identification and validation of the small npcRNA transcriptome of the human malaria parasite Plasmodium falciparum. We identified 630 novel npcRNA candidates. Based on sequence and structural motifs, 43 of them belong to the C/D and H/ACA-box subclasses of small nucleolar RNAs (snoRNAs) and small Cajal body-specific RNAs (scaRNAs). We further observed the exonization of a functional H/ACA snoRNA gene, which might contribute to the regulation of ribosomal protein L7a gene expression. Some of the small npcRNA candidates are from telomeric and subtelomeric repetitive regions, suggesting their potential involvement in maintaining telomeric integrity and subtelomeric gene silencing. We also detected 328 cis-encoded antisense npcRNAs (asRNAs) complementary to P. falciparum protein-coding genes of a wide range of biochemical pathways, including determinants of virulence and pathology. All cis-encoded asRNA genes tested exhibit lifecycle-specific expression profiles. For all but one of the respective sense-antisense pairs, we deduced concordant patterns of expression. Our findings have important implications for a better understanding of gene regulatory mechanisms in P. falciparum, revealing an extended and sophisticated npcRNA network that may control the expression of housekeeping genes and virulence factors.
Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction.
Do, Ron; Stitziel, Nathan O; Won, Hong-Hee; Jørgensen, Anders Berg; Duga, Stefano; Angelica Merlini, Pier; Kiezun, Adam; Farrall, Martin; Goel, Anuj; Zuk, Or; Guella, Illaria; Asselta, Rosanna; Lange, Leslie A; Peloso, Gina M; Auer, Paul L; Girelli, Domenico; Martinelli, Nicola; Farlow, Deborah N; DePristo, Mark A; Roberts, Robert; Stewart, Alexander F R; Saleheen, Danish; Danesh, John; Epstein, Stephen E; Sivapalaratnam, Suthesh; Hovingh, G Kees; Kastelein, John J; Samani, Nilesh J; Schunkert, Heribert; Erdmann, Jeanette; Shah, Svati H; Kraus, William E; Davies, Robert; Nikpay, Majid; Johansen, Christopher T; Wang, Jian; Hegele, Robert A; Hechter, Eliana; Marz, Winfried; Kleber, Marcus E; Huang, Jie; Johnson, Andrew D; Li, Mingyao; Burke, Greg L; Gross, Myron; Liu, Yongmei; Assimes, Themistocles L; Heiss, Gerardo; Lange, Ethan M; Folsom, Aaron R; Taylor, Herman A; Olivieri, Oliviero; Hamsten, Anders; Clarke, Robert; Reilly, Dermot F; Yin, Wu; Rivas, Manuel A; Donnelly, Peter; Rossouw, Jacques E; Psaty, Bruce M; Herrington, David M; Wilson, James G; Rich, Stephen S; Bamshad, Michael J; Tracy, Russell P; Cupples, L Adrienne; Rader, Daniel J; Reilly, Muredach P; Spertus, John A; Cresci, Sharon; Hartiala, Jaana; Tang, W H Wilson; Hazen, Stanley L; Allayee, Hooman; Reiner, Alex P; Carlson, Christopher S; Kooperberg, Charles; Jackson, Rebecca D; Boerwinkle, Eric; Lander, Eric S; Schwartz, Stephen M; Siscovick, David S; McPherson, Ruth; Tybjaerg-Hansen, Anne; Abecasis, Goncalo R; Watkins, Hugh; Nickerson, Deborah A; Ardissino, Diego; Sunyaev, Shamil R; O'Donnell, Christopher J; Altshuler, David; Gabriel, Stacey; Kathiresan, Sekar
2015-02-05
Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance. When MI occurs early in life, genetic inheritance is a major component to risk. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI risk in individual families, whereas common variants at more than 45 loci have been associated with MI risk in the population. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI at an early age (≤50 years in males and ≤60 years in females) along with MI-free controls. We identified two genes in which rare coding-sequence mutations were more frequent in MI cases versus controls at exome-wide significance. At low-density lipoprotein receptor (LDLR), carriers of rare non-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol. Among controls, about 1 in 217 carried an LDLR coding-sequence mutation and had plasma LDL cholesterol > 190 mg dl(-1). At apolipoprotein A-V (APOA5), carriers of rare non-synonymous mutations were at 2.2-fold increased risk for MI. When compared with non-carriers, LDLR mutation carriers had higher plasma LDL cholesterol, whereas APOA5 mutation carriers had higher plasma triglycerides. Recent evidence has connected MI risk with coding-sequence mutations at two genes functionally related to APOA5, namely lipoprotein lipase and apolipoprotein C-III (refs 18, 19). Combined, these observations suggest that, as well as LDL cholesterol, disordered metabolism of triglyceride-rich lipoproteins contributes to MI risk.
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones
Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio
2004-01-01
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
RARE VARIANTS IN THE NEUROTROPHIN SIGNALING PATHWAY IMPLICATED IN SCHIZOPHRENIA RISK
Kranz, Thorsten M.; Goetz, Ray R.; Walsh-Messinger, Julie; Goetz, Deborah; Antonius, Daniel; Dolgalev, Igor; Heguy, Adriana; Seandel, Marco; Malaspina, Dolores; Chao, Moses V.
2015-01-01
Multiple lines of evidence corroborate impaired signaling pathways as relevant to the underpinnings of schizophrenia. There has been an interest in neurotrophins, since they are crucial mediators of neurodevelopment and in synaptic connectivity in the adult brain. Neurotrophins and their receptors demonstrate aberrant expression patterns in cortical areas for schizophrenia cases in comparison to control subjects. There is little known about the contribution of neurotrophin genes in psychiatric disorders. To begin to address this issue, we conducted high-coverage targeted exome capture in a subset of neurotrophin genes in 48 comprehensively characterized cases with schizophrenia-related psychosis. We herein report rare missense polymorphisms and novel missense mutations in neurotrophin receptor signaling pathway genes. Furthermore, we observed that several genes have a higher propensity to harbor missense coding variants than others. Based on this initial analysis we suggest that rare variants and missense mutations in neurotrophin genes might represent genetic contributions involved across psychiatric disorders. PMID:26215504
A Non-Degenerate Code of Deleterious Variants in Mendelian Loci Contributes to Complex Disease Risk
Blair, David R.; Lyttle, Christopher S.; Mortensen, Jonathan M.; Bearden, Charles F.; Jensen, Anders Boeck; Khiabanian, Hossein; Melamed, Rachel; Rabadan, Raul; Bernstam, Elmer V.; Brunak, Søren; Jensen, Lars Juhl; Nicolae, Dan; Shah, Nigam H.; Grossman, Robert L.; Cox, Nancy J.; White, Kevin P.; Rzhetsky, Andrey
2013-01-01
Summary Whereas countless highly penetrant variants have been associated with Mendelian disorders, the genetic etiologies underlying complex diseases remain largely unresolved. Here, we examine the extent to which Mendelian variation contributes to complex disease risk by mining the medical records of over 110 million patients. We detect thousands of associations between Mendelian and complex diseases, revealing a non-degenerate, phenotypic code that links each complex disorder to a unique collection of Mendelian loci. Using genome-wide association results, we demonstrate that common variants associated with complex diseases are enriched in the genes indicated by this “Mendelian code.” Finally, we detect hundreds of comorbidity associations among Mendelian disorders, and we use probabilistic genetic modeling to demonstrate that Mendelian variants likely contribute non-additively to the risk for a subset of complex diseases. Overall, this study illustrates a complementary approach for mapping complex disease loci and provides unique predictions concerning the etiologies of specific diseases. PMID:24074861
Zhou, Yingbiao; Zhu, Yueming; Dai, Longhai; Men, Yan; Wu, Jinhai; Zhang, Juankun; Sun, Yuanxia
2017-01-01
Melibiose is widely used as a functional carbohydrate. Whole-cell biocatalytic production of melibiose from raffinose could reduce its cost. However, characteristics of strains for whole-cell biocatalysis and mechanism of such process are unclear. We compared three different Saccharomyces cerevisiae strains (liquor, wine, and baker's yeasts) in terms of concentration variations of substrate (raffinose), target product (melibiose), and by-products (fructose and galactose) in whole-cell biocatalysis process. Distinct difference was observed in whole-cell catalytic efficiency among three strains. Furthermore, activities of key enzymes (invertase, α-galactosidase, and fructose transporter) involved in process and expression levels of their coding genes (suc2, mel1, and fsy1) were investigated. Conservation of key genes in S. cerevisiae strains was also evaluated. Results show that whole-cell catalytic efficiency of S. cerevisiae in the raffinose substrate was closely related to activity of key enzymes and expression of their coding genes. Finally, we summarized characteristics of producing strain that offered advantages, as well as contributions of key genes to excellent strains. Furthermore, we presented a dynamic mechanism model to achieve some mechanism insight for this whole-cell biocatalytic process. This pioneering study should contribute to improvement of whole-cell biocatalytic production of melibiose from raffinose.
Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu
2017-01-04
The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Non-coding functions of alternative pre-mRNA splicing in development.
Mockenhaupt, Stefan; Makeyev, Eugene V
2015-12-01
A majority of messenger RNA precursors (pre-mRNAs) in the higher eukaryotes undergo alternative splicing to generate more than one mature product. By targeting the open reading frame region this process increases diversity of protein isoforms beyond the nominal coding capacity of the genome. However, alternative splicing also frequently controls output levels and spatiotemporal features of cellular and organismal gene expression programs. Here we discuss how these non-coding functions of alternative splicing contribute to development through regulation of mRNA stability, translational efficiency and cellular localization. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Systematic screening for mutations in the promoter and the coding region of the 5-HT{sub 1A} gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erdmann, J.; Shimron-Abarbanell, D.; Cichon, S.
1995-10-09
In the present study we sought to identify genetic variation in the 5-HT{sub 1A} receptor gene which through alteration of protein function or level of expression might contribute to the genetic predisposition to neuropsychiatric diseases. Genomic DNA samples from 159 unrelated subjects (including 45 schizophrenic, 46 bipolar affective, and 43 patients with Tourette`s syndrome, as well as 25 healthy controls) were investigated by single-strand conformation analysis. Overlapping PCR (polymerase chain reaction) fragments covered the whole coding sequence as well as the 5{prime} untranslated region of the 5-HT{sub 1A} gene. The region upstream to the coding sequence we investigated contains amore » functional promoter. We found two rare nucleotide sequence variants. Both mutations are located in the coding region of the gene: a coding mutation (A{yields}G) in nucleotide position 82 which leads to an amino acid exchange (Ile{yields}Val) in position 28 of the receptor protein and a silent mutation (C{yields}T) in nucleotide position 549. The occurrence of the Ile-28-Val substitution was studied in an extended sample of patients (n = 352) and controls (n = 210) but was found in similar frequencies in all groups. Thus, this mutation is unlikely to play a significant role in the genetic predisposition to the diseases investigated. In conclusion, our study does not provide evidence that the 5-HT{sub 1A} gene plays either a major or a minor role in the genetic predisposition to schizophrenia, bipolar affective disorder, or Tourette`s syndrome. 29 refs., 4 figs., 1 tab.« less
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-01-01
Abstract The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. PMID:28981708
Zhang, Yanjie; Sun, Jin; Li, Xinzheng; Qiu, Jian-Wen
2016-01-01
We reported a nearly complete mitochondrial genome (mitogenome) from the glass sponge Lophophysema eversa, the second mitogenome in the order Amphidiscosida and the ninth in the class Hexactinellida. It is 20,651 base pairs in length and contains 39 genes including 13 protein-coding genes, 2 ribosomal RNA subunit genes and 24 tRNA genes. The gene content and order of L. eversa are identical to those of Tabachnickia sp., the other species with a sequenced mitogenome in Amphidiscosida, except with two additional tRNAs and three tRNA translocations. The cob gene has a +1 translational frameshift. These results will contribute to a better understanding of the phylogeny of glass sponges.
Babbar, Anshu; Itzek, Andreas; Pieper, Dietmar H; Nitsche-Schmitz, D Patric
2018-03-12
Streptococcus dysgalactiae subsp. equisimilis (SDSE), belonging to the group C and G streptococci, are human pathogens reported to cause clinical manifestations similar to infections caused by Streptococcus pyogenes. To scrutinize the distribution of gene coding for S. pyogenes virulence factors in SDSE, 255 isolates were collected from humans infected with SDSE in Vellore, a region in southern India, with high incidence of SDSE infections. Initial evaluation indicated SDSE isolates comprising of 82.35% group G and 17.64% group C. A multiplex PCR system was used to detect 21 gene encoding virulence-associated factors of S. pyogenes, like superantigens, DNases, proteinases, and other immune modulatory toxins. As validated by DNA sequencing of the PCR products, sequences homologous to speC, speG, speH, speI, speL, ssa and smeZ of the family of superantigen coding genes and for DNases like sdaD and sdc were detected in the SDSE collection. Furthermore, there was high abundance (48.12% in group G and 86.6% in group C SDSE) of scpA, the gene coding for C5a peptidase in these isolates. Higher abundance of S. pyogenes virulence factor genes was observed in SDSE of Lancefield group C as compared to group G, even though the incidence rates in former were lower. This study not only substantiates detection of S. pyogenes virulence factor genes in whole genome sequenced SDSE but also makes significant contribution towards the understanding of SDSE and its increasing virulence potential.
Higashi, Koichi; Tobe, Toru; Kanai, Akinori; Uyar, Ebru; Ishikawa, Shu; Suzuki, Yutaka; Ogasawara, Naotake; Kurokawa, Ken; Oshima, Taku
2016-01-01
Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches. PMID:26789284
Higashi, Koichi; Tobe, Toru; Kanai, Akinori; Uyar, Ebru; Ishikawa, Shu; Suzuki, Yutaka; Ogasawara, Naotake; Kurokawa, Ken; Oshima, Taku
2016-01-01
Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches.
Epigenetic regulation in human melanoma: past and future.
Sarkar, Debina; Leung, Euphemia Y; Baguley, Bruce C; Finlay, Graeme J; Askarian-Amiri, Marjan E
2015-01-01
The development and progression of melanoma have been attributed to independent or combined genetic and epigenetic events. There has been remarkable progress in understanding melanoma pathogenesis in terms of genetic alterations. However, recent studies have revealed a complex involvement of epigenetic mechanisms in the regulation of gene expression, including methylation, chromatin modification and remodeling, and the diverse activities of non-coding RNAs. The roles of gene methylation and miRNAs have been relatively well studied in melanoma, but other studies have shown that changes in chromatin status and in the differential expression of long non-coding RNAs can lead to altered regulation of key genes. Taken together, they affect the functioning of signaling pathways that influence each other, intersect, and form networks in which local perturbations disturb the activity of the whole system. Here, we focus on how epigenetic events intertwine with these pathways and contribute to the molecular pathogenesis of melanoma.
Molecular codes for neuronal individuality and cell assembly in the brain
Yagi, Takeshi
2012-01-01
The brain contains an enormous, but finite, number of neurons. The ability of this limited number of neurons to produce nearly limitless neural information over a lifetime is typically explained by combinatorial explosion; that is, by the exponential amplification of each neuron's contribution through its incorporation into “cell assemblies” and neural networks. In development, each neuron expresses diverse cellular recognition molecules that permit the formation of the appropriate neural cell assemblies to elicit various brain functions. The mechanism for generating neuronal assemblies and networks must involve molecular codes that give neurons individuality and allow them to recognize one another and join appropriate networks. The extensive molecular diversity of cell-surface proteins on neurons is likely to contribute to their individual identities. The clustered protocadherins (Pcdh) is a large subfamily within the diverse cadherin superfamily. The clustered Pcdh genes are encoded in tandem by three gene clusters, and are present in all known vertebrate genomes. The set of clustered Pcdh genes is expressed in a random and combinatorial manner in each neuron. In addition, cis-tetramers composed of heteromultimeric clustered Pcdh isoforms represent selective binding units for cell-cell interactions. Here I present the mathematical probabilities for neuronal individuality based on the random and combinatorial expression of clustered Pcdh isoforms and their formation of cis-tetramers in each neuron. Notably, clustered Pcdh gene products are known to play crucial roles in correct axonal projections, synaptic formation, and neuronal survival. Their molecular and biological features induce a hypothesis that the diverse clustered Pcdh molecules provide the molecular code by which neuronal individuality and cell assembly permit the combinatorial explosion of networks that supports enormous processing capability and plasticity of the brain. PMID:22518100
Genes encoding cuticular proteins are components of the Nimrod gene cluster in Drosophila.
Cinege, Gyöngyi; Zsámboki, János; Vidal-Quadras, Maite; Uv, Anne; Csordás, Gábor; Honti, Viktor; Gábor, Erika; Hegedűs, Zoltán; Varga, Gergely I B; Kovács, Attila L; Juhász, Gábor; Williams, Michael J; Andó, István; Kurucz, Éva
2017-08-01
The Nimrod gene cluster, located on the second chromosome of Drosophila melanogaster, is the largest synthenic unit of the Drosophila genome. Nimrod genes show blood cell specific expression and code for phagocytosis receptors that play a major role in fruit fly innate immune functions. We previously identified three homologous genes (vajk-1, vajk-2 and vajk-3) located within the Nimrod cluster, which are unrelated to the Nimrod genes, but are homologous to a fourth gene (vajk-4) located outside the cluster. Here we show that, unlike the Nimrod candidates, the Vajk proteins are expressed in cuticular structures of the late embryo and the late pupa, indicating that they contribute to cuticular barrier functions. Copyright © 2017 Elsevier Ltd. All rights reserved.
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.
Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja
2017-02-01
Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
Sexual selection drives evolution and rapid turnover of male gene expression.
Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Dean, Rebecca; Montgomery, Stephen H; Pointer, Marie A; Mank, Judith E
2015-04-07
The profound and pervasive differences in gene expression observed between males and females, and the unique evolutionary properties of these genes in many species, have led to the widespread assumption that they are the product of sexual selection and sexual conflict. However, we still lack a clear understanding of the connection between sexual selection and transcriptional dimorphism, often termed sex-biased gene expression. Moreover, the relative contribution of sexual selection vs. drift in shaping broad patterns of expression, divergence, and polymorphism remains unknown. To assess the role of sexual selection in shaping these patterns, we assembled transcriptomes from an avian clade representing the full range of sexual dimorphism and sexual selection. We use these species to test the links between sexual selection and sex-biased gene expression evolution in a comparative framework. Through ancestral reconstruction of sex bias, we demonstrate a rapid turnover of sex bias across this clade driven by sexual selection and show it to be primarily the result of expression changes in males. We use phylogenetically controlled comparative methods to demonstrate that phenotypic measures of sexual selection predict the proportion of male-biased but not female-biased gene expression. Although male-biased genes show elevated rates of coding sequence evolution, consistent with previous reports in a range of taxa, there is no association between sexual selection and rates of coding sequence evolution, suggesting that expression changes may be more important than coding sequence in sexual selection. Taken together, our results highlight the power of sexual selection to act on gene expression differences and shape genome evolution.
Examination of AVPR1a as an autism susceptibility gene.
Wassink, T H; Piven, J; Vieland, V J; Pietila, J; Goedken, R J; Folstein, S E; Sheffield, V C
2004-10-01
Impaired reciprocal social interaction is one of the core features of autism. While its determinants are complex, one biomolecular pathway that clearly influences social behavior is the arginine-vasopressin (AVP) system. The behavioral effects of AVP are mediated through the AVP receptor 1a (AVPR1a), making the AVPR1a gene a reasonable candidate for autism susceptibility. We tested the gene's contribution to autism by screening its exons in 125 independent autistic probands and genotyping two promoter polymorphisms in 65 autism affected sibling pair (ASP) families. While we found no nonconservative coding sequence changes, we did identify evidence of linkage and of linkage disequilibrium. These results were most pronounced in a subset of the ASP families with relatively less severe impairment of language. Thus, though we did not demonstrate a disease-causing variant in the coding sequence, numerous nontraditional disease-causing genetic abnormalities are known to exist that would escape detection by traditional gene screening methods. Given the emerging biological, animal model, and now genetic data, AVPR1a and genes in the AVP system remain strong candidates for involvement in autism susceptibility and deserve continued scrutiny.
Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer.
Goremykin, Vadim V; Salamini, Francesco; Velasco, Riccardo; Viola, Roberto
2009-01-01
The mitochondrial genome of grape (Vitis vinifera), the largest organelle genome sequenced so far, is presented. The genome is 773,279 nt long and has the highest coding capacity among known angiosperm mitochondrial DNAs (mtDNAs). The proportion of promiscuous DNA of plastid origin in the genome is also the largest ever reported for an angiosperm mtDNA, both in absolute and relative terms. In all, 42.4% of chloroplast genome of Vitis has been incorporated into its mitochondrial genome. In order to test if horizontal gene transfer (HGT) has also contributed to the gene content of the grape mtDNA, we built phylogenetic trees with the coding sequences of mitochondrial genes of grape and their homologs from plant mitochondrial genomes. Many incongruent gene tree topologies were obtained. However, the extent of incongruence between these gene trees is not significantly greater than that observed among optimal trees for chloroplast genes, the common ancestry of which has never been in doubt. In both cases, we attribute this incongruence to artifacts of tree reconstruction, insufficient numbers of characters, and gene paralogy. This finding leads us to question the recent phylogenetic interpretation of Bergthorsson et al. (2003, 2004) and Richardson and Palmer (2007) that rampant HGT into the mtDNA of Amborella best explains phylogenetic incongruence between mitochondrial gene trees for angiosperms. The only evidence for HGT into the Vitis mtDNA found involves fragments of two coding sequences stemming from two closteroviruses that cause the leaf roll disease of this plant. We also report that analysis of sequences shared by both chloroplast and mitochondrial genomes provides evidence for a previously unknown gene transfer route from the mitochondrion to the chloroplast.
Guo, Changjiang; Sun, Xiaoguang; Chen, Xiao; Yang, Sihai; Li, Jing; Wang, Long; Zhang, Xiaohui
2016-01-01
Most rice blast resistance genes (R-genes) encode proteins with nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains. Our previous study has shown that more rice blast R-genes can be cloned in rapidly evolving NBS-LRR gene families. In the present study, two rapidly evolving R-gene families in rice were selected for cloning a subset of genes from their paralogs in three resistant rice lines. A total of eight functional blast R-genes were identified among nine NBS-LRR genes, and some of these showed resistance to three or more blast strains. Evolutionary analysis indicated that high nucleotide diversity of coding regions served as important parameters in the determination of gene resistance. We also observed that amino-acid variants (nonsynonymous mutations, insertions, or deletions) in essential motifs of the NBS domain contribute to the blast resistance capacity of NBS-LRR genes. These results suggested that the NBS regions might also play an important role in resistance specificity determination. On the other hand, different splicing patterns of introns were commonly observed in R-genes. The results of the present study contribute to improving the effectiveness of R-gene identification by using evolutionary analysis method and acquisition of novel blast resistance genes.
Thuan, Nguyen Huy; Dhakal, Dipesh; Pokhrel, Anaya Raj; Chu, Luan Luong; Van Pham, Thi Thuy; Shrestha, Anil; Sohng, Jae Kyung
2018-05-01
Streptomyces peucetius ATCC 27952 produces two major anthracyclines, doxorubicin (DXR) and daunorubicin (DNR), which are potent chemotherapeutic agents for the treatment of several cancers. In order to gain detailed insight on genetics and biochemistry of the strain, the complete genome was determined and analyzed. The result showed that its complete sequence contains 7187 protein coding genes in a total of 8,023,114 bp, whereas 87% of the genome contributed to the protein coding region. The genomic sequence included 18 rRNA, 66 tRNAs, and 3 non-coding RNAs. In silico studies predicted ~ 68 biosynthetic gene clusters (BCGs) encoding diverse classes of secondary metabolites, including non-ribosomal polyketide synthase (NRPS), polyketide synthase (PKS I, II, and III), terpenes, and others. Detailed analysis of the genome sequence revealed versatile biocatalytic enzymes such as cytochrome P450 (CYP), electron transfer systems (ETS) genes, methyltransferase (MT), glycosyltransferase (GT). In addition, numerous functional genes (transporter gene, SOD, etc.) and regulatory genes (afsR-sp, metK-sp, etc.) involved in the regulation of secondary metabolites were found. This minireview summarizes the genome-based genome mining (GM) of diverse BCGs and genome exploration (GE) of versatile biocatalytic enzymes, and other enzymes involved in maintenance and regulation of metabolism of S. peucetius. The detailed analysis of genome sequence provides critically important knowledge useful in the bioengineering of the strain or harboring catalytically efficient enzymes for biotechnological applications.
Rare variants in the neurotrophin signaling pathway implicated in schizophrenia risk.
Kranz, Thorsten M; Goetz, Ray R; Walsh-Messinger, Julie; Goetz, Deborah; Antonius, Daniel; Dolgalev, Igor; Heguy, Adriana; Seandel, Marco; Malaspina, Dolores; Chao, Moses V
2015-10-01
Multiple lines of evidence corroborate impaired signaling pathways as relevant to the underpinnings of schizophrenia. There has been an interest in neurotrophins, since they are crucial mediators of neurodevelopment and in synaptic connectivity in the adult brain. Neurotrophins and their receptors demonstrate aberrant expression patterns in cortical areas for schizophrenia cases in comparison to control subjects. There is little known about the contribution of neurotrophin genes in psychiatric disorders. To begin to address this issue, we conducted high-coverage targeted exome capture in a subset of neurotrophin genes in 48 comprehensively characterized cases with schizophrenia-related psychosis. We herein report rare missense polymorphisms and novel missense mutations in neurotrophin receptor signaling pathway genes. Furthermore, we observed that several genes have a higher propensity to harbor missense coding variants than others. Based on this initial analysis we suggest that rare variants and missense mutations in neurotrophin genes might represent genetic contributions involved across psychiatric disorders. Copyright © 2015 Elsevier B.V. All rights reserved.
Kelsen, Judith R.; Dawany, Noor; Moran, Christopher J.; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S.; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F.; Daly, Mark; Sullivan, Kathleen E.; Baldassano, Robert N.; Devoto, Marcella
2016-01-01
Background & Aims Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed ≤5 y of age, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Methods Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (ages 3 weeks to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by post-processing and variant calling. Following functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency <0.1%, and scaled combined annotation dependent depletion scores ≤10. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n=45) or adult-onset Crohn's disease (n=20) and healthy individuals (controls, n=145) were obtained from the University of Kiel, Germany and used as control groups. Results Four-hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling > 1 Mbp of coding sequence, were selected from the whole exome data. Our analysis revealed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. Conclusions In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. PMID:26193622
The Evolution and Expression Pattern of Human Overlapping lncRNA and Protein-coding Gene Pairs.
Ning, Qianqian; Li, Yixue; Wang, Zhen; Zhou, Songwen; Sun, Hong; Yu, Guangjun
2017-03-27
Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh
2018-06-03
Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-11-01
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Zhao, Qiang; Yue, Shengjie; Bilal, Muhammad; Hu, Hongbo; Wang, Wei; Zhang, Xuehong
2017-12-31
Bacteria belonging to the genera Sphingomonas and Sphingobium are known for their ability to catabolize aromatic compounds. In this study, we analyzed the whole genome sequences of 26 strains in the genera Sphingomonas and Sphingobium to gain insight into dissemination of bioremediation capabilities, biodegradation potential, central pathways and genome plasticity. Phylogenetic analysis revealed that both Sphingomonas sp. strain BHC-A and Sphingomonas paucimobilis EPA505 should be placed in the genus Sphingobium. The bph and xyl gene cluster was found in 6 polycyclic aromatic hydrocarbons-degrading strains. Transposase and IS coding genes were found in the 6 gene clusters, suggesting the mobility of bph and xyl gene clusters. β-ketoadipate and homogentisate pathways were the main central pathways in Sphingomonas and Sphingobium strains. A large number of oxygenase coding genes were predicted in the 26 genomes, indicating a huge biodegradation potential of the Sphingomonas and Sphingobium strains. Horizontal gene transfer related genes and prophages were predicted in the analyzed strains, suggesting the ongoing evolution and shaping of the genomes. Analysis of the 26 genomes in this work contributes to the understanding of dispersion of bioremediation capabilities, bioremediation potential and genome plasticity in strains belonging to the genera Sphingomonas and Sphingobium. Copyright © 2017 Elsevier B.V. All rights reserved.
Rieseberg, Loren H.; Blackman, Benjamin K.
2010-01-01
Background Analyses of speciation genes – genes that contribute to the cessation of gene flow between populations – can offer clues regarding the ecological settings, evolutionary forces and molecular mechanisms that drive the divergence of populations and species. This review discusses the identities and attributes of genes that contribute to reproductive isolation (RI) in plants, compares them with animal speciation genes and investigates what these genes can tell us about speciation. Scope Forty-one candidate speciation genes were identified in the plant literature. Of these, seven contributed to pre-pollination RI, one to post-pollination, prezygotic RI, eight to hybrid inviability, and 25 to hybrid sterility. Genes, gene families and genetic pathways that were frequently found to underlie the evolution of RI in different plant groups include the anthocyanin pathway and its regulators (pollinator isolation), S RNase-SI genes (unilateral incompatibility), disease resistance genes (hybrid necrosis), chimeric mitochondrial genes (cytoplasmic male sterility), and pentatricopeptide repeat family genes (cytoplasmic male sterility). Conclusions The most surprising conclusion from this review is that identities of genes underlying both prezygotic and postzygotic RI are often predictable in a broad sense from the phenotype of the reproductive barrier. Regulatory changes (both cis and trans) dominate the evolution of pre-pollination RI in plants, whereas a mix of regulatory mutations and changes in protein-coding genes underlie intrinsic postzygotic barriers. Also, loss-of-function mutations and copy number variation frequently contribute to RI. Although direct evidence of positive selection on speciation genes is surprisingly scarce in plants, analyses of gene family evolution, along with theoretical considerations, imply an important role for diversifying selection and genetic conflict in the evolution of RI. Unlike in animals, however, most candidate speciation genes in plants exhibit intraspecific polymorphism, consistent with an important role for stochastic forces and/or balancing selection in development of RI in plants. PMID:20576737
CHEK2 contribution to hereditary breast cancer in non-BRCA families.
Desrichard, Alexis; Bidet, Yannick; Uhrhammer, Nancy; Bignon, Yves-Jean
2011-01-01
Mutations in the BRCA1 and BRCA2 genes are responsible for only a part of hereditary breast cancer (HBC). The origins of "non-BRCA" HBC in families may be attributed in part to rare mutations in genes conferring moderate risk, such as CHEK2, which encodes for an upstream regulator of BRCA1. Previous studies have demonstrated an association between CHEK2 founder mutations and non-BRCA HBC. However, very few data on the entire coding sequence of this gene are available. We investigated the contribution of CHEK2 mutations to non-BRCA HBC by direct sequencing of its whole coding sequence in 507 non-BRCA HBC cases and 513 controls. We observed 16 mutations in cases and 4 in controls, including 9 missense variants of uncertain consequence. Using both in silico tools and an in vitro kinase activity test, the majority of the variants were found likely to be deleterious for protein function. One variant present in both cases and controls was proposed to be neutral. Removing this variant from the pool of potentially deleterious variants gave a mutation frequency of 1.48% for cases and 0.29% for controls (P = 0.0040). The odds ratio of breast cancer in the presence of a deleterious CHEK2 mutation was 5.18. Our work indicates that a variety of deleterious CHEK2 alleles make an appreciable contribution to breast cancer susceptibility, and their identification could help in the clinical management of patients carrying a CHEK2 mutation.
Negligible impact of rare autoimmune-locus coding-region variants on missing heritability.
Hunt, Karen A; Mistry, Vanisha; Bockett, Nicholas A; Ahmad, Tariq; Ban, Maria; Barker, Jonathan N; Barrett, Jeffrey C; Blackburn, Hannah; Brand, Oliver; Burren, Oliver; Capon, Francesca; Compston, Alastair; Gough, Stephen C L; Jostins, Luke; Kong, Yong; Lee, James C; Lek, Monkol; MacArthur, Daniel G; Mansfield, John C; Mathew, Christopher G; Mein, Charles A; Mirza, Muddassar; Nutland, Sarah; Onengut-Gumuscu, Suna; Papouli, Efterpi; Parkes, Miles; Rich, Stephen S; Sawcer, Steven; Satsangi, Jack; Simmonds, Matthew J; Trembath, Richard C; Walker, Neil M; Wozniak, Eva; Todd, John A; Simpson, Michael A; Plagnol, Vincent; van Heel, David A
2013-06-13
Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.
Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M
2014-06-01
It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
Werling, Donna M; Brand, Harrison; An, Joon-Yong; Stone, Matthew R; Zhu, Lingxue; Glessner, Joseph T; Collins, Ryan L; Dong, Shan; Layer, Ryan M; Markenscoff-Papadimitriou, Eirene; Farrell, Andrew; Schwartz, Grace B; Wang, Harold Z; Currall, Benjamin B; Zhao, Xuefang; Dea, Jeanselle; Duhn, Clif; Erdman, Carolyn A; Gilson, Michael C; Yadav, Rachita; Handsaker, Robert E; Kashin, Seva; Klei, Lambertus; Mandell, Jeffrey D; Nowakowski, Tomasz J; Liu, Yuwen; Pochareddy, Sirisha; Smith, Louw; Walker, Michael F; Waterman, Matthew J; He, Xin; Kriegstein, Arnold R; Rubenstein, John L; Sestan, Nenad; McCarroll, Steven A; Neale, Benjamin M; Coon, Hilary; Willsey, A Jeremy; Buxbaum, Joseph D; Daly, Mark J; State, Matthew W; Quinlan, Aaron R; Marth, Gabor T; Roeder, Kathryn; Devlin, Bernie; Talkowski, Michael E; Sanders, Stephan J
2018-05-01
Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here we present a comparable framework to evaluate rare and de novo noncoding single-nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism, the contribution of de novo noncoding variation is probably modest in comparison to that of de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple-testing burden.
Genomics of Clostridium taeniosporum, an organism which forms endospores with ribbon-like appendages
Cambridge, Joshua M.; Blinkova, Alexandra L.; Salvador Rocha, Erick I.; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M.; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O.
2018-01-01
Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12–14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism. PMID:29293521
Cambridge, Joshua M; Blinkova, Alexandra L; Salvador Rocha, Erick I; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O; Walker, James R
2018-01-01
Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12-14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism.
Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano
2004-01-01
The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464
Behind the curtain of non-coding RNAs; long non-coding RNAs regulating hepatocarcinogenesis
El Khodiry, Aya; Afify, Menna; El Tayebi, Hend M
2018-01-01
Hepatocellular carcinoma (HCC) is one of the most common and aggressive cancers worldwide. HCC is the fifth common malignancy in the world and the second leading cause of cancer death in Asia. Long non-coding RNAs (lncRNAs) are RNAs with a length greater than 200 nucleotides that do not encode proteins. lncRNAs can regulate gene expression and protein synthesis in several ways by interacting with DNA, RNA and proteins in a sequence specific manner. They could regulate cellular and developmental processes through either gene inhibition or gene activation. Many studies have shown that dysregulation of lncRNAs is related to many human diseases such as cardiovascular diseases, genetic disorders, neurological diseases, immune mediated disorders and cancers. However, the study of lncRNAs is challenging as they are poorly conserved between species, their expression levels aren’t as high as that of mRNAs and have great interpatient variations. The study of lncRNAs expression in cancers have been a breakthrough as it unveils potential biomarkers and drug targets for cancer therapy and helps understand the mechanism of pathogenesis. This review discusses many long non-coding RNAs and their contribution in HCC, their role in development, metastasis, and prognosis of HCC and how to regulate and target these lncRNAs as a therapeutic tool in HCC treatment in the future. PMID:29434445
Rapid strain improvement through optimized evolution in the cytostat.
Gilbert, Alan; Sangurdekar, Dipen P; Srienc, Friedrich
2009-06-15
Acetate is present in lignocellulosic hydrolysates at growth inhibiting concentrations. Industrial processes based on such feedstock require strains that are tolerant of this and other inhibitors present. We investigated the effect of acetate on Saccharomyces cerevisiae and show that elevated acetate concentrations result in a decreased specific growth rate, an accumulation of cells in the G1 phase of the cell cycle, and an increased cell size. With the cytostat cultivation technology under previously derived optimal operating conditions, several acetate resistant mutants were enriched and isolated in the shortest possible time. In each case, the isolation time was less than 5 days. The independently isolated mutant strains have increased specific growth rates under conditions of high acetate concentrations, high ethanol concentrations, and high temperature. In the presence of high acetate concentrations, the isolated mutants produce ethanol at higher rates and titers than the parental strain and a commercial ethanol producing strain that has been analyzed for comparison. Whole genome microarray analysis revealed gene amplifications in each mutant. In one case, the LPP1 gene, coding for lipid phosphate phosphatase, was amplified. Two mutants contained amplified ENA1, ENA2, and ENA5 genes, which code for P-type ATPase sodium pumps. LPP1 was overexpressed on a plasmid, and the growth data at elevated acetate concentrations suggest that LPP1 likely contributes to the phenotype of acetate tolerance. A diploid cross of the two mutants with the amplified ENA genes grew faster than either individual haploid parent strain when 20 g/L acetate was supplemented to the medium, which suggests that these genes contribute to acetate tolerance in a gene dosage dependent manner. 2009 Wiley Periodicals, Inc.
Dickinson, Joanne L; Sale, Michèle M; Passmore, Abraham; FitzGerald, Liesel M; Wheatley, Catherine M; Burdon, Kathryn P; Craig, Jamie E; Tengtrisorn, Supaporn; Carden, Susan M; Maclean, Hector; Mackey, David A
2006-01-01
To examine the contribution of mutations within the Norrie disease (NDP) gene to the clinically similar retinal diseases Norrie disease, X-linked familial exudative vitreoretinopathy (FEVR), Coat's disease and retinopathy of prematurity (ROP). A dataset comprising 13 Norrie-FEVR, one Coat's disease, 31 ROP patients and 90 ex-premature babies of <32 weeks' gestation underwent an ophthalmologic examination and were screened for mutations within the NDP gene by direct DNA sequencing, denaturing high-performance liquid chromatography or gel electrophoresis. Controls were only screened using denaturing high-performance liquid chromatography and gel electrophoresis. Confirmation of mutations identified was obtained by DNA sequencing. Evidence for two novel mutations in the NDP gene was presented: Leu103Val in one FEVR patient and His43Arg in monozygotic twin Norrie disease patients. Furthermore, a previously described 14-bp deletion located in the 5' unstranslated region of the NDP gene was detected in three cases of regressed ROP. A second heterozygotic 14-bp deletion was detected in an unaffected ex-premature girl. Only two of the 13 Norrie-FEVR index cases had the full features of Norrie disease with deafness and mental retardation. Two novel mutations within the coding region of the NDP gene were found, one associated with a severe disease phenotypes of Norrie disease and the other with FEVR. A deletion within the non-coding region was associated with only mild-regressed ROP, despite the presence of low birthweight, prematurity and exposure to oxygen. In full-term children with retinal detachment only 15% appear to have the full features of Norrie disease and this is important for counselling parents on the possible long-term outcome.
Non-redundant coding of aversive odours in the main olfactory pathway
Dewan, Adam; Pacifico, Rodrigo; Zhan, Ross; Rinberg, Dmitry; Bozza, Thomas
2013-01-01
Many species are critically dependent on olfaction for survival. In the main olfactory system of mammals, odours are detected by sensory neurons which express a large repertoire of canonical odorant receptors (ORs) and a much smaller repertoire of Trace Amine-Associated Receptors (TAARs)1–4. Odours are encoded in a combinatorial fashion across glomeruli in the main olfactory bulb, with each glomerulus corresponding to a different receptor5–7. The degree to which individual receptor genes contribute to odour perception is unclear. Here we show that genetic deletion of the olfactory TAAR gene family, or even a single TAAR gene, eliminates aversion that mice display to low concentrations of volatile amines and to the odour of predator urine. Our findings identify a role for the TAARs in olfaction, namely in the high-sensitivity detection of innately aversive odours. In addition, our data reveal that aversive amines are represented in a non-redundant fashion, and that individual main olfactory receptor genes can contribute significantly to odour perception. PMID:23624375
De Novo Coding Variants Are Strongly Associated with Tourette Disorder
Willsey, A. Jeremy; Fernandez, Thomas V.; Yu, Dongmei; King, Robert A.; Dietrich, Andrea; Xing, Jinchuan; Sanders, Stephan J.; Mandell, Jeffrey D.; Huang, Alden Y.; Richer, Petra; Smith, Louw; Dong, Shan; Samocha, Kaitlin E.; Neale, Benjamin M.; Coppola, Giovanni; Mathews, Carol A.; Tischfield, Jay A.; Scharf, Jeremiah M.; State, Matthew W.; Heiman, Gary A.
2017-01-01
SUMMARY Whole-exome sequencing (WES) and de novo variant detection have proven a powerful approach to gene discovery in complex neurodevelopmental disorders. We have completed WES of 325 Tourette disorder trios from the Tourette International Collaborative Genetics cohort and a replication sample of 186 trios from the Tourette Syndrome Association International Consortium on Genetics (511 total). We observe strong and consistent evidence for the contribution of de novo likely gene-disrupting (LGD) variants (rate ratio [RR] 2.32, p = 0.002). Additionally, de novo damaging variants (LGD and probably damaging missense) are overrepresented in probands (RR 1.37, p = 0.003). We identify four likely risk genes with multiple de novo damaging variants in unrelated probands: WWC1 (WW and C2 domain containing 1), CELSR3 (Cadherin EGF LAG seven-pass G-type receptor 3), NIPBL (Nipped-B-like), and FN1 (fibronectin 1). Overall, we estimate that de novo damaging variants in approximately 400 genes contribute risk in 12% of clinical cases. PMID:28472652
Non-redundant coding of aversive odours in the main olfactory pathway.
Dewan, Adam; Pacifico, Rodrigo; Zhan, Ross; Rinberg, Dmitry; Bozza, Thomas
2013-05-23
Many species are critically dependent on olfaction for survival. In the main olfactory system of mammals, odours are detected by sensory neurons that express a large repertoire of canonical odorant receptors and a much smaller repertoire of trace amine-associated receptors (TAARs). Odours are encoded in a combinatorial fashion across glomeruli in the main olfactory bulb, with each glomerulus corresponding to a specific receptor. The degree to which individual receptor genes contribute to odour perception is unclear. Here we show that genetic deletion of the olfactory Taar gene family, or even a single Taar gene (Taar4), eliminates the aversion that mice display to low concentrations of volatile amines and to the odour of predator urine. Our findings identify a role for the TAARs in olfaction, namely, in the high-sensitivity detection of innately aversive odours. In addition, our data reveal that aversive amines are represented in a non-redundant fashion, and that individual main olfactory receptor genes can contribute substantially to odour perception.
An, Shi-Qi; Febrer, Melanie; McCarthy, Yvonne; Tang, Dong-Jie; Clissold, Leah; Kaithakottil, Gemy; Swarbreck, David; Tang, Ji-Liang; Rogers, Jane; Dow, J Maxwell; Ryan, Robert P
2013-01-01
The bacterium Xanthomonas campestris is an economically important pathogen of many crop species and a model for the study of bacterial phytopathogenesis. In X. campestris, a regulatory system mediated by the signal molecule DSF controls virulence to plants. The synthesis and recognition of the DSF signal depends upon different Rpf proteins. DSF signal generation requires RpfF whereas signal perception and transduction depends upon a system comprising the sensor RpfC and regulator RpfG. Here we have addressed the action and role of Rpf/DSF signalling in phytopathogenesis by high-resolution transcriptional analysis coupled to functional genomics. We detected transcripts for many genes that were unidentified by previous computational analysis of the genome sequence. Novel transcribed regions included intergenic transcripts predicted as coding or non-coding as well as those that were antisense to coding sequences. In total, mutation of rpfF, rpfG and rpfC led to alteration in transcript levels (more than fourfold) of approximately 480 genes. The regulatory influence of RpfF and RpfC demonstrated considerable overlap. Contrary to expectation, the regulatory influence of RpfC and RpfG had limited overlap, indicating complexities of the Rpf signalling system. Importantly, functional analysis revealed over 160 new virulence factors within the group of Rpf-regulated genes. PMID:23617851
Investigation of genes coding for inflammatory components in Parkinson's disease.
Håkansson, Anna; Westberg, Lars; Nilsson, Staffan; Buervenich, Silvia; Carmine, Andrea; Holmberg, Björn; Sydow, Olof; Olson, Lars; Johnels, Bo; Eriksson, Elias; Nissbrandt, Hans
2005-05-01
Several findings obtained recently indicate that inflammation may contribute to the pathogenesis in Parkinson's disease (PD). Genetic variants of genes coding for components involved in immune reactions in the brain might therefore influence the risk of developing PD or the age of disease onset. Five single nucleotide polymorphisms (SNPs) in the genes coding for interferon-gamma (IFN-gamma; T874A in intron 1), interferon-gamma receptor 2 (IFN-gamma R2; Gln64Arg), interleukin-10 (IL-10; G1082A in the promoter region), platelet-activating factor acetylhydrolase (PAF-AH; Val379Ala), and intercellular adhesion molecule 1 (ICAM-1; Lys469Glu) were genotyped, using pyrosequencing, in 265 patients with PD and 308 controls. None of the investigated SNPs was found to be associated with PD; however, the G1082A polymorphism in the IL-10 gene promoter was found to be related to the age of disease onset. Linear regression showed a significantly earlier onset with more A-alleles (P = 0.0095; after Bonferroni correction, P = 0.048), resulting in a 5-year delayed age of onset of the disease for individuals having two G-alleles compared with individuals having two A-alleles. The results indicate that the IL-10 G1082A SNP could possibly be related to the age of onset of PD. Copyright 2005 Movement Disorder Society.
The SPINK gene family and celiac disease susceptibility.
Wapenaar, Martin C; Monsuur, Alienke J; Poell, Jos; van 't Slot, Ruben; Meijer, Jos W R; Meijer, Gerrit A; Mulder, Chris J; Mearin, Maria Luisa; Wijmenga, Cisca
2007-05-01
The gene family of serine protease inhibitors of the Kazal type (SPINK) are functional and positional candidate genes for celiac disease (CD). Our aim was to assess the gut mucosal gene expression and genetic association of SPINK1, -2, -4, and -5 in the Dutch CD population. Gene expression was determined for all four SPINK genes by quantitative reverse-transcription polymerase chain reaction in duodenal biopsy samples from untreated (n=15) and diet-treated patients (n=31) and controls (n=16). Genetic association of the four SPINK genes was tested within a total of 18 haplotype tagging SNPs, one coding SNP, 310 patients, and 180 controls. The SPINK4 study cohort was further expanded to include 479 CD cases and 540 controls. SPINK4 DNA sequence analysis was performed on six members of a multigeneration CD family to detect possible point mutations or deletions. SPINK4 showed differential gene expression, which was at its highest in untreated patients and dropped sharply upon commencement of a gluten-free diet. Genetic association tests for all four SPINK genes were negative, including SPINK4 in the extended case/control cohort. No SPINK4 mutations or deletions were observed in the multigeneration CD family with linkage to chromosome 9p21-13 nor was the coding SNP disease-specific. SPINK4 exhibits CD pathology-related differential gene expression, likely derived from altered goblet cell activity. All of the four SPINK genes tested do not contribute to the genetic risk for CD in the Dutch population.
Haiman, Christopher A; Garcia, Rachel R; Hsu, Chris; Xia, Lucy; Ha, Helen; Sheng, Xin; Le Marchand, Loic; Kolonel, Laurence N; Henderson, Brian E; Stallcup, Michael R; Greene, Geoffrey L; Press, Michael F
2009-01-30
Only a limited number of studies have performed comprehensive investigations of coding variation in relation to breast cancer risk. Given the established role of estrogens in breast cancer, we hypothesized that coding variation in steroid receptor coactivator and corepressor genes may alter inter-individual response to estrogen and serve as markers of breast cancer risk. We sequenced the coding exons of 17 genes (EP300, CCND1, NME1, NCOA1, NCOA2, NCOA3, SMARCA4, SMARCA2, CARM1, FOXA1, MPG, NCOR1, NCOR2, CALCOCO1, PRMT1, PPARBP and CREBBP) suggested to influence transcriptional activation by steroid hormone receptors in a multiethnic panel of women with advanced breast cancer (n = 95): African Americans, Latinos, Japanese, Native Hawaiians and European Americans. Association testing of validated coding variants was conducted in a breast cancer case-control study (1,612 invasive cases and 1,961 controls) nested in the Multiethnic Cohort. We used logistic regression to estimate odds ratios for allelic effects in ethnic-pooled analyses as well as in subgroups defined by disease stage and steroid hormone receptor status. We also investigated effect modification by established breast cancer risk factors that are associated with steroid hormone exposure. We identified 45 coding variants with frequencies > or = 1% in any one ethnic group (43 non-synonymous variants). We observed nominally significant positive associations with two coding variants in ethnic-pooled analyses (NCOR2: His52Arg, OR = 1.79; 95% CI, 1.05-3.05; CALCOCO1: Arg12His, OR = 2.29; 95% CI, 1.00-5.26). A small number of variants were associated with risk in disease subgroup analyses and we observed no strong evidence of effect modification by breast cancer risk factors. Based on the large number of statistical tests conducted in this study, the nominally significant associations that we observed may be due to chance, and will need to be confirmed in other studies. Our findings suggest that common coding variation in these candidate genes do not make a substantial contribution to breast cancer risk in the general population. Cataloging and testing of coding variants in coactivator and corepressor genes should continue and may serve as a valuable resource for investigations of other hormone-related phenotypes, such as inter-individual response to hormonal therapies used for cancer treatment and prevention.
Expansion by whole genome duplication and evolution of the sox gene family in teleost fish
Naville, Magali; Volff, Jean-Nicolas
2017-01-01
It is now recognized that several rounds of whole genome duplication (WGD) have occurred during the evolution of vertebrates, but the link between WGDs and phenotypic diversification remains unsolved. We have investigated in this study the impact of the teleost-specific WGD on the evolution of the sox gene family in teleostean fishes. The sox gene family, which encodes for transcription factors, has essential role in morphology, physiology and behavior of vertebrates and teleosts, the current largest group of vertebrates. We have first redrawn the evolution of all sox genes identified in eleven teleost genomes using a comparative genomic approach including phylogenetic and synteny analyses. We noticed, compared to tetrapods, an important expansion of the sox family: 58% (11/19) of sox genes are duplicated in teleost genomes. Furthermore, all duplicated sox genes, except sox17 paralogs, are derived from the teleost-specific WGD. Then, focusing on five sox genes, analyzing the evolution of coding and non-coding sequences, as well as the expression patterns in fish embryos and adult tissues, we demonstrated that these paralogs followed lineage-specific evolutionary trajectories in teleost genomes. This work, based on whole genome data from multiple teleostean species, supports the contribution of WGDs to the expansion of gene families, as well as to the emergence of genomic differences between lineages that might promote genetic and phenotypic diversity in teleosts. PMID:28738066
Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng
2005-09-10
Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less
Evolution of Salmonella-Host Cell Interactions through a Dynamic Bacterial Genome
Ilyas, Bushra; Tsai, Caressa N.; Coombes, Brian K.
2017-01-01
Salmonella Typhimurium has a broad arsenal of genes that are tightly regulated and coordinated to facilitate adaptation to the various host environments it colonizes. The genome of Salmonella Typhimurium has undergone multiple gene acquisition events and has accrued changes in non-coding DNA that have undergone selection by regulatory evolution. Together, at least 17 horizontally acquired pathogenicity islands (SPIs), prophage-associated genes, and changes in core genome regulation contribute to the virulence program of Salmonella. Here, we review the latest understanding of these elements and their contributions to pathogenesis, emphasizing the regulatory circuitry that controls niche-specific gene expression. In addition to an overview of the importance of SPI-1 and SPI-2 to host invasion and colonization, we describe the recently characterized contributions of other SPIs, including the antibacterial activity of SPI-6 and adhesion and invasion mediated by SPI-4. We further discuss how these fitness traits have been integrated into the regulatory circuitry of the bacterial cell through cis-regulatory evolution and by a careful balance of silencing and counter-silencing by regulatory proteins. Detailed understanding of regulatory evolution within Salmonella is uncovering novel aspects of infection biology that relate to host-pathogen interactions and evasion of host immunity. PMID:29034217
Quantifying the mechanisms of domain gain in animal proteins.
Buljan, Marija; Frankish, Adam; Bateman, Alex
2010-01-01
Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechanisms that underlie domain gains in animals are still unknown. By using animal gene phylogenies we were able to identify a set of high confidence domain gain events and by looking at their coding DNA investigate the causative mechanisms. Here we show that the major mechanism for gains of new domains in metazoan proteins is likely to be gene fusion through joining of exons from adjacent genes, possibly mediated by non-allelic homologous recombination. Retroposition and insertion of exons into ancestral introns through intronic recombination are, in contrast to previous expectations, only minor contributors to domain gains and have accounted for less than 1% and 10% of high confidence domain gain events, respectively. Additionally, exonization of previously non-coding regions appears to be an important mechanism for addition of disordered segments to proteins. We observe that gene duplication has preceded domain gain in at least 80% of the gain events. The interplay of gene duplication and domain gain demonstrates an important mechanism for fast neofunctionalization of genes.
Bhavanandan, V P; Gupta, D; Woitach, J; Guo, X; Jiang, W
1999-06-01
Secreted epithelial mucins are large macromolecules which exhibit extreme polydispersity, the molecular basis of which is not fully understood. We have obtained partial sequences of two genes (BSM1 and BSM2) coding for two distinct molecules. This is the first time that such closely-related genes have been identified for any mucin from an animal. We propose that a combination of multiple homologous genes, alternative splicing, differential glycosylation, and additional post-translational processing all contribute to the extreme polydispersity of mucins. The multiple domain structure and non-identical tandem repeats are also very important for the generation of the saccharide diversities of mucins.
Discovering Protein-Coding Genes from the Environment: Time for the Eukaryotes?
Marmeisse, Roland; Kellner, Harald; Fraissinet-Tachet, Laurence; Luis, Patricia
2017-09-01
Eukaryotic microorganisms from diverse environments encompass a large number of taxa, many of them still unknown to science. One strategy to mine these organisms for genes of biotechnological relevance is to use a pool of eukaryotic mRNA directly extracted from environmental samples. Recent reports demonstrate that the resulting metatranscriptomic cDNA libraries can be screened by expression in yeast for a wide range of genes and functions from many of the different eukaryotic taxa. In combination with novel emerging high-throughput technologies, we anticipate that this approach should contribute to exploring the functional diversity of the eukaryotic microbiota. Copyright © 2017 Elsevier Ltd. All rights reserved.
CHEK2 contribution to hereditary breast cancer in non-BRCA families
2011-01-01
Background Mutations in the BRCA1 and BRCA2 genes are responsible for only a part of hereditary breast cancer (HBC). The origins of "non-BRCA" HBC in families may be attributed in part to rare mutations in genes conferring moderate risk, such as CHEK2, which encodes for an upstream regulator of BRCA1. Previous studies have demonstrated an association between CHEK2 founder mutations and non-BRCA HBC. However, very few data on the entire coding sequence of this gene are available. Methods We investigated the contribution of CHEK2 mutations to non-BRCA HBC by direct sequencing of its whole coding sequence in 507 non-BRCA HBC cases and 513 controls. Results We observed 16 mutations in cases and 4 in controls, including 9 missense variants of uncertain consequence. Using both in silico tools and an in vitro kinase activity test, the majority of the variants were found likely to be deleterious for protein function. One variant present in both cases and controls was proposed to be neutral. Removing this variant from the pool of potentially deleterious variants gave a mutation frequency of 1.48% for cases and 0.29% for controls (P = 0.0040). The odds ratio of breast cancer in the presence of a deleterious CHEK2 mutation was 5.18. Conclusions Our work indicates that a variety of deleterious CHEK2 alleles make an appreciable contribution to breast cancer susceptibility, and their identification could help in the clinical management of patients carrying a CHEK2 mutation. PMID:22114986
Sierocka, Izabela; Kozlowski, Lukasz P; Bujnicki, Janusz M; Jarmolowski, Artur; Szweykowska-Kulinska, Zofia
2014-06-17
In flowering plants a number of genes have been identified which control the transition from a vegetative to generative phase of life cycle. In bryophytes representing basal lineage of land plants, there is little data regarding the mechanisms that control this transition. Two species from bryophytes - moss Physcomitrella patens and liverwort Marchantia polymorpha are under advanced molecular and genetic research. The goal of our study was to identify genes connected to female gametophyte development and archegonia production in the dioecious liverwort Pellia endiviifolia species B, which is representative of the most basal lineage of the simple thalloid liverworts. The utility of the RDA-cDNA technique allowed us to identify three genes specifically expressed in the female individuals of P.endiviifolia: PenB_CYSP coding for cysteine protease, PenB_MT2 and PenB_MT3 coding for Mysterious Transcripts1 and 2 containing ORFs of 143 and 177 amino acid residues in length, respectively. The exon-intron structure of all three genes has been characterized and pre-mRNA processing was investigated. Interestingly, five mRNA isoforms are produced from the PenB_MT2 gene, which result from alternative splicing within the second and third exon. All observed splicing events take place within the 5'UTR and do not interfere with the coding sequence. All three genes are exclusively expressed in the female individuals, regardless of whether they were cultured in vitro or were collected from a natural habitat. Moreover we observed ten-fold increased transcripts level for all three genes in the archegonial tissue in comparison to the vegetative parts of the same female thalli grown in natural habitat suggesting their connection to archegonia development. We have identified three genes which are specifically expressed in P.endiviifolia sp B female gametophytes. Moreover, their expression is connected to the female sex-organ differentiation and is developmentally regulated. The contribution of the identified genes may be crucial for successful liverwort sexual reproduction.
Facts and updates about cardiovascular non-coding RNAs in heart failure.
Thum, Thomas
2015-09-01
About 11% of all deaths include heart failure as a contributing cause. The annual cost of heart failure amounts to US $34,000,000,000 in the United States alone. With the exception of heart transplantation, there is no curative therapy available. Only occasionally there are new areas in science that develop into completely new research fields. The topic on non-coding RNAs, including microRNAs, long non-coding RNAs, and circular RNAs, is such a field. In this short review, we will discuss the latest developments about non-coding RNAs in cardiovascular disease. MicroRNAs are short regulatory non-coding endogenous RNA species that are involved in virtually all cellular processes. Long non-coding RNAs also regulate gene and protein levels; however, by much more complicated and diverse mechanisms. In general, non-coding RNAs have been shown to be of great value as therapeutic targets in adverse cardiac remodelling and also as diagnostic and prognostic biomarkers for heart failure. In the future, non-coding RNA-based therapeutics are likely to enter the clinical reality offering a new treatment approach of heart failure.
Michel, Christian J
2017-04-18
In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.
Qayyum, Arqam; Zai, Clement C.; Hirata, Yuko; Tiwari, Arun K.; Cheema, Sheraz; Nowrouzi, Behdin; Beitchman, Joseph H.; Kennedy, James L.
2015-01-01
Aggressive behaviors have become a major public health problem, and early-onset aggression can lead to outcomes such as substance abuse, antisocial personality disorder among other issues. In recent years, there has been an increase in research in the molecular and genetic underpinnings of aggressive behavior, and one of the candidate genes codes for the catechol-O-methyltransferase (COMT). COMT is involved in catabolizing catecholamines such as dopamine. These neurotransmitters appear to be involved in regulating mood which can contribute to aggression. The most common gene variant studied in the COMT gene is the Valine (Val) to Methionine (Met) substitution at codon 158. We will be reviewing the current literature on this gene variant in aggressive behavior. PMID:26630958
Kelsen, Judith R; Dawany, Noor; Moran, Christopher J; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F; Daly, Mark; Sullivan, Kathleen E; Baldassano, Robert N; Devoto, Marcella
2015-11-01
Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed at 5 years of age or younger, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (age, 3 wk to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by postprocessing and variant calling. After functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency less than 0.1%, and scaled combined annotation-dependent depletion scores of 10 or less. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n = 45) or adult-onset Crohn's disease (n = 20) and healthy individuals (controls, n = 145) were obtained from the University of Kiel, Germany, and used as control groups. Four hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling more than 1 Mbp of coding sequence, were selected from the whole-exome data. Our analysis showed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
Küpper, Clemens; Burke, Terry; Lank, David B.
2015-01-01
Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935
Studying Functions of All Yeast Genes Simultaneously
NASA Technical Reports Server (NTRS)
Stolc, Viktor; Eason, Robert G.; Poumand, Nader; Herman, Zelek S.; Davis, Ronald W.; Anthony Kevin; Jejelowo, Olufisayo
2006-01-01
A method of studying the functions of all the genes of a given species of microorganism simultaneously has been developed in experiments on Saccharomyces cerevisiae (commonly known as baker's or brewer's yeast). It is already known that many yeast genes perform functions similar to those of corresponding human genes; therefore, by facilitating understanding of yeast genes, the method may ultimately also contribute to the knowledge needed to treat some diseases in humans. Because of the complexity of the method and the highly specialized nature of the underlying knowledge, it is possible to give only a brief and sketchy summary here. The method involves the use of unique synthetic deoxyribonucleic acid (DNA) sequences that are denoted as DNA bar codes because of their utility as molecular labels. The method also involves the disruption of gene functions through deletion of genes. Saccharomyces cerevisiae is a particularly powerful experimental system in that multiple deletion strains easily can be pooled for parallel growth assays. Individual deletion strains recently have been created for 5,918 open reading frames, representing nearly all of the estimated 6,000 genetic loci of Saccharomyces cerevisiae. Tagging of each deletion strain with one or two unique 20-nucleotide sequences enables identification of genes affected by specific growth conditions, without prior knowledge of gene functions. Hybridization of bar-code DNA to oligonucleotide arrays can be used to measure the growth rate of each strain over several cell-division generations. The growth rate thus measured serves as an index of the fitness of the strain.
Michel, Christian J.
2017-01-01
In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X. As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X. Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes. PMID:28420220
Low Prevalence of CHEK2 Gene Mutations in Multiethnic Cohorts of Breast Cancer Patients in Malaysia
Mohamad, Suriati; Isa, Nurismah Md; Muhammad, Rohaizak; Emran, Nor Aina; Kitan, Nor Mayah; Kang, Peter; Kang, In Nee; Taib, Nur Aishah Mohd; Teo, Soo Hwang; Akmal, Sharifah Noor
2015-01-01
CHEK2 is a protein kinase that is involved in cell-cycle checkpoint control after DNA damage. Germline mutations in CHEK2 gene have been associated with increase in breast cancer risk. The aim of this study is to identify the CHEK2 gene germline mutations among high-risk breast cancer patients and its contribution to the multiethnic population in Malaysia. We screened the entire coding region of CHEK2 gene on 59 high-risk breast cancer patients who tested negative for BRCA1/2 germline mutations from UKM Medical Centre (UKMMC), Hospital Kuala Lumpur (HKL) and Hospital Putrajaya (HPJ). Sequence variants identified were screened further in case-control cohorts consisting of 878 unselected invasive breast cancer patients (180 Malays, 526 Chinese and 172 Indian) and 270 healthy individuals (90 Malays, 90 Chinese and 90 Indian). By screening the entire coding region of the CHEK2 gene, two missense mutations, c.480A>G (p.I160M) and c.538C>T (p.R180C) were identified in two unrelated patients (3.4%). Further screening of these missense mutations on the case-control cohorts unveiled the variant p.I160M in 2/172 (1.1%) Indian cases and 1/90 (1.1%) Indian control, variant p.R180C in 2/526 (0.38%) Chinese cases and 0/90 Chinese control, and in 2/180 (1.1%) of Malay cases and 1/90 (1.1%) of Malay control. The results of this study suggest that CHEK2 mutations are rare among high-risk breast cancer patients and may play a minor contributing role in breast carcinogenesis among Malaysian population. PMID:25629968
Low prevalence of CHEK2 gene mutations in multiethnic cohorts of breast cancer patients in Malaysia.
Mohamad, Suriati; Isa, Nurismah Md; Muhammad, Rohaizak; Emran, Nor Aina; Kitan, Nor Mayah; Kang, Peter; Kang, In Nee; Taib, Nur Aishah Mohd; Teo, Soo Hwang; Akmal, Sharifah Noor
2015-01-01
CHEK2 is a protein kinase that is involved in cell-cycle checkpoint control after DNA damage. Germline mutations in CHEK2 gene have been associated with increase in breast cancer risk. The aim of this study is to identify the CHEK2 gene germline mutations among high-risk breast cancer patients and its contribution to the multiethnic population in Malaysia. We screened the entire coding region of CHEK2 gene on 59 high-risk breast cancer patients who tested negative for BRCA1/2 germline mutations from UKM Medical Centre (UKMMC), Hospital Kuala Lumpur (HKL) and Hospital Putrajaya (HPJ). Sequence variants identified were screened further in case-control cohorts consisting of 878 unselected invasive breast cancer patients (180 Malays, 526 Chinese and 172 Indian) and 270 healthy individuals (90 Malays, 90 Chinese and 90 Indian). By screening the entire coding region of the CHEK2 gene, two missense mutations, c.480A>G (p.I160M) and c.538C>T (p.R180C) were identified in two unrelated patients (3.4%). Further screening of these missense mutations on the case-control cohorts unveiled the variant p.I160M in 2/172 (1.1%) Indian cases and 1/90 (1.1%) Indian control, variant p.R180C in 2/526 (0.38%) Chinese cases and 0/90 Chinese control, and in 2/180 (1.1%) of Malay cases and 1/90 (1.1%) of Malay control. The results of this study suggest that CHEK2 mutations are rare among high-risk breast cancer patients and may play a minor contributing role in breast carcinogenesis among Malaysian population.
Howard, David M; Adams, Mark J; Clarke, Toni-Kim; Wigmore, Eleanor M; Zeng, Yanni; Hagenaars, Saskia P; Lyall, Donald M; Thomson, Pippa A; Evans, Kathryn L; Porteous, David J; Nagy, Reka; Hayward, Caroline; Haley, Chris S; Smith, Blair H; Murray, Alison D; Batty, G David; Deary, Ian J; McIntosh, Andrew M
2017-01-01
Cognitive ability is a heritable trait with a polygenic architecture, for which several associated variants have been identified using genotype-based and candidate gene approaches. Haplotype-based analyses are a complementary technique that take phased genotype data into account, and potentially provide greater statistical power to detect lower frequency variants. In the present analysis, three cohort studies (n total = 48,002) were utilised: Generation Scotland: Scottish Family Health Study (GS:SFHS), the English Longitudinal Study of Ageing (ELSA), and the UK Biobank. A genome-wide haplotype-based meta-analysis of cognitive ability was performed, as well as a targeted meta-analysis of several gene coding regions. None of the assessed haplotypes provided evidence of a statistically significant association with cognitive ability in either the individual cohorts or the meta-analysis. Within the meta-analysis, the haplotype with the lowest observed P -value overlapped with the D-amino acid oxidase activator ( DAOA ) gene coding region. This coding region has previously been associated with bipolar disorder, schizophrenia and Alzheimer's disease, which have all been shown to impact upon cognitive ability. Another potentially interesting region highlighted within the current genome-wide association analysis (GS:SFHS: P = 4.09 x 10 -7 ), was the butyrylcholinesterase ( BCHE ) gene coding region. The protein encoded by BCHE has been shown to influence the progression of Alzheimer's disease and its role in cognitive ability merits further investigation. Although no evidence was found for any haplotypes with a statistically significant association with cognitive ability, our results did provide further evidence that the genetic variants contributing to the variance of cognitive ability are likely to be of small effect.
Genome-wide identification and expression profiling of the SnRK2 gene family in Malus prunifolia.
Shao, Yun; Qin, Yuan; Zou, Yangjun; Ma, Fengwang
2014-11-15
Sucrose non-fermenting-1-related protein kinase 2 (SnRK2) constitutes a small plant-specific serine/threonine kinase family with essential roles in the abscisic acid (ABA) signal pathway and in responses to osmotic stress. Although a genome-wide analysis of this family has been conducted in some species, little is known about SnRK2 genes in apple (Malus domestica). We identified 14 putative sequences encoding 12 deduced SnRK2 proteins within the apple genome. Gene chromosomal location and synteny analysis of the apple SnRK2 genes indicated that tandem and segmental duplications have likely contributed to the expansion and evolution of these genes. All 12 full-length coding sequences were confirmed by cloning from Malus prunifolia. The gene structure and motif compositions of the apple SnRK2 genes were analyzed. Phylogenetic analysis showed that MpSnRK2s could be classified into four groups. Profiling of these genes presented differential patterns of expression in various tissues. Under stress conditions, transcript levels for some family members were up-regulated in the leaves in response to drought, salinity, or ABA treatments. This suggested their possible roles in plant response to abiotic stress. Our findings provide essential information about SnRK2 genes in apple and will contribute to further functional dissection of this gene family. Copyright © 2014 Elsevier B.V. All rights reserved.
Association analysis identifies 65 new breast cancer risk loci
Lemaçon, Audrey; Soucy, Penny; Glubb, Dylan; Rostamianfar, Asha; Bolla, Manjeet K.; Wang, Qin; Tyrer, Jonathan; Dicks, Ed; Lee, Andrew; Wang, Zhaoming; Allen, Jamie; Keeman, Renske; Eilber, Ursula; French, Juliet D.; Chen, Xiao Qing; Fachal, Laura; McCue, Karen; McCart Reed, Amy E.; Ghoussaini, Maya; Carroll, Jason; Jiang, Xia; Finucane, Hilary; Adams, Marcia; Adank, Muriel A.; Ahsan, Habibul; Aittomäki, Kristiina; Anton-Culver, Hoda; Antonenkova, Natalia N.; Arndt, Volker; Aronson, Kristan J.; Arun, Banu; Auer, Paul L.; Bacot, François; Barrdahl, Myrto; Baynes, Caroline; Beckmann, Matthias W.; Behrens, Sabine; Benitez, Javier; Bermisheva, Marina; Bernstein, Leslie; Blomqvist, Carl; Bogdanova, Natalia V.; Bojesen, Stig E.; Bonanni, Bernardo; Børresen-Dale, Anne-Lise; Brand, Judith S.; Brauch, Hiltrud; Brennan, Paul; Brenner, Hermann; Brinton, Louise; Broberg, Per; Brock, Ian W.; Broeks, Annegien; Brooks-Wilson, Angela; Brucker, Sara Y.; Brüning, Thomas; Burwinkel, Barbara; Butterbach, Katja; Cai, Qiuyin; Cai, Hui; Caldés, Trinidad; Canzian, Federico; Carracedo, Angel; Carter, Brian D.; Castelao, Jose E.; Chan, Tsun L.; Cheng, Ting-Yuan David; Chia, Kee Seng; Choi, Ji-Yeob; Christiansen, Hans; Clarke, Christine L.; Collée, Margriet; Conroy, Don M.; Cordina-Duverger, Emilie; Cornelissen, Sten; Cox, David G; Cox, Angela; Cross, Simon S.; Cunningham, Julie M.; Czene, Kamila; Daly, Mary B.; Devilee, Peter; Doheny, Kimberly F.; Dörk, Thilo; dos-Santos-Silva, Isabel; Dumont, Martine; Durcan, Lorraine; Dwek, Miriam; Eccles, Diana M.; Ekici, Arif B.; Eliassen, A. Heather; Ellberg, Carolina; Elvira, Mingajeva; Engel, Christoph; Eriksson, Mikael; Fasching, Peter A.; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Fritschi, Lin; Gaborieau, Valerie; Gabrielson, Marike; Gago-Dominguez, Manuela; Gao, Yu-Tang; Gapstur, Susan M.; García-Sáenz, José A.; Gaudet, Mia M.; Georgoulias, Vassilios; Giles, Graham G.; Glendon, Gord; Goldberg, Mark S.; Goldgar, David E.; González-Neira, Anna; Grenaker Alnæs, Grethe I.; Grip, Mervi; Gronwald, Jacek; Grundy, Anne; Guénel, Pascal; Haeberle, Lothar; Hahnen, Eric; Haiman, Christopher A.; Håkansson, Niclas; Hamann, Ute; Hamel, Nathalie; Hankinson, Susan; Harrington, Patricia; Hart, Steven N.; Hartikainen, Jaana M.; Hartman, Mikael; Hein, Alexander; Heyworth, Jane; Hicks, Belynda; Hillemanns, Peter; Ho, Dona N.; Hollestelle, Antoinette; Hooning, Maartje J.; Hoover, Robert N.; Hopper, John L.; Hou, Ming-Feng; Hsiung, Chia-Ni; Huang, Guanmengqian; Humphreys, Keith; Ishiguro, Junko; Ito, Hidemi; Iwasaki, Motoki; Iwata, Hiroji; Jakubowska, Anna; Janni, Wolfgang; John, Esther M.; Johnson, Nichola; Jones, Kristine; Jones, Michael; Jukkola-Vuorinen, Arja; Kaaks, Rudolf; Kabisch, Maria; Kaczmarek, Katarzyna; Kang, Daehee; Kasuga, Yoshio; Kerin, Michael J.; Khan, Sofia; Khusnutdinova, Elza; Kiiski, Johanna I.; Kim, Sung-Won; Knight, Julia A.; Kosma, Veli-Matti; Kristensen, Vessela N.; Krüger, Ute; Kwong, Ava; Lambrechts, Diether; Marchand, Loic Le; Lee, Eunjung; Lee, Min Hyuk; Lee, Jong Won; Lee, Chuen Neng; Lejbkowicz, Flavio; Li, Jingmei; Lilyquist, Jenna; Lindblom, Annika; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Long, Jirong; Lophatananon, Artitaya; Lubinski, Jan; Luccarini, Craig; Lux, Michael P.; Ma, Edmond S.K.; MacInnis, Robert J.; Maishman, Tom; Makalic, Enes; Malone, Kathleen E; Kostovska, Ivana Maleva; Mannermaa, Arto; Manoukian, Siranoush; Manson, JoAnn E.; Margolin, Sara; Mariapun, Shivaani; Martinez, Maria Elena; Matsuo, Keitaro; Mavroudis, Dimitrios; McKay, James; McLean, Catriona; Meijers-Heijboer, Hanne; Meindl, Alfons; Menéndez, Primitiva; Menon, Usha; Meyer, Jeffery; Miao, Hui; Miller, Nicola; Mohd Taib, Nur Aishah; Muir, Kenneth; Mulligan, Anna Marie; Mulot, Claire; Neuhausen, Susan L.; Nevanlinna, Heli; Neven, Patrick; Nielsen, Sune F.; Noh, Dong-Young; Nordestgaard, Børge G.; Norman, Aaron; Olopade, Olufunmilayo I.; Olson, Janet E.; Olsson, Håkan; Olswold, Curtis; Orr, Nick; Pankratz, V. Shane; Park, Sue K.; Park-Simon, Tjoung-Won; Lloyd, Rachel; Perez, Jose I.A.; Peterlongo, Paolo; Peto, Julian; Phillips, Kelly-Anne; Pinchev, Mila; Plaseska-Karanfilska, Dijana; Prentice, Ross; Presneau, Nadege; Prokofieva, Darya; Pugh, Elizabeth; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rennert, Gadi; Rennert, Hedy S.; Rhenius, Valerie; Romero, Atocha; Romm, Jane; Ruddy, Kathryn J; Rüdiger, Thomas; Rudolph, Anja; Ruebner, Matthias; Rutgers, Emiel J. Th.; Saloustros, Emmanouil; Sandler, Dale P.; Sangrajrang, Suleeporn; Sawyer, Elinor J.; Schmidt, Daniel F.; Schmutzler, Rita K.; Schneeweiss, Andreas; Schoemaker, Minouk J.; Schumacher, Fredrick; Schürmann, Peter; Scott, Rodney J.; Scott, Christopher; Seal, Sheila; Seynaeve, Caroline; Shah, Mitul; Sharma, Priyanka; Shen, Chen-Yang; Sheng, Grace; Sherman, Mark E.; Shrubsole, Martha J.; Shu, Xiao-Ou; Smeets, Ann; Sohn, Christof; Southey, Melissa C.; Spinelli, John J.; Stegmaier, Christa; Stewart-Brown, Sarah; Stone, Jennifer; Stram, Daniel O.; Surowy, Harald; Swerdlow, Anthony; Tamimi, Rulla; Taylor, Jack A.; Tengström, Maria; Teo, Soo H.; Terry, Mary Beth; Tessier, Daniel C.; Thanasitthichai, Somchai; Thöne, Kathrin; Tollenaar, Rob A.E.M.; Tomlinson, Ian; Tong, Ling; Torres, Diana; Truong, Thérèse; Tseng, Chiu-chen; Tsugane, Shoichiro; Ulmer, Hans-Ulrich; Ursin, Giske; Untch, Michael; Vachon, Celine; van Asperen, Christi J.; Van Den Berg, David; van den Ouweland, Ans M.W.; van der Kolk, Lizet; van der Luijt, Rob B.; Vincent, Daniel; Vollenweider, Jason; Waisfisz, Quinten; Wang-Gohrke, Shan; Weinberg, Clarice R.; Wendt, Camilla; Whittemore, Alice S.; Wildiers, Hans; Willett, Walter; Winqvist, Robert; Wolk, Alicja; Wu, Anna H.; Xia, Lucy; Yamaji, Taiki; Yang, Xiaohong R.; Yip, Cheng Har; Yoo, Keun-Young; Yu, Jyh-Cherng; Zheng, Wei; Zheng, Ying; Zhu, Bin; Ziogas, Argyrios; Ziv, Elad; Lakhani, Sunil R.; Antoniou, Antonis C.; Droit, Arnaud; Andrulis, Irene L.; Amos, Christopher I.; Couch, Fergus J.; Pharoah, Paul D.P.; Chang-Claude, Jenny; Hall, Per; Hunter, David J.; Milne, Roger L.; García-Closas, Montserrat; Schmidt, Marjanka K.; Chanock, Stephen J.; Dunning, Alison M.; Edwards, Stacey L.; Bader, Gary D.; Chenevix-Trench, Georgia; Simard, Jacques; Kraft, Peter; Easton, Douglas F.
2017-01-01
Breast cancer risk is influenced by rare coding variants in susceptibility genes such as BRCA1 and many common, mainly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. We report results from a genome-wide association study (GWAS) of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry1. We identified 65 new loci associated with overall breast cancer at p<5x10-8. The majority of credible risk SNPs in the new loci fall in distal regulatory elements, and by integrating in-silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all SNPs in regulatory features was 2-5-fold enriched relative to the genome-wide average, with strong enrichment for particular transcription factor binding sites. These results provide further insight into genetic susceptibility to breast cancer and will improve the utility of genetic risk scores for individualized screening and prevention. PMID:29059683
SoyNet: a database of co-functional networks for soybean Glycine max.
Kim, Eiru; Hwang, Sohyun; Lee, Insuk
2017-01-04
Soybean (Glycine max) is a legume crop with substantial economic value, providing a source of oil and protein for humans and livestock. More than 50% of edible oils consumed globally are derived from this crop. Soybean plants are also important for soil fertility, as they fix atmospheric nitrogen by symbiosis with microorganisms. The latest soybean genome annotation (version 2.0) lists 56 044 coding genes, yet their functional contributions to crop traits remain mostly unknown. Co-functional networks have proven useful for identifying genes that are involved in a particular pathway or phenotype with various network algorithms. Here, we present SoyNet (available at www.inetbio.org/soynet), a database of co-functional networks for G. max and a companion web server for network-based functional predictions. SoyNet maps 1 940 284 co-functional links between 40 812 soybean genes (72.8% of the coding genome), which were inferred from 21 distinct types of genomics data including 734 microarrays and 290 RNA-seq samples from soybean. SoyNet provides a new route to functional investigation of the soybean genome, elucidating genes and pathways of agricultural importance. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Association analysis identifies 65 new breast cancer risk loci.
Michailidou, Kyriaki; Lindström, Sara; Dennis, Joe; Beesley, Jonathan; Hui, Shirley; Kar, Siddhartha; Lemaçon, Audrey; Soucy, Penny; Glubb, Dylan; Rostamianfar, Asha; Bolla, Manjeet K; Wang, Qin; Tyrer, Jonathan; Dicks, Ed; Lee, Andrew; Wang, Zhaoming; Allen, Jamie; Keeman, Renske; Eilber, Ursula; French, Juliet D; Qing Chen, Xiao; Fachal, Laura; McCue, Karen; McCart Reed, Amy E; Ghoussaini, Maya; Carroll, Jason S; Jiang, Xia; Finucane, Hilary; Adams, Marcia; Adank, Muriel A; Ahsan, Habibul; Aittomäki, Kristiina; Anton-Culver, Hoda; Antonenkova, Natalia N; Arndt, Volker; Aronson, Kristan J; Arun, Banu; Auer, Paul L; Bacot, François; Barrdahl, Myrto; Baynes, Caroline; Beckmann, Matthias W; Behrens, Sabine; Benitez, Javier; Bermisheva, Marina; Bernstein, Leslie; Blomqvist, Carl; Bogdanova, Natalia V; Bojesen, Stig E; Bonanni, Bernardo; Børresen-Dale, Anne-Lise; Brand, Judith S; Brauch, Hiltrud; Brennan, Paul; Brenner, Hermann; Brinton, Louise; Broberg, Per; Brock, Ian W; Broeks, Annegien; Brooks-Wilson, Angela; Brucker, Sara Y; Brüning, Thomas; Burwinkel, Barbara; Butterbach, Katja; Cai, Qiuyin; Cai, Hui; Caldés, Trinidad; Canzian, Federico; Carracedo, Angel; Carter, Brian D; Castelao, Jose E; Chan, Tsun L; David Cheng, Ting-Yuan; Seng Chia, Kee; Choi, Ji-Yeob; Christiansen, Hans; Clarke, Christine L; Collée, Margriet; Conroy, Don M; Cordina-Duverger, Emilie; Cornelissen, Sten; Cox, David G; Cox, Angela; Cross, Simon S; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Devilee, Peter; Doheny, Kimberly F; Dörk, Thilo; Dos-Santos-Silva, Isabel; Dumont, Martine; Durcan, Lorraine; Dwek, Miriam; Eccles, Diana M; Ekici, Arif B; Eliassen, A Heather; Ellberg, Carolina; Elvira, Mingajeva; Engel, Christoph; Eriksson, Mikael; Fasching, Peter A; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Fritschi, Lin; Gaborieau, Valerie; Gabrielson, Marike; Gago-Dominguez, Manuela; Gao, Yu-Tang; Gapstur, Susan M; García-Sáenz, José A; Gaudet, Mia M; Georgoulias, Vassilios; Giles, Graham G; Glendon, Gord; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Grenaker Alnæs, Grethe I; Grip, Mervi; Gronwald, Jacek; Grundy, Anne; Guénel, Pascal; Haeberle, Lothar; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hamann, Ute; Hamel, Nathalie; Hankinson, Susan; Harrington, Patricia; Hart, Steven N; Hartikainen, Jaana M; Hartman, Mikael; Hein, Alexander; Heyworth, Jane; Hicks, Belynda; Hillemanns, Peter; Ho, Dona N; Hollestelle, Antoinette; Hooning, Maartje J; Hoover, Robert N; Hopper, John L; Hou, Ming-Feng; Hsiung, Chia-Ni; Huang, Guanmengqian; Humphreys, Keith; Ishiguro, Junko; Ito, Hidemi; Iwasaki, Motoki; Iwata, Hiroji; Jakubowska, Anna; Janni, Wolfgang; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael; Jukkola-Vuorinen, Arja; Kaaks, Rudolf; Kabisch, Maria; Kaczmarek, Katarzyna; Kang, Daehee; Kasuga, Yoshio; Kerin, Michael J; Khan, Sofia; Khusnutdinova, Elza; Kiiski, Johanna I; Kim, Sung-Won; Knight, Julia A; Kosma, Veli-Matti; Kristensen, Vessela N; Krüger, Ute; Kwong, Ava; Lambrechts, Diether; Le Marchand, Loic; Lee, Eunjung; Lee, Min Hyuk; Lee, Jong Won; Neng Lee, Chuen; Lejbkowicz, Flavio; Li, Jingmei; Lilyquist, Jenna; Lindblom, Annika; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Long, Jirong; Lophatananon, Artitaya; Lubinski, Jan; Luccarini, Craig; Lux, Michael P; Ma, Edmond S K; MacInnis, Robert J; Maishman, Tom; Makalic, Enes; Malone, Kathleen E; Kostovska, Ivana Maleva; Mannermaa, Arto; Manoukian, Siranoush; Manson, JoAnn E; Margolin, Sara; Mariapun, Shivaani; Martinez, Maria Elena; Matsuo, Keitaro; Mavroudis, Dimitrios; McKay, James; McLean, Catriona; Meijers-Heijboer, Hanne; Meindl, Alfons; Menéndez, Primitiva; Menon, Usha; Meyer, Jeffery; Miao, Hui; Miller, Nicola; Taib, Nur Aishah Mohd; Muir, Kenneth; Mulligan, Anna Marie; Mulot, Claire; Neuhausen, Susan L; Nevanlinna, Heli; Neven, Patrick; Nielsen, Sune F; Noh, Dong-Young; Nordestgaard, Børge G; Norman, Aaron; Olopade, Olufunmilayo I; Olson, Janet E; Olsson, Håkan; Olswold, Curtis; Orr, Nick; Pankratz, V Shane; Park, Sue K; Park-Simon, Tjoung-Won; Lloyd, Rachel; Perez, Jose I A; Peterlongo, Paolo; Peto, Julian; Phillips, Kelly-Anne; Pinchev, Mila; Plaseska-Karanfilska, Dijana; Prentice, Ross; Presneau, Nadege; Prokofyeva, Darya; Pugh, Elizabeth; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rennert, Gadi; Rennert, Hedy S; Rhenius, Valerie; Romero, Atocha; Romm, Jane; Ruddy, Kathryn J; Rüdiger, Thomas; Rudolph, Anja; Ruebner, Matthias; Rutgers, Emiel J T; Saloustros, Emmanouil; Sandler, Dale P; Sangrajrang, Suleeporn; Sawyer, Elinor J; Schmidt, Daniel F; Schmutzler, Rita K; Schneeweiss, Andreas; Schoemaker, Minouk J; Schumacher, Fredrick; Schürmann, Peter; Scott, Rodney J; Scott, Christopher; Seal, Sheila; Seynaeve, Caroline; Shah, Mitul; Sharma, Priyanka; Shen, Chen-Yang; Sheng, Grace; Sherman, Mark E; Shrubsole, Martha J; Shu, Xiao-Ou; Smeets, Ann; Sohn, Christof; Southey, Melissa C; Spinelli, John J; Stegmaier, Christa; Stewart-Brown, Sarah; Stone, Jennifer; Stram, Daniel O; Surowy, Harald; Swerdlow, Anthony; Tamimi, Rulla; Taylor, Jack A; Tengström, Maria; Teo, Soo H; Beth Terry, Mary; Tessier, Daniel C; Thanasitthichai, Somchai; Thöne, Kathrin; Tollenaar, Rob A E M; Tomlinson, Ian; Tong, Ling; Torres, Diana; Truong, Thérèse; Tseng, Chiu-Chen; Tsugane, Shoichiro; Ulmer, Hans-Ulrich; Ursin, Giske; Untch, Michael; Vachon, Celine; van Asperen, Christi J; Van Den Berg, David; van den Ouweland, Ans M W; van der Kolk, Lizet; van der Luijt, Rob B; Vincent, Daniel; Vollenweider, Jason; Waisfisz, Quinten; Wang-Gohrke, Shan; Weinberg, Clarice R; Wendt, Camilla; Whittemore, Alice S; Wildiers, Hans; Willett, Walter; Winqvist, Robert; Wolk, Alicja; Wu, Anna H; Xia, Lucy; Yamaji, Taiki; Yang, Xiaohong R; Har Yip, Cheng; Yoo, Keun-Young; Yu, Jyh-Cherng; Zheng, Wei; Zheng, Ying; Zhu, Bin; Ziogas, Argyrios; Ziv, Elad; Lakhani, Sunil R; Antoniou, Antonis C; Droit, Arnaud; Andrulis, Irene L; Amos, Christopher I; Couch, Fergus J; Pharoah, Paul D P; Chang-Claude, Jenny; Hall, Per; Hunter, David J; Milne, Roger L; García-Closas, Montserrat; Schmidt, Marjanka K; Chanock, Stephen J; Dunning, Alison M; Edwards, Stacey L; Bader, Gary D; Chenevix-Trench, Georgia; Simard, Jacques; Kraft, Peter; Easton, Douglas F
2017-11-02
Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10 -8 . The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide average, with strong enrichment for particular transcription factor binding sites. These results provide further insight into genetic susceptibility to breast cancer and will improve the use of genetic risk scores for individualized screening and prevention.
Mitchison, A
1997-01-01
In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics.
Transcriptional profiling of murine osteoblast differentiation based on RNA-seq expression analyses.
Khayal, Layal Abo; Grünhagen, Johannes; Provazník, Ivo; Mundlos, Stefan; Kornak, Uwe; Robinson, Peter N; Ott, Claus-Eric
2018-04-11
Osteoblastic differentiation is a multistep process characterized by osteogenic induction of mesenchymal stem cells, which then differentiate into proliferative pre-osteoblasts that produce copious amounts of extracellular matrix, followed by stiffening of the extracellular matrix, and matrix mineralization by hydroxylapatite deposition. Although these processes have been well characterized biologically, a detailed transcriptional analysis of murine primary calvaria osteoblast differentiation based on RNA sequencing (RNA-seq) analyses has not previously been reported. Here, we used RNA-seq to obtain expression values of 29,148 genes at four time points as murine primary calvaria osteoblasts differentiate in vitro until onset of mineralization was clearly detectable by microscopic inspection. Expression of marker genes confirmed osteogenic differentiation. We explored differential expression of 1386 protein-coding genes using unsupervised clustering and GO analyses. 100 differentially expressed lncRNAs were investigated by co-expression with protein-coding genes that are localized within the same topologically associated domain. Additionally, we monitored expression of 237 genes that are silent or active at distinct time points and compared differential exon usage. Our data represent an in-depth profiling of murine primary calvaria osteoblast differentiation by RNA-seq and contribute to our understanding of genetic regulation of this key process in osteoblast biology. Copyright © 2018 Elsevier Inc. All rights reserved.
Dastani, Zari; Ruel, Isabelle L; Engert, James C; Genest, Jacques; Marcil, Michel
2007-01-01
Background Niemann-Pick disease type A and B is caused by a deficiency of acid sphingomyelinase due to mutations in the sphingomyelin phosphodiesterase-1 (SMPD1) gene. In Niemann-Pick patients, SMPD1 gene defects are reported to be associated with a severe reduction in plasma high-density lipoprotein (HDL) cholesterol. Methods Two common coding polymorphisms in the SMPD1 gene, the G1522A (G508R) and a hexanucleotide repeat sequence within the signal peptide region, were investigated in 118 unrelated subjects of French Canadian descent with low plasma levels of HDL-cholesterol (< 5th percentile for age and gender-matched subjects). Control subjects (n = 230) had an HDL-cholesterol level > the 25th percentile. Results For G1522A the frequency of the G and A alleles were 75.2% and 24.8% respectively in controls, compared to 78.6% and 21.4% in subjects with low HDL-cholesterol (p = 0.317). The frequency of 6 and 7 hexanucleotide repeats was 46.2% and 46.6% respectively in controls, compared to 45.6% and 49.1% in subjects with low HDL-cholesterol (p = 0.619). Ten different haplotypes were observed in cases and controls. Overall haplotype frequencies in cases and controls were not significantly different. Conclusion These results suggest that the two common coding variants at the SMPD1 gene locus are not associated with low HDL-cholesterol levels in the French Canadian population. PMID:18088425
Durso, Lisa M; Miller, Daniel N; Wienhold, Brian J
2012-01-01
There is concern that antibiotic resistance can potentially be transferred from animals to humans through the food chain. The relationship between specific antibiotic resistant bacteria and the genes they carry remains to be described. Few details are known about the ecology of antibiotic resistant genes and bacteria in food production systems, or how antibiotic resistance genes in food animals compare to antibiotic resistance genes in other ecosystems. Here we report the distribution of antibiotic resistant genes in publicly available agricultural and non-agricultural metagenomic samples and identify which bacteria are likely to be carrying those genes. Antibiotic resistance, as coded for in the genes used in this study, is a process that was associated with all natural, agricultural, and human-impacted ecosystems examined, with between 0.7 to 4.4% of all classified genes in each habitat coding for resistance to antibiotic and toxic compounds (RATC). Agricultural, human, and coastal-marine metagenomes have characteristic distributions of antibiotic resistance genes, and different bacteria that carry the genes. There is a larger percentage of the total genome associated with antibiotic resistance in gastrointestinal-associated and agricultural metagenomes compared to marine and Antarctic samples. Since antibiotic resistance genes are a natural part of both human-impacted and pristine habitats, presence of these resistance genes in any specific habitat is therefore not sufficient to indicate or determine impact of anthropogenic antibiotic use. We recommend that baseline studies and control samples be taken in order to determine natural background levels of antibiotic resistant bacteria and/or antibiotic resistance genes when investigating the impacts of veterinary use of antibiotics on human health. We raise questions regarding whether the underlying biology of each type of bacteria contributes to the likelihood of transfer via the food chain.
Cady, Janet; Allred, Peggy; Bali, Taha; Pestronk, Alan; Goate, Alison; Miller, Timothy M; Mitra, Robi D; Ravits, John; Harms, Matthew B; Baloh, Robert H
2015-01-01
To define the genetic landscape of amyotrophic lateral sclerosis (ALS) and assess the contribution of possible oligogenic inheritance, we aimed to comprehensively sequence 17 known ALS genes in 391 ALS patients from the United States. Targeted pooled-sample sequencing was used to identify variants in 17 ALS genes. Fragment size analysis was used to define ATXN2 and C9ORF72 expansion sizes. Genotype-phenotype correlations were made with individual variants and total burden of variants. Rare variant associations for risk of ALS were investigated at both the single variant and gene level. A total of 64.3% of familial and 27.8% of sporadic subjects carried potentially pathogenic novel or rare coding variants identified by sequencing or an expanded repeat in C9ORF72 or ATXN2; 3.8% of subjects had variants in >1 ALS gene, and these individuals had disease onset 10 years earlier (p = 0.0046) than subjects with variants in a single gene. The number of potentially pathogenic coding variants did not influence disease duration or site of onset. Rare and potentially pathogenic variants in known ALS genes are present in >25% of apparently sporadic and 64% of familial patients, significantly higher than previous reports using less comprehensive sequencing approaches. A significant number of subjects carried variants in >1 gene, which influenced the age of symptom onset and supports oligogenic inheritance as relevant to disease pathogenesis. © 2014 American Neurological Association.
Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok
2014-01-01
Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome. PMID:24647560
Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok
2014-01-01
Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5'- or 3'-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome.
Li, Wan; Zhu, Lina; Huang, Hao; He, Yuehan; Lv, Junjie; Li, Weimin; Chen, Lina; He, Weiming
2017-10-01
Complex chronic diseases are caused by the effects of genetic and environmental factors. Single nucleotide polymorphisms (SNPs), one common type of genetic variations, played vital roles in diseases. We hypothesized that disease risk functional SNPs in coding regions and protein interaction network modules were more likely to contribute to the identification of disease susceptible genes for complex chronic diseases. This could help to further reveal the pathogenesis of complex chronic diseases. Disease risk SNPs were first recognized from public SNP data for coronary heart disease (CHD), hypertension (HT) and type 2 diabetes (T2D). SNPs in coding regions that were classified into nonsense and missense by integrating several SNP functional annotation databases were treated as functional SNPs. Then, regions significantly associated with each disease were screened using random permutations for disease risk functional SNPs. Corresponding to these regions, 155, 169 and 173 potential disease susceptible genes were identified for CHD, HT and T2D, respectively. A disease-related gene product interaction network in environmental context was constructed for interacting gene products of both disease genes and potential disease susceptible genes for these diseases. After functional enrichment analysis for disease associated modules, 5 CHD susceptible genes, 7 HT susceptible genes and 3 T2D susceptible genes were finally identified, some of which had pleiotropic effects. Most of these genes were verified to be related to these diseases in literature. This was similar for disease genes identified from another method proposed by Lee et al. from a different aspect. This research could provide novel perspectives for diagnosis and treatment of complex chronic diseases and susceptible genes identification for other diseases. Copyright © 2017 Elsevier Inc. All rights reserved.
Ojala, Teija; Laine, Pia K S; Ahlroos, Terhi; Tanskanen, Jarna; Pitkänen, Saara; Salusjärvi, Tuomas; Kankainen, Matti; Tynkkynen, Soile; Paulin, Lars; Auvinen, Petri
2017-01-16
Propionibacterium freudenreichii is a commercially important bacterium that is essential for the development of the characteristic eyes and flavor of Swiss-type cheeses. These bacteria grow actively and produce large quantities of flavor compounds during cheese ripening at warm temperatures but also appear to contribute to the aroma development during the subsequent cold storage of cheese. Here, we advance our understanding of the role of P. freudenreichii in cheese ripening by presenting the 2.68-Mbp annotated genome sequence of P. freudenreichii ssp. shermanii JS and determining its global transcriptional profiles during industrial cheese-making using transcriptome sequencing. The annotation of the genome identified a total of 2377 protein-coding genes and revealed the presence of enzymes and pathways for formation of several flavor compounds. Based on transcriptome profiling, the expression of 348 protein-coding genes was altered between the warm and cold room ripening of cheese. Several propionate, acetate, and diacetyl/acetoin production related genes had higher expression levels in the warm room, whereas a general slowing down of the metabolism and an activation of mobile genetic elements was seen in the cold room. A few ripening-related and amino acid catabolism involved genes were induced or remained active in cold room, indicating that strain JS contributes to the aroma development also during cold room ripening. In addition, we performed a comparative genomic analysis of strain JS and 29 other Propionibacterium strains of 10 different species, including an isolate of both P. freudenreichii subspecies freudenreichii and shermanii. Ortholog grouping of the predicted protein sequences revealed that close to 86% of the ortholog groups of strain JS, including a variety of ripening-related ortholog groups, were conserved across the P. freudenreichii isolates. Taken together, this study contributes to the understanding of the genomic basis of P. freudenreichii and sheds light on its activities during cheese ripening. Copyright © 2016 Elsevier B.V. All rights reserved.
Colonization of heterochromatic genes by transposable elements in Drosophila.
Dimitri, Patrizio; Junakovic, Nikolaj; Arcà, Bruno
2003-04-01
As a further step toward understanding transposable element-host genome interactions, we investigated the molecular anatomy of introns from five heterochromatic and 22 euchromatic protein-coding genes of Drosophila melanogaster. A total of 79 kb of intronic sequences from heterochromatic genes and 355 kb of intronic sequences from euchromatic genes have been used in Blast searches against Drosophila transposable elements (TEs). The results show that TE-homologous sequences belonging to 19 different families represent about 50% of intronic DNA from heterochromatic genes. In contrast, only 0.1% of the euchromatic intron DNA exhibits homology to known TEs. Intraspecific and interspecific size polymorphisms of introns were found, which are likely to be associated with changes in TE-related sequences. Together, the enrichment in TEs and the apparent dynamic state of heterochromatic introns suggest that TEs contribute significantly to the evolution of genes located in heterochromatin.
MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity
Wang, Yupeng; Tang, Haibao; DeBarry, Jeremy D.; Tan, Xu; Li, Jingping; Wang, Xiyin; Lee, Tae-ho; Jin, Huizhe; Marler, Barry; Guo, Hui; Kissinger, Jessica C.; Paterson, Andrew H.
2012-01-01
MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/. PMID:22217600
De Novo Coding Variants Are Strongly Associated with Tourette Disorder.
Willsey, A Jeremy; Fernandez, Thomas V; Yu, Dongmei; King, Robert A; Dietrich, Andrea; Xing, Jinchuan; Sanders, Stephan J; Mandell, Jeffrey D; Huang, Alden Y; Richer, Petra; Smith, Louw; Dong, Shan; Samocha, Kaitlin E; Neale, Benjamin M; Coppola, Giovanni; Mathews, Carol A; Tischfield, Jay A; Scharf, Jeremiah M; State, Matthew W; Heiman, Gary A
2017-05-03
Whole-exome sequencing (WES) and de novo variant detection have proven a powerful approach to gene discovery in complex neurodevelopmental disorders. We have completed WES of 325 Tourette disorder trios from the Tourette International Collaborative Genetics cohort and a replication sample of 186 trios from the Tourette Syndrome Association International Consortium on Genetics (511 total). We observe strong and consistent evidence for the contribution of de novo likely gene-disrupting (LGD) variants (rate ratio [RR] 2.32, p = 0.002). Additionally, de novo damaging variants (LGD and probably damaging missense) are overrepresented in probands (RR 1.37, p = 0.003). We identify four likely risk genes with multiple de novo damaging variants in unrelated probands: WWC1 (WW and C2 domain containing 1), CELSR3 (Cadherin EGF LAG seven-pass G-type receptor 3), NIPBL (Nipped-B-like), and FN1 (fibronectin 1). Overall, we estimate that de novo damaging variants in approximately 400 genes contribute risk in 12% of clinical cases. VIDEO ABSTRACT. Copyright © 2017 Elsevier Inc. All rights reserved.
Beamer, B A; Negri, C; Yen, C J; Gavrilova, O; Rumberger, J M; Durcan, M J; Yarnall, D P; Hawkins, A L; Griffin, C A; Burns, D K; Roth, J; Reitman, M; Shuldiner, A R
1997-04-28
We determined the chromosomal localization and partial genomic structure of the coding region of the human PPAR gamma gene (hPPAR gamma), a nuclear receptor important for adipocyte differentiation and function. Sequence analysis and long PCR of human genomic DNA with primers that span putative introns revealed that intron positions and sizes of hPPAR gamma are similar to those previously determined for the mouse PPAR gamma gene[13]. Fluorescent in situ hybridization localized hPPAR gamma to chromosome 3, band 3p25. Radiation hybrid mapping with two independent primer pairs was consistent with hPPAR gamma being within 1.5 Mb of marker D3S1263 on 3p25-p24.2. These sequences of the intron/exon junctions of the 6 coding exons shared by hPPAR gamma 1 and hPPAR gamma 2 will facilitate screening for possible mutations. Furthermore, D3S1263 is a suitable polymorphic marker for linkage analysis to evaluate PPAR gamma's potential contribution to genetic susceptibility to obesity, lipoatrophy, insulin resistance, and diabetes.
HLA-F polymorphisms in a Euro-Brazilian population from Southern Brazil.
Manvailer, L F S; Wowk, P F; Mattar, S B; da Siva, J S; da Graça Bicalho, M; Roxo, V M M S
2014-12-01
HLA-F is a non-classical major histocompatibility complex (MHC) gene. It codes class Ib MHC molecules with restricted distribution and less nucleotide variations than MHC class Ia genes. Of the 22 alleles registered on the IMGT database only four alleles encode for proteins that differ in their primary structure. To estimate genotype and allele frequencies, this study targeted on known protein coding regions of the HLA-F gene. Genotyping was performed by Sequence Base Typing (SBT). The sample was composed by 199-unrelated bone marrow donors from the Brazilian Bone Marrow Donor Registry (REDOME), Euro-Brazilians, from Southern Brazil. About 1673 bp were analyzed. The most frequent allele was HLA-F*01:01 (87.19%), followed by HLA-F*01:03 (12.31%), HLA-F*01:02 (0.25%) and HLA-F*01:04 (0.25%). Significant linkage disequilibrium (LD) was verified between HLA-F and HLA classes I and II alleles. This is the first study regarding HLA-F polymorphisms in a Euro-Brazilian population contributing to the Southern Brazilian genetic characterization. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
USDA-ARS?s Scientific Manuscript database
It is documented that some experimental and commercial lines of chickens develop spontaneous ALV-like tumors. It is also documented that MDV-2 or ALV-E or both escalate the incidence of such tumors. In one ADOL chicken line, known as rapid feathering susceptible (RFS), the observed tumor incidence w...
Olaitan, Abiola Olumuyiwa; Rolain, Jean-Marc
2016-08-01
Antibiotic resistance is an ancient biological mechanism in bacteria, although its proliferation in our contemporary world has been amplified through antimicrobial therapy. Recent studies conducted on ancient environmental and human samples have uncovered numerous antibiotic-resistant bacteria and resistance genes. The resistance genes that have been reported from the analysis of ancient bacterial DNA include genes coding for several classes of antibiotics, such as glycopeptides, β-lactams, tetracyclines, and macrolides. The investigation of the resistome of ancient bacteria is a recent and emerging field of research, and technological advancements such as next-generation sequencing will further contribute to its growth. It is hoped that the knowledge gained from this research will help us to better understand the evolution of antibiotic resistance genes and will also be used in drug design as a proactive measure against antibiotic resistance.
Deciphering the Code of the Cancer Genome: Mechanisms of Chromosome Rearrangement
Willis, Nicholas A.; Rass, Emilie; Scully, Ralph
2015-01-01
Chromosome rearrangement plays a causal role in tumorigenesis by contributing to the inactivation of tumor suppressor genes, the dysregulated expression or amplification of oncogenes and the generation of novel gene fusions. Chromosome breaks are important intermediates in this process. How, when and where these breaks arise and the specific mechanisms engaged in their repair strongly influence the resulting patterns of chromosome rearrangement. Here, we review recent progress in understanding how certain distinctive features of the cancer genome, including clustered mutagenesis, tandem segmental duplications, complex breakpoints, chromothripsis, chromoplexy and chromoanasynthesis may arise. PMID:26726318
Pervasive transcription: detecting functional RNAs in bacteria.
Lybecker, Meghan; Bilusic, Ivana; Raghavan, Rahul
2014-01-01
Pervasive, or genome-wide, transcription has been reported in all domains of life. In bacteria, most pervasive transcription occurs antisense to protein-coding transcripts, although recently a new class of pervasive RNAs was identified that originates from within annotated genes. Initially considered to be non-functional transcriptional noise, pervasive transcription is increasingly being recognized as important in regulating gene expression. The function of pervasive transcription is an extensively debated question in the field of transcriptomics and regulatory RNA biology. Here, we highlight the most recent contributions addressing the purpose of pervasive transcription in bacteria and discuss their implications.
2011-01-01
Background Acid stress impacts the persistence of lactobacilli in industrial sourdough fermentations, and in intestinal ecosystems. However, the contribution of glutamate to acid resistance in lactobacilli has not been demonstrated experimentally, and evidence for the contribution of acid resistance to the competitiveness of lactobacilli in sourdough is lacking. It was therefore the aim of this study to investigate the ecological role of glutamate decarboxylase in L. reuteri. Results A gene coding for a putative glutamate decarboxylase, gadB, was identified in the genome of L. reuteri 100-23. Different from the organization of genetic loci coding for glutamate decarboxylase in other lactic acid bacteria, gadB was located adjacent to a putative glutaminase gene, gls3. An isogenic deletion mutant, L. reuteri ∆gadB, was generated by a double crossover method. L. reuteri 100-23 but not L. reuteri ∆gadB converted glutamate to γ-aminobutyrate (GABA) in phosphate butter (pH 2.5). In sourdough, both strains converted glutamine to glutamate but only L. reuteri 100-23 accumulated GABA. Glutamate addition to phosphate buffer, pH 2.5, improved survival of L. reuteri 100-23 100-fold. However, survival of L. reuteri ∆gadB remained essentially unchanged. The disruption of gadB did not affect growth of L. reuteri in mMRS or in sourdough. However, the wild type strain L. reuteri 100-23 displaced L. reuteri ∆gadB after 5 cycles of fermentation in back-slopped sourdough fermentations. Conclusions The conversion of glutamate to GABA by L. reuteri 100-23 contributes to acid resistance and to competitiveness in industrial sourdough fermentations. The organization of the gene cluster for glutamate conversion, and the availability of amino acids in cereals imply that glutamine rather than glutamate functions as the substrate for GABA formation. The exceptional coupling of glutamine deamidation to glutamate decarboxylation in L. reuteri likely reflects adaptation to cereal substrates. PMID:21995488
SMARCB1/INI1 germline mutations contribute to 10% of sporadic schwannomatosis.
Rousseau, Guillaume; Noguchi, Tetsuro; Bourdon, Violaine; Sobol, Hagay; Olschwang, Sylviane
2011-01-24
Schwannomatosis is a disease characterized by multiple non-vestibular schwannomas. Although biallelic NF2 mutations are found in schwannomas, no germ line event is detected in schwannomatosis patients. In contrast, germline mutations of the SMARCB1 (INI1) tumor suppressor gene were described in familial and sporadic schwannomatosis patients. To delineate the SMARCB1 gene contribution, the nine coding exons were sequenced in a series of 56 patients affected with a variable number of non-vestibular schwannomas. Nine variants scattered along the sequence of SMARCB1 were identified. Five of them were classified as deleterious. All five patients carrying a SMARCB1 mutation had more multiple schwannomas, corresponding to 10.2% of patients with schwannomatosis. They were also diagnosed before 35 years of age. These results suggest that patients with schwannomas have a significant probability of carrying a SMARCB1 mutation. Combined with data available from other studies, they confirm the clinical indications for genetic screening of the SMARCB1 gene.
SMARCB1/INI1 germline mutations contribute to 10% of sporadic schwannomatosis
2011-01-01
Background Schwannomatosis is a disease characterized by multiple non-vestibular schwannomas. Although biallelic NF2 mutations are found in schwannomas, no germ line event is detected in schwannomatosis patients. In contrast, germline mutations of the SMARCB1 (INI1) tumor suppressor gene were described in familial and sporadic schwannomatosis patients. Methods To delineate the SMARCB1 gene contribution, the nine coding exons were sequenced in a series of 56 patients affected with a variable number of non-vestibular schwannomas. Results Nine variants scattered along the sequence of SMARCB1 were identified. Five of them were classified as deleterious. All five patients carrying a SMARCB1 mutation had more multiple schwannomas, corresponding to 10.2% of patients with schwannomatosis. They were also diagnosed before 35 years of age. Conclusions These results suggest that patients with schwannomas have a significant probability of carrying a SMARCB1 mutation. Combined with data available from other studies, they confirm the clinical indications for genetic screening of the SMARCB1 gene. PMID:21255467
Implication of common and disease specific variants in CLU, CR1, and PICALM.
Ferrari, Raffaele; Moreno, Jorge H; Minhajuddin, Abu T; O'Bryant, Sid E; Reisch, Joan S; Barber, Robert C; Momeni, Parastoo
2012-08-01
Two recent genome-wide association studies (GWAS) for late onset Alzheimer's disease (LOAD) revealed 3 new genes: clusterin (CLU), phosphatidylinositol binding clathrin assembly protein (PICALM), and complement receptor 1 (CR1). In order to evaluate association with these genome-wide association study-identified genes and to isolate the variants contributing to the pathogenesis of LOAD, we genotyped the top single nucleotide polymorphisms (SNPs), rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), and sequenced the entire coding regions of these genes in our cohort of 342 LOAD patients and 277 control subjects. We confirmed the association of rs3851179 (PICALM) (p = 7.4 × 10(-3)) with the disease status. Through sequencing we identified 18 variants in CLU, 3 of which were found exclusively in patients; 8 variants (out of 65) in CR1 gene were only found in patients and the 16 variants identified in PICALM gene were present in both patients and controls. In silico analysis of the variants in PICALM did not predict any damaging effect on the protein. The haplotype analysis of the variants in each gene predicted a common haplotype when the 3 single nucleotide polymorphisms rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), respectively, were included. For each gene the haplotype structure and size differed between patients and controls. In conclusion, we confirmed association of CLU, CR1, and PICALM genes with the disease status in our cohort through identification of a number of disease-specific variants among patients through the sequencing of the coding region of these genes. Published by Elsevier Inc.
2010-01-01
Background Two types of horns are evident in cattle - fixed horns attached to the skull and a variation called scurs, which refers to small loosely attached horns. Cattle lacking horns are referred to as polled. Although both the Poll and Scurs loci have been mapped to BTA1 and 19 respectively, the underlying genetic basis of these phenotypes is unknown, and so far, no candidate genes regulating these developmental processes have been described. This study is the first reported attempt at transcript profiling to identify genes and pathways contributing to horn and scurs development in Brahman cattle, relative to polled counterparts. Results Expression patterns in polled, horned and scurs tissues were obtained using the Agilent 44 k bovine array. The most notable feature when comparing transcriptional profiles of developing horn tissues against polled was the down regulation of genes coding for elements of the cadherin junction as well as those involved in epidermal development. We hypothesize this as a key event involved in keratinocyte migration and subsequent horn development. In the polled-scurs comparison, the most prevalent differentially expressed transcripts code for genes involved in extracellular matrix remodelling, which were up regulated in scurs tissues relative to polled. Conclusion For this first time we describe networks of genes involved in horn and scurs development. Interestingly, we did not observe differential expression in any of the genes present on the fine mapped region of BTA1 known to contain the Poll locus. PMID:20537189
Oliveira, Marisa; Lert-itthiporn, Worachart; Cavadas, Bruno; Fernandes, Verónica; Chuansumrit, Ampaiwan; Anunciação, Orlando; Casademont, Isabelle; Koeth, Fanny; Penova, Marina; Tangnararatchakit, Kanchana; Khor, Chiea Chuen; Paul, Richard; Malasit, Prida; Matsuda, Fumihiko; Simon-Lorière, Etienne; Suriyaphol, Prapat; Sakuntabhai, Anavaj
2018-01-01
Ethnic diversity has been long considered as one of the factors explaining why the severe forms of dengue are more prevalent in Southeast Asia than anywhere else. Here we take advantage of the admixed profile of Southeast Asians to perform coupled association-admixture analyses in Thai cohorts. For dengue shock syndrome (DSS), the significant haplotypes are located in genes coding for phospholipase C members (PLCB4 added to previously reported PLCE1), related to inflammation of blood vessels. For dengue fever (DF), we found evidence of significant association with CHST10, AHRR, PPP2R5E and GRIP1 genes, which participate in the xenobiotic metabolism signaling pathway. We conducted functional analyses for PPP2R5E, revealing by immunofluorescence imaging that the coded protein co-localizes with both DENV1 and DENV2 NS5 proteins. Interestingly, only DENV2-NS5 migrated to the nucleus, and a deletion of the predicted top-linking motif in NS5 abolished the nuclear transfer. These observations support the existence of differences between serotypes in their cellular dynamics, which may contribute to differential infection outcome risk. The contribution of the identified genes to the genetic risk render Southeast and Northeast Asian populations more susceptible to both phenotypes, while African populations are best protected against DSS and intermediately protected against DF, and Europeans the best protected against DF but the most susceptible against DSS. PMID:29447178
Oliveira, Marisa; Lert-Itthiporn, Worachart; Cavadas, Bruno; Fernandes, Verónica; Chuansumrit, Ampaiwan; Anunciação, Orlando; Casademont, Isabelle; Koeth, Fanny; Penova, Marina; Tangnararatchakit, Kanchana; Khor, Chiea Chuen; Paul, Richard; Malasit, Prida; Matsuda, Fumihiko; Simon-Lorière, Etienne; Suriyaphol, Prapat; Pereira, Luisa; Sakuntabhai, Anavaj
2018-02-01
Ethnic diversity has been long considered as one of the factors explaining why the severe forms of dengue are more prevalent in Southeast Asia than anywhere else. Here we take advantage of the admixed profile of Southeast Asians to perform coupled association-admixture analyses in Thai cohorts. For dengue shock syndrome (DSS), the significant haplotypes are located in genes coding for phospholipase C members (PLCB4 added to previously reported PLCE1), related to inflammation of blood vessels. For dengue fever (DF), we found evidence of significant association with CHST10, AHRR, PPP2R5E and GRIP1 genes, which participate in the xenobiotic metabolism signaling pathway. We conducted functional analyses for PPP2R5E, revealing by immunofluorescence imaging that the coded protein co-localizes with both DENV1 and DENV2 NS5 proteins. Interestingly, only DENV2-NS5 migrated to the nucleus, and a deletion of the predicted top-linking motif in NS5 abolished the nuclear transfer. These observations support the existence of differences between serotypes in their cellular dynamics, which may contribute to differential infection outcome risk. The contribution of the identified genes to the genetic risk render Southeast and Northeast Asian populations more susceptible to both phenotypes, while African populations are best protected against DSS and intermediately protected against DF, and Europeans the best protected against DF but the most susceptible against DSS.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-10-03
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-01-01
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
The Long Noncoding RNA Landscape of the Mouse Eye.
Chen, Weiwei; Yang, Shuai; Zhou, Zhonglou; Zhao, Xiaoting; Zhong, Jiayun; Reinach, Peter S; Yan, Dongsheng
2017-12-01
Long noncoding RNAs (lncRNAs) are important regulators of diverse biological functions. However, an extensive in-depth analysis of their expression profile and function in mammalian eyes is still lacking. Here we describe comprehensive landscapes of stage-dependent and tissue-specific lncRNA expression in the mouse eye. Affymetrix transcriptome array profiled lncRNA signatures from six different ocular tissue subsets (i.e., cornea, lens, retina, RPE, choroid, and sclera) in newborn and 8-week-old mice. Quantitative RT-PCR analysis validated array findings. Cis analyses and Gene Ontology (GO) annotation of protein-coding genes adjacent to signature lncRNA loci clarified potential lncRNA roles in maintaining tissue identity and regulating eye maturation during the aforementioned phase. In newborn and 8-week-old mice, we identified 47,332 protein-coding and noncoding gene transcripts. LncRNAs comprise 19,313 of these transcripts annotated in public data banks. During this maturation phase of these six different tissue subsets, more than 1000 lncRNAs expression levels underwent ≥2-fold changes. qRT-PCR analysis confirmed part of the gene microarray analysis results. K-means clustering identified 910 lncRNAs in the P0 groups and 686 lncRNAs in the postnatal 8-week-old groups, suggesting distinct tissue-specific lncRNA clusters. GO analysis of protein-coding genes proximal to lncRNA signatures resolved close correlations with their tissue-specific functional maturation between P0 and 8 weeks of age in the 6 tissue subsets. Characterizating maturational changes in lncRNA expression patterns as well as tissue-specific lncRNA signatures in six ocular tissues suggest important contributions made by lncRNA to the control of developmental processes in the mouse eye.
The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.
Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin
2013-01-01
Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
Knight, Helen M.; Pickard, Benjamin S.; Maclean, Alan; Malloy, Mary P.; Soares, Dinesh C.; McRae, Allan F.; Condie, Alison; White, Angela; Hawkins, William; McGhee, Kevin; van Beck, Margaret; MacIntyre, Donald J.; Starr, John M.; Deary, Ian J.; Visscher, Peter M.; Porteous, David J.; Cannon, Ronald E.; St Clair, David; Muir, Walter J.; Blackwood, Douglas H.R.
2009-01-01
Schizophrenia and bipolar disorder are leading causes of morbidity across all populations, with heritability estimates of ∼80% indicating a substantial genetic component. Population genetics and genome-wide association studies suggest an overlap of genetic risk factors between these illnesses but it is unclear how this genetic component is divided between common gene polymorphisms, rare genomic copy number variants, and rare gene sequence mutations. We report evidence that the lipid transporter gene ABCA13 is a susceptibility factor for both schizophrenia and bipolar disorder. After the initial discovery of its disruption by a chromosome abnormality in a person with schizophrenia, we resequenced ABCA13 exons in 100 cases with schizophrenia and 100 controls. Multiple rare coding variants were identified including one nonsense and nine missense mutations and compound heterozygosity/homozygosity in six cases. Variants were genotyped in additional schizophrenia, bipolar, depression (n > 1600), and control (n > 950) cohorts and the frequency of all rare variants combined was greater than controls in schizophrenia (OR = 1.93, p = 0.0057) and bipolar disorder (OR = 2.71, p = 0.00007). The population attributable risk of these mutations was 2.2% for schizophrenia and 4.0% for bipolar disorder. In a study of 21 families of mutation carriers, we genotyped affected and unaffected relatives and found significant linkage (LOD = 4.3) of rare variants with a phenotype including schizophrenia, bipolar disorder, and major depression. These data identify a candidate gene, highlight the genetic overlap between schizophrenia, bipolar disorder, and depression, and suggest that rare coding variants may contribute significantly to risk of these disorders. PMID:19944402
Raghavan, Avanthi; Neeli, Hemanth; Jin, Weijun; Badellino, Karen O.; Demissie, Serkalem; Manning, Alisa K.; DerOhannessian, Stephanie L.; Wolfe, Megan L.; Cupples, L. Adrienne; Li, Mingyao; Kathiresan, Sekar; Rader, Daniel J.
2011-01-01
Genome-wide association studies (GWAS) have successfully identified loci associated with quantitative traits, such as blood lipids. Deep resequencing studies are being utilized to catalogue the allelic spectrum at GWAS loci. The goal of these studies is to identify causative variants and missing heritability, including heritability due to low frequency and rare alleles with large phenotypic impact. Whereas rare variant efforts have primarily focused on nonsynonymous coding variants, we hypothesized that noncoding variants in these loci are also functionally important. Using the HDL-C gene LIPG as an example, we explored the effect of regulatory variants identified through resequencing of subjects at HDL-C extremes on gene expression, protein levels, and phenotype. Resequencing a portion of the LIPG promoter and 5′ UTR in human subjects with extreme HDL-C, we identified several rare variants in individuals from both extremes. Luciferase reporter assays were used to measure the effect of these rare variants on LIPG expression. Variants conferring opposing effects on gene expression were enriched in opposite extremes of the phenotypic distribution. Minor alleles of a common regulatory haplotype and noncoding GWAS SNPs were associated with reduced plasma levels of the LIPG gene product endothelial lipase (EL), consistent with its role in HDL-C catabolism. Additionally, we found that a common nonfunctional coding variant associated with HDL-C (rs2000813) is in linkage disequilibrium with a 5′ UTR variant (rs34474737) that decreases LIPG promoter activity. We attribute the gene regulatory role of rs34474737 to the observed association of the coding variant with plasma EL levels and HDL-C. Taken together, the findings show that both rare and common noncoding regulatory variants are important contributors to the allelic spectrum in complex trait loci. PMID:22174694
2014-01-01
Background In flowering plants a number of genes have been identified which control the transition from a vegetative to generative phase of life cycle. In bryophytes representing basal lineage of land plants, there is little data regarding the mechanisms that control this transition. Two species from bryophytes - moss Physcomitrella patens and liverwort Marchantia polymorpha are under advanced molecular and genetic research. The goal of our study was to identify genes connected to female gametophyte development and archegonia production in the dioecious liverwort Pellia endiviifolia species B, which is representative of the most basal lineage of the simple thalloid liverworts. Results The utility of the RDA-cDNA technique allowed us to identify three genes specifically expressed in the female individuals of P.endiviifolia: PenB_CYSP coding for cysteine protease, PenB_MT2 and PenB_MT3 coding for Mysterious Transcripts1 and 2 containing ORFs of 143 and 177 amino acid residues in length, respectively. The exon-intron structure of all three genes has been characterized and pre-mRNA processing was investigated. Interestingly, five mRNA isoforms are produced from the PenB_MT2 gene, which result from alternative splicing within the second and third exon. All observed splicing events take place within the 5′UTR and do not interfere with the coding sequence. All three genes are exclusively expressed in the female individuals, regardless of whether they were cultured in vitro or were collected from a natural habitat. Moreover we observed ten-fold increased transcripts level for all three genes in the archegonial tissue in comparison to the vegetative parts of the same female thalli grown in natural habitat suggesting their connection to archegonia development. Conclusions We have identified three genes which are specifically expressed in P.endiviifolia sp B female gametophytes. Moreover, their expression is connected to the female sex-organ differentiation and is developmentally regulated. The contribution of the identified genes may be crucial for successful liverwort sexual reproduction. PMID:24939387
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Satellite DNA Modulates Gene Expression in the Beetle Tribolium castaneum after Heat Stress
Feliciello, Isidoro; Akrap, Ivana; Ugarković, Đurđica
2015-01-01
Non-coding repetitive DNAs have been proposed to perform a gene regulatory role, however for tandemly repeated satellite DNA no such role was defined until now. Here we provide the first evidence for a role of satellite DNA in the modulation of gene expression under specific environmental conditions. The major satellite DNA TCAST1 in the beetle Tribolium castaneum is preferentially located within pericentromeric heterochromatin but is also dispersed as single repeats or short arrays in the vicinity of protein-coding genes within euchromatin. Our results show enhanced suppression of activity of TCAST1-associated genes and slower recovery of their activity after long-term heat stress relative to the same genes without associated TCAST1 satellite DNA elements. The level of gene suppression is not influenced by the distance of TCAST1 elements from the associated genes up to 40 kb from the genes’ transcription start sites, but it does depend on the copy number of TCAST1 repeats within an element, being stronger for the higher number of copies. The enhanced gene suppression correlates with the enrichment of the repressive histone marks H3K9me2/3 at dispersed TCAST1 elements and their flanking regions as well as with increased expression of TCAST1 satellite DNA. The results reveal transient, RNAi based heterochromatin formation at dispersed TCAST1 repeats and their proximal regions as a mechanism responsible for enhanced silencing of TCAST1-associated genes. Differences in the pattern of distribution of TCAST1 elements contribute to gene expression diversity among T. castaneum strains after long-term heat stress and might have an impact on adaptation to different environmental conditions. PMID:26275223
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.
Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor
2017-08-30
Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.
Decoding Mechanisms by which Silent Codon Changes Influence Protein Biogenesis and Function
Bali, Vedrana; Bebok, Zsuzsanna
2015-01-01
Scope Synonymous codon usage has been a focus of investigation since the discovery of the genetic code and its redundancy. The occurrences of synonymous codons vary between species and within genes of the same genome, known as codon usage bias. Today, bioinformatics and experimental data allow us to compose a global view of the mechanisms by which the redundancy of the genetic code contributes to the complexity of biological systems from affecting survival in prokaryotes, to fine tuning the structure and function of proteins in higher eukaryotes. Studies analyzing the consequences of synonymous codon changes in different organisms have revealed that they impact nucleic acid stability, protein levels, structure and function without altering amino acid sequence. As such, synonymous mutations inevitably contribute to the pathogenesis of complex human diseases. Yet, fundamental questions remain unresolved regarding the impact of silent mutations in human disorders. In the present review we describe developments in this area concentrating on mechanisms by which synonymous mutations may affect protein function and human health. Purpose This synopsis illustrates the significance of synonymous mutations in disease pathogenesis. We review the different steps of gene expression affected by silent mutations, and assess the benefits and possible harmful effects of codon optimization applied in the development of therapeutic biologics. Physiological and medical relevance Understanding mechanisms by which synonymous mutations contribute to complex diseases such as cancer, neurodegeneration and genetic disorders, including the limitations of codon-optimized biologics, provides insight concerning interpretation of silent variants and future molecular therapies. PMID:25817479
EGASP: the human ENCODE Genome Annotation Assessment Project
Guigó, Roderic; Flicek, Paul; Abril, Josep F; Reymond, Alexandre; Lagarde, Julien; Denoeud, France; Antonarakis, Stylianos; Ashburner, Michael; Bajic, Vladimir B; Birney, Ewan; Castelo, Robert; Eyras, Eduardo; Ucla, Catherine; Gingeras, Thomas R; Harrow, Jennifer; Hubbard, Tim; Lewis, Suzanna E; Reese, Martin G
2006-01-01
Background We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. Results The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. Conclusion This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence. PMID:16925836
The complete mitochondrial genome of Ambastaia sidthimunki (Cypriniformes: Cobitidae).
Yu, Peng; Wei, Min; Yang, Qichao; Yang, Yingming; Wan, Quan
2016-09-01
Ambastaia sidthimunki is a beautiful small-sized fish and it was categorized as Endangered B2ab (iii,v) in the IUCN Red List. In this study, we reported the complete mitochondrial genome of the A. sidthimunki. The mitochondrial genome sequence was a circular molecule with 16,574 bp in length, and it contained 2 ribosomal RNA genes, 22 transfer RNA genes, 13 protein-coding genes, an L-strand replication origin (OL) and a control region (D-loop). The nucleotide acid composition of the entire mitogenome was 26.94% for C, 15.55% for G, 31.84% for A and 25.67% for T, with an AT content of 57.51%. This research contributes new molecular data for the conservation of this Endangered species.
Ma, Xiaoyin; Ma, Zhiwei; Jiao, Xiaodong; Hejtmancik, J Fielding
2017-08-30
To identify possible genetic variants influencing expression of EPHA2 (Ephrin-receptor Type-A2), a tyrosine kinase receptor that has been shown to be important for lens development and to contribute to both congenital and age related cataract when mutated, the extended promoter region of EPHA2 was screened for variants. SNP rs6603883 lies in a PAX2 binding site in the EPHA2 promoter region. The C (minor) allele decreased EPHA2 transcriptional activity relative to the T allele by reducing the binding affinity of PAX2. Knockdown of PAX2 in human lens epithelial (HLE) cells decreased endogenous expression of EPHA2. Whole RNA sequencing showed that extracellular matrix (ECM), MAPK-AKT signaling pathways and cytoskeleton related genes were dysregulated in EPHA2 knockdown HLE cells. Taken together, these results indicate a functional non-coding SNP in EPHA2 promoter affects PAX2 binding and reduces EPHA2 expression. They further suggest that decreasing EPHA2 levels alters MAPK, AKT signaling pathways and ECM and cytoskeletal genes in lens cells that could contribute to cataract. These results demonstrate a direct role for PAX2 in EPHA2 expression and help delineate the role of EPHA2 in development and homeostasis required for lens transparency.
APADB: a database for alternative polyadenylation and microRNA regulation events
Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn
2014-01-01
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703
Assembly of the Auditory Circuitry by a Hox Genetic Network in the Mouse Brainstem
Di Bonito, Maria; Narita, Yuichi; Avallone, Bice; Sequino, Luigi; Mancuso, Marta; Andolfi, Gennaro; Franzè, Anna Maria; Puelles, Luis; Rijli, Filippo M.; Studer, Michèle
2013-01-01
Rhombomeres (r) contribute to brainstem auditory nuclei during development. Hox genes are determinants of rhombomere-derived fate and neuronal connectivity. Little is known about the contribution of individual rhombomeres and their associated Hox codes to auditory sensorimotor circuitry. Here, we show that r4 contributes to functionally linked sensory and motor components, including the ventral nucleus of lateral lemniscus, posterior ventral cochlear nuclei (VCN), and motor olivocochlear neurons. Assembly of the r4-derived auditory components is involved in sound perception and depends on regulatory interactions between Hoxb1 and Hoxb2. Indeed, in Hoxb1 and Hoxb2 mutant mice the transmission of low-level auditory stimuli is lost, resulting in hearing impairments. On the other hand, Hoxa2 regulates the Rig1 axon guidance receptor and controls contralateral projections from the anterior VCN to the medial nucleus of the trapezoid body, a circuit involved in sound localization. Thus, individual rhombomeres and their associated Hox codes control the assembly of distinct functionally segregated sub-circuits in the developing auditory brainstem. PMID:23408898
Assembly of the auditory circuitry by a Hox genetic network in the mouse brainstem.
Di Bonito, Maria; Narita, Yuichi; Avallone, Bice; Sequino, Luigi; Mancuso, Marta; Andolfi, Gennaro; Franzè, Anna Maria; Puelles, Luis; Rijli, Filippo M; Studer, Michèle
2013-01-01
Rhombomeres (r) contribute to brainstem auditory nuclei during development. Hox genes are determinants of rhombomere-derived fate and neuronal connectivity. Little is known about the contribution of individual rhombomeres and their associated Hox codes to auditory sensorimotor circuitry. Here, we show that r4 contributes to functionally linked sensory and motor components, including the ventral nucleus of lateral lemniscus, posterior ventral cochlear nuclei (VCN), and motor olivocochlear neurons. Assembly of the r4-derived auditory components is involved in sound perception and depends on regulatory interactions between Hoxb1 and Hoxb2. Indeed, in Hoxb1 and Hoxb2 mutant mice the transmission of low-level auditory stimuli is lost, resulting in hearing impairments. On the other hand, Hoxa2 regulates the Rig1 axon guidance receptor and controls contralateral projections from the anterior VCN to the medial nucleus of the trapezoid body, a circuit involved in sound localization. Thus, individual rhombomeres and their associated Hox codes control the assembly of distinct functionally segregated sub-circuits in the developing auditory brainstem.
Differentially-Expressed Pseudogenes in HIV-1 Infection.
Gupta, Aditi; Brown, C Titus; Zheng, Yong-Hui; Adami, Christoph
2015-09-29
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit.
Jin, Qijiang; Hu, Xin; Li, Xin; Wang, Bei; Wang, Yanjie; Jiang, Hongwei; Mattson, Neil; Xu, Yingchun
2016-01-01
Trehalose-6-phosphate synthase (TPS) plays a key role in plant carbohydrate metabolism and the perception of carbohydrate availability. In the present work, the publicly available Nelumbo nucifera (lotus) genome sequence database was analyzed which led to identification of nine lotus TPS genes (NnTPS). It was found that at least two introns are included in the coding sequences of NnTPS genes. When the motif compositions were analyzed we found that NnTPS generally shared the similar motifs, implying that they have similar functions. The dN/dS ratios were always less than 1 for different domains and regions outside domains, suggesting purifying selection on the lotus TPS gene family. The regions outside TPS domain evolved relatively faster than NnTPS domains. A phylogenetic tree was constructed using all predicted coding sequences of lotus TPS genes, together with those from Arabidopsis, poplar, soybean, and rice. The result indicated that those TPS genes could be clearly divided into two main subfamilies (I-II), where each subfamily could be further divided into 2 (I) and 5 (II) subgroups. Analyses of divergence and adaptive evolution show that purifying selection may have been the main force driving evolution of plant TPS genes. Some of the critical sites that contributed to divergence may have been under positive selection. Transcriptome data analysis revealed that most NnTPS genes were predominantly expressed in sink tissues. Expression pattern of NnTPS genes under copper and submergence stress indicated that NNU_014679 and NNU_022788 might play important roles in lotus energy metabolism and participate in stress response. Our results can facilitate further functional studies of TPS genes in lotus. PMID:27746792
Co-expression analysis and identification of fecundity-related long non-coding RNAs in sheep ovaries
Miao, Xiangyang; Luo, Qingmiao; Zhao, Huijing; Qin, Xiaoyu
2016-01-01
Small Tail Han sheep, including the FecBBFecBB (Han BB) and FecB+ FecB+ (Han++) genotypes, and Dorset sheep exhibit different fecundities. To identify novel long non-coding RNAs (lncRNAs) associated with sheep fecundity to better understand their molecular mechanisms, a genome-wide analysis of mRNAs and lncRNAs from Han BB, Han++ and Dorset sheep was performed. After the identification of differentially expressed mRNAs and lncRNAs, 16 significant modules were explored by using weighted gene coexpression network analysis (WGCNA) followed by functional enrichment analysis of the genes and lncRNAs in significant modules. Among these selected modules, the yellow and brown modules were significantly related to sheep fecundity. lncRNAs (e.g., NR0B1, XLOC_041882, and MYH15) in the yellow module were mainly involved in the TGF-β signalling pathway, and NYAP1 and BCORL1 were significantly associated with the oxytocin signalling pathway, which regulates several genes in the coexpression network of the brown module. Overall, we identified several gene modules associated with sheep fecundity, as well as networks consisting of hub genes and lncRNAs that may contribute to sheep prolificacy by regulating the target mRNAs related to the TGF-β and oxytocin signalling pathways. This study provides an alternative strategy for the identification of potential candidate regulatory lncRNAs. PMID:27982099
Miao, Xiangyang; Luo, Qingmiao; Zhao, Huijing; Qin, Xiaoyu
2016-12-16
Small Tail Han sheep, including the FecB B FecB B (Han BB) and FecB + FecB + (Han++) genotypes, and Dorset sheep exhibit different fecundities. To identify novel long non-coding RNAs (lncRNAs) associated with sheep fecundity to better understand their molecular mechanisms, a genome-wide analysis of mRNAs and lncRNAs from Han BB, Han++ and Dorset sheep was performed. After the identification of differentially expressed mRNAs and lncRNAs, 16 significant modules were explored by using weighted gene coexpression network analysis (WGCNA) followed by functional enrichment analysis of the genes and lncRNAs in significant modules. Among these selected modules, the yellow and brown modules were significantly related to sheep fecundity. lncRNAs (e.g., NR0B1, XLOC_041882, and MYH15) in the yellow module were mainly involved in the TGF-β signalling pathway, and NYAP1 and BCORL1 were significantly associated with the oxytocin signalling pathway, which regulates several genes in the coexpression network of the brown module. Overall, we identified several gene modules associated with sheep fecundity, as well as networks consisting of hub genes and lncRNAs that may contribute to sheep prolificacy by regulating the target mRNAs related to the TGF-β and oxytocin signalling pathways. This study provides an alternative strategy for the identification of potential candidate regulatory lncRNAs.
Network perturbation by recurrent regulatory variants in cancer
Cho, Ara; Lee, Insuk; Choi, Jung Kyoon
2017-01-01
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements
Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.
2008-01-01
X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625
Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui; ...
2014-10-02
Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui
Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred
2014-11-20
Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.
Gene-Auto: Automatic Software Code Generation for Real-Time Embedded Systems
NASA Astrophysics Data System (ADS)
Rugina, A.-E.; Thomas, D.; Olive, X.; Veran, G.
2008-08-01
This paper gives an overview of the Gene-Auto ITEA European project, which aims at building a qualified C code generator from mathematical models under Matlab-Simulink and Scilab-Scicos. The project is driven by major European industry partners, active in the real-time embedded systems domains. The Gene- Auto code generator will significantly improve the current development processes in such domains by shortening the time to market and by guaranteeing the quality of the generated code through the use of formal methods. The first version of the Gene-Auto code generator has already been released and has gone thought a validation phase on real-life case studies defined by each project partner. The validation results are taken into account in the implementation of the second version of the code generator. The partners aim at introducing the Gene-Auto results into industrial development by 2010.
Target gene analyses of 39 amelogenesis imperfecta kindreds
Chan, Hui-Chen; Estrella, Ninna M. R. P.; Milkovich, Rachel N.; Kim, Jung-Wook; Simmer, James P.; Hu, Jan C-C.
2012-01-01
Previously, mutational analyses identified six disease-causing mutations in 24 amelogenesis imperfecta (AI) kindreds. We have since expanded the number of AI kindreds to 39, and performed mutation analyses covering the coding exons and adjoining intron sequences for the six proven AI candidate genes [amelogenin (AMELX), enamelin (ENAM), family with sequence similarity 83, member H (FAM83H), WD repeat containing domain 72 (WDR72), enamelysin (MMP20), and kallikrein-related peptidase 4 (KLK4)] and for ameloblastin (AMBN) (a suspected candidate gene). All four of the X-linked AI families (100%) had disease-causing mutations in AMELX, suggesting that AMELX is the only gene involved in the aetiology of X-linked AI. Eighteen families showed an autosomal-dominant pattern of inheritance. Disease-causing mutations were identified in 12 (67%): eight in FAM83H, and four in ENAM. No FAM83H coding-region or splice-junction mutations were identified in three probands with autosomal-dominant hypocalcification AI (ADHCAI), suggesting that a second gene may contribute to the aetiology of ADHCAI. Six families showed an autosomal-recessive pattern of inheritance, and disease-causing mutations were identified in three (50%): two in MMP20, and one in WDR72. No disease-causing mutations were found in 11 families with only one affected member. We conclude that mutation analyses of the current candidate genes for AI have about a 50% chance of identifying the disease-causing mutation in a given kindred. PMID:22243262
Origin and evolution of spliceosomal introns
2012-01-01
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section. PMID:22507701
The microRNAs involved in human myeloid differentiation and myelogenous/myeloblastic leukemia
Wang, Xiao-Shuang; Zhang, Jun-Wu
2008-01-01
Abstract MicroRNAs (miRNAs) are endogenously expressed, functional RNAs that interact with native coding mRNAs to cleave mRNA or repress translation. Several miRNAs contribute to normal haematopoietic processes and some miRNAs act both as tumour suppressors and oncogenes in the pathology of haematological malignancies. While most effort is engaged in identifying and investigating the target genes of miRNAs, miRNA gene promoter methylation or transcriptional regulation is another important field of investigation, since these two main mechanisms can form a regulatory circuit. This review focuses on recent researches on miRNAs with important roles in myeloid cells. PMID:18554315
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot–dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution. PMID:22905132
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot-dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution.
Fritsche, Lars G.; Igl, Wilmar; Cooke Bailey, Jessica N.; Grassmann, Felix; Sengupta, Sebanti; Bragg-Gresham, Jennifer L.; Burdon, Kathryn P.; Hebbring, Scott J.; Wen, Cindy; Gorski, Mathias; Kim, Ivana K.; Cho, David; Zack, Donald; Souied, Eric; Scholl, Hendrik P. N.; Bala, Elisa; Lee, Kristine E.; Hunter, David J.; Sardell, Rebecca J.; Mitchell, Paul; Merriam, Joanna E.; Cipriani, Valentina; Hoffman, Joshua D.; Schick, Tina; Lechanteur, Yara T. E.; Guymer, Robyn H.; Johnson, Matthew P.; Jiang, Yingda; Stanton, Chloe M.; Buitendijk, Gabriëlle H. S.; Zhan, Xiaowei; Kwong, Alan M.; Boleda, Alexis; Brooks, Matthew; Gieser, Linn; Ratnapriya, Rinki; Branham, Kari E.; Foerster, Johanna R.; Heckenlively, John R.; Othman, Mohammad I.; Vote, Brendan J.; Liang, Helena Hai; Souzeau, Emmanuelle; McAllister, Ian L.; Isaacs, Timothy; Hall, Janette; Lake, Stewart; Mackey, David A.; Constable, Ian J.; Craig, Jamie E.; Kitchner, Terrie E.; Yang, Zhenglin; Su, Zhiguang; Luo, Hongrong; Chen, Daniel; Ouyang, Hong; Flagg, Ken; Lin, Danni; Mao, Guanping; Ferreyra, Henry; Stark, Klaus; von Strachwitz, Claudia N.; Wolf, Armin; Brandl, Caroline; Rudolph, Guenther; Olden, Matthias; Morrison, Margaux A.; Morgan, Denise J.; Schu, Matthew; Ahn, Jeeyun; Silvestri, Giuliana; Tsironi, Evangelia E.; Park, Kyu Hyung; Farrer, Lindsay A.; Orlin, Anton; Brucker, Alexander; Li, Mingyao; Curcio, Christine; Mohand-Saïd, Saddek; Sahel, José-Alain; Audo, Isabelle; Benchaboune, Mustapha; Cree, Angela J.; Rennie, Christina A.; Goverdhan, Srinivas V.; Grunin, Michelle; Hagbi-Levi, Shira; Campochiaro, Peter; Katsanis, Nicholas; Holz, Frank G.; Blond, Frédéric; Blanché, Hélène; Deleuze, Jean-François; Igo, Robert P.; Truitt, Barbara; Peachey, Neal S.; Meuer, Stacy M.; Myers, Chelsea E.; Moore, Emily L.; Klein, Ronald; Hauser, Michael A.; Postel, Eric A.; Courtenay, Monique D.; Schwartz, Stephen G.; Kovach, Jaclyn L.; Scott, William K.; Liew, Gerald; Tƒan, Ava G.; Gopinath, Bamini; Merriam, John C.; Smith, R. Theodore; Khan, Jane C.; Shahid, Humma; Moore, Anthony T.; McGrath, J. Allie; Laux, Reneé; Brantley, Milam A.; Agarwal, Anita; Ersoy, Lebriz; Caramoy, Albert; Langmann, Thomas; Saksens, Nicole T. M.; de Jong, Eiko K.; Hoyng, Carel B.; Cain, Melinda S.; Richardson, Andrea J.; Martin, Tammy M.; Blangero, John; Weeks, Daniel E.; Dhillon, Bal; van Duijn, Cornelia M.; Doheny, Kimberly F.; Romm, Jane; Klaver, Caroline C. W.; Hayward, Caroline; Gorin, Michael B.; Klein, Michael L.; Baird, Paul N.; den Hollander, Anneke I.; Fauser, Sascha; Yates, John R. W.; Allikmets, Rando; Wang, Jie Jin; Schaumberg, Debra A.; Klein, Barbara E. K.; Hagstrom, Stephanie A.; Chowers, Itay; Lotery, Andrew J.; Léveillard, Thierry; Zhang, Kang; Brilliant, Murray H.; Hewitt, Alex W.; Swaroop, Anand; Chew, Emily Y.; Pericak-Vance, Margaret A.; DeAngelis, Margaret; Stambolian, Dwight; Haines, Jonathan L.; Iyengar, Sudha K.; Weber, Bernhard H. F.; Abecasis, Gonçalo R.; Heid, Iris M.
2016-01-01
Advanced age-related macular degeneration (AMD) is the leading cause of blindness in the elderly with limited therapeutic options. Here, we report on a study of >12 million variants including 163,714 directly genotyped, most rare, protein-altering variant. Analyzing 16,144 patients and 17,832 controls, we identify 52 independently associated common and rare variants (P < 5×10–8) distributed across 34 loci. While wet and dry AMD subtypes exhibit predominantly shared genetics, we identify the first signal specific to wet AMD, near MMP9 (difference-P = 4.1×10–10). Very rare coding variants (frequency < 0.1%) in CFH, CFI, and TIMP3 suggest causal roles for these genes, as does a splice variant in SLC16A8. Our results support the hypothesis that rare coding variants can pinpoint causal genes within known genetic loci and illustrate that applying the approach systematically to detect new loci requires extremely large sample sizes. PMID:26691988
Liu, Yangyang; Han, Xiao; Yuan, Junting; Geng, Tuoyu; Chen, Shihao; Hu, Xuming; Cui, Isabelle H; Cui, Hengmi
2017-04-07
The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C , NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Heterogeneous conservation of Dlx paralog co-expression in jawed vertebrates.
Debiais-Thibaud, Mélanie; Metcalfe, Cushla J; Pollack, Jacob; Germon, Isabelle; Ekker, Marc; Depew, Michael; Laurenti, Patrick; Borday-Birraux, Véronique; Casane, Didier
2013-01-01
The Dlx gene family encodes transcription factors involved in the development of a wide variety of morphological innovations that first evolved at the origins of vertebrates or of the jawed vertebrates. This gene family expanded with the two rounds of genome duplications that occurred before jawed vertebrates diversified. It includes at least three bigene pairs sharing conserved regulatory sequences in tetrapods and teleost fish, but has been only partially characterized in chondrichthyans, the third major group of jawed vertebrates. Here we take advantage of developmental and molecular tools applied to the shark Scyliorhinus canicula to fill in the gap and provide an overview of the evolution of the Dlx family in the jawed vertebrates. These results are analyzed in the theoretical framework of the DDC (Duplication-Degeneration-Complementation) model. The genomic organisation of the catshark Dlx genes is similar to that previously described for tetrapods. Conserved non-coding elements identified in bony fish were also identified in catshark Dlx clusters and showed regulatory activity in transgenic zebrafish. Gene expression patterns in the catshark showed that there are some expression sites with high conservation of the expressed paralog(s) and other expression sites with events of paralog sub-functionalization during jawed vertebrate diversification, resulting in a wide variety of evolutionary scenarios within this gene family. Dlx gene expression patterns in the catshark show that there has been little neo-functionalization in Dlx genes over gnathostome evolution. In most cases, one tandem duplication and two rounds of vertebrate genome duplication have led to at least six Dlx coding sequences with redundant expression patterns followed by some instances of paralog sub-functionalization. Regulatory constraints such as shared enhancers, and functional constraints including gene pleiotropy, may have contributed to the evolutionary inertia leading to high redundancy between gene expression patterns.
BANERJI, JULIAN
2015-01-01
The present treatment of childhood T-cell leukemias involves the systemic administration of prokary-otic L-asparaginase (ASNase), which depletes plasma Asparagine (Asn) and inhibits protein synthesis. The mechanism of therapeutic action of ASNase is poorly understood, as are the etiologies of the side-effects incurred by treatment. Protein expression from genes bearing Asn homopolymeric coding regions (N-hCR) may be particularly susceptible to Asn level fluctuation. In mammals, N-hCR are rare, short and conserved. In humans, misfunctions of genes encoding N-hCR are associated with a cluster of disorders that mimic ASNase therapy side-effects which include impaired glycemic control, dislipidemia, pancreatitis, compromised vascular integrity, and neurological dysfunction. This paper proposes that dysregulation of Asn homeostasis, potentially even by ASNase produced by the microbiome, may contribute to several clinically important syndromes by altering expression of N-hCR bearing genes. By altering amino acid abundance and modulating ribosome translocation rates at codon repeats, the microbiomic environment may contribute to genome decoding and to shaping the proteome. We suggest that impaired translation at poly Asn codons elevates diabetes risk and severity. PMID:26178806
Banerji, Julian
2015-09-01
The present treatment of childhood T-cell leukemias involves the systemic administration of prokaryotic L-asparaginase (ASNase), which depletes plasma Asparagine (Asn) and inhibits protein synthesis. The mechanism of therapeutic action of ASNase is poorly understood, as are the etiologies of the side-effects incurred by treatment. Protein expression from genes bearing Asn homopolymeric coding regions (N-hCR) may be particularly susceptible to Asn level fluctuation. In mammals, N-hCR are rare, short and conserved. In humans, misfunctions of genes encoding N-hCR are associated with a cluster of disorders that mimic ASNase therapy side-effects which include impaired glycemic control, dislipidemia, pancreatitis, compromised vascular integrity, and neurological dysfunction. This paper proposes that dysregulation of Asn homeostasis, potentially even by ASNase produced by the microbiome, may contribute to several clinically important syndromes by altering expression of N-hCR bearing genes. By altering amino acid abundance and modulating ribosome translocation rates at codon repeats, the microbiomic environment may contribute to genome decoding and to shaping the proteome. We suggest that impaired translation at poly Asn codons elevates diabetes risk and severity.
Epigenetics of sex determination and gonadogenesis.
Piferrer, Francesc
2013-04-01
Epigenetics is commonly defined as the study of heritable changes in gene function that cannot be explained by changes in DNA sequence. The three major epigenetic mechanisms for gene expression regulation include DNA methylation, histone modifications, and non-coding RNAs. Epigenetic mechanisms provide organisms with the ability to integrate genomic and environmental information to modify the activity of their genes for generating a particular phenotype. During development, cells differentiate, acquire, and maintain identity through changes in gene expression. This is crucial for sex determination and differentiation, which are among the most important developmental processes for the proper functioning and perpetuation of species. This review summarizes studies showing how epigenetic regulatory mechanisms contribute to sex determination and reproductive organ formation in plants, invertebrates, and vertebrates. Further progress will be made by integrating several approaches, including genomics and Next Generation Sequencing to create epigenetic maps related to different aspects of sex determination and gonadogenesis. Epigenetics will also contribute to understand the etiology of several disorders of sexual development. It also might play a significant role in the control of reproduction in animal farm production and will aid in recognizing the environmental versus genetic influences on sex determination of sensitive species in a global change scenario. Copyright © 2013 Wiley Periodicals, Inc.
Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora
Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio
2017-01-01
Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566
Functional Genomic Analysis of the let-7 Regulatory Network in Caenorhabditis elegans
Zisoulis, Dimitrios G.; Lovci, Michael T.; Melnik-Martinez, Katya V.; Yeo, Gene W.; Pasquinelli, Amy E.
2013-01-01
The let-7 microRNA (miRNA) regulates cellular differentiation across many animal species. Loss of let-7 activity causes abnormal development in Caenorhabditis elegans and unchecked cellular proliferation in human cells, which contributes to tumorigenesis. These defects are due to improper expression of protein-coding genes normally under let-7 regulation. While some direct targets of let-7 have been identified, the genome-wide effect of let-7 insufficiency in a developing animal has not been fully investigated. Here we report the results of molecular and genetic assays aimed at determining the global network of genes regulated by let-7 in C. elegans. By screening for mis-regulated genes that also contribute to let-7 mutant phenotypes, we derived a list of physiologically relevant potential targets of let-7 regulation. Twenty new suppressors of the rupturing vulva or extra seam cell division phenotypes characteristic of let-7 mutants emerged. Three of these genes, opt-2, prmt-1, and T27D12.1, were found to associate with Argonaute in a let-7–dependent manner and are likely novel direct targets of this miRNA. Overall, a complex network of genes with various activities is subject to let-7 regulation to coordinate developmental timing across tissues during worm development. PMID:23516374
Cancer prevention, the need to preserve the integrity of the genome at all cost.
Okafor, M T; Nwagha, T U; Anusiem, C; Okoli, U A; Nubila, N I; Al-Alloosh, F; Udenyia, I J
2018-05-01
The entire genetic information carried by an organism makes up its genome. Genes have a diverse number of functions. They code different proteins for normal proliferation of cells. However, changes in the base sequence of genes affect their protein by-products which act as messengers for normal cellular functions such as proliferation and repairs. Salient processes for maintaining the integrity of the genome are hinged on intricate mechanisms put in place for the evolution to tackle genomic stresses. To discuss how cells sense and repair damage to their deoxyribonucleic acid (DNA) as well as to highlight how defects in the genes involved in DNA repair contribute to cancer development. Methodology: Online searches on the following databases such as Google Scholar, PubMed, Biomed Central, and SciELO were done. Attempt was made to review articles with keywords such as cancer, cell cycle, tumor suppressor genes, and DNA repair. The cell cycle, tumor suppression genes, DNA repair mechanism, as well as their contribution to cancer development, were discussed and reviewed. Knowledge on how cells detect and repair DNA damage through an array of mechanisms should allay our anxiety as regards cancer development. More studies on DNA damage detection and repair processes are important toward a holistic approach to cancer treatment.
Yassin, Atteyet F; Langenberg, Stefan; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Mukherjee, Supratim; Reddy, T B K; Daum, Chris; Shapiro, Nicole; Ivanova, Natalia; Woyke, Tanja; Kyrpides, Nikos C
2017-01-01
The permanent draft genome sequence of Actinotignum schaalii DSM 15541T is presented. The annotated genome includes 2,130,987 bp, with 1777 protein-coding and 58 rRNA-coding genes. Genome sequence analysis revealed absence of genes encoding for: components of the PTS systems, enzymes of the TCA cycle, glyoxylate shunt and gluconeogensis. Genomic data revealed that A. schaalii is able to oxidize carbohydrates via glycolysis, the nonoxidative pentose phosphate and the Entner-Doudoroff pathways. Besides, the genome harbors genes encoding for enzymes involved in the conversion of pyruvate to lactate, acetate and ethanol, which are found to be the end products of carbohydrate fermentation. The genome contained the gene encoding Type I fatty acid synthase required for de novo FAS biosynthesis. The plsY and plsX genes encoding the acyltransferases necessary for phosphatidic acid biosynthesis were absent from the genome. The genome harbors genes encoding enzymes responsible for isoprene biosynthesis via the mevalonate (MVA) pathway. Genes encoding enzymes that confer resistance to reactive oxygen species (ROS) were identified. In addition, A. schaalii harbors genes that protect the genome against viral infections. These include restriction-modification (RM) systems, type II toxin-antitoxin (TA), CRISPR-Cas and abortive infection system. A. schaalii genome also encodes several virulence factors that contribute to adhesion and internalization of this pathogen such as the tad genes encoding proteins required for pili assembly, the nanI gene encoding exo-alpha-sialidase, genes encoding heat shock proteins and genes encoding type VII secretion system. These features are consistent with anaerobic and pathogenic lifestyles. Finally, resistance to ciprofloxacin occurs by mutation in chromosomal genes that encode the subunits of DNA-gyrase (GyrA) and topisomerase IV (ParC) enzymes, while resistant to metronidazole was due to the frxA gene, which encodes NADPH-flavin oxidoreductase.
Genome-scale analysis of positionally relocated genes
Bhutkar, Arjun; Russo, Susan M.; Smith, Temple F.; Gelbart, William M.
2007-01-01
During evolution, genome reorganization includes large-scale events such as inversions, translocations, and segmental or even whole-genome duplications, as well as fine-scale events such as the relocation of individual genes. This latter category, which we will refer to as positionally relocated genes (PRGs), is the subject of this report. Assessment of the magnitude of such PRGs and of possible contributing mechanisms is aided by a comparative analysis of related genomes, where conserved chromosomal organization can aid in identifying genes that have acquired a new location in a lineage of these genomes. Here we utilize two methods to comprehensively identify relocated protein-coding genes in the recently sequenced genomes of 12 species of genus Drosophila. We use exceptions to the general rule of maintenance of chromosome arm (Muller element) association for most Drosophila genes to identify one major class of PRGs. We also identify a partially overlapping set of PRGs among “embedded genes,” located within the extents of other surrounding genes. We provide evidence that PRG movements have at least two different origins: Some events occur via retrotransposition of processed RNAs and others via a DNA-based transposition mechanism. Overall, we identify several hundred PRGs that arose within a lineage of the genus Drosophila phylogeny and provide suggestive evidence that a few thousand such events have occurred within the radiation of the insect order Diptera, thereby illustrating the magnitude of the contribution of PRG movement to chromosomal reorganization during evolution. PMID:17989252
Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.
2013-01-01
Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard
2014-09-01
To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.
Sims, Rebecca; van der Lee, Sven J.; Naj, Adam C.; Bellenguez, Céline; Badarinarayan, Nandini; Jakobsdottir, Johanna; Kunkle, Brian W.; Boland, Anne; Raybould, Rachel; Bis, Joshua C.; Martin, Eden R.; Grenier-Boley, Benjamin; Heilmann-Heimbach, Stefanie; Chouraki, Vincent; Kuzma, Amanda B.; Sleegers, Kristel; Vronskaya, Maria; Ruiz, Agustin; Graham, Robert R.; Olaso, Robert; Hoffmann, Per; Grove, Megan L.; Vardarajan, Badri N.; Hiltunen, Mikko; Nöthen, Markus M.; White, Charles C.; Hamilton-Nelson, Kara L.; Epelbaum, Jacques; Maier, Wolfgang; Choi, Seung-Hoan; Beecham, Gary W.; Dulary, Cécile; Herms, Stefan; Smith, Albert V.; Funk, Cory C.; Derbois, Céline; Forstner, Andreas J.; Ahmad, Shahzad; Li, Hongdong; Bacq, Delphine; Harold, Denise; Satizabal, Claudia L.; Valladares, Otto; Squassina, Alessio; Thomas, Rhodri; Brody, Jennifer A.; Qu, Liming; Sanchez-Juan, Pascual; Morgan, Taniesha; Wolters, Frank J.; Zhao, Yi; Garcia, Florentino Sanchez; Denning, Nicola; Fornage, Myriam; Malamon, John; Naranjo, Maria Candida Deniz; Majounie, Elisa; Mosley, Thomas H.; Dombroski, Beth; Wallon, David; Lupton, Michelle K; Dupuis, Josée; Whitehead, Patrice; Fratiglioni, Laura; Medway, Christopher; Jian, Xueqiu; Mukherjee, Shubhabrata; Keller, Lina; Brown, Kristelle; Lin, Honghuang; Cantwell, Laura B.; Panza, Francesco; McGuinness, Bernadette; Moreno-Grau, Sonia; Burgess, Jeremy D.; Solfrizzi, Vincenzo; Proitsi, Petra; Adams, Hieab H.; Allen, Mariet; Seripa, Davide; Pastor, Pau; Cupples, L. Adrienne; Price, Nathan D; Hannequin, Didier; Frank-García, Ana; Levy, Daniel; Chakrabarty, Paramita; Caffarra, Paolo; Giegling, Ina; Beiser, Alexa S.; Giedraitis, Vimantas; Hampel, Harald; Garcia, Melissa E.; Wang, Xue; Lannfelt, Lars; Mecocci, Patrizia; Eiriksdottir, Gudny; Crane, Paul K.; Pasquier, Florence; Boccardi, Virginia; Henández, Isabel; Barber, Robert C.; Scherer, Martin; Tarraga, Lluis; Adams, Perrie M.; Leber, Markus; Chen, Yuning; Albert, Marilyn S.; Riedel-Heller, Steffi; Emilsson, Valur; Beekly, Duane; Braae, Anne; Schmidt, Reinhold; Blacker, Deborah; Masullo, Carlo; Schmidt, Helena; Doody, Rachelle S.; Spalletta, Gianfranco; Longstreth, WT; Fairchild, Thomas J.; Bossù, Paola; Lopez, Oscar L.; Frosch, Matthew P.; Sacchinelli, Eleonora; Ghetti, Bernardino; Sánchez-Juan, Pascual; Yang, Qiong; Huebinger, Ryan M.; Jessen, Frank; Li, Shuo; Kamboh, M. Ilyas; Morris, John; Sotolongo-Grau, Oscar; Katz, Mindy J.; Corcoran, Chris; Himali, Jayanadra J.; Keene, C. Dirk; Tschanz, JoAnn; Fitzpatrick, Annette L.; Kukull, Walter A.; Norton, Maria; Aspelund, Thor; Larson, Eric B.; Munger, Ron; Rotter, Jerome I.; Lipton, Richard B.; Bullido, María J; Hofman, Albert; Montine, Thomas J.; Coto, Eliecer; Boerwinkle, Eric; Petersen, Ronald C.; Alvarez, Victoria; Rivadeneira, Fernando; Reiman, Eric M.; Gallo, Maura; O’Donnell, Christopher J.; Reisch, Joan S.; Bruni, Amalia Cecilia; Royall, Donald R.; Dichgans, Martin; Sano, Mary; Galimberti, Daniela; St George-Hyslop, Peter; Scarpini, Elio; Tsuang, Debby W.; Mancuso, Michelangelo; Bonuccelli, Ubaldo; Winslow, Ashley R.; Daniele, Antonio; Wu, Chuang-Kuo; Peters, Oliver; Nacmias, Benedetta; Riemenschneider, Matthias; Heun, Reinhard; Brayne, Carol; Rubinsztein, David C; Bras, Jose; Guerreiro, Rita; Hardy, John; Al-Chalabi, Ammar; Shaw, Christopher E; Collinge, John; Mann, David; Tsolaki, Magda; Clarimón, Jordi; Sussams, Rebecca; Lovestone, Simon; O’Donovan, Michael C; Owen, Michael J; Behrens, Timothy W.; Mead, Simon; Goate, Alison M.; Uitterlinden, Andre G.; Holmes, Clive; Cruchaga, Carlos; Ingelsson, Martin; Bennett, David A.; Powell, John; Golde, Todd E.; Graff, Caroline; De Jager, Philip L.; Morgan, Kevin; Ertekin-Taner, Nilufer; Combarros, Onofre; Psaty, Bruce M.; Passmore, Peter; Younkin, Steven G; Berr, Claudine; Gudnason, Vilmundur; Rujescu, Dan; Dickson, Dennis W.; Dartigues, Jean-Francois; DeStefano, Anita L.; Ortega-Cubero, Sara; Hakonarson, Hakon; Campion, Dominique; Boada, Merce; Kauwe, John “Keoni”; Farrer, Lindsay A.; Van Broeckhoven, Christine; Ikram, M. Arfan; Jones, Lesley; Haines, Johnathan; Tzourio, Christophe; Launer, Lenore J.; Escott-Price, Valentina; Mayeux, Richard; Deleuze, Jean-François; Amin, Najaf; Holmans, Peter A; Pericak-Vance, Margaret A.; Amouyel, Philippe; van Duijn, Cornelia M.; Ramirez, Alfredo; Wang, Li-San; Lambert, Jean-Charles; Seshadri, Sudha; Williams, Julie; Schellenberg, Gerard D.
2017-01-01
Introduction We identified rare coding variants associated with Alzheimer’s disease (AD) in a 3-stage case-control study of 85,133 subjects. In stage 1, 34,174 samples were genotyped using a whole-exome microarray. In stage 2, we tested associated variants (P<1×10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, an additional 14,997 samples were used to test the most significant stage 2 associations (P<5×10-8) using imputed genotypes. We observed 3 novel genome-wide significant (GWS) AD associated non-synonymous variants; a protective variant in PLCG2 (rs72824905/p.P522R, P=5.38×10-10, OR=0.68, MAFcases=0.0059, MAFcontrols=0.0093), a risk variant in ABI3 (rs616338/p.S209F, P=4.56×10-10, OR=1.43, MAFcases=0.011, MAFcontrols=0.008), and a novel GWS variant in TREM2 (rs143332484/p.R62H, P=1.55×10-14, OR=1.67, MAFcases=0.0143, MAFcontrols=0.0089), a known AD susceptibility gene. These protein-coding changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified AD risk genes. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to AD development. PMID:28714976
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript
Rose, Dominic; Stadler, Peter F.
2011-01-01
Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
Lipovich, Leonard; Hou, Zhuo-Cheng; Jia, Hui; Sinkler, Christopher; McGowen, Michael; Sterner, Kirstin N; Weckle, Amy; Sugalski, Amara B; Pipes, Lenore; Gatti, Domenico L; Mason, Christopher E; Sherwood, Chet C; Hof, Patrick R; Kuzawa, Christopher W; Grossman, Lawrence I; Goodman, Morris; Wildman, Derek E
2016-02-01
The human brain and human cognitive abilities are strikingly different from those of other great apes despite relatively modest genome sequence divergence. However, little is presently known about the interspecies divergence in gene structure and transcription that might contribute to these phenotypic differences. To date, most comparative studies of gene structure in the brain have examined humans, chimpanzees, and macaque monkeys. To add to this body of knowledge, we analyze here the brain transcriptome of the western lowland gorilla (Gorilla gorilla gorilla), an African great ape species that is phylogenetically closely related to humans, but with a brain that is approximately one-third the size. Manual transcriptome curation from a sample of the planum temporale region of the neocortex revealed 12 protein-coding genes and one noncoding-RNA gene with exons in the gorilla unmatched by public transcriptome data from the orthologous human loci. These interspecies gene structure differences accounted for a total of 134 amino acids in proteins found in the gorilla that were absent from protein products of the orthologous human genes. Proteins varying in structure between human and gorilla were involved in immunity and energy metabolism, suggesting their relevance to phenotypic differences. This gorilla neocortical transcriptome comprises an empirical, not homology- or prediction-driven, resource for orthologous gene comparisons between human and gorilla. These findings provide a unique repository of the sequences and structures of thousands of genes transcribed in the gorilla brain, pointing to candidate genes that may contribute to the traits distinguishing humans from other closely related great apes. © 2015 Wiley Periodicals, Inc.
Liu, Shiguo; Wang, Xueqin; Xu, Longqiang; Zheng, Lanlan; Ge, Yinlin; Ma, Xu
2015-02-01
To clarify the association of monoamine oxidase A- variable number of tandem repeat (MAOA-pVNTR) with susceptibility to Tourette's syndrome (TS) in Chinese Han population we discuss the genetic contribution of MAOA-VNTR in 141 TS patients including all their parents in Chinese Han population using transmission disequilibrium test (TDT) design. Our results revealed that no significant association was found in the MAOA gene promoter VNTR polymorphism and TS in Chinese Han population (TDT = 1.515, df = 1, p > 0.05). The negative result may be mainly due to the small sample size, but we don't deny the role of gene coding serotonergic or monoaminergic structures in the etiology of TS.
Identification of new mutations in primary hyperoxaluria type 1 (PH1).
von Schnakenburg, C; Rumsby, G
1998-01-01
Primary hyperoxaluria type 1 (PH1) is caused by deficiency of the hepatic peroxisomal enzyme alanine:glyoxylate aminotransferase (AGT). The AGXT gene, which codes for the 392 amino acid protein, has been mapped to chromosome 2q37.3. In order to identify new mutations in the AGXT gene we studied 79 PH1 patients using single strand conformation polymorphism analysis. In addition to a cluster of new mutations in exon 7 we report five novel mutations in exons 2, 4, 5, 9 and 10. These are T444C, G640A, G690A, 1008-1010delGCG and G1171A. These five new mutations contribute to our knowledge of the AGXT gene. Their possible consequences for PH1 phenotype and enzyme activity are discussed.
The complete mitochondrial genome sequence of Neovison vison (Carnivora: Mustelidae).
Sun, Wei-Li; Wang, Shao-Jing; Wang, Zhuo; Liu, Han-Lu; Zhong, Wei; Yang, Ya-Han; Li, Guang-Yu
2016-05-01
The phylogenetic and taxonomic position of the American mink Neovison vison have long been unclear. In this paper, the complete mitogenome of N. vison was sequenced and characterized. The total length was 16,594 bp and typically consists of 37 genes, including 13 protein-coding genes, 2 rRNAs, 22 tRNA, a large control region (CR) and a light-strand replication origin (OL). Gene contents, locations, and arrangements were identical to those of typical vertebrate. The overall base composition is 33.6%, 25.4%, 27.8% and 13.3% for A, C, T and G, respectively, with a moderate bias on AT content (61.4%). This result is expected to provide useful molecular data and contribute to further taxonomic and phylogenetic studies of Mustelidae and Carnivora.
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A.; Highland, Heather M.; Locke, Adam E.; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J.; Teslovich, Tanya M.; Rayner, N. William; Robertson, Neil R.; Beer, Nicola L.; Rundle, Jana K.; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P.; Gabriel, Stacey; Gjesing, Anette P.; Groves, Christopher J.; Hollensted, Mette; Huyghe, Jeroen R.; Jackson, Anne U.; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S.; Stringham, Heather M.; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I.; Blangero, John; Cox, Nancy J.; Duggirala, Ravindranath; Hanis, Craig L.; Seielstad, Mark; Wilson, James G.; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L.; Doney, Alex S. F.; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E.; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D.; Morris, Andrew D.; Palmer, Colin N. A.; Collins, Francis S.; Mohlke, Karen L.; Bergman, Richard N.; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M.; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C.; Morris, Andrew P.; Altshuler, David; Meigs, James B.; Boehnke, Michael; McCarthy, Mark I.; Lindgren, Cecilia M.; Gloyn, Anna L.
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights. PMID:25625282
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A; Highland, Heather M; Locke, Adam E; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J; Teslovich, Tanya M; Rayner, N William; Robertson, Neil R; Beer, Nicola L; Rundle, Jana K; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P; Gabriel, Stacey; Gjesing, Anette P; Groves, Christopher J; Hollensted, Mette; Huyghe, Jeroen R; Jackson, Anne U; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S; Stringham, Heather M; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I; Blangero, John; Cox, Nancy J; Duggirala, Ravindranath; Hanis, Craig L; Seielstad, Mark; Wilson, James G; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L; Doney, Alex S F; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D; Morris, Andrew D; Palmer, Colin N A; Collins, Francis S; Mohlke, Karen L; Bergman, Richard N; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C; Morris, Andrew P; Altshuler, David; Meigs, James B; Boehnke, Michael; McCarthy, Mark I; Lindgren, Cecilia M; Gloyn, Anna L
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights.
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Yersinia pestis and Yersinia pseudotuberculosis infection: a regulatory RNA perspective
Martínez-Chavarría, Luary C.; Vadyvaloo, Viveka
2015-01-01
Yersinia pestis, responsible for causing fulminant plague, has evolved clonally from the enteric pathogen, Y. pseudotuberculosis, which in contrast, causes a relatively benign enteric illness. An ~97% nucleotide identity over 75% of their shared protein coding genes is maintained between these two pathogens, leaving much conjecture regarding the molecular determinants responsible for producing these vastly different disease etiologies, host preferences and transmission routes. One idea is that coordinated production of distinct factors required for host adaptation and virulence in response to specific environmental cues could contribute to the distinct pathogenicity distinguishing these two species. Small non-coding RNAs that direct posttranscriptional regulation have recently been identified as key molecules that may provide such timeous expression of appropriate disease enabling factors. Here the burgeoning field of small non-coding regulatory RNAs in Yersinia pathogenesis is reviewed from the viewpoint of adaptive colonization, virulence and divergent evolution of these pathogens. PMID:26441890
Fan, SiGang; Hu, ChaoQun; Wen, Jing; Zhang, LvPing
2011-05-01
The complete mitochondrial DNA sequence contains useful information for phylogenetic analyses of metazoa. In this study, the complete mitochondrial DNA sequence of sea cucumber Stichopus horrens (Holothuroidea: Stichopodidae: Stichopus) is presented. The complete sequence was determined using normal and long PCRs. The mitochondrial genome of Stichopus horrens is a circular molecule 16257 bps long, composed of 13 protein-coding genes, two ribosomal RNA genes and 22 transfer RNA genes. Most of these genes are coded on the heavy strand except for one protein-coding gene (nad6) and five tRNA genes (tRNA ( Ser(UCN) ), tRNA ( Gln ), tRNA ( Ala ), tRNA ( Val ), tRNA ( Asp )) which are coded on the light strand. The composition of the heavy strand is 30.8% A, 23.7% C, 16.2% G, and 29.3% T bases (AT skew=0.025; GC skew=-0.188). A non-coding region of 675 bp was identified as a putative control region because of its location and AT richness. The intergenic spacers range from 1 to 50 bp in size, totaling 227 bp. A total of 25 overlapping nucleotides, ranging from 1 to 10 bp in size, exist among 11 genes. All 13 protein-coding genes are initiated with an ATG. The TAA codon is used as the stop codon in all the protein coding genes except nad3 and nad4 that use TAG as their termination codon. The most frequently used amino acids are Leu (16.29%), Ser (10.34%) and Phe (8.37%). All of the tRNA genes have the potential to fold into typical cloverleaf secondary structures. We also compared the order of the genes in the mitochondrial DNA from the five holothurians that are now available and found a novel gene arrangement in the mitochondrial DNA of Stichopus horrens.
Decoding the role of regulatory element polymorphisms in complex disease.
Vockley, Christopher M; Barrera, Alejandro; Reddy, Timothy E
2017-04-01
Genetic variation in gene regulatory elements contributes to diverse human diseases, ranging from rare and severe developmental defects to common and complex diseases such as obesity and diabetes. Early examples of regulatory mechanisms of human diseases involve large chromosomal rearrangements that change the regulatory connections within the genome. Single nucleotide variants in regulatory elements can also contribute to disease, potentially via demonstrated associations with changes in transcription factor binding, enhancer activity, post-translational histone modifications, long-range enhancer-promoter interactions, or RNA polymerase recruitment. Establishing causality between non-coding genetic variants, gene regulation, and disease has recently become more feasible with advances in genome-editing and epigenome-editing technologies. As establishing causal regulatory mechanisms of diseases becomes routine, functional annotation of target genes is likely to emerge as a major bottleneck for translation into patient benefits. In this review, we discuss the history and recent advances in understanding the regulatory mechanisms of human disease, and new challenges likely to be encountered once establishing those mechanisms becomes rote. Copyright © 2016 Elsevier Ltd. All rights reserved.
Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard
2014-01-01
Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098
Chocu, Sophie; Evrard, Bertrand; Lavigne, Régis; Rolland, Antoine D; Aubry, Florence; Jégou, Bernard; Chalmel, Frédéric; Pineau, Charles
2014-11-01
Spermatogenesis is a complex process, dependent upon the successive activation and/or repression of thousands of gene products, and ends with the production of haploid male gametes. RNA sequencing of male germ cells in the rat identified thousands of novel testicular unannotated transcripts (TUTs). Although such RNAs are usually annotated as long noncoding RNAs (lncRNAs), it is possible that some of these TUTs code for protein. To test this possibility, we used a "proteomics informed by transcriptomics" (PIT) strategy combining RNA sequencing data with shotgun proteomics analyses of spermatocytes and spermatids in the rat. Among 3559 TUTs and 506 lncRNAs found in meiotic and postmeiotic germ cells, 44 encoded at least one peptide. We showed that these novel high-confidence protein-coding loci exhibit several genomic features intermediate between those of lncRNAs and mRNAs. We experimentally validated the testicular expression pattern of two of these novel protein-coding gene candidates, both highly conserved in mammals: one for a vesicle-associated membrane protein we named VAMP-9, and the other for an enolase domain-containing protein. This study confirms the potential of PIT approaches for the discovery of protein-coding transcripts initially thought to be untranslated or unknown transcripts. Our results contribute to the understanding of spermatogenesis by characterizing two novel proteins, implicated by their strong expression in germ cells. The mass spectrometry proteomics data have been deposited with the ProteomeXchange Consortium under the data set identifier PXD000872. © 2014 by the Society for the Study of Reproduction, Inc.
MitoNuc: a database of nuclear genes coding for mitochondrial proteins. Update 2002.
Attimonelli, Marcella; Catalano, Domenico; Gissi, Carmela; Grillo, Giorgio; Licciulli, Flavio; Liuni, Sabino; Santamaria, Monica; Pesole, Graziano; Saccone, Cecilia
2002-01-01
Mitochondria, besides their central role in energy metabolism, have recently been found to be involved in a number of basic processes of cell life and to contribute to the pathogenesis of many degenerative diseases. All functions of mitochondria depend on the interaction of nuclear and organelle genomes. Mitochondrial genomes have been extensively sequenced and analysed and data have been collected in several specialised databases. In order to collect information on nuclear coded mitochondrial proteins we developed MitoNuc, a database containing detailed information on sequenced nuclear genes coding for mitochondrial proteins in Metazoa. The MitoNuc database can be retrieved through SRS and is available via the web site http://bighost.area.ba.cnr.it/mitochondriome where other mitochondrial databases developed by our group, the complete list of the sequenced mitochondrial genomes, links to other mitochondrial sites and related information, are available. The MitoAln database, related to MitoNuc in the previous release, reporting the multiple alignments of the relevant homologous protein coding regions, is no longer supported in the present release. In order to keep the links among entries in MitoNuc from homologous proteins, a new field in the database has been defined: the cluster identifier, an alpha numeric code used to identify each cluster of homologous proteins. A comment field derived from the corresponding SWISS-PROT entry has been introduced; this reports clinical data related to dysfunction of the protein. The logic scheme of MitoNuc database has been implemented in the ORACLE DBMS. This will allow the end-users to retrieve data through a friendly interface that will be soon implemented.
Intragenome Diversity of Gene Families Encoding Toxin-like Proteins in Venomous Animals.
Rodríguez de la Vega, Ricardo C; Giraud, Tatiana
2016-11-01
The evolution of venoms is the story of how toxins arise and of the processes that generate and maintain their diversity. For animal venoms these processes include recruitment for expression in the venom gland, neofunctionalization, paralogous expansions, and functional divergence. The systematic study of these processes requires the reliable identification of the venom components involved in antagonistic interactions. High-throughput sequencing has the potential of uncovering the entire set of toxins in a given organism, yet the existence of non-venom toxin paralogs and the misleading effects of partial census of the molecular diversity of toxins make necessary to collect complementary evidence to distinguish true toxins from their non-venom paralogs. Here, we analyzed the whole genomes of two scorpions, one spider and one snake, aiming at the identification of the full repertoires of genes encoding toxin-like proteins. We classified the entire set of protein-coding genes into paralogous groups and monotypic genes, identified genes encoding toxin-like proteins based on known toxin families, and quantified their expression in both venom-glands and pooled tissues. Our results confirm that genes encoding toxin-like proteins are part of multigene families, and that these families arise by recruitment events from non-toxin genes followed by limited expansions of the toxin-like protein coding genes. We also show that failing to account for sequence similarity with non-toxin proteins has a considerable misleading effect that can be greatly reduced by comparative transcriptomics. Our study overall contributes to the understanding of the evolutionary dynamics of proteins involved in antagonistic interactions. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.
Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen
2016-02-02
The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.
Mediator phosphorylation prevents stress response transcription during non-stress conditions.
Miller, Christian; Matic, Ivan; Maier, Kerstin C; Schwalb, Björn; Roether, Susanne; Strässer, Katja; Tresch, Achim; Mann, Matthias; Cramer, Patrick
2012-12-28
The multiprotein complex Mediator is a coactivator of RNA polymerase (Pol) II transcription that is required for the regulated expression of protein-coding genes. Mediator serves as an end point of signaling pathways and regulates Pol II transcription, but the mechanisms it uses are not well understood. Here, we used mass spectrometry and dynamic transcriptome analysis to investigate a functional role of Mediator phosphorylation in gene expression. Affinity purification and mass spectrometry revealed that Mediator from the yeast Saccharomyces cerevisiae is phosphorylated at multiple sites of 17 of its 25 subunits. Mediator phosphorylation levels change upon an external stimulus set by exposure of cells to high salt concentrations. Phosphorylated sites in the Mediator tail subunit Med15 are required for suppression of stress-induced changes in gene expression under non-stress conditions. Thus dynamic and differential Mediator phosphorylation contributes to gene regulation in eukaryotic cells.
Medical Sequencing at the extremes of Human Body Mass
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahituv, Nadav; Kavaslar, Nihan; Schackwitz, Wendy
2006-09-01
Body weight is a quantitative trait with significantheritability in humans. To identify potential genetic contributors tothis phenotype, we resequenced the coding exons and splice junctions of58 genes in 379 obese and 378 lean individuals. Our 96Mb survey included21 genes associated with monogenic forms of obesity in humans or mice, aswell as 37 genes that function in body weight-related pathways. We foundthat the monogenic obesity-associated gene group was enriched for rarenonsynonymous variants unique to the obese (n=46) versus lean (n=26)populations. Computational analysis further predicted a significantlygreater fraction of deleterious variants within the obese cohort.Consistent with the complex inheritance of body weight,more » we did notobserve obvious familial segregation in the majority of the 28 availablekindreds. Taken together, these data suggest that multiple rare alleleswith variable penetrance contribute to obesity in the population andprovide a deep medical sequencing based approach to detectthem.« less
Unique features of a global human ectoparasite identified through sequencing of the bed bug genome
Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen
2016-01-01
The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814
Disentangling the many layers of eukaryotic transcriptional regulation.
Lelli, Katherine M; Slattery, Matthew; Mann, Richard S
2012-01-01
Regulation of gene expression in eukaryotes is an extremely complex process. In this review, we break down several critical steps, emphasizing new data and techniques that have expanded current gene regulatory models. We begin at the level of DNA sequence where cis-regulatory modules (CRMs) provide important regulatory information in the form of transcription factor (TF) binding sites. In this respect, CRMs function as instructional platforms for the assembly of gene regulatory complexes. We discuss multiple mechanisms controlling complex assembly, including cooperative DNA binding, combinatorial codes, and CRM architecture. The second section of this review places CRM assembly in the context of nucleosomes and condensed chromatin. We discuss how DNA accessibility and histone modifications contribute to TF function. Lastly, new advances in chromosomal mapping techniques have provided increased understanding of intra- and interchromosomal interactions. We discuss how these topological maps influence gene regulatory models.
Omeire, Destiny; Abdin, Shaunte; Brooks, Daniel M; Miranda, Hector C
2015-04-01
The Germain's Peacock-Pheasant Polyplectron germaini (Aves, Galliformes, Phasianidae) is classified as Near Threatened on the IUCN Red List. The complete mitochondrial genome of P. germaini is 16,699 bp, consisting of 13 protein-coding genes, 2 rRNA, 22 tRNA genes and 1 control region. All of the 13 protein-coding genes have ATG as start codon. Eight of the 13 protein-coding genes have TAA as stop codon.
Kröber, Magdalena; Verwaaijen, Bart; Wibberg, Daniel; Winkler, Anika; Pühler, Alfred; Schlüter, Andreas
2016-08-10
The strain Bacillus amyloliquefaciens FZB42 is a plant growth promoting rhizobacterium (PGPR) and biocontrol agent known to keep infections of lettuce (Lactuca sativa) by the phytopathogen Rhizoctonia solani down. Several mechanisms, including the production of secondary metabolites possessing antimicrobial properties and induction of the host plant's systemic resistance (ISR), were proposed to explain the biocontrol effect of the strain. B. amyloliquefaciens FZB42 is able to form plaques (biofilm-like structures) on plant roots and this feature was discussed to be associated with its biocontrol properties. For this reason, formation of B. amyloliquefaciens biofilms was studied at the transcriptional level using high-throughput sequencing of whole transcriptome cDNA libraries from cells grown under biofilm-forming conditions vs. planktonic growth. Comparison of the transcriptional profiles of B. amyloliquefaciens FZB42 under these growth conditions revealed a common set of highly transcribed genes mostly associated with basic cellular functions. The lci gene, encoding an antimicrobial peptide (AMP), was among the most highly transcribed genes of cells under both growth conditions suggesting that AMP production may contribute to biocontrol. In contrast, gene clusters coding for synthesis of secondary metabolites with antimicrobial properties were only moderately transcribed and not induced in biofilm-forming cells. Differential gene expression revealed that 331 genes were significantly up-regulated and 230 genes were down-regulated in the transcriptome of B. amyloliquefaciens FZB42 under biofilm-forming conditions in comparison to planktonic cells. Among the most highly up-regulated genes, the yvqHI operon, coding for products involved in nisin (class I bacteriocin) resistance, was identified. In addition, an operon whose products play a role in fructosamine metabolism was enhanced in its transcription. Moreover, genes involved in the production of the extracellular biofilm matrix including exopolysaccharide genes (eps) and the yqxM-tasA-sipW operon encoding amyloid fiber synthesis were up-regulated in the B. amyloliquefaciens FZB42 biofilm. On the other hand, highly down-regulated genes in biofilms are associated with synthesis, assembly and regulation of the flagellar apparatus, the degradation of aromatic compounds and the export of copper. The obtained transcriptional profile for B. amyloliquefaciens biofilm cells uncovered genes involved in its development and enabled the assessment that synthesis of secondary metabolites among other factors may contribute to the biocontrol properties of the strain. Copyright © 2016 Elsevier B.V. All rights reserved.
Discovery of stimulation-responsive immune enhancers with CRISPR activation
Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander
2017-01-01
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues1–3. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption4–6, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa)7 to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs. PMID:28854172
Discovery of stimulation-responsive immune enhancers with CRISPR activation.
Simeonov, Dimitre R; Gowen, Benjamin G; Boontanrart, Mandy; Roth, Theodore L; Gagnon, John D; Mumbach, Maxwell R; Satpathy, Ansuman T; Lee, Youjin; Bray, Nicolas L; Chan, Alice Y; Lituiev, Dmytro S; Nguyen, Michelle L; Gate, Rachel E; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M; Mitros, Therese; Ray, Graham J; Curie, Gemma L; Naddaf, Nicki; Chu, Julia S; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R; Schumann, Kathrin; Daly, Mark J; Farh, Kyle K; Ansel, K Mark; Ye, Chun J; Greenleaf, William J; Anderson, Mark S; Bluestone, Jeffrey A; Chang, Howard Y; Corn, Jacob E; Marson, Alexander
2017-09-07
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (T H 17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.
Greenwald, Scott H.; Kuchenbecker, James A.; Rowlan, Jessica S.; Neitz, Jay; Neitz, Maureen
2017-01-01
Purpose Human long (L) and middle (M) wavelength cone opsin genes are highly variable due to intermixing. Two L/M cone opsin interchange mutants, designated LIAVA and LVAVA, are associated with clinical diagnoses, including red-green color vision deficiency, blue cone monochromacy, cone degeneration, myopia, and Bornholm Eye Disease. Because the protein and splicing codes are carried by the same nucleotides, intermixing L and M genes can cause disease by affecting protein structure and splicing. Methods Genetically engineered mice were created to allow investigation of the consequences of altered protein structure alone, and the effects on cone morphology were examined using immunohistochemistry. In humans and mice, cone function was evaluated using the electroretinogram (ERG) under L/M- or short (S) wavelength cone isolating conditions. Effects of LIAVA and LVAVA genes on splicing were evaluated using a minigene assay. Results ERGs and histology in mice revealed protein toxicity for the LVAVA but not for the LIAVA opsin. Minigene assays showed that the dominant messenger RNA (mRNA) was aberrantly spliced for both variants; however, the LVAVA gene produced a small but significant amount of full-length mRNA and LVAVA subjects had correspondingly reduced ERG amplitudes. In contrast, the LIAVA subject had no L/M cone ERG. Conclusions Dramatic differences in phenotype can result from seemingly minor differences in genotype through divergent effects on the dual amino acid and splicing codes. Translational Relevance The mechanism by which individual mutations contribute to clinical phenotypes provides valuable information for diagnosis and prognosis of vision disorders associated with L/M interchange mutations, and it informs strategies for developing therapies. PMID:28516000
Discovery of stimulation-responsive immune enhancers with CRISPR activation
NASA Astrophysics Data System (ADS)
Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K.; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander
2017-09-01
The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.
Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv
2010-01-01
RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
Chakraborty, Supriyo; Uddin, Arif; Mazumder, Tarikul Huda; Choudhury, Monisha Nath; Malakar, Arup Kumar; Paul, Prosenjit; Halder, Binata; Deka, Himangshu; Mazumder, Gulshana Akthar; Barbhuiya, Riazul Ahmed; Barbhuiya, Masuk Ahmed; Devi, Warepam Jesmi
2017-12-02
The study of codon usage coupled with phylogenetic analysis is an important tool to understand the genetic and evolutionary relationship of a gene. The 13 protein coding genes of human mitochondria are involved in electron transport chain for the generation of energy currency (ATP). However, no work has yet been reported on the codon usage of the mitochondrial protein coding genes across six continents. To understand the patterns of codon usage in mitochondrial genes across six different continents, we used bioinformatic analyses to analyze the protein coding genes. The codon usage bias was low as revealed from high ENC value. Correlation between codon usage and GC3 suggested that all the codons ending with G/C were positively correlated with GC3 but vice versa for A/T ending codons with the exception of ND4L and ND5 genes. Neutrality plot revealed that for the genes ATP6, COI, COIII, CYB, ND4 and ND4L, natural selection might have played a major role while mutation pressure might have played a dominant role in the codon usage bias of ATP8, COII, ND1, ND2, ND3, ND5 and ND6 genes. Phylogenetic analysis indicated that evolutionary relationships in each of 13 protein coding genes of human mitochondria were different across six continents and further suggested that geographical distance was an important factor for the origin and evolution of 13 protein coding genes of human mitochondria. Copyright © 2017 Elsevier B.V. and Mitochondria Research Society. All rights reserved.
Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells
Borel, Christelle; Mudge, Jonathan M.; Howald, Cédric; Foissac, Sylvain; Ucla, Catherine; Chrast, Jacqueline; Ribeca, Paolo; Martin, David; Murray, Ryan R.; Yang, Xinping; Ghamsari, Lila; Lin, Chenwei; Bell, Ian; Dumais, Erica; Drenkow, Jorg; Tress, Michael L.; Gelpí, Josep Lluís; Orozco, Modesto; Valencia, Alfonso; van Berkum, Nynke L.; Lajoie, Bryan R.; Vidal, Marc; Stamatoyannopoulos, John; Batut, Philippe; Dobin, Alex; Harrow, Jennifer; Hubbard, Tim; Dekker, Job; Frankish, Adam; Salehi-Ashtiani, Kourosh; Reymond, Alexandre; Antonarakis, Stylianos E.; Guigó, Roderic; Gingeras, Thomas R.
2012-01-01
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network. PMID:22238572
2010-01-01
Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
Boyd, David A.; Thevenot, Tracy; Gumbmann, Markus; Honeyman, Allen L.; Hamilton, Ian R.
2000-01-01
Transposon mutagenesis and marker rescue were used to isolate and identify an 8.5-kb contiguous region containing six open reading frames constituting the operon for the sorbitol P-enolpyruvate phosphotransferase transport system (PTS) of Streptococcus mutans LT11. The first gene, srlD, codes for sorbitol-6-phosphate dehydrogenase, followed downstream by srlR, coding for a transcriptional regulator; srlM, coding for a putative activator; and the srlA, srlE, and srlB genes, coding for the EIIC, EIIBC, and EIIA components of the sorbitol PTS, respectively. Among all sorbitol PTS operons characterized to date, the srlD gene is found after the genes coding for the EII components; thus, the location of the gene in S. mutans is unique. The SrlR protein is similar to several transcriptional regulators found in Bacillus spp. that contain PTS regulator domains (J. Stülke, M. Arnaud, G. Rapoport, and I. Martin-Verstraete, Mol. Microbiol. 28:865–874, 1998), and its gene overlaps the srlM gene by 1 bp. The arrangement of these two regulatory genes is unique, having not been reported for other bacteria. PMID:10639465
Wu, Shengru; Liu, Yanli; Guo, Wei; Cheng, Xi; Ren, Xiaochun; Chen, Si; Li, Xueyuan; Duan, Yongle; Sun, Qingzhu; Yang, Xiaojun
2018-06-27
The liver is mainly hematopoietic in the embryo, and converts into a major metabolic organ in the adult. Therefore, it is intensively remodeled after birth to adapt and perform adult functions. Long non-coding RNAs (lncRNAs) are involved in organ development and cell differentiation, likely they have potential roles in regulating postnatal liver development. Herein, in order to understand the roles of lncRNAs in postnatal liver maturation, we analyzed the lncRNAs and mRNAs expression profiles in immature and mature livers from one-day-old and adult (40 weeks of age) breeder roosters by Ribo-Zero RNA-Sequencing. Around 21,939 protein-coding genes and 2220 predicted lncRNAs were expressed in livers of breeder roosters. Compared to protein-coding genes, the identified chicken lncRNAs shared fewer exons, shorter transcript length, and significantly lower expression levels. Notably, in comparison between the livers of newborn and adult breeder roosters, a total of 1570 mRNAs and 214 lncRNAs were differentially expressed with the criteria of log 2 fold change > 1 or < - 1 and P values < 0.05, which were validated by qPCR using randomly selected five mRNAs and five lncRNAs. Further GO and KEGG analyses have revealed that the differentially expressed mRNAs were involved in the hepatic metabolic and immune functional changes, as well as some biological processes and pathways including cell proliferation, apoptotic and cell cycle that are implicated in the development of liver. We also investigated the cis- and trans- regulatory effects of differentially expressed lncRNAs on its target genes. GO and KEGG analyses indicated that these lncRNAs had their neighbor protein coding genes and trans-regulated genes associated with adapting of adult hepatic functions, as well as some pathways involved in liver development, such as cell cycle pathway, Notch signaling pathway, Hedgehog signaling pathway, and Wnt signaling pathway. This study provides a catalog of mRNAs and lncRNAs related to postnatal liver maturation of chicken, and will contribute to a fuller understanding of biological processes or signaling pathways involved in significant functional transition during postnatal liver development that differentially expressed genes and lncRNAs could take part in.
The mitochondrial genome of Moniliophthora roreri, the frosty pod rot pathogen of cacao.
Costa, Gustavo G L; Cabrera, Odalys G; Tiburcio, Ricardo A; Medrano, Francisco J; Carazzolle, Marcelo F; Thomazella, Daniela P T; Schuster, Stephen C; Carlson, John E; Guiltinan, Mark J; Bailey, Bryan A; Mieczkowski, Piotr; Pereira, Gonçalo A G; Meinhardt, Lyndel W
2012-05-01
In this study, we report the sequence of the mitochondrial (mt) genome of the Basidiomycete fungus Moniliophthora roreri, which is the etiologic agent of frosty pod rot of cacao (Theobroma cacao L.). We also compare it to the mtDNA from the closely-related species Moniliophthora perniciosa, which causes witches' broom disease of cacao. The 94 Kb mtDNA genome of M. roreri has a circular topology and codes for the typical 14 mt genes involved in oxidative phosphorylation. It also codes for both rRNA genes, a ribosomal protein subunit, 13 intronic open reading frames (ORFs), and a full complement of 27 tRNA genes. The conserved genes of M. roreri mtDNA are completely syntenic with homologous genes of the 109 Kb mtDNA of M. perniciosa. As in M. perniciosa, M. roreri mtDNA contains a high number of hypothetical ORFs (28), a remarkable feature that make Moniliophthoras the largest reservoir of hypothetical ORFs among sequenced fungal mtDNA. Additionally, the mt genome of M. roreri has three free invertron-like linear mt plasmids, one of which is very similar to that previously described as integrated into the main M. perniciosa mtDNA molecule. Moniliophthora roreri mtDNA also has a region of suspected plasmid origin containing 15 hypothetical ORFs distributed in both strands. One of these ORFs is similar to an ORF in the mtDNA gene encoding DNA polymerase in Pleurotus ostreatus. The comparison to M. perniciosa showed that the 15 Kb difference in mtDNA sizes is mainly attributed to a lower abundance of repetitive regions in M. roreri (5.8 Kb vs 20.7 Kb). The most notable differences between M. roreri and M. perniciosa mtDNA are attributed to repeats and regions of plasmid origin. These elements might have contributed to the rapid evolution of mtDNA. Since M. roreri is the second species of the genus Moniliophthora whose mtDNA genome has been sequenced, the data presented here contribute valuable information for understanding the evolution of fungal mt genomes among closely-related species. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.
Grijalvo, Santiago; Alagia, Adele
2018-01-01
Oligonucleotide-based therapy has become an alternative to classical approaches in the search of novel therapeutics involving gene-related diseases. Several mechanisms have been described in which demonstrate the pivotal role of oligonucleotide for modulating gene expression. Antisense oligonucleotides (ASOs) and more recently siRNAs and miRNAs have made important contributions either in reducing aberrant protein levels by sequence-specific targeting messenger RNAs (mRNAs) or restoring the anomalous levels of non-coding RNAs (ncRNAs) that are involved in a good number of diseases including cancer. In addition to formulation approaches which have contributed to accelerate the presence of ASOs, siRNAs and miRNAs in clinical trials; the covalent linkage between non-viral vectors and nucleic acids has also added value and opened new perspectives to the development of promising nucleic acid-based therapeutics. This review article is mainly focused on the strategies carried out for covalently modifying siRNA and miRNA molecules. Examples involving cell-penetrating peptides (CPPs), carbohydrates, polymers, lipids and aptamers are discussed for the synthesis of siRNA conjugates whereas in the case of miRNA-based drugs, this review article makes special emphasis in using antagomiRs, locked nucleic acids (LNAs), peptide nucleic acids (PNAs) as well as nanoparticles. The biomedical applications of siRNA and miRNA conjugates are also discussed. PMID:29415514
Nutrigenetics and modulation of oxidative stress.
Da Costa, Laura A; Badawi, Alaa; El-Sohemy, Ahmed
2012-01-01
Oxidative stress develops as a result of an imbalance between the production and accumulation of reactive species and the body's ability to manage them using exogenous and endogenous antioxidants. Exogenous antioxidants obtained from the diet, including vitamin C, vitamin E, and carotenoids, have important roles in preventing and reducing oxidative stress. Individual genetic variation affecting proteins involved in the uptake, utilization and metabolism of these antioxidants may alter their serum levels, exposure to target cells and subsequent contribution to the extent of oxidative stress. Endogenous antioxidants include the antioxidant enzymes superoxide dismutase, catalase, glutathione peroxidase, paraoxanase, and glutathione S-transferase. These enzymes metabolize reactive species and their by-products, reducing oxidative stress. Variation in the genes coding these enzymes may impact their enzymatic antioxidant activity and, thus, the levels of reactive species, oxidative stress, and risk of disease development. Oxidative stress may contribute to the development of chronic disease, including osteoporosis, type 2 diabetes, neurodegenerative diseases, cardiovascular disease, and cancer. Indeed, polymorphisms in most of the genes that code for antioxidant enzymes have been associated with several types of cancer, although inconsistent findings between studies have been reported. These inconsistencies may, in part, be explained by interactions with the environment, such as modification by diet. In this review, we highlight some of the recent studies in the field of nutrigenetics, which have examined interactions between diet, genetic variation in antioxidant enzymes, and oxidative stress. Copyright © 2012 S. Karger AG, Basel.
Kelleher, Raymond J; Geigenmüller, Ute; Hovhannisyan, Hayk; Trautman, Edwin; Pinard, Robert; Rathmell, Barbara; Carpenter, Randall; Margulies, David
2012-01-01
Identification of common molecular pathways affected by genetic variation in autism is important for understanding disease pathogenesis and devising effective therapies. Here, we test the hypothesis that rare genetic variation in the metabotropic glutamate-receptor (mGluR) signaling pathway contributes to autism susceptibility. Single-nucleotide variants in genes encoding components of the mGluR signaling pathway were identified by high-throughput multiplex sequencing of pooled samples from 290 non-syndromic autism cases and 300 ethnically matched controls on two independent next-generation platforms. This analysis revealed significant enrichment of rare functional variants in the mGluR pathway in autism cases. Higher burdens of rare, potentially deleterious variants were identified in autism cases for three pathway genes previously implicated in syndromic autism spectrum disorder, TSC1, TSC2, and SHANK3, suggesting that genetic variation in these genes also contributes to risk for non-syndromic autism. In addition, our analysis identified HOMER1, which encodes a postsynaptic density-localized scaffolding protein that interacts with Shank3 to regulate mGluR activity, as a novel autism-risk gene. Rare, potentially deleterious HOMER1 variants identified uniquely in the autism population affected functionally important protein regions or regulatory sequences and co-segregated closely with autism among children of affected families. We also identified rare ASD-associated coding variants predicted to have damaging effects on components of the Ras/MAPK cascade. Collectively, these findings suggest that altered signaling downstream of mGluRs contributes to the pathogenesis of non-syndromic autism.
Hovhannisyan, Hayk; Trautman, Edwin; Pinard, Robert; Rathmell, Barbara; Carpenter, Randall; Margulies, David
2012-01-01
Identification of common molecular pathways affected by genetic variation in autism is important for understanding disease pathogenesis and devising effective therapies. Here, we test the hypothesis that rare genetic variation in the metabotropic glutamate-receptor (mGluR) signaling pathway contributes to autism susceptibility. Single-nucleotide variants in genes encoding components of the mGluR signaling pathway were identified by high-throughput multiplex sequencing of pooled samples from 290 non-syndromic autism cases and 300 ethnically matched controls on two independent next-generation platforms. This analysis revealed significant enrichment of rare functional variants in the mGluR pathway in autism cases. Higher burdens of rare, potentially deleterious variants were identified in autism cases for three pathway genes previously implicated in syndromic autism spectrum disorder, TSC1, TSC2, and SHANK3, suggesting that genetic variation in these genes also contributes to risk for non-syndromic autism. In addition, our analysis identified HOMER1, which encodes a postsynaptic density-localized scaffolding protein that interacts with Shank3 to regulate mGluR activity, as a novel autism-risk gene. Rare, potentially deleterious HOMER1 variants identified uniquely in the autism population affected functionally important protein regions or regulatory sequences and co-segregated closely with autism among children of affected families. We also identified rare ASD-associated coding variants predicted to have damaging effects on components of the Ras/MAPK cascade. Collectively, these findings suggest that altered signaling downstream of mGluRs contributes to the pathogenesis of non-syndromic autism. PMID:22558107
Choquet, Remy; Maaroufi, Meriem; Fonjallaz, Yannick; de Carrara, Albane; Vandenbussche, Pierre-Yves; Dhombres, Ferdinand; Landais, Paul
Characterizing a rare disease diagnosis for a given patient is often made through expert's networks. It is a complex task that could evolve over time depending on the natural history of the disease and the evolution of the scientific knowledge. Most rare diseases have genetic causes and recent improvements of sequencing techniques contribute to the discovery of many new diseases every year. Diagnosis coding in the rare disease field requires data from multiple knowledge bases to be aggregated in order to offer the clinician a global information space from possible diagnosis to clinical signs (phenotypes) and known genetic mutations (genotype). Nowadays, the major barrier to the coding activity is the lack of consolidation of such information scattered in different thesaurus such as Orphanet, OMIM or HPO. The Linking Open data for Rare Diseases (LORD) web portal we developed stands as the first attempt to fill this gap by offering an integrated view of 8,400 rare diseases linked to more than 14,500 signs and 3,270 genes. The application provides a browsing feature to navigate through the relationships between diseases, signs and genes, and some Application Programming Interfaces to help its integration in health information systems in routine.
Choquet, Remy; Maaroufi, Meriem; Fonjallaz, Yannick; de Carrara, Albane; Vandenbussche, Pierre-Yves; Dhombres, Ferdinand; Landais, Paul
2015-01-01
Characterizing a rare disease diagnosis for a given patient is often made through expert’s networks. It is a complex task that could evolve over time depending on the natural history of the disease and the evolution of the scientific knowledge. Most rare diseases have genetic causes and recent improvements of sequencing techniques contribute to the discovery of many new diseases every year. Diagnosis coding in the rare disease field requires data from multiple knowledge bases to be aggregated in order to offer the clinician a global information space from possible diagnosis to clinical signs (phenotypes) and known genetic mutations (genotype). Nowadays, the major barrier to the coding activity is the lack of consolidation of such information scattered in different thesaurus such as Orphanet, OMIM or HPO. The Linking Open data for Rare Diseases (LORD) web portal we developed stands as the first attempt to fill this gap by offering an integrated view of 8,400 rare diseases linked to more than 14,500 signs and 3,270 genes. The application provides a browsing feature to navigate through the relationships between diseases, signs and genes, and some Application Programming Interfaces to help its integration in health information systems in routine. PMID:26958175
Hyatt, Sam; Cheung, Kat; Skelton, Andrew J.; Xu, Yaobo; Clark, Ian M.
2017-01-01
Long non-coding RNAs (lncRNAs) are expressed in a highly tissue-specific manner and function in various aspects of cell biology, often as key regulators of gene expression. In this study, we established a role for lncRNAs in chondrocyte differentiation. Using RNA sequencing we identified a human articular chondrocyte repertoire of lncRNAs from normal hip cartilage donated by neck of femur fracture patients. Of particular interest are lncRNAs upstream of the master chondrocyte transcription factor SOX9 locus. SOX9 is an HMG-box transcription factor that plays an essential role in chondrocyte development by directing the expression of chondrocyte-specific genes. Two of these lncRNAs are upregulated during chondrogenic differentiation of mesenchymal stem cells (MSCs). Depletion of one of these lncRNAs, LOC102723505, which we termed ROCR (regulator of chondrogenesis RNA), by RNA interference disrupted MSC chondrogenesis, concomitant with reduced cartilage-specific gene expression and incomplete matrix component production, indicating an important role in chondrocyte biology. Specifically, SOX9 induction was significantly ablated in the absence of ROCR, and overexpression of SOX9 rescued the differentiation of MSCs into chondrocytes. Our work sheds further light on chondrocyte-specific SOX9 expression and highlights a novel method of chondrocyte gene regulation involving a lncRNA. PMID:29084806
Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping
2015-08-01
Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Juge, Pierre-Antoine; Borie, Raphaël; Kannengiesser, Caroline; Gazal, Steven; Revy, Patrick; Wemeau-Stervinou, Lidwine; Debray, Marie-Pierre; Ottaviani, Sébastien; Marchand-Adam, Sylvain; Nathan, Nadia; Thabut, Gabriel; Richez, Christophe; Nunes, Hilario; Callebaut, Isabelle; Justet, Aurélien; Leulliot, Nicolas; Bonnefond, Amélie; Salgado, David; Richette, Pascal; Desvignes, Jean-Pierre; Lioté, Huguette; Froguel, Philippe; Allanore, Yannick; Sand, Olivier; Dromer, Claire; Flipo, René-Marc; Clément, Annick; Béroud, Christophe; Sibilia, Jean; Coustet, Baptiste; Cottin, Vincent; Boissier, Marie-Christophe; Wallaert, Benoit; Schaeverbeke, Thierry; Dastot le Moal, Florence; Frazier, Aline; Ménard, Christelle; Soubrier, Martin; Saidenberg, Nathalie; Valeyre, Dominique; Amselem, Serge; Boileau, Catherine; Crestani, Bruno; Dieudé, Philippe
2017-05-01
Despite its high prevalence and mortality, little is known about the pathogenesis of rheumatoid arthritis-associated interstitial lung disease (RA-ILD). Given that familial pulmonary fibrosis (FPF) and RA-ILD frequently share the usual pattern of interstitial pneumonia and common environmental risk factors, we hypothesised that the two diseases might share additional risk factors, including FPF-linked genes. Our aim was to identify coding mutations of FPF-risk genes associated with RA-ILD.We used whole exome sequencing (WES), followed by restricted analysis of a discrete number of FPF-linked genes and performed a burden test to assess the excess number of mutations in RA-ILD patients compared to controls.Among the 101 RA-ILD patients included, 12 (11.9%) had 13 WES-identified heterozygous mutations in the TERT , RTEL1 , PARN or SFTPC coding regions . The burden test, based on 81 RA-ILD patients and 1010 controls of European ancestry, revealed an excess of TERT , RTEL1 , PARN or SFTPC mutations in RA-ILD patients (OR 3.17, 95% CI 1.53-6.12; p=9.45×10 -4 ). Telomeres were shorter in RA-ILD patients with a TERT , RTEL1 or PARN mutation than in controls (p=2.87×10 -2 ).Our results support the contribution of FPF-linked genes to RA-ILD susceptibility. Copyright ©ERS 2017.
Differentially-Expressed Pseudogenes in HIV-1 Infection
Gupta, Aditi; Brown, C. Titus; Zheng, Yong-Hui; Adami, Christoph
2015-01-01
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these “functional” pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit. PMID:26426037
Divergent transcription is associated with promoters of transcriptional regulators
2013-01-01
Background Divergent transcription is a wide-spread phenomenon in mammals. For instance, short bidirectional transcripts are a hallmark of active promoters, while longer transcripts can be detected antisense from active genes in conditions where the RNA degradation machinery is inhibited. Moreover, many described long non-coding RNAs (lncRNAs) are transcribed antisense from coding gene promoters. However, the general significance of divergent lncRNA/mRNA gene pair transcription is still poorly understood. Here, we used strand-specific RNA-seq with high sequencing depth to thoroughly identify antisense transcripts from coding gene promoters in primary mouse tissues. Results We found that a substantial fraction of coding-gene promoters sustain divergent transcription of long non-coding RNA (lncRNA)/mRNA gene pairs. Strikingly, upstream antisense transcription is significantly associated with genes related to transcriptional regulation and development. Their promoters share several characteristics with those of transcriptional developmental genes, including very large CpG islands, high degree of conservation and epigenetic regulation in ES cells. In-depth analysis revealed a unique GC skew profile at these promoter regions, while the associated coding genes were found to have large first exons, two genomic features that might enforce bidirectional transcription. Finally, genes associated with antisense transcription harbor specific H3K79me2 epigenetic marking and RNA polymerase II enrichment profiles linked to an intensified rate of early transcriptional elongation. Conclusions We concluded that promoters of a class of transcription regulators are characterized by a specialized transcriptional control mechanism, which is directly coupled to relaxed bidirectional transcription. PMID:24365181
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; ...
2016-06-24
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novelmore » splice isoforms. Additionally, we uncover APA ofB11,000 expressed genes and more than 2,100 novel genes. Lastly, these results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.« less
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; Ngam, Peter; Devitt, Nicholas; Schilkey, Faye; Ben-Hur, Asa; Reddy, Anireddy S. N.
2016-01-01
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ∼11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism. PMID:27339290
Overview of research on Bombyx mori microRNA
Wang, Xin; Tang, Shun-ming; Shen, Xing-jia
2014-01-01
Abstract MicroRNAs (miRNAs) constitute some of the most significant regulatory factors involved at the post-transcriptional level after gene expression, contributing to the modulation of a large number of physiological processes such as development, metabolism, and disease occurrence. This review comprehensively and retrospectively explores the literature investigating silkworm, Bombyx mori L. (Lepidoptera: Bombicidae), miRNAs published to date, including discovery, identification, expression profiling analysis, target gene prediction, and the functional analysis of both miRNAs and their targets. It may provide experimental considerations and approaches for future study of miRNAs and benefit elucidation of the mechanisms of miRNAs involved in silkworm developmental processes and intracellular activities of other unknown non-coding RNAs. PMID:25368077
Fourie, Gerda; van der Merwe, Nicolaas A; Wingfield, Brenda D; Bogale, Mesfin; Tudzynski, Bettina; Wingfield, Michael J; Steenkamp, Emma T
2013-09-08
The availability of mitochondrial genomes has allowed for the resolution of numerous questions regarding the evolutionary history of fungi and other eukaryotes. In the Gibberella fujikuroi species complex, the exact relationships among the so-called "African", "Asian" and "American" Clades remain largely unresolved, irrespective of the markers employed. In this study, we considered the feasibility of using mitochondrial genes to infer the phylogenetic relationships among Fusarium species in this complex. The mitochondrial genomes of representatives of the three Clades (Fusarium circinatum, F. verticillioides and F. fujikuroi) were characterized and we determined whether or not the mitochondrial genomes of these fungi have value in resolving the higher level evolutionary relationships in the complex. Overall, the mitochondrial genomes of the three species displayed a high degree of synteny, with all the genes (protein coding genes, unique ORFs, ribosomal RNA and tRNA genes) in identical order and orientation, as well as introns that share similar positions within genes. The intergenic regions and introns generally contributed significantly to the size differences and diversity observed among these genomes. Phylogenetic analysis of the concatenated protein-coding dataset separated members of the Gibberella fujikuroi complex from other Fusarium species and suggested that F. fujikuroi ("Asian" Clade) is basal in the complex. However, individual mitochondrial gene trees were largely incongruent with one another and with the concatenated gene tree, because six distinct phylogenetic trees were recovered from the various single gene datasets. The mitochondrial genomes of Fusarium species in the Gibberella fujikuroi complex are remarkably similar to those of the previously characterized Fusarium species and Sordariomycetes. Despite apparently representing a single replicative unit, all of the genes encoded on the mitochondrial genomes of these fungi do not share the same evolutionary history. This incongruence could be due to biased selection on some genes or recombination among mitochondrial genomes. The results thus suggest that the use of individual mitochondrial genes for phylogenetic inference could mask the true relationships between species in this complex.
Origin and evolution of the long non-coding genes in the X-inactivation center.
Romito, Antonio; Rougeulle, Claire
2011-11-01
Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.
Bayesian variable selection for post-analytic interrogation of susceptibility loci.
Chen, Siying; Nunez, Sara; Reilly, Muredach P; Foulkes, Andrea S
2017-06-01
Understanding the complex interplay among protein coding genes and regulatory elements requires rigorous interrogation with analytic tools designed for discerning the relative contributions of overlapping genomic regions. To this aim, we offer a novel application of Bayesian variable selection (BVS) for classifying genomic class level associations using existing large meta-analysis summary level resources. This approach is applied using the expectation maximization variable selection (EMVS) algorithm to typed and imputed SNPs across 502 protein coding genes (PCGs) and 220 long intergenic non-coding RNAs (lncRNAs) that overlap 45 known loci for coronary artery disease (CAD) using publicly available Global Lipids Gentics Consortium (GLGC) (Teslovich et al., 2010; Willer et al., 2013) meta-analysis summary statistics for low-density lipoprotein cholesterol (LDL-C). The analysis reveals 33 PCGs and three lncRNAs across 11 loci with >50% posterior probabilities for inclusion in an additive model of association. The findings are consistent with previous reports, while providing some new insight into the architecture of LDL-cholesterol to be investigated further. As genomic taxonomies continue to evolve, additional classes such as enhancer elements and splicing regions, can easily be layered into the proposed analysis framework. Moreover, application of this approach to alternative publicly available meta-analysis resources, or more generally as a post-analytic strategy to further interrogate regions that are identified through single point analysis, is straightforward. All coding examples are implemented in R version 3.2.1 and provided as supplemental material. © 2016, The International Biometric Society.
Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.
Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D
2017-12-03
A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163
Jiang, Shu-Ye; Sevugan, Mayalagu; Ramachandran, Srinivasan
2018-05-09
Valine-glutamine (VQ) motif containing proteins play important roles in abiotic and biotic stress responses in plants. However, little is known about the origin and evolution as well as comprehensive expression regulation of the VQ gene family. In this study, we systematically surveyed this gene family in 50 plant genomes from algae, moss, gymnosperm and angiosperm and explored their presence in other species from animals, bacteria, fungi and viruses. No VQs were detected in all tested algae genomes and all genomes from moss, gymnosperm and angiosperm encode varying numbers of VQs. Interestingly, some of fungi, lower animals and bacteria also encode single to a few VQs. Thus, they are not plant-specific and should be regarded as an ancient family. Their family expansion was mainly due to segmental duplication followed by tandem duplication and mobile elements. Limited contribution of gene conversion was detected to the family evolution. Generally, VQs were very much conserved in their motif coding region and were under purifying selection. However, positive selection was also observed during species divergence. Many VQs were up- or down-regulated by various abiotic / biotic stresses and phytohormones in rice and Arabidopsis. They were also co-expressed with some of other stress-related genes. All of the expression data suggest a comprehensive expression regulation of the VQ gene family. We provide new insights into gene expansion, divergence, evolution and their expression regulation of this VQ family. VQs were detectable not only in plants but also in some of fungi, lower animals and bacteria, suggesting the evolutionary conservation and the ancient origin. Overall, VQs are non-plant-specific and play roles in abiotic / biotic responses or other biological processes through comprehensive expression regulation.
de Freitas, Michele C R; Resende, Juliana A; Ferreira-Machado, Alessandra B; Saji, Guadalupe D R Q; de Vasconcelos, Ana T R; da Silva, Vânia L; Nicolás, Marisa F; Diniz, Cláudio G
2016-01-01
Bacteroides fragilis , member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug efflux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment.
Diallinas, G; Gorfinkiel, L; Arst, H N; Cecchetto, G; Scazzocchio, C
1995-04-14
In Aspergillus nidulans, loss-of-function mutations in the uapA and azgA genes, encoding the major uric acid-xanthine and hypoxanthine-adenine-guanine permeases, respectively, result in impaired utilization of these purines as sole nitrogen sources. The residual growth of the mutant strains is due to the activity of a broad specificity purine permease. We have identified uapC, the gene coding for this third permease through the isolation of both gain-of-function and loss-of-function mutations. Uptake studies with wild-type and mutant strains confirmed the genetic analysis and showed that the UapC protein contributes 30% and 8-10% to uric acid and hypoxanthine transport rates, respectively. The uapC gene was cloned, its expression studied, its sequence and transcript map established, and the sequence of its putative product analyzed. uapC message accumulation is: (i) weakly induced by 2-thiouric acid; (ii) repressed by ammonium; (iii) dependent on functional uaY and areA regulatory gene products (mediating uric acid induction and nitrogen metabolite repression, respectively); (iv) increased by uapC gain-of-function mutations which specifically, but partially, suppress a leucine to valine mutation in the zinc finger of the protein coded by the areA gene. The putative uapC gene product is a highly hydrophobic protein of 580 amino acids (M(r) = 61,251) including 12-14 putative transmembrane segments. The UapC protein is highly similar (58% identity) to the UapA permease and significantly similar (23-34% identity) to a number of bacterial transporters. Comparisons of the sequences and hydropathy profiles of members of this novel family of transporters yield insights into their structure, functionally important residues, and possible evolutionary relationships.
Genetics of Cerebellar and Neocortical Expansion in Anthropoid Primates: A Comparative Approach
Harrison, Peter W.; Montgomery, Stephen H.
2017-01-01
What adaptive changes in brain structure and function underpin the evolution of increased cognitive performance in humans and our close relatives? Identifying the genetic basis of brain evolution has become a major tool in answering this question. Numerous cases of positive selection, altered gene expression or gene duplication have been identified that may contribute to the evolution of the neocortex, which is widely assumed to play a predominant role in cognitive evolution. However, the components of the neocortex co-evolve with other functionally interdependent regions of the brain, most notably in the cerebellum. The cerebellum is linked to a range of cognitive tasks and expanded rapidly during hominoid evolution. Here we present data that suggest that, across anthropoid primates, protein-coding genes with known roles in cerebellum development were just as likely to be targeted by selection as genes linked to cortical development. Indeed, based on currently available gene ontology data, protein-coding genes with known roles in cerebellum development are more likely to have evolved adaptively during hominoid evolution. This is consistent with phenotypic data suggesting an accelerated rate of cerebellar expansion in apes that is beyond that predicted from scaling with the neocortex in other primates. Finally, we present evidence that the strength of selection on specific genes is associated with variation in the volume of either the neocortex or the cerebellum, but not both. This result provides preliminary evidence that co-variation between these brain components during anthropoid evolution may be at least partly regulated by selection on independent loci, a conclusion that is consistent with recent intraspecific genetic analyses and a mosaic model of brain evolution that predicts adaptive evolution of brain structure. PMID:28683440
RpfF-dependent regulon of Xylella fastidiosa.
Wang, Nian; Li, Jian-Liang; Lindow, Steven E
2012-11-01
ABSTRACT Xylella fastidiosa regulates traits important to both virulence of grape as well as colonization of sharpshooter vectors via its production of a fatty acid signal molecule known as DSF whose production is dependent on rpfF. Although X. fastidiosa rpfF mutants exhibit increased virulence to plants, they are unable to be spread from plant to plant by insect vectors. To gain more insight into the traits that contribute to these processes, a whole-genome Agilent DNA microarray for this species was developed and used to determine the RpfF-dependent regulon by transcriptional profiling. In total, 446 protein coding genes whose expression was significantly different between the wild type and an rpfF mutant (false discovery rate < 0.05) were identified when cells were grown in PW liquid medium. Among them, 165 genes were downregulated in the rpfF mutant compared with the wild-type strain whereas 281 genes were over-expressed. RpfF function was required for regulation of 11 regulatory and σ factors, including rpfE, yybA, PD1177, glnB, rpfG, PD0954, PD0199, PD2050, colR, rpoH, and rpoD. In general, RpfF is required for regulation of genes involved in attachment and biofilm formation, enhancing expression of hemagglutinin genes hxfA and hxfB, and suppressing most type IV pili and gum genes. A large number of other RpfF-dependent genes that might contribute to virulence or insect colonization were also identified such as those encoding hemolysin and colicin V, as well as genes with unknown functions.
Decoding the genome beyond sequencing: the new phase of genomic research.
Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J
2011-10-01
While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. Copyright © 2011 Elsevier Inc. All rights reserved.
MiR-34a regulates the invasive capacity of canine osteosarcoma cell lines
Lopez, Cecilia M.; Yu, Peter Y.; Zhang, Xiaoli; Yilmaz, Ayse Selen; London, Cheryl A.
2018-01-01
Background Osteosarcoma (OSA) is the most common bone tumor in children and dogs; however, no substantial improvement in clinical outcome has occurred in either species over the past 30 years. MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression and play a fundamental role in cancer. The purpose of this study was to investigate the potential contribution of miR-34a loss to the biology of canine OSA, a well-established spontaneous model of the human disease. Methodology and principal findings RT-qPCR demonstrated that miR-34a expression levels were significantly reduced in primary canine OSA tumors and canine OSA cell lines as compared to normal canine osteoblasts. In canine OSA cell lines stably transduced with empty vector or pre-miR-34a lentiviral constructs, overexpression of miR-34a inhibited cellular invasion and migration but had no effect on cell proliferation or cell cycle distribution. Transcriptional profiling of canine OSA8 cells possessing enforced miR-34a expression demonstrated dysregulation of numerous genes, including significant down-regulation of multiple putative targets of miR-34a. Moreover, gene ontology analysis of down-regulated miR-34a target genes showed enrichment of several biological processes related to cell invasion and motility. Lastly, we validated changes in miR-34a putative target gene expression, including decreased expression of KLF4, SEM3A, and VEGFA transcripts in canine OSA cells overexpressing miR-34a and identified KLF4 and VEGFA as direct target genes of miR-34a. Concordant with these data, primary canine OSA tumor tissues demonstrated increased expression levels of putative miR-34a target genes. Conclusions These data demonstrate that miR-34a contributes to invasion and migration in canine OSA cells and suggest that loss of miR-34a may promote a pattern of gene expression contributing to the metastatic phenotype in canine OSA. PMID:29293555
MiR-34a regulates the invasive capacity of canine osteosarcoma cell lines.
Lopez, Cecilia M; Yu, Peter Y; Zhang, Xiaoli; Yilmaz, Ayse Selen; London, Cheryl A; Fenger, Joelle M
2018-01-01
Osteosarcoma (OSA) is the most common bone tumor in children and dogs; however, no substantial improvement in clinical outcome has occurred in either species over the past 30 years. MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression and play a fundamental role in cancer. The purpose of this study was to investigate the potential contribution of miR-34a loss to the biology of canine OSA, a well-established spontaneous model of the human disease. RT-qPCR demonstrated that miR-34a expression levels were significantly reduced in primary canine OSA tumors and canine OSA cell lines as compared to normal canine osteoblasts. In canine OSA cell lines stably transduced with empty vector or pre-miR-34a lentiviral constructs, overexpression of miR-34a inhibited cellular invasion and migration but had no effect on cell proliferation or cell cycle distribution. Transcriptional profiling of canine OSA8 cells possessing enforced miR-34a expression demonstrated dysregulation of numerous genes, including significant down-regulation of multiple putative targets of miR-34a. Moreover, gene ontology analysis of down-regulated miR-34a target genes showed enrichment of several biological processes related to cell invasion and motility. Lastly, we validated changes in miR-34a putative target gene expression, including decreased expression of KLF4, SEM3A, and VEGFA transcripts in canine OSA cells overexpressing miR-34a and identified KLF4 and VEGFA as direct target genes of miR-34a. Concordant with these data, primary canine OSA tumor tissues demonstrated increased expression levels of putative miR-34a target genes. These data demonstrate that miR-34a contributes to invasion and migration in canine OSA cells and suggest that loss of miR-34a may promote a pattern of gene expression contributing to the metastatic phenotype in canine OSA.
Primer development to obtain complete coding sequence of HA and NA genes of influenza A/H3N2 virus.
Agustiningsih, Agustiningsih; Trimarsanto, Hidayat; Setiawaty, Vivi; Artika, I Made; Muljono, David Handojo
2016-08-30
Influenza is an acute respiratory illness and has become a serious public health problem worldwide. The need to study the HA and NA genes in influenza A virus is essential since these genes frequently undergo mutations. This study describes the development of primer sets for RT-PCR to obtain complete coding sequence of Hemagglutinin (HA) and Neuraminidase (NA) genes of influenza A/H3N2 virus from Indonesia. The primers were developed based on influenza A/H3N2 sequence worldwide from Global Initiative on Sharing All Influenza Data (GISAID) and further tested using Indonesian influenza A/H3N2 archived samples of influenza-like illness (ILI) surveillance from 2008 to 2009. An optimum RT-PCR condition was acquired for all HA and NA fragments designed to cover complete coding sequence of HA and NA genes. A total of 71 samples were successfully sequenced for complete coding sequence both of HA and NA genes out of 145 samples of influenza A/H3N2 tested. The developed primer sets were suitable for obtaining complete coding sequences of HA and NA genes of Indonesian samples from 2008 to 2009.
NASA Technical Reports Server (NTRS)
Weitzeal, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.
2016-01-01
Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photoassimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASAs GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be upregulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS like protein (potentially affecting cell elongation in the leaves), and an F-boxkelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm upregulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASAs VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.
NASA Technical Reports Server (NTRS)
Weitzeal, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.
2016-01-01
Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photoassimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASA's GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be upregulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS like protein (potentially affecting cell elongation in the leaves), and an F-box/kelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm upregulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASA's VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.
The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101
NASA Astrophysics Data System (ADS)
Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.
2014-08-01
Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.
Pietan, Lucas L.; Spradling, Theresa A.
2016-01-01
In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589
Cellular miR-2909 RNomics governs the genes that ensure immune checkpoint regulation.
Kaul, Deepak; Malik, Deepti; Wani, Sameena
2018-06-20
Cross-talk between coding RNAs and regulatory non-coding microRNAs, within human genome, has provided compelling evidence for the existence of flexible checkpoint control of T-Cell activation. The present study attempts to demonstrate that the interplay between miR-2909 and its effector KLF4 gene has the inherent capacity to regulate genes coding for CTLA4, CD28, CD40, CD134, PDL1, CD80, CD86, IL-6 and IL-10 within normal human peripheral blood mononuclear cells (PBMCs). Based upon these findings, we propose a pathway that links miR-2909 RNomics with the genes coding for immune checkpoint regulators required for the maintenance of immune homeostasis.
Villada, Juan C.; Brustolini, Otávio José Bernardes
2017-01-01
Abstract Gene codon optimization may be impaired by the misinterpretation of frequency and optimality of codons. Although recent studies have revealed the effects of codon usage bias (CUB) on protein biosynthesis, an integrated perspective of the biological role of individual codons remains unknown. Unlike other previous studies, we show, through an integrated framework that attributes of codons such as frequency, optimality and positional dependency should be combined to unveil individual codon contribution for protein biosynthesis. We designed a codon quantification method for assessing CUB as a function of position within genes with a novel constraint: the relativity of position-dependent codon usage shaped by coding sequence length. Thus, we propose a new way of identifying the enrichment, depletion and non-uniform positional distribution of codons in different regions of yeast genes. We clustered codons that shared attributes of frequency and optimality. The cluster of non-optimal codons with rare occurrence displayed two remarkable characteristics: higher codon decoding time than frequent–non-optimal cluster and enrichment at the 5′-end region, where optimal codons with the highest frequency are depleted. Interestingly, frequent codons with non-optimal adaptation to tRNAs are uniformly distributed in the Saccharomyces cerevisiae genes, suggesting their determinant role as a speed regulator in protein elongation. PMID:28449100
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome
Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.
2001-01-01
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.
Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M
2001-10-09
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.
Villada, Juan C; Brustolini, Otávio José Bernardes; Batista da Silveira, Wendel
2017-08-01
Gene codon optimization may be impaired by the misinterpretation of frequency and optimality of codons. Although recent studies have revealed the effects of codon usage bias (CUB) on protein biosynthesis, an integrated perspective of the biological role of individual codons remains unknown. Unlike other previous studies, we show, through an integrated framework that attributes of codons such as frequency, optimality and positional dependency should be combined to unveil individual codon contribution for protein biosynthesis. We designed a codon quantification method for assessing CUB as a function of position within genes with a novel constraint: the relativity of position-dependent codon usage shaped by coding sequence length. Thus, we propose a new way of identifying the enrichment, depletion and non-uniform positional distribution of codons in different regions of yeast genes. We clustered codons that shared attributes of frequency and optimality. The cluster of non-optimal codons with rare occurrence displayed two remarkable characteristics: higher codon decoding time than frequent-non-optimal cluster and enrichment at the 5'-end region, where optimal codons with the highest frequency are depleted. Interestingly, frequent codons with non-optimal adaptation to tRNAs are uniformly distributed in the Saccharomyces cerevisiae genes, suggesting their determinant role as a speed regulator in protein elongation. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Studying the genetic basis of speciation in high gene flow marine invertebrates
2016-01-01
A growing number of genes responsible for reproductive incompatibilities between species (barrier loci) exhibit the signals of positive selection. However, the possibility that genes experiencing positive selection diverge early in speciation and commonly cause reproductive incompatibilities has not been systematically investigated on a genome-wide scale. Here, I outline a research program for studying the genetic basis of speciation in broadcast spawning marine invertebrates that uses a priori genome-wide information on a large, unbiased sample of genes tested for positive selection. A targeted sequence capture approach is proposed that scores single-nucleotide polymorphisms (SNPs) in widely separated species populations at an early stage of allopatric divergence. The targeted capture of both coding and non-coding sequences enables SNPs to be characterized at known locations across the genome and at genes with known selective or neutral histories. The neutral coding and non-coding SNPs provide robust background distributions for identifying FST-outliers within genes that can, in principle, identify specific mutations experiencing diversifying selection. If natural hybridization occurs between species, the neutral coding and non-coding SNPs can provide a neutral admixture model for genomic clines analyses aimed at finding genes exhibiting strong blocks to introgression. Strongylocentrotid sea urchins are used as a model system to outline the approach but it can be used for any group that has a complete reference genome available. PMID:29491951
Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population.
Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P
2004-10-01
The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
Yuan, Yao-Wu; Sagawa, Janelle M; Young, Riane C; Christensen, Brian J; Bradshaw, Harvey D
2013-05-01
Prezygotic barriers play a major role in the evolution of reproductive isolation, which is a prerequisite for speciation. However, despite considerable progress in identifying genes and mutations responsible for postzygotic isolation, little is known about the genetic and molecular basis underlying prezygotic barriers. The bumblebee-pollinated Mimulus lewisii and the hummingbird-pollinated M. cardinalis represent a classic example of pollinator-mediated prezygotic isolation between two sister species in sympatry. Flower color differences resulting from both carotenoid and anthocyanin pigments contribute to pollinator discrimination between the two species in nature. Through fine-scale genetic mapping, site-directed mutagenesis, and transgenic experiments, we demonstrate that a single-repeat R3 MYB repressor, ROSE INTENSITY1 (ROI1), is the causal gene underlying a major quantitative trait locus (QTL) with the largest effect on anthocyanin concentration and that cis-regulatory change rather than coding DNA mutations cause the allelic difference between M. lewisii and M. cardinalis. Together with the genomic resources and stable transgenic tools developed here, these results suggest that Mimulus is an excellent platform for studying the genetics of pollinator-mediated reproductive isolation and the molecular basis of morphological evolution at the most fundamental level-gene by gene, mutation by mutation.
Next Generation Sequencing and ALS: known genes, different phenotyphes.
Campopiano, Rosa; Ryskalin, Larisa; Giardina, Emiliano; Zampatti, Stefania; Busceti, Carla L; Biagioni, Francesca; Ferese, Rosangela; Storto, Marianna; Gambardella, Stefano; Fornai, Francesco
2017-12-01
Amyotrophic lateral sclerosis (ALS) is fatal neurodegenerative disease clinically characterized by upper and lower motor neuron dysfunction resulting in rapidly progressive paralysis and death from respiratory failure. Most cases appear to be sporadic, but 5-10 % of cases have a family history of the disease, and over the last decade, identification of mutations in about 20 genes predisposing to these disorders has provided the means to better understand their pathogenesis. Next Generation sequencing (NGS) is an advanced high-throughput DNA sequencing technology which have rapidly contributed to an acceleration in the discovery of genetic risk factors for both familial and sporadic neurological and neurodegenerative diseases. These strategies allowed to rapidly identify disease-associated variants and genetic risk factors for both familial (fALS) and sporadic ALS (sALS), strongly contributing to the knowledge of the genetic architecture of ALS. Moreover, as the number of ALS genes grows, many of the proteins they encode are in intracellular processes shared with other known diseases, suggesting an overlapping of clinical and phatological features between different diseases. To emphasize this concept, the review focuses on genes coding for Valosin-containing protein (VPC) and two Heterogeneous nuclear RNA-binding proteins (HNRNPA1 and hnRNPA2B1), recently idefied through NGS, where different mutations have been associated in both ALS and other neurological and neurodegenerative diseases.
Lathe, R
1985-05-05
Synthetic probes deduced from amino acid sequence data are widely used to detect cognate coding sequences in libraries of cloned DNA segments. The redundancy of the genetic code dictates that a choice must be made between (1) a mixture of probes reflecting all codon combinations, and (2) a single longer "optimal" probe. The second strategy is examined in detail. The frequency of sequences matching a given probe by chance alone can be determined and also the frequency of sequences closely resembling the probe and contributing to the hybridization background. Gene banks cannot be treated as random associations of the four nucleotides, and probe sequences deduced from amino acid sequence data occur more often than predicted by chance alone. Probe lengths must be increased to confer the necessary specificity. Examination of hybrids formed between unique homologous probes and their cognate targets reveals that short stretches of perfect homology occurring by chance make a significant contribution to the hybridization background. Statistical methods for improving homology are examined, taking human coding sequences as an example, and considerations of codon utilization and dinucleotide frequencies yield an overall homology of greater than 82%. Recommendations for probe design and hybridization are presented, and the choice between using multiple probes reflecting all codon possibilities and a unique optimal probe is discussed.
The locus of evolution: evo devo and the genetics of adaptation.
Hoekstra, Hopi E; Coyne, Jerry A
2007-05-01
An important tenet of evolutionary developmental biology ("evo devo") is that adaptive mutations affecting morphology are more likely to occur in the cis-regulatory regions than in the protein-coding regions of genes. This argument rests on two claims: (1) the modular nature of cis-regulatory elements largely frees them from deleterious pleiotropic effects, and (2) a growing body of empirical evidence appears to support the predominant role of gene regulatory change in adaptation, especially morphological adaptation. Here we discuss and critique these assertions. We first show that there is no theoretical or empirical basis for the evo devo contention that adaptations involving morphology evolve by genetic mechanisms different from those involving physiology and other traits. In addition, some forms of protein evolution can avoid the negative consequences of pleiotropy, most notably via gene duplication. In light of evo devo claims, we then examine the substantial data on the genetic basis of adaptation from both genome-wide surveys and single-locus studies. Genomic studies lend little support to the cis-regulatory theory: many of these have detected adaptation in protein-coding regions, including transcription factors, whereas few have examined regulatory regions. Turning to single-locus studies, we note that the most widely cited examples of adaptive cis-regulatory mutations focus on trait loss rather than gain, and none have yet pinpointed an evolved regulatory site. In contrast, there are many studies that have both identified structural mutations and functionally verified their contribution to adaptation and speciation. Neither the theoretical arguments nor the data from nature, then, support the claim for a predominance of cis-regulatory mutations in evolution. Although this claim may be true, it is at best premature. Adaptation and speciation probably proceed through a combination of cis-regulatory and structural mutations, with a substantial contribution of the latter.
Vouille, V; Amiche, M; Nicolas, P
1997-09-01
We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.
Haughey, Heather M; Kaiser, Alan L; Johnson, Thomas E; Bennett, Beth; Sikela, James M; Zahniser, Nancy R
2005-10-01
Altered noradrenergic neurotransmission is associated with depression and may contribute to drug abuse and alcoholism. Differential initial sensitivity to ethanol is an important predictor of risk for future alcoholism, making the inbred long-sleep (ILS) and inbred short-sleep (ISS) mice a useful model for identifying genes that may contribute to alcoholism. In this study, molecular biological, neurochemical, and behavioral approaches were used to test the hypothesis that the norepinephrine transporter (NET) contributes to the differences in ethanol-induced loss of righting reflex (LORR) in ILS and ISS mice. We used these mice to investigate the NET as a candidate gene contributing to this phenotype. The ILS and ISS mice carry different DNA haplotypes for NET, showing eight silent differences between allelic coding regions. Only the ILS haplotype is found in other mouse strains thus far sequenced. Brain regional analyses revealed that ILS mice have 30 to 50% lower [3H]NE uptake, NET binding, and NET mRNA levels than ISS mice. Maximal [3H]NE uptake and NET number were reduced, with no change in affinity, in the ILS mice. These neurobiological changes were associated with significant influences on the behavioral phenotype of these mice, as demonstrated by (1) a differential response in the duration of ethanol-induced LORR in ILS and ISS mice pretreated with a NET inhibitor and (2) increased ethanol-induced LORR in LXS recombinant inbred (RI) strains, homozygous for ILS in the NET chromosomal region (44-47 cM), compared with ISS homozygous strains. This is the first report to suggest that the NET gene is one of many possible genetic factors influencing ethanol sensitivity in ILS, ISS, and LXS RI mouse strains.
Jakobsen, Tim Holm; Hansen, Martin Asser; Jensen, Peter Østrup; Hansen, Lars; Riber, Leise; Cockburn, April; Kolpen, Mette; Rønne Hansen, Christine; Ridderberg, Winnie; Eickhardt, Steffen; Hansen, Marlene; Kerpedjiev, Peter; Alhede, Morten; Qvortrup, Klaus; Burmølle, Mette; Moser, Claus; Kühl, Michael; Ciofu, Oana; Givskov, Michael; Sørensen, Søren J.; Høiby, Niels; Bjarnsholt, Thomas
2013-01-01
Achromobacter xylosoxidans is an environmental opportunistic pathogen, which infects an increasing number of immunocompromised patients. In this study we combined genomic analysis of a clinical isolated A. xylosoxidans strain with phenotypic investigations of its important pathogenic features. We present a complete assembly of the genome of A. xylosoxidans NH44784-1996, an isolate from a cystic fibrosis patient obtained in 1996. The genome of A. xylosoxidans NH44784-1996 contains approximately 7 million base pairs with 6390 potential protein-coding sequences. We identified several features that render it an opportunistic human pathogen, We found genes involved in anaerobic growth and the pgaABCD operon encoding the biofilm adhesin poly-β-1,6-N-acetyl-D-glucosamin. Furthermore, the genome contains a range of antibiotic resistance genes coding efflux pump systems and antibiotic modifying enzymes. In vitro studies of A. xylosoxidans NH44784-1996 confirmed the genomic evidence for its ability to form biofilms, anaerobic growth via denitrification, and resistance to a broad range of antibiotics. Our investigation enables further studies of the functionality of important identified genes contributing to the pathogenicity of A. xylosoxidans and thereby improves our understanding and ability to treat this emerging pathogen. PMID:23894309
De novo mutations in regulatory elements in neurodevelopmental disorders
Short, Patrick J.; McRae, Jeremy F.; Gallone, Giuseppe; Sifrim, Alejandro; Won, Hyejung; Geschwind, Daniel H.; Wright, Caroline F.; Firth, Helen V; FitzPatrick, David R.; Barrett, Jeffrey C.; Hurles, Matthew E.
2018-01-01
We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1-3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders. PMID:29562236
Han, Zhenyun; Hu, Yanan; Lv, Yuanda; Sun, Yaqiang; Shen, Fei; Wang, Yi; Zhang, Xinzhong; Xu, Xuefeng
2018-01-01
Through natural or human selection, many fleshy fruits have evolved vivid external or internal coloration, which often develops during ripening. Such developmental changes in color are associated with the biosynthesis of pigments as well as with degreening through chlorophyll degradation. Here, we demonstrated that natural variation in the coding region of the gene ETHYLENE RESPONSE FACTOR17 (ERF17) contributes to apple (Malus domestica) fruit peel degreening. Specifically, ERF17 mutant alleles with different serine (Ser) repeat insertions in the coding region exhibited enhanced transcriptional regulation activity in a dual-luciferase reporter assay when more Ser repeats were present. Notably, surface plasmon resonance analysis showed that the number of Ser repeats affected the binding activity of ERF17 to the promoter sequences of chlorophyll degradation-related genes. In addition, overexpression of ERF17 in evergreen apples altered the accumulation of chlorophyll. Furthermore, we demonstrated that ERF17 has been under selection since the origin of apple tree cultivation. Taken together, these results reveal allelic variation underlying an important fruit quality trait and a molecular genetic mechanism associated with apple domestication. PMID:29431631
Divergence and Mosaicism among Virulent Soil Phages of the Burkholderia cepacia Complex‡
Summer, Elizabeth J.; Gonzalez, Carlos F.; Bomer, Morgan; Carlile, Thomas; Embry, Addie; Kucherka, Amalie M.; Lee, Jonte; Mebane, Leslie; Morrison, William C.; Mark, Louise; King, Maria D.; LiPuma, John J.; Vidaver, Anne K.; Young, Ry
2006-01-01
We have determined the genomic sequences of four virulent myophages, Bcep1, Bcep43, BcepB1A, and Bcep781, whose hosts are soil isolates of the Burkholderia cepacia complex. Despite temporal and spatial separations between initial isolations, three of the phages (Bcep1, Bcep43, and Bcep781, designated the Bcep781 group) exhibit 87% to 99% sequence identity to one another and most coding region differences are due to synonymous nucleotide substitutions, a hallmark of neutral genetic drift. Phage BcepB1A has a very different genome organization but is clearly a mosaic with respect to many of the genes of the Bcep781 group, as is a defective prophage element in Photorhabdus luminescens. Functions were assigned to 27 out of 71 predicted genes of Bcep1 despite extreme sequence divergence. Using a lambda repressor fusion technique, 10 Bcep781-encoded proteins were identified for their ability to support homotypic interactions. While head and tail morphogenesis genes have retained canonical gene order despite extreme sequence divergence, genes involved in DNA metabolism and host lysis are not organized as in other phages. This unusual genome arrangement may contribute to the ability of the Bcep781-like phages to maintain a unified genomic type. However, the Bcep781 group phages can also engage in lateral gene transfer events with otherwise unrelated phages, a process that contributes to the broader-scale genomic mosaicism prevalent among the tailed phages. PMID:16352842
XGC developments for a more efficient XGC-GENE code coupling
NASA Astrophysics Data System (ADS)
Dominski, Julien; Hager, Robert; Ku, Seung-Hoe; Chang, Cs
2017-10-01
In the Exascale Computing Program, the High-Fidelity Whole Device Modeling project initially aims at delivering a tightly-coupled simulation of plasma neoclassical and turbulence dynamics from the core to the edge of the tokamak. To permit such simulations, the gyrokinetic codes GENE and XGC will be coupled together. Numerical efforts are made to improve the numerical schemes agreement in the coupling region. One of the difficulties of coupling those codes together is the incompatibility of their grids. GENE is a continuum grid-based code and XGC is a Particle-In-Cell code using unstructured triangular mesh. A field-aligned filter is thus implemented in XGC. Even if XGC originally had an approximately field-following mesh, this field-aligned filter permits to have a perturbation discretization closer to the one solved in the field-aligned code GENE. Additionally, new XGC gyro-averaging matrices are implemented on a velocity grid adapted to the plasma properties, thus ensuring same accuracy from the core to the edge regions.
Prediction and Validation of Disease Genes Using HeteSim Scores.
Zeng, Xiangxiang; Liao, Yuanlu; Liu, Yuansheng; Zou, Quan
2017-01-01
Deciphering the gene disease association is an important goal in biomedical research. In this paper, we use a novel relevance measure, called HeteSim, to prioritize candidate disease genes. Two methods based on heterogeneous networks constructed using protein-protein interaction, gene-phenotype associations, and phenotype-phenotype similarity, are presented. In HeteSim_MultiPath (HSMP), HeteSim scores of different paths are combined with a constant that dampens the contributions of longer paths. In HeteSim_SVM (HSSVM), HeteSim scores are combined with a machine learning method. The 3-fold experiments show that our non-machine learning method HSMP performs better than the existing non-machine learning methods, our machine learning method HSSVM obtains similar accuracy with the best existing machine learning method CATAPULT. From the analysis of the top 10 predicted genes for different diseases, we found that HSSVM avoid the disadvantage of the existing machine learning based methods, which always predict similar genes for different diseases. The data sets and Matlab code for the two methods are freely available for download at http://lab.malab.cn/data/HeteSim/index.jsp.
McClelland, Shawn; Brennan, Gary P; Dubé, Celine; Rajpara, Seeta; Iyer, Shruti; Richichi, Cristina; Bernard, Christophe; Baram, Tallie Z
2014-01-01
The mechanisms generating epileptic neuronal networks following insults such as severe seizures are unknown. We have previously shown that interfering with the function of the neuron-restrictive silencer factor (NRSF/REST), an important transcription factor that influences neuronal phenotype, attenuated development of this disorder. In this study, we found that epilepsy-provoking seizures increased the low NRSF levels in mature hippocampus several fold yet surprisingly, provoked repression of only a subset (∼10%) of potential NRSF target genes. Accordingly, the repressed gene-set was rescued when NRSF binding to chromatin was blocked. Unexpectedly, genes selectively repressed by NRSF had mid-range binding frequencies to the repressor, a property that rendered them sensitive to moderate fluctuations of NRSF levels. Genes selectively regulated by NRSF during epileptogenesis coded for ion channels, receptors, and other crucial contributors to neuronal function. Thus, dynamic, selective regulation of NRSF target genes may play a role in influencing neuronal properties in pathological and physiological contexts. DOI: http://dx.doi.org/10.7554/eLife.01267.001 PMID:25117540
Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D
2018-04-30
The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.
Optimization of algorithm of coding of genetic information of Chlamydia
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Ulyanov, Sergey S.; Zaytsev, Sergey S.; Saltykov, Yury V.; Ulianova, Onega V.
2018-04-01
New method of coding of genetic information using coherent optical fields is developed. Universal technique of transformation of nucleotide sequences of bacterial gene into laser speckle pattern is suggested. Reference speckle patterns of the nucleotide sequences of omp1 gene of typical wild strains of Chlamydia trachomatis of genovars D, E, F, G, J and K and Chlamydia psittaci serovar I as well are generated. Algorithm of coding of gene information into speckle pattern is optimized. Fully developed speckles with Gaussian statistics for gene-based speckles have been used as criterion of optimization.
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).
Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai
2014-12-01
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.
Genetic background of extreme violent behavior
Tiihonen, J; Rautiainen, M-R; Ollila, HM; Repo-Tiihonen, E; Virkkunen, M; Palotie, A; Pietiläinen, O; Kristiansson, K; Joukamaa, M; Lauerma, H; Saarela, J; Tyni, S; Vartiainen, H; Paananen, J; Goldman, D; Paunio, T
2015-01-01
In developed countries, the majority of all violent crime is committed by a small group of antisocial recidivistic offenders, but no genes have been shown to contribute to recidivistic violent offending or severe violent behavior, such as homicide. Our results, from two independent cohorts of Finnish prisoners, revealed that a monoamine oxidase A (MAOA) low-activity genotype (contributing to low dopamine turnover rate) as well as the CDH13 gene (coding for neuronal membrane adhesion protein) are associated with extremely violent behavior (at least 10 committed homicides, attempted homicides or batteries). No substantial signal was observed for either MAOA or CDH13 among non-violent offenders, indicating that findings were specific for violent offending, and not largely attributable to substance abuse or antisocial personality disorder. These results indicate both low monoamine metabolism and neuronal membrane dysfunction as plausible factors in the etiology of extreme criminal violent behavior, and imply that at least about 5–10% of all severe violent crime in Finland is attributable to the aforementioned MAOA and CDH13 genotypes. PMID:25349169
Genetic background of extreme violent behavior.
Tiihonen, J; Rautiainen, M-R; Ollila, H M; Repo-Tiihonen, E; Virkkunen, M; Palotie, A; Pietiläinen, O; Kristiansson, K; Joukamaa, M; Lauerma, H; Saarela, J; Tyni, S; Vartiainen, H; Paananen, J; Goldman, D; Paunio, T
2015-06-01
In developed countries, the majority of all violent crime is committed by a small group of antisocial recidivistic offenders, but no genes have been shown to contribute to recidivistic violent offending or severe violent behavior, such as homicide. Our results, from two independent cohorts of Finnish prisoners, revealed that a monoamine oxidase A (MAOA) low-activity genotype (contributing to low dopamine turnover rate) as well as the CDH13 gene (coding for neuronal membrane adhesion protein) are associated with extremely violent behavior (at least 10 committed homicides, attempted homicides or batteries). No substantial signal was observed for either MAOA or CDH13 among non-violent offenders, indicating that findings were specific for violent offending, and not largely attributable to substance abuse or antisocial personality disorder. These results indicate both low monoamine metabolism and neuronal membrane dysfunction as plausible factors in the etiology of extreme criminal violent behavior, and imply that at least about 5-10% of all severe violent crime in Finland is attributable to the aforementioned MAOA and CDH13 genotypes.
The invasive MED/Q Bemisia tabaci genome: a tale of gene loss and gene gain.
Xie, Wen; Yang, Xin; Chen, Chunhai; Yang, Zezhong; Guo, Litao; Wang, Dan; Huang, Jinqun; Zhang, Hailin; Wen, Yanan; Zhao, Jinyang; Wu, Qingjun; Wang, Shaoli; Coates, Brad S; Zhou, Xuguo; Zhang, Youjun
2018-01-22
Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q's genomic signatures that may contribute to the highly invasive nature of this emerging insect pest. The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects. The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species' invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions.
Mikhailov, Alexander T; Torrado, Mario
2018-05-12
There is growing evidence that putative gene regulatory networks including cardio-enriched transcription factors, such as PITX2, TBX5, ZFHX3, and SHOX2, and their effector/target genes along with downstream non-coding RNAs can play a potentially important role in the process of adaptive and maladaptive atrial rhythm remodeling. In turn, expression of atrial fibrillation-associated transcription factors is under the control of upstream regulatory non-coding RNAs. This review broadly explores gene regulatory mechanisms associated with susceptibility to atrial fibrillation-with key examples from both animal models and patients-within the context of both cardiac transcription factors and non-coding RNAs. These two systems appear to have multiple levels of cross-regulation and act coordinately to achieve effective control of atrial rhythm effector gene expression. Perturbations of a dynamic expression balance between transcription factors and corresponding non-coding RNAs can provoke the development or promote the progression of atrial fibrillation. We also outline deficiencies in current models and discuss ongoing studies to clarify remaining mechanistic questions. An understanding of the function of transcription factors and non-coding RNAs in gene regulatory networks associated with atrial fibrillation risk will enable the development of innovative therapeutic strategies.
Methylation of miRNA genes and oncogenesis.
Loginov, V I; Rykov, S V; Fridman, M V; Braga, E A
2015-02-01
Interaction between microRNA (miRNA) and messenger RNA of target genes at the posttranscriptional level provides fine-tuned dynamic regulation of cell signaling pathways. Each miRNA can be involved in regulating hundreds of protein-coding genes, and, conversely, a number of different miRNAs usually target a structural gene. Epigenetic gene inactivation associated with methylation of promoter CpG-islands is common to both protein-coding genes and miRNA genes. Here, data on functions of miRNAs in development of tumor-cell phenotype are reviewed. Genomic organization of promoter CpG-islands of the miRNA genes located in inter- and intragenic areas is discussed. The literature and our own results on frequency of CpG-island methylation in miRNA genes from tumors are summarized, and data regarding a link between such modification and changed activity of miRNA genes and, consequently, protein-coding target genes are presented. Moreover, the impact of miRNA gene methylation on key oncogenetic processes as well as affected signaling pathways is discussed.
Bergkemper, Fabian; Kublik, Susanne; Lang, Friederike; Krüger, Jaane; Vestergaard, Gisle; Schloter, Michael; Schulz, Stefanie
2016-06-01
Phosphorus (P) is of central importance for cellular life but likewise a limiting macronutrient in numerous environments. Certainly microorganisms have proven their ability to increase the phosphorus bioavailability by mineralization of organic-P and solubilization of inorganic-P. On the other hand they efficiently take up P and compete with other biota for phosphorus. However the actual microbial community that is associated to the turnover of this crucial macronutrient in different ecosystems remains largely anonymous especially taking effects of seasonality and spatial heterogeneity into account. In this study seven oligonucleotide primers are presented which target genes coding for microbial acid and alkaline phosphatases (phoN, phoD), phytases (appA), phosphonatases (phnX) as well as the quinoprotein glucose dehydrogenase (gcd) and different P transporters (pitA, pstS). Illumina amplicon sequencing of soil genomic DNA underlined the high rate of primer specificity towards the respective target gene which usually ranged between 98% and 100% (phoN: 87%). As expected the primers amplified genes from a broad diversity of distinct microorganisms. Using DNA from a beech dominated forest soil, the highest microbial diversity was detected for the alkaline phosphatase (phoD) gene which was amplified from 15 distinct phyla respectively 81 families. Noteworthy the primers also allowed amplification of phoD from 6 fungal orders. The genes coding for acid phosphatase (phoN) and the quinoprotein glucose dehydrogenase (gcd) were amplified from 20 respectively 17 different microbial orders. In comparison the phytase and phosphonatase (appA, phnX) primers covered 13 bacterial orders from 2 different phyla respectively. Although the amplified microbial diversity was apparently limited both primers reliably detected all orders that contributed to the P turnover in the investigated soil as revealed by a previous metagenomic approach. Genes that code for microbial P transporter (pitA, pstS) were amplified from 13 respectively 9 distinct microbial orders. Accordingly the introduced primers represent a valuable tool for further analysis of the microbial community involved in the turnover of phosphorus in soils but most likely also in other environments. Copyright © 2016 Elsevier B.V. All rights reserved.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Brečević, Lukrecija; Rinčić, Martina; Krsnik, Željka; Sedmak, Goran; Hamid, Ahmed B.; Kosyakova, Nadezda; Galić, Ivan; Liehr, Thomas; Borovečki, Fran
2015-01-01
We describe an as yet unreported neocentric small supernumerary marker chromosome (sSMC) derived from chromosome 1p21.3p21.2. It was present in 80% of the lymphocytes in a male patient with intellectual disability, severe speech deficit, mild dysmorphic features, and hyperactivity with elements of autism spectrum disorder (ASD). Several important neurodevelopmental genes are affected by the 3.56 Mb copy number gain of 1p21.3p21.2, which may be considered reciprocal in gene content to the recently recognized 1p21.3 microdeletion syndrome. Both 1p21.3 deletions and the presented duplication display overlapping symptoms, fitting the same disorder category. Contribution of coding and non-coding genes to the phenotype is discussed in the light of cellular and intercellular homeostasis disequilibrium. In line with this the presented 1p21.3p21.2 copy number gain correlated to 1p21.3 microdeletion syndrome verifies the hypothesis of a cumulative effect of the number of deregulated genes - homeostasis disequilibrium leading to overlapping phenotypes between microdeletion and microduplication syndromes. Although miR-137 appears to be the major player in the 1p21.3p21.2 region, deregulation of the DPYD (dihydropyrimidine dehydrogenase) gene may potentially affect neighboring genes underlying the overlapping symptoms present in both the copy number loss and copy number gain of 1p21. Namely, the all-in approach revealed that DPYD is a complex gene whose expression is epigenetically regulated by long non-coding RNAs (lncRNAs) within the locus. Furthermore, the long interspersed nuclear element-1 (LINE-1) L1MC1 transposon inserted in DPYD intronic transcript 1 (DPYD-IT1) lncRNA with its parasites, TcMAR-Tigger5b and pair of Alu repeats appears to be the “weakest link” within the DPYD gene liable to break. Identification of the precise mechanism through which DPYD is epigenetically regulated, and underlying reasons why exactly the break (FRA1E) happens, will consequently pave the way toward preventing severe toxicity to the antineoplastic drug 5-fluorouracil (5-FU) and development of the causative therapy for the dihydropyrimidine dehydrogenase deficiency. PMID:28123791
Dubey, Bhawna; Meganathan, P R; Haque, Ikramul
2012-07-01
This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.
Clinical and experimental advances in congenital and paediatric cataracts
Churchill, Amanda; Graw, Jochen
2011-01-01
Cataracts (opacities of the lens) are frequent in the elderly, but rare in paediatric practice. Congenital cataracts (in industrialized countries) are mainly caused by mutations affecting lens development. Much of our knowledge about the underlying mechanisms of cataractogenesis has come from the genetic analysis of affected families: there are contributions from genes coding for transcription factors (such as FoxE3, Maf, Pitx3) and structural proteins such as crystallins or connexins. In addition, there are contributions from enzymes affecting sugar pathways (particularly the galactose pathway) and from a quite unexpected area: axon guidance molecules like ephrins and their receptors. Cataractous mouse lenses can be identified easily by visual inspection, and a remarkable number of mutant lines have now been characterized. Generally, most of the mouse mutants show a similar phenotype to their human counterparts; however, there are some remarkable differences. It should be noted that many mutations affect genes that are expressed not only in the lens, but also in tissues and organs outside the eye. There is increasing evidence for pleiotropic effects of these genes, and increasing consideration that cataracts may act as early and readily detectable biomarkers for a number of systemic syndromes. PMID:21402583
Energy metabolism in Desulfovibrio vulgaris Hildenborough: insights from transcriptome analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pereira, Patricia M.; He, Qiang; Valente, Filipa M.A.
2007-11-01
Sulphate-reducing bacteria are important players in the global sulphur and carbon cycles, with considerable economical and ecological impact. However, the process of sulphate respiration is still incompletely understood. Several mechanisms of energy conservation have been proposed, but it is unclear how the different strategies contribute to the overall process. In order to obtain a deeper insight into the energy metabolism of sulphate-reducers whole-genome microarrays were used to compare the transcriptional response of Desulfovibrio vulgaris Hildenborough grown with hydrogen/sulphate, pyruvate/sulphate, pyruvate with limiting sulphate, and lactate/thiosulphate, relative to growth in lactate/sulphate. Growth with hydrogen/sulphate showed the largest number of differentially expressedmore » genes and the largest changes in transcript levels. In this condition the most up-regulated energy metabolism genes were those coding for the periplasmic [NiFeSe]hydrogenase, followed by the Ech hydrogenase. The results also provide evidence for the involvement of formate cycling and the recently proposed ethanol pathway during growth in hydrogen. The pathway involving CO cycling is relevant during growth on lactate and pyruvate, but not during growth in hydrogen as the most down-regulated genes were those coding for the CO-induced hydrogenase. Growth on lactate/thiosulphate reveals a down-regulation of several energymetabolism genes similar to what was observed in the presence of nitrite. This study identifies the role of several proteins involved in the energy metabolism of D. vulgaris and highlights several novel genes related to this process, revealing a more complex bioenergetic metabolism than previously considered.« less
Sudden infant death syndrome (SIDS) and polymorphisms in Monoamine oxidase A gene (MAOA): a revisit.
Groß, Maximilian; Bajanowski, Thomas; Vennemann, Mechtild; Poetsch, Micaela
2014-01-01
Literature describes multiple possible links between genetic variations in the neuroadrenergic system and the occurrence of sudden infant death syndrome. The X-chromosomal Monoamine oxidase A (MAOA) is one of the genes with regulatory activity in the noradrenergic and serotonergic neuronal systems and a polymorphism of the promoter which affects the activity of this gene has been proclaimed to contribute significantly to the prevalence of sudden infant death syndrome (SIDS) in three studies from 2009, 2012 and 2013. However, these studies described different significant correlations regarding gender or age of children. Since several studies, suggesting associations between genetic variations and SIDS, were disproved by follow-up analysis, this study was conducted to take a closer look at the MAOA gene and its polymorphisms. The functional MAOA promoter length polymorphism was investigated in 261 SIDS cases and 93 control subjects. Moreover, the allele distribution of 12 coding and non-coding single nucleotide polymorphisms (SNPs) of the MAOA gene was examined in 285 SIDS cases and 93 controls by a minisequencing technique. In contrast to prior studies with fewer individuals, no significant correlations between the occurrence of SIDS and the frequency of allele variants of the promoter polymorphism could be demonstrated, even including the results from the abovementioned previous studies. Regarding the SNPs, three statistically significant associations were observed which had not been described before. This study clearly disproves interactions between MAOA promoter polymorphisms and SIDS, even if variations in single nucleotide polymorphisms of MAOA should be subjected to further analysis to clarify their impact on SIDS.
Mutations in PIGY: expanding the phenotype of inherited glycosylphosphatidylinositol deficiencies
Ilkovski, Biljana; Pagnamenta, Alistair T.; O'Grady, Gina L.; Kinoshita, Taroh; Howard, Malcolm F.; Lek, Monkol; Thomas, Brett; Turner, Anne; Christodoulou, John; Sillence, David; Knight, Samantha J.L.; Popitsch, Niko; Keays, David A.; Anzilotti, Consuelo; Goriely, Anne; Waddell, Leigh B.; Brilot, Fabienne; North, Kathryn N.; Kanzawa, Noriyuki; Macarthur, Daniel G.; Taylor, Jenny C.; Kini, Usha; Murakami, Yoshiko; Clarke, Nigel F.
2015-01-01
Glycosylphosphatidylinositol (GPI)-anchored proteins are ubiquitously expressed in the human body and are important for various functions at the cell surface. Mutations in many GPI biosynthesis genes have been described to date in patients with multi-system disease and together these constitute a subtype of congenital disorders of glycosylation. We used whole exome sequencing in two families to investigate the genetic basis of disease and used RNA and cellular studies to investigate the functional consequences of sequence variants in the PIGY gene. Two families with different phenotypes had homozygous recessive sequence variants in the GPI biosynthesis gene PIGY. Two sisters with c.137T>C (p.Leu46Pro) PIGY variants had multi-system disease including dysmorphism, seizures, severe developmental delay, cataracts and early death. There were significantly reduced levels of GPI-anchored proteins (CD55 and CD59) on the surface of patient-derived skin fibroblasts (∼20–50% compared with controls). In a second, consanguineous family, two siblings had moderate development delay and microcephaly. A homozygous PIGY promoter variant (c.-540G>A) was detected within a 7.7 Mb region of autozygosity. This variant was predicted to disrupt a SP1 consensus binding site and was shown to be associated with reduced gene expression. Mutations in PIGY can occur in coding and non-coding regions of the gene and cause variable phenotypes. This article contributes to understanding of the range of disease phenotypes and disease genes associated with deficiencies of the GPI-anchor biosynthesis pathway and also serves to highlight the potential importance of analysing variants detected in 5′-UTR regions despite their typically low coverage in exome data. PMID:26293662
Mutations in PIGY: expanding the phenotype of inherited glycosylphosphatidylinositol deficiencies.
Ilkovski, Biljana; Pagnamenta, Alistair T; O'Grady, Gina L; Kinoshita, Taroh; Howard, Malcolm F; Lek, Monkol; Thomas, Brett; Turner, Anne; Christodoulou, John; Sillence, David; Knight, Samantha J L; Popitsch, Niko; Keays, David A; Anzilotti, Consuelo; Goriely, Anne; Waddell, Leigh B; Brilot, Fabienne; North, Kathryn N; Kanzawa, Noriyuki; Macarthur, Daniel G; Taylor, Jenny C; Kini, Usha; Murakami, Yoshiko; Clarke, Nigel F
2015-11-01
Glycosylphosphatidylinositol (GPI)-anchored proteins are ubiquitously expressed in the human body and are important for various functions at the cell surface. Mutations in many GPI biosynthesis genes have been described to date in patients with multi-system disease and together these constitute a subtype of congenital disorders of glycosylation. We used whole exome sequencing in two families to investigate the genetic basis of disease and used RNA and cellular studies to investigate the functional consequences of sequence variants in the PIGY gene. Two families with different phenotypes had homozygous recessive sequence variants in the GPI biosynthesis gene PIGY. Two sisters with c.137T>C (p.Leu46Pro) PIGY variants had multi-system disease including dysmorphism, seizures, severe developmental delay, cataracts and early death. There were significantly reduced levels of GPI-anchored proteins (CD55 and CD59) on the surface of patient-derived skin fibroblasts (∼20-50% compared with controls). In a second, consanguineous family, two siblings had moderate development delay and microcephaly. A homozygous PIGY promoter variant (c.-540G>A) was detected within a 7.7 Mb region of autozygosity. This variant was predicted to disrupt a SP1 consensus binding site and was shown to be associated with reduced gene expression. Mutations in PIGY can occur in coding and non-coding regions of the gene and cause variable phenotypes. This article contributes to understanding of the range of disease phenotypes and disease genes associated with deficiencies of the GPI-anchor biosynthesis pathway and also serves to highlight the potential importance of analysing variants detected in 5'-UTR regions despite their typically low coverage in exome data. © The Author 2015. Published by Oxford University Press.
Lack of pathogenic mutations in SOS1 gene in phenytoin-induced gingival overgrowth patients.
Margiotti, Katia; Pascolini, Giulia; Consoli, Federica; Guida, Valentina; Di Bonaventura, Carlo; Giallonardo, Anna Teresa; Pizzuti, Antonio; De Luca, Alessandro
2017-08-01
Gingival overgrowth is a side effect associated with some distinct classes of drugs, such as anticonvulsants, immunosuppressants, and calcium channel blockers. One of the main drugs associated with gingival overgrowth is the antiepileptic phenytoin, which affects gingival tissues by altering extracellular matrix metabolism. It has been shown that mutation of human SOS1 gene is responsible for a rare hereditary gingival fibromatosis type 1, a benign gingival overgrowth. The aim of the present study is to evaluate the possible contribution of SOS1 mutation to gingival overgrowth-related phenotype. We selected and screened for mutations a group of 24 epileptic patients who experienced significant gingival overgrowth following phenytoin therapy. Mutation scanning was carried out by denaturing high-performance liquid chromatography analysis of the entire coding region of the SOS1 gene. Novel identified variants were analyzed in-silico by using Alamut Visual mutation interpretation software, and comparison with normal control group was done. Mutation scanning of the entire coding sequence of SOS1 gene identified seven intronic variants and one new exonic substitution (c.138G>A). The seven common intronic variants were not considered to be of pathogenic importance. The exonic substitution c.138G>A was found to be absent in 100 ethnically matched normal control chromosomes, but was not expected to have functional significance based on prediction bioinformatics tools. This study represents the first mutation analysis of the SOS1 gene in phenytoin-induced gingival overgrowth epileptic patients. Present results suggest that obvious pathogenic mutations in the SOS1 gene do not represent a common mechanism underlying phenytoin-induced gingival overgrowth in epileptic patients; other mechanisms are likely to be involved in the pathogenesis of this drug-induced phenotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
Song, Xuhao; Shen, Fujun; Huang, Jie; Huang, Yan; Du, Lianming; Wang, Chengdong; Fan, Zhenxin; Hou, Rong; Yue, Bisong; Zhang, Xiuyue
2016-09-01
Recently, an increasing number of microsatellites or simple sequence repeats (SSRs) have been found and characterized from transcriptomes. Such SSRs can be employed as putative functional markers to easily tag corresponding genes, which play an important role in biomedical studies and genetic analysis. However, the transcriptome-derived SSRs for giant panda (Ailuropoda melanoleuca) are not yet available. In this work, we identified and characterized 20 tetranucleotide microsatellite loci from a transcript database generated from the blood of giant panda. Furthermore, we assigned their predicted transcriptome locations: 16 loci were assigned to untranslated regions (UTRs) and 4 loci were assigned to coding regions (CDSs). Gene identities of 14 transcripts contained corresponding microsatellites were determined, which provide useful information to study the potential contribution of SSRs to gene regulation in giant panda. The polymorphic information content (PIC) values ranged from 0.293 to 0.789 with an average of 0.603 for the 16 UTRs-derived SSRs. Interestingly, 4 CDS-derived microsatellites developed in our study were also polymorphic, and the instability of these 4 CDS-derived SSRs was further validated by re-genotyping and sequencing. The genes containing these 4 CDS-derived SSRs were embedded with various types of repeat motifs. The interaction of all the length-changing SSRs might provide a way against coding region frameshift caused by microsatellite instability. We hope these newly gene-associated biomarkers will pave the way for genetic and biomedical studies for giant panda in the future. In sum, this set of transcriptome-derived markers complements the genetic resources available for giant panda. © The American Genetic Association. 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Heimann, Louisa; Horst, Ina; Perduns, Renke; Dreesen, Björn; Offermann, Sascha; Peterhansel, Christoph
2013-05-01
C4 photosynthesis evolved more than 60 times independently in different plant lineages. Each time, multiple genes were recruited into C4 metabolism. The corresponding promoters acquired new regulatory features such as high expression, light induction, or cell type-specific expression in mesophyll or bundle sheath cells. We have previously shown that histone modifications contribute to the regulation of the model C4 phosphoenolpyruvate carboxylase (C4-Pepc) promoter in maize (Zea mays). We here tested the light- and cell type-specific responses of three selected histone acetylations and two histone methylations on five additional C4 genes (C4-Ca, C4-Ppdk, C4-Me, C4-Pepck, and C4-RbcS2) in maize. Histone acetylation and nucleosome occupancy assays indicated extended promoter regions with regulatory upstream regions more than 1,000 bp from the transcription initiation site for most of these genes. Despite any detectable homology of the promoters on the primary sequence level, histone modification patterns were highly coregulated. Specifically, H3K9ac was regulated by illumination, whereas H3K4me3 was regulated in a cell type-specific manner. We further compared histone modifications on the C4-Pepc and C4-Me genes from maize and the homologous genes from sorghum (Sorghum bicolor) and Setaria italica. Whereas sorghum and maize share a common C4 origin, C4 metabolism evolved independently in S. italica. The distribution of histone modifications over the promoters differed between the species, but differential regulation of light-induced histone acetylation and cell type-specific histone methylation were evident in all three species. We propose that a preexisting histone code was recruited into C4 promoter control during the evolution of C4 metabolism.
2011-01-01
Background Corynebacterium variabile is part of the complex microflora on the surface of smear-ripened cheeses and contributes to the development of flavor and textural properties during cheese ripening. Still little is known about the metabolic processes and microbial interactions during the production of smear-ripened cheeses. Therefore, the gene repertoire contributing to the lifestyle of the cheese isolate C. variabile DSM 44702 was deduced from the complete genome sequence to get a better understanding of this industrial process. Results The chromosome of C. variabile DSM 44702 is composed of 3, 433, 007 bp and contains 3, 071 protein-coding regions. A comparative analysis of this gene repertoire with that of other corynebacteria detected 1, 534 predicted genes to be specific for the cheese isolate. These genes might contribute to distinct metabolic capabilities of C. variabile, as several of them are associated with metabolic functions in cheese habitats by playing roles in the utilization of alternative carbon and sulphur sources, in amino acid metabolism, and fatty acid degradation. Relevant C. variabile genes confer the capability to catabolize gluconate, lactate, propionate, taurine, and gamma-aminobutyric acid and to utilize external caseins. In addition, C. variabile is equipped with several siderophore biosynthesis gene clusters for iron acquisition and an exceptional repertoire of AraC-regulated iron uptake systems. Moreover, C. variabile can produce acetoin, butanediol, and methanethiol, which are important flavor compounds in smear-ripened cheeses. Conclusions The genome sequence of C. variabile provides detailed insights into the distinct metabolic features of this bacterium, implying a strong adaption to the iron-depleted cheese surface habitat. By combining in silico data obtained from the genome annotation with previous experimental knowledge, occasional observations on genes that are involved in the complex metabolic capacity of C. variabile were integrated into a global view on the lifestyle of this species. PMID:22053731
Seligmann, Hervé
2013-03-01
Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Baytak, Esra; Gong, Qiang; Akman, Burcu; Yuan, Hongling; Chan, Wing C; Küçük, Can
2017-05-01
Natural killer/T-cell lymphoma is a rare but aggressive neoplasm with poor prognosis. Despite previous reports that showed potential tumor suppressors, such as PRDM1 or oncogenes associated with the etiology of this malignancy, the role of long non-coding RNAs in natural killer/T-cell lymphoma pathobiology has not been addressed to date. Here, we aim to identify cancer-associated dysregulated long non-coding RNAs and signaling pathways or biological processes associated with these long non-coding RNAs in natural killer/T-cell lymphoma cases and to identify the long non-coding RNAs transcriptionally regulated by PRDM1. RNA-Seq analysis revealed 166 and 66 long non-coding RNAs to be significantly overexpressed or underexpressed, respectively, in natural killer/T-cell lymphoma cases compared with resting or activated normal natural killer cells. Novel long non-coding RNAs as well as the cancer-associated ones such as SNHG5, ZFAS1, or MIR155HG were dysregulated. Interestingly, antisense transcripts of many growth-regulating genes appeared to be transcriptionally deregulated. Expression of ZFAS1, which is upregulated in natural killer/T-cell lymphoma cases, showed association with growth-regulating pathways such as stabilization of P53, regulation of apoptosis, cell cycle, or nuclear factor-kappa B signaling in normal and neoplastic natural killer cell samples. Consistent with the tumor suppressive role of PRDM1, we identified MIR155HG and TERC to be transcriptionally downregulated by PRDM1 in two PRDM1-null NK-cell lines when it is ectopically expressed. In conclusion, this is the first study that identified long non-coding RNAs whose expression is dysregulated in natural killer/T-cell lymphoma cases. These findings suggest that ZFAS1 and other dysregulated long non-coding RNAs may be involved in natural killer/T-cell lymphoma pathobiology through regulation of cancer-related genes, and loss-of-PRDM1 expression in natural killer/T-cell lymphomas may contribute to overexpression of MIR155HG; thereby promoting tumorigenesis.
Chen, Geng; Yin, Kangping; Shi, Leming; Fang, Yuanzhang; Qi, Ya; Li, Peng; Luo, Jian; He, Bing; Liu, Mingyao; Shi, Tieliu
2011-01-01
In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.
The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.
Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P
2017-07-31
The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.
Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe
2016-02-15
Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
McTavish, H; LaQuier, F; Arciero, D; Logan, M; Mundfrom, G; Fuchs, J A; Hooper, A B
1993-04-01
The genome of Nitrosomonas europaea contains at least three copies each of the genes coding for hydroxylamine oxidoreductase (HAO) and cytochrome c554. A copy of an HAO gene is always located within 2.7 kb of a copy of a cytochrome c554 gene. Cytochrome P-460, a protein that shares very unusual spectral features with HAO, was found to be encoded by a gene separate from the HAO genes.
McGuire, Austen B; Rafi, Syed K; Manzardo, Ann M; Butler, Merlin G
2016-05-05
Mammalian chromosomes are comprised of complex chromatin architecture with the specific assembly and configuration of each chromosome influencing gene expression and function in yet undefined ways by varying degrees of heterochromatinization that result in Giemsa (G) negative euchromatic (light) bands and G-positive heterochromatic (dark) bands. We carried out morphometric measurements of high-resolution chromosome ideograms for the first time to characterize the total euchromatic and heterochromatic chromosome band length, distribution and localization of 20,145 known protein-coding genes, 790 recognized autism spectrum disorder (ASD) genes and 365 obesity genes. The individual lengths of G-negative euchromatin and G-positive heterochromatin chromosome bands were measured in millimeters and recorded from scaled and stacked digital images of 850-band high-resolution ideograms supplied by the International Society of Chromosome Nomenclature (ISCN) 2013. Our overall measurements followed established banding patterns based on chromosome size. G-negative euchromatic band regions contained 60% of protein-coding genes while the remaining 40% were distributed across the four heterochromatic dark band sub-types. ASD genes were disproportionately overrepresented in the darker heterochromatic sub-bands, while the obesity gene distribution pattern did not significantly differ from protein-coding genes. Our study supports recent trends implicating genes located in heterochromatin regions playing a role in biological processes including neurodevelopment and function, specifically genes associated with ASD.
Fraschka, Sabine Anne-Kristin; Henderson, Rob Wilhelmus Maria; Bártfai, Richárd
2016-01-01
Histones, by packaging and organizing the DNA into chromatin, serve as essential building blocks for eukaryotic life. The basic structure of the chromatin is established by four canonical histones (H2A, H2B, H3 and H4), while histone variants are more commonly utilized to alter the properties of specific chromatin domains. H3.3, a variant of histone H3, was found to have diverse localization patterns and functions across species but has been rather poorly studied in protists. Here we present the first genome-wide analysis of H3.3 in the malaria-causing, apicomplexan parasite, P. falciparum, which revealed a complex occupancy profile consisting of conserved and parasite-specific features. In contrast to other histone variants, PfH3.3 primarily demarcates euchromatic coding and subtelomeric repetitive sequences. Stable occupancy of PfH3.3 in these regions is largely uncoupled from the transcriptional activity and appears to be primarily dependent on the GC-content of the underlying DNA. Importantly, PfH3.3 specifically marks the promoter region of an active and poised, but not inactive antigenic variation (var) gene, thereby potentially contributing to immune evasion. Collectively, our data suggest that PfH3.3, together with other histone variants, indexes the P. falciparum genome to functionally distinct domains and contribute to a key survival strategy of this deadly pathogen. PMID:27555062
Gene and genon concept: coding versus regulation
2007-01-01
We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon. PMID:18087760
The complete mitochondrial genome sequence of Aesopia cornuta (Pleuronectiformes: Soleidae).
Wang, Shu-Ying; Shi, Wei; Wang, Zhong-Ming; Gong, Li; Kong, Xiao-Yu
2015-02-01
Aesopia cornuta belongs to the family Soleidae of Pleuronectiformes, and the morphological characters are much similar to those of Zebrias. In this article, we sequenced, characterized, and compared the complete mitogenome of A. cornuta for the first time. The genome is 16,737 base pairs in length, and is typically consist of 37 genes, including 13 protein-coding genes, two ribosomal RNA, 22 transfer RNA, as well as a putative L-strand replication origin and a putative control region. The gene organization is identical to that of typical bony fishes. The overall base composition is 29.1, 28.3, 26.8 and 15.8% for C, A, T and G, respectively, with a slight AT bias of 55.1%. This result is expected to contribute to understanding the systematic evolution of the genus Aesopia and further taxonomic and phylogenetic studies of Soleidae and Pleuronectiformes.
Chen, Chaoyang; Sun, Chongran; Wu, Yi-Rui
2018-03-21
A wild-type solventogenic strain Clostridium diolis WST, isolated from mangrove sediments, was characterized to produce high amount of butanol and acetone with negligible level of ethanol and acids from glucose via a unique acetone-butanol (AB) fermentation pathway. Through the genomic sequencing, the assembled draft genome of strain WST is calculated to be 5.85 Mb with a GC content of 29.69% and contains 5263 genes that contribute to the annotation of 5049 protein-coding sequences. Within these annotated genes, the butanol dehydrogenase gene (bdh) was determined to be in a higher amount from strain WST compared to other Clostridial strains, which is positively related to its high-efficient production of butanol. Therefore, we present a draft genome sequence analysis of strain WST in this article that should facilitate to further understand the solventogenic mechanism of this special microorganism.
Sensitive Periods in Epigenetics: bringing us closer to complex behavioral phenotypes
Nagy, Corina; Turecki, Gustavo
2017-01-01
Genetic studies have attempted to elucidate causal mechanisms for the development of complex disease but genome-wide associations have been largely unsuccessful in establishing these links. As an alternative link between genes and disease, recent efforts have focused on mechanisms that alter the function of genes without altering the underlying DNA sequence. Known as epigenetic mechanisms, these include: DNA methylation, chromatin conformational changes through histone modifications, non-coding RNAs, and most recently, 5-hydroxymethylcytosine. Though DNA methylation is involved in normal development, aging and gene regulation, altered methylation patterns have been associated with disease. It is generally believed that early life constitutes a period during which there is increased sensitivity to the regulatory effects of epigenetic mechanisms. The purpose of this review is to outline the contribution of epigenetic mechanisms to genomic function, particularly in the development of complex behavioral phenotypes, focusing on the sensitive periods. PMID:22920183
Naval-Sanchez, Marina; Nguyen, Quan; McWilliam, Sean; Porto-Neto, Laercio R; Tellam, Ross; Vuocolo, Tony; Reverter, Antonio; Perez-Enciso, Miguel; Brauning, Rudiger; Clarke, Shannon; McCulloch, Alan; Zamani, Wahid; Naderi, Saeid; Rezaei, Hamid Reza; Pompanon, Francois; Taberlet, Pierre; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Jhangiani, Shalini N; Cockett, Noelle; Daetwyler, Hans; Kijas, James
2018-02-28
Domestication fundamentally reshaped animal morphology, physiology and behaviour, offering the opportunity to investigate the molecular processes driving evolutionary change. Here we assess sheep domestication and artificial selection by comparing genome sequence from 43 modern breeds (Ovis aries) and their Asian mouflon ancestor (O. orientalis) to identify selection sweeps. Next, we provide a comparative functional annotation of the sheep genome, validated using experimental ChIP-Seq of sheep tissue. Using these annotations, we evaluate the impact of selection and domestication on regulatory sequences and find that sweeps are significantly enriched for protein coding genes, proximal regulatory elements of genes and genome features associated with active transcription. Finally, we find individual sites displaying strong allele frequency divergence are enriched for the same regulatory features. Our data demonstrate that remodelling of gene expression is likely to have been one of the evolutionary forces that drove phenotypic diversification of this common livestock species.
Keeping abreast with long non-coding RNAs in mammary gland development and breast cancer
Hansji, Herah; Leung, Euphemia Y.; Baguley, Bruce C.; Finlay, Graeme J.; Askarian-Amiri, Marjan E.
2014-01-01
The majority of the human genome is transcribed, even though only 2% of transcripts encode proteins. Non-coding transcripts were originally dismissed as evolutionary junk or transcriptional noise, but with the development of whole genome technologies, these non-coding RNAs (ncRNAs) are emerging as molecules with vital roles in regulating gene expression. While shorter ncRNAs have been extensively studied, the functional roles of long ncRNAs (lncRNAs) are still being elucidated. Studies over the last decade show that lncRNAs are emerging as new players in a number of diseases including cancer. Potential roles in both oncogenic and tumor suppressive pathways in cancer have been elucidated, but the biological functions of the majority of lncRNAs remain to be identified. Accumulated data are identifying the molecular mechanisms by which lncRNA mediates both structural and functional roles. LncRNA can regulate gene expression at both transcriptional and post-transcriptional levels, including splicing and regulating mRNA processing, transport, and translation. Much current research is aimed at elucidating the function of lncRNAs in breast cancer and mammary gland development, and at identifying the cellular processes influenced by lncRNAs. In this paper we review current knowledge of lncRNAs contributing to these processes and present lncRNA as a new paradigm in breast cancer development. PMID:25400658
Maize GO annotation—methods, evaluation, and review (maize-GAMER)
USDA-ARS?s Scientific Manuscript database
We created a new high-coverage, robust, and reproducible functional annotation of maize protein-coding genes based on Gene Ontology (GO) term assignments. Whereas the existing Phytozome and Gramene maize GO annotation sets only cover 41% and 56% of maize protein-coding genes, respectively, this stu...
Rare Genome-Wide Copy Number Variation and Expression of Schizophrenia in 22q11.2 Deletion Syndrome.
Bassett, Anne S; Lowther, Chelsea; Merico, Daniele; Costain, Gregory; Chow, Eva W C; van Amelsvoort, Therese; McDonald-McGinn, Donna; Gur, Raquel E; Swillen, Ann; Van den Bree, Marianne; Murphy, Kieran; Gothelf, Doron; Bearden, Carrie E; Eliez, Stephan; Kates, Wendy; Philip, Nicole; Sashi, Vandana; Campbell, Linda; Vorstman, Jacob; Cubells, Joseph; Repetto, Gabriela M; Simon, Tony; Boot, Erik; Heung, Tracy; Evers, Rens; Vingerhoets, Claudia; van Duin, Esther; Zackai, Elaine; Vergaelen, Elfi; Devriendt, Koen; Vermeesch, Joris R; Owen, Michael; Murphy, Clodagh; Michaelovosky, Elena; Kushan, Leila; Schneider, Maude; Fremont, Wanda; Busa, Tiffany; Hooper, Stephen; McCabe, Kathryn; Duijff, Sasja; Isaev, Karin; Pellecchia, Giovanna; Wei, John; Gazzellone, Matthew J; Scherer, Stephen W; Emanuel, Beverly S; Guo, Tingwei; Morrow, Bernice E; Marshall, Christian R
2017-11-01
Chromosome 22q11.2 deletion syndrome (22q11.2DS) is associated with a more than 20-fold increased risk for developing schizophrenia. The aim of this study was to identify additional genetic factors (i.e., "second hits") that may contribute to schizophrenia expression. Through an international consortium, the authors obtained DNA samples from 329 psychiatrically phenotyped subjects with 22q11.2DS. Using a high-resolution microarray platform and established methods to assess copy number variation (CNV), the authors compared the genome-wide burden of rare autosomal CNV, outside of the 22q11.2 deletion region, between two groups: a schizophrenia group and those with no psychotic disorder at age ≥25 years. The authors assessed whether genes overlapped by rare CNVs were overrepresented in functional pathways relevant to schizophrenia. Rare CNVs overlapping one or more protein-coding genes revealed significant between-group differences. For rare exonic duplications, six of 19 gene sets tested were enriched in the schizophrenia group; genes associated with abnormal nervous system phenotypes remained significant in a stepwise logistic regression model and showed significant interactions with 22q11.2 deletion region genes in a connectivity analysis. For rare exonic deletions, the schizophrenia group had, on average, more genes overlapped. The additional rare CNVs implicated known (e.g., GRM7, 15q13.3, 16p12.2) and novel schizophrenia risk genes and loci. The results suggest that additional rare CNVs overlapping genes outside of the 22q11.2 deletion region contribute to schizophrenia risk in 22q11.2DS, supporting a multigenic hypothesis for schizophrenia. The findings have implications for understanding expression of psychotic illness and herald the importance of whole-genome sequencing to appreciate the overall genomic architecture of schizophrenia.
GONUTS: the Gene Ontology Normal Usage Tracking System
Renfro, Daniel P.; McIntosh, Brenley K.; Venkatraman, Anand; Siegele, Deborah A.; Hu, James C.
2012-01-01
The Gene Ontology Normal Usage Tracking System (GONUTS) is a community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins. GONUTS uses wiki technology to allow registered users to share and edit notes on the use of each term in GO, and to contribute annotations for specific genes of interest. By providing a site for generation of third-party documentation at the granularity of individual terms, GONUTS complements the official documentation of the Gene Ontology Consortium. To provide examples for community users, GONUTS displays the complete GO annotations from seven model organisms: Saccharomyces cerevisiae, Dictyostelium discoideum, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus and Arabidopsis thaliana. To support community annotation, GONUTS allows automated creation of gene pages for gene products in UniProt. GONUTS will improve the consistency of annotation efforts across genome projects, and should be useful in training new annotators and consumers in the production of GO annotations and the use of GO terms. GONUTS can be accessed at http://gowiki.tamu.edu. The source code for generating the content of GONUTS is available upon request. PMID:22110029
Essential RNA-Based Technologies and Their Applications in Plant Functional Genomics.
Teotia, Sachin; Singh, Deepali; Tang, Xiaoqing; Tang, Guiliang
2016-02-01
Genome sequencing has not only extended our understanding of the blueprints of many plant species but has also revealed the secrets of coding and non-coding genes. We present here a brief introduction to and personal account of key RNA-based technologies, as well as their development and applications for functional genomics of plant coding and non-coding genes, with a focus on short tandem target mimics (STTMs), artificial microRNAs (amiRNAs), and CRISPR/Cas9. In addition, their use in multiplex technologies for the functional dissection of gene networks is discussed. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Tee, Ling Fei; Neoh, Hui-min; Then, Sue Mian; Murad, Nor Azian; Asillam, Mohd Fairos; Hashim, Mohd Helmy; Nathan, Sheila; Jamal, Rahman
2017-11-01
Studies of multigenerational Caenorhabditis elegans exposed to long-term spaceflight have revealed expression changes of genes involved in longevity, DNA repair, and locomotion. However, results from spaceflight experiments are difficult to reproduce as space missions are costly and opportunities are rather limited for researchers. In addition, multigenerational cultures of C. elegans used in previous studies contribute to mixture of gene expression profiles from both larvae and adult worms, which were recently reported to be different. Usage of different culture media during microgravity simulation experiments might also give rise to differences in the gene expression and biological phenotypes of the worms. In this study, we investigated the effects of simulated microgravity on the gene expression and biological phenotype profiles of a single generation of C. elegans worms cultured on 2 different culture media. A desktop Random Positioning Machine (RPM) was used to simulate microgravity on the worms for approximately 52 to 54 h. Gene expression profile was analysed using the Affymetrix GeneChip® C. elegans 1.0 ST Array. Only one gene (R01H2.2) was found to be downregulated in nematode growth medium (NGM)-cultured worms exposed to simulated microgravity. On the other hand, eight genes were differentially expressed for C. elegans Maintenance Medium (CeMM)-cultured worms in microgravity; six were upregulated, while two were downregulated. Five of the upregulated genes (C07E3.15, C34H3.21, C32D5.16, F35H8.9 and C34F11.17) encode non-coding RNAs. In terms of biological phenotype, we observed that microgravity-simulated worms experienced minimal changes in terms of lifespan, locomotion and reproductive capabilities in comparison with the ground controls. Taking it all together, simulated microgravity on a single generation of C. elegans did not confer major changes to their gene expression and biological phenotype. Nevertheless, exposure of the worms to microgravity lead to higher expression of non-coding RNA genes, which may play an epigenetic role in the worms during longer terms of microgravity exposure.
Premzl, Marko
2015-01-01
Using eutherian comparative genomic analysis protocol and public genomic sequence data sets, the present work attempted to update and revise two gene data sets. The most comprehensive third party annotation gene data sets of eutherian adenohypophysis cystine-knot genes (128 complete coding sequences), and d-dopachrome tautomerases and macrophage migration inhibitory factor genes (30 complete coding sequences) were annotated. For example, the present study first described primate-specific cystine-knot Prometheus genes, as well as differential gene expansions of D-dopachrome tautomerase genes. Furthermore, new frameworks of future experiments of two eutherian gene data sets were proposed. PMID:25941635
RCDB: Renal Cancer Gene Database.
Ramana, Jayashree
2012-05-18
Renal cell carcinoma or RCC is one of the common and most lethal urological cancers, with 40% of the patients succumbing to death because of metastatic progression of the disease. Treatment of metastatic RCC remains highly challenging because of its resistance to chemotherapy as well as radiotherapy, besides surgical resection. Whereas RCC comprises tumors with differing histological types, clear cell RCC remains the most common. A major problem in the clinical management of patients presenting with localized ccRCC is the inability to determine tumor aggressiveness and accurately predict the risk of metastasis following surgery. As a measure to improve the diagnosis and prognosis of RCC, researchers have identified several molecular markers through a number of techniques. However the wealth of information available is scattered in literature and not easily amenable to data-mining. To reduce this gap, this work describes a comprehensive repository called Renal Cancer Gene Database, as an integrated gateway to study renal cancer related data. Renal Cancer Gene Database is a manually curated compendium of 240 protein-coding and 269 miRNA genes contributing to the etiology and pathogenesis of various forms of renal cell carcinomas. The protein coding genes have been classified according to the kind of gene alteration observed in RCC. RCDB also includes the miRNAsdysregulated in RCC, along with the corresponding information regarding the type of RCC and/or metastatic or prognostic significance. While some of the miRNA genes showed an association with other types of cancers few were unique to RCC. Users can query the database using keywords, category and chromosomal location of the genes. The knowledgebase can be freely accessed via a user-friendly web interface at http://www.juit.ac.in/attachments/jsr/rcdb/homenew.html. It is hoped that this database would serve as a useful complement to the existing public resources and as a good starting point for researchers and physicians interested in RCC genetics.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-02-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Nedelcu, Aurora M.; Lee, Robert W.; Lemieux, Claude; Gray, Michael W.; Burger, Gertraud
2000-01-01
Two distinct mitochondrial genome types have been described among the green algal lineages investigated to date: a reduced–derived, Chlamydomonas-like type and an ancestral, Prototheca-like type. To determine if this unexpected dichotomy is real or is due to insufficient or biased sampling and to define trends in the evolution of the green algal mitochondrial genome, we sequenced and analyzed the mitochondrial DNA (mtDNA) of Scenedesmus obliquus. This genome is 42,919 bp in size and encodes 42 conserved genes (i.e., large and small subunit rRNA genes, 27 tRNA and 13 respiratory protein-coding genes), four additional free-standing open reading frames with no known homologs, and an intronic reading frame with endonuclease/maturase similarity. No 5S rRNA or ribosomal protein-coding genes have been identified in Scenedesmus mtDNA. The standard protein-coding genes feature a deviant genetic code characterized by the use of UAG (normally a stop codon) to specify leucine, and the unprecedented use of UCA (normally a serine codon) as a signal for termination of translation. The mitochondrial genome of Scenedesmus combines features of both green algal mitochondrial genome types: the presence of a more complex set of protein-coding and tRNA genes is shared with the ancestral type, whereas the lack of 5S rRNA and ribosomal protein-coding genes as well as the presence of fragmented and scrambled rRNA genes are shared with the reduced–derived type of mitochondrial genome organization. Furthermore, the gene content and the fragmentation pattern of the rRNA genes suggest that this genome represents an intermediate stage in the evolutionary process of mitochondrial genome streamlining in green algae. [The sequence data described in this paper have been submitted to the GenBank data library under accession no. AF204057.] PMID:10854413
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.
Zhang, Chun-Ting; Wang, Ju; Zhang, Ren
2002-02-01
The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
Zhao, Yi; Tang, Liang; Li, Zhe; Jin, Jinpu; Luo, Jingchu; Gao, Ge
2015-04-18
Long-established protein-coding genes may lose their coding potential during evolution ("unitary gene loss"). Members of the Poaceae family are a major food source and represent an ideal model clade for plant evolution research. However, the global pattern of unitary gene loss in Poaceae genomes as well as the evolutionary fate of lost genes are still less-investigated and remain largely elusive. Using a locally developed pipeline, we identified 129 unitary gene loss events for long-established protein-coding genes from four representative species of Poaceae, i.e. brachypodium, rice, sorghum and maize. Functional annotation suggested that the lost genes in all or most of Poaceae species are enriched for genes involved in development and response to endogenous stimulus. We also found that 44 mutated genomic loci of lost genes, which we referred as relics, were still actively transcribed, and of which 84% (37 of 44) showed significantly differential expression across different tissues. More interestingly, we found that there were totally five expressed relics may function as competitive endogenous RNA in brachypodium, rice and sorghum genome. Based on comparative genomics and transcriptome data, we firstly compiled a comprehensive catalogue of unitary gene loss events in Poaceae species and characterized a statistically significant functional preference for these lost genes as well showed the potential of relics functioning as competitive endogenous RNAs in Poaceae genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Damiani, R.D. Jr.; Wessler, S.R.
1993-09-01
The R/B genes of maize encode a family of basic helix-loop-helix proteins that determine where and when the anthocyanin-pigment pathway will be expressed in the plant. Previous studies showed that allelic diversity among family members reflects differences in gene expression, specifically in transcription initiation. The authors present evidence that the R gene Lc is under translational control. They demonstrate that the 235-nt transcript leader of Lc represses expression 25- to 30-fold in an in vivo assay. Repression is mediated by the presence in cis of a 38-codon upstream open reading frame. Furthermore, the coding capacity of the upstream open readingmore » frame influences the magnitude of repression. It is proposed that translational control does not contribute to tissue specificity but prevents overexpression of the Lc protein. The diversity of promoter and 5' untranslated leader sequences among the R/B genes provides an opportunity to study the coevolution of transcriptional and translational mechanisms of gene regulation. 36 refs., 5 figs.« less
Exon Shuffling and Origin of Scorpion Venom Biodiversity
Wang, Xueli; Gao, Bin; Zhu, Shunyi
2016-01-01
Scorpion venom is a complex combinatorial library of peptides and proteins with multiple biological functions. A combination of transcriptomic and proteomic techniques has revealed its enormous molecular diversity, as identified by the presence of a large number of ion channel-targeted neurotoxins with different folds, membrane-active antimicrobial peptides, proteases, and protease inhibitors. Although the biodiversity of scorpion venom has long been known, how it arises remains unsolved. In this work, we analyzed the exon-intron structures of an array of scorpion venom protein-encoding genes and unexpectedly found that nearly all of these genes possess a phase-1 intron (one intron located between the first and second nucleotides of a codon) near the cleavage site of a signal sequence despite their mature peptides remarkably differ. This observation matches a theory of exon shuffling in the origin of new genes and suggests that recruitment of different folds into scorpion venom might be achieved via shuffling between body protein-coding genes and ancestral venom gland-specific genes that presumably contributed tissue-specific regulatory elements and secretory signal sequences. PMID:28035955
Exon Shuffling and Origin of Scorpion Venom Biodiversity.
Wang, Xueli; Gao, Bin; Zhu, Shunyi
2016-12-26
Scorpion venom is a complex combinatorial library of peptides and proteins with multiple biological functions. A combination of transcriptomic and proteomic techniques has revealed its enormous molecular diversity, as identified by the presence of a large number of ion channel-targeted neurotoxins with different folds, membrane-active antimicrobial peptides, proteases, and protease inhibitors. Although the biodiversity of scorpion venom has long been known, how it arises remains unsolved. In this work, we analyzed the exon-intron structures of an array of scorpion venom protein-encoding genes and unexpectedly found that nearly all of these genes possess a phase-1 intron (one intron located between the first and second nucleotides of a codon) near the cleavage site of a signal sequence despite their mature peptides remarkably differ. This observation matches a theory of exon shuffling in the origin of new genes and suggests that recruitment of different folds into scorpion venom might be achieved via shuffling between body protein-coding genes and ancestral venom gland-specific genes that presumably contributed tissue-specific regulatory elements and secretory signal sequences.
A large GLC1C Greek family with a myocilin T377M mutation: inheritance and phenotypic variability.
Petersen, Michael B; Kitsos, George; Samples, John R; Gaudette, N Donna; Economou-Petersen, Effrosini; Sykes, Renée; Rust, Kristal; Grigoriadou, Maria; Aperis, George; Choi, Dongseok; Psilas, Konstantinos; Craig, Jamie E; Kramer, Patricia L; Mackey, David A; Wirtz, Mary K
2006-02-01
POAG is a complex disease; therefore, families in which a glaucoma gene has been mapped may carry additional POAG genes. The goal of this study was to determine whether mutations in the myocilin (MYOC) gene on chromosome 1 are present in two POAG families, which have previously been mapped to the GLC1C locus on chromosome 3. The three exons of MYOC were screened by denaturing (d)HPLC. Samples with heteroduplex peaks were sequenced. Clinical findings were compared with genotype status in all available family members over the age of 20 years. A T377M coding sequence change in MYOC was identified in family members of the Greek GLC1C family but not in the Oregon GLC1C family. Individuals carrying both the MYOC T377M variant and the GLC1C haplotype were more severely affected at an earlier age than individuals with just one of the POAG genes, suggesting that these two genes interact or that both contribute to the POAG phenotype in a cumulative way.
Molecular Evolution of Phosphoprotein Phosphatases in Drosophila
Miskei, Márton; Ádám, Csaba; Kovács, László; Karányi, Zsolt; Dombrádi, Viktor
2011-01-01
Phosphoprotein phosphatases (PPP), these ancient and important regulatory enzymes are present in all eukaryotic organisms. Based on the genome sequences of 12 Drosophila species we traced the evolution of the PPP catalytic subunits and noted a substantial expansion of the gene family. We concluded that the 18–22 PPP genes of Drosophilidae were generated from a core set of 8 indispensable phosphatases that are present in most of the insects. Retropositons followed by tandem gene duplications extended the phosphatase repertoire, and sporadic gene losses contributed to the species specific variations in the PPP complement. During the course of these studies we identified 5, up till now uncharacterized phosphatase retrogenes: PpY+, PpD5+, PpD6+, Pp4+, and Pp6+ which are found only in some ancient Drosophila. We demonstrated that all of these new PPP genes exhibit a distinct male specific expression. In addition to the changes in gene numbers, the intron-exon structure and the chromosomal localization of several PPP genes was also altered during evolution. The G−C content of the coding regions decreased when a gene moved into the heterochromatic region of chromosome Y. Thus the PPP enzymes exemplify the various types of dynamic rearrangements that accompany the molecular evolution of a gene family in Drosophilidae. PMID:21789237
Lee, Chien-Yueh; Hsieh, Ping-Han; Chiang, Li-Mei; Chattopadhyay, Amrita; Li, Kuan-Yi; Lee, Yi-Fang; Lu, Tzu-Pin; Lai, Liang-Chuan; Lin, En-Chung; Lee, Hsinyu; Ding, Shih-Torng; Tsai, Mong-Hsun; Chen, Chien-Yu; Chuang, Eric Y
2018-05-01
The Mikado pheasant (Syrmaticus mikado) is a nearly endangered species indigenous to high-altitude regions of Taiwan. This pheasant provides an opportunity to investigate evolutionary processes following geographic isolation. Currently, the genetic background and adaptive evolution of the Mikado pheasant remain unclear. We present the draft genome of the Mikado pheasant, which consists of 1.04 Gb of DNA and 15,972 annotated protein-coding genes. The Mikado pheasant displays expansion and positive selection of genes related to features that contribute to its adaptive evolution, such as energy metabolism, oxygen transport, hemoglobin binding, radiation response, immune response, and DNA repair. To investigate the molecular evolution of the major histocompatibility complex (MHC) across several avian species, 39 putative genes spanning 227 kb on a contiguous region were annotated and manually curated. The MHC loci of the pheasant revealed a high level of synteny, several rapidly evolving genes, and inverse regions compared to the same loci in the chicken. The complete mitochondrial genome was also sequenced, assembled, and compared against four long-tailed pheasants. The results from molecular clock analysis suggest that ancestors of the Mikado pheasant migrated from the north to Taiwan about 3.47 million years ago. This study provides a valuable genomic resource for the Mikado pheasant, insights into its adaptation to high altitude, and the evolutionary history of the genus Syrmaticus, which could potentially be useful for future studies that investigate molecular evolution, genomics, ecology, and immunogenetics.
Zhan, Siyuan; Dong, Yao; Zhao, Wei; Guo, Jiazhong; Zhong, Tao; Wang, Linjie; Li, Li; Zhang, Hongping
2016-08-22
Long non-coding RNAs (lncRNAs) have been studied extensively over the past few years. Large numbers of lncRNAs have been identified in mouse, rat, and human, and some of them have been shown to play important roles in muscle development and myogenesis. However, there are few reports on the characterization of lncRNAs covering all the development stages of skeletal muscle in livestock. RNA libraries constructed from developing longissimus dorsi muscle of fetal (45, 60, and 105 days of gestation) and postnatal (3 days after birth) goat (Capra hircus) were sequenced. A total of 1,034,049,894 clean reads were generated. Among them, 3981 lncRNA transcripts corresponding to 2739 lncRNA genes were identified, including 3515 intergenic lncRNAs and 466 anti-sense lncRNAs. Notably, in pairwise comparisons between the libraries of skeletal muscle at the different development stages, a total of 577 transcripts were differentially expressed (P < 0.05) which were validated by qPCR using randomly selected six lncRNA genes. The identified goat lncRNAs shared some characteristics, such as fewer exons and shorter length, with the lncRNAs in other mammals. We also found 1153 lncRNAs genes were neighbored 1455 protein-coding genes (<10 kb upstream and downstream) and functionally enriched in transcriptional regulation and development-related processes, indicating they may be in cis-regulatory relationships. Additionally, Pearson's correlation coefficients of co-expression levels suggested 1737 lncRNAs and 19,422 mRNAs were possibly in trans-regulatory relationships (r > 0.95 or r < -0.95). These co-expressed mRNAs were enriched in development-related biological processes such as muscle system processes, regulation of cell growth, muscle cell development, regulation of transcription, and embryonic morphogenesis. This study provides a catalog of goat muscle-related lncRNAs, and will contribute to a fuller understanding of the molecular mechanism underpinning muscle development in mammals.
Decoding the non-coding genome: elucidating genetic risk outside the coding genome.
Barr, C L; Misener, V L
2016-01-01
Current evidence emerging from genome-wide association studies indicates that the genetic underpinnings of complex traits are likely attributable to genetic variation that changes gene expression, rather than (or in combination with) variation that changes protein-coding sequences. This is particularly compelling with respect to psychiatric disorders, as genetic changes in regulatory regions may result in differential transcriptional responses to developmental cues and environmental/psychosocial stressors. Until recently, however, the link between transcriptional regulation and psychiatric genetic risk has been understudied. Multiple obstacles have contributed to the paucity of research in this area, including challenges in identifying the positions of remote (distal from the promoter) regulatory elements (e.g. enhancers) and their target genes and the underrepresentation of neural cell types and brain tissues in epigenome projects - the availability of high-quality brain tissues for epigenetic and transcriptome profiling, particularly for the adolescent and developing brain, has been limited. Further challenges have arisen in the prediction and testing of the functional impact of DNA variation with respect to multiple aspects of transcriptional control, including regulatory-element interaction (e.g. between enhancers and promoters), transcription factor binding and DNA methylation. Further, the brain has uncommon DNA-methylation marks with unique genomic distributions not found in other tissues - current evidence suggests the involvement of non-CG methylation and 5-hydroxymethylation in neurodevelopmental processes but much remains unknown. We review here knowledge gaps as well as both technological and resource obstacles that will need to be overcome in order to elucidate the involvement of brain-relevant gene-regulatory variants in genetic risk for psychiatric disorders. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Prediction of plant lncRNA by ensemble machine learning classifiers.
Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian
2018-05-02
In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.
Peng, Rui; Zeng, Bo; Meng, Xiuxiang; Yue, Bisong; Zhang, Zhihe; Zou, Fangdong
2007-08-01
The complete mitochondrial genome sequence of the giant panda, Ailuropoda melanoleuca, was determined by the long and accurate polymerase chain reaction (LA-PCR) with conserved primers and primer walking sequence methods. The complete mitochondrial DNA is 16,805 nucleotides in length and contains two ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and one control region. The total length of the 13 protein-coding genes is longer than the American black bear, brown bear and polar bear by 3 amino acids at the end of ND5 gene. The codon usage also followed the typical vertebrate pattern except for an unusual ATT start codon, which initiates the NADH dehydrogenase subunit 5 (ND5) gene. The molecular phylogenetic analysis was performed on the sequences of 12 concatenated heavy-strand encoded protein-coding genes, and suggested that the giant panda is most closely related to bears.
Yang, Q L; Huang, X Y; Kong, J J; Zhao, S G; Liu, L X; Gun, S B
2016-08-19
Piglet diarrhea is one of the primary factors that affects the benefits of the swine industry. Recent studies have shown that exon 2 of the swine leukocyte antigen-DQA gene is associated with piglet resistance to diarrhea; however, the contributions of additional exon coding regions of this gene remain unclear. Here, we detected and sequenced variants in the exon 3 region and examined their associations with diarrhea infection in 425 suckling piglets using the polymerase chain reaction-single-strand conformational polymorphism and sequencing analysis. The results revealed that exon 3 of the swine leukocyte antigen-DQA gene is highly polymorphic and pivotal to both diarrhea susceptibility and resistance in piglets. We identified 14 genotypes (AA, AB, BB, BC, CC, EE, EF, BE, BF, CF, DD, DH, GG, and GF) and eight alleles (A-H) that were generated by 14 nucleotide variants, eight of which were novel, and three nucleotide deletions. Statistical analyses revealed that the genotypes AB and EF were associated with resistance to diarrheal disease (P < 0.05), and the genotype DD may contribute to diarrhea susceptibility but was unique to Large White pigs (P > 0.05). These results elucidate the genetic and immunological background to piglet diarrhea, and provide useful information for resistance breeding programs.
Fu, Mao; Cheng, Hua; Chen, Lihong; Wu, Bin; Cai, Mengyin; Xie, Ding; Fu, Zuzhi
2002-12-01
To investigate whether genetic variation in cocaine and amphetamine-regulated transcript (CART) gene might contribute to the genesis of type 2 diabetes. Screening for mutations in the entire coding region for the CART gene were performed with polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) in 180 normoglycemic control subjects and 221 patients with type 2 diabetes. (1) Adenine deletion was identified at position 1,457 nucleotide located at untranslation area of exon 3. In normal weight control, the frequencies of CART-A + and CART-A-alleles were 83.6% and 16.4% respectively. The frequencies of CART-A + A +, A + A-, A-A- genotype were 68.9%, 29.4% and 1.7% respectively. (2) In diabetic patients, the frequencies of CART-A + and A-alleles were 84.6% and 15.4% respectively; the frequencies of CART-A + A +, A + A-, A-A- genotype were 71.9%, 25.3% and 2.7% respectively. The frequency of A deletion of the CART gene in diabetic patients did not differ significantly from normoglycemic control subjects. (3) Diabetic patients with the A deletion had increased total cholesterol, low-density lipoprotein cholesterol and high-density lipoprotein cholesterol. Polymorphism was found in the 3'-untranslated region (Delta A1457) of CART in Chinese. A deletion in CART is not associated with type 2 diabetes, but may contribute to dyslipidemia.
NASA Technical Reports Server (NTRS)
Weitzel, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.
2016-01-01
Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photo-assimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASA's GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be up-regulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS (Auxin-Regulated Gene Involved in Organ Size)-like protein (potentially affecting cell elongation in the leaves), and an F-box/kelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm up-regulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASA's VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.
Intact coding region of the serotonin transporter gene in obsessive-compulsive disorder
DOE Office of Scientific and Technical Information (OSTI.GOV)
Altemus, M.; Murphy, D.L.; Greenberg, B.
1996-07-26
Epidemiologic studies indicate that obsessive-compulsive disorder is genetically transmitted in some families, although no genetic abnormalities have been identified in individuals with this disorder. The selective response of obsessive-compulsive disorder to treatment with agents which block serotonin reuptake suggests the gene coding for the serotonin transporter as a candidate gene. The primary structure of the serotonin-transporter coding region was sequenced in 22 patients with obsessive-compulsive disorder, using direct PCR sequencing of cDNA synthesized from platelet serotonin-transporter mRNA. No variations in amino acid sequence were found among the obsessive-compulsive disorder patients or healthy controls. These results do not support a rolemore » for alteration in the primary structure of the coding region of the serotonin-transporter gene in the pathogenesis of obsessive-compulsive disorder. 27 refs.« less
Solov'ev, V V; Kel', A E; Kolchanov, N A
1989-01-01
The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Orsini, A; Bonuccelli, A; Striano, P; Azzara, A; Costagliola, G; Consolini, R; Peroni, D G; Valetto, A; Bertini, V
2018-04-26
Terminal deletions of long arm of chromosome 13 are rare and poorly characterized by cytogenetic studies, making for difficult genotype-phenotype correlations. We report two siblings presenting generalized epilepsy, intellectual disability, and genitourinary tract defects. Array CGH detected a 1.3 Mb deletion at 13q34; it contains two protein-coding genes, SOX1 and ARHGEF7, whose haploinsufficiency can contribute to the epileptic phenotype. Copyright © 2018 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.
Stotz, Henrik U; Harvey, Pascoe J; Haddadi, Parham; Mashanova, Alla; Kukol, Andreas; Larkan, Nicholas J; Borhan, M Hossein; Fitt, Bruce D L
2018-01-01
Genes coding for nucleotide-binding leucine-rich repeat (LRR) receptors (NLRs) control resistance against intracellular (cell-penetrating) pathogens. However, evidence for a role of genes coding for proteins with LRR domains in resistance against extracellular (apoplastic) fungal pathogens is limited. Here, the distribution of genes coding for proteins with eLRR domains but lacking kinase domains was determined for the Brassica napus genome. Predictions of signal peptide and transmembrane regions divided these genes into 184 coding for receptor-like proteins (RLPs) and 121 coding for secreted proteins (SPs). Together with previously annotated NLRs, a total of 720 LRR genes were found. Leptosphaeria maculans-induced expression during a compatible interaction with cultivar Topas differed between RLP, SP and NLR gene families; NLR genes were induced relatively late, during the necrotrophic phase of pathogen colonization. Seven RLP, one SP and two NLR genes were found in Rlm1 and Rlm3/Rlm4/Rlm7/Rlm9 loci for resistance against L. maculans on chromosome A07 of B. napus. One NLR gene at the Rlm9 locus was positively selected, as was the RLP gene on chromosome A10 with LepR3 and Rlm2 alleles conferring resistance against L. maculans races with corresponding effectors AvrLm1 and AvrLm2, respectively. Known loci for resistance against L. maculans (extracellular hemi-biotrophic fungus), Sclerotinia sclerotiorum (necrotrophic fungus) and Plasmodiophora brassicae (intracellular, obligate biotrophic protist) were examined for presence of RLPs, SPs and NLRs in these regions. Whereas loci for resistance against P. brassicae were enriched for NLRs, no such signature was observed for the other pathogens. These findings demonstrate involvement of (i) NLR genes in resistance against the intracellular pathogen P. brassicae and a putative NLR gene in Rlm9-mediated resistance against the extracellular pathogen L. maculans.
A genome-wide survey of maternal and embryonic transcripts during Xenopus tropicalis development.
Paranjpe, Sarita S; Jacobi, Ulrike G; van Heeringen, Simon J; Veenstra, Gert Jan C
2013-11-06
Dynamics of polyadenylation vs. deadenylation determine the fate of several developmentally regulated genes. Decay of a subset of maternal mRNAs and new transcription define the maternal-to-zygotic transition, but the full complement of polyadenylated and deadenylated coding and non-coding transcripts has not yet been assessed in Xenopus embryos. To analyze the dynamics and diversity of coding and non-coding transcripts during development, both polyadenylated mRNA and ribosomal RNA-depleted total RNA were harvested across six developmental stages and subjected to high throughput sequencing. The maternally loaded transcriptome is highly diverse and consists of both polyadenylated and deadenylated transcripts. Many maternal genes show peak expression in the oocyte and include genes which are known to be the key regulators of events like oocyte maturation and fertilization. Of all the transcripts that increase in abundance between early blastula and larval stages, about 30% of the embryonic genes are induced by fourfold or more by the late blastula stage and another 35% by late gastrulation. Using a gene model validation and discovery pipeline, we identified novel transcripts and putative long non-coding RNAs (lncRNA). These lncRNA transcripts were stringently selected as spliced transcripts generated from independent promoters, with limited coding potential and a codon bias characteristic of noncoding sequences. Many lncRNAs are conserved and expressed in a developmental stage-specific fashion. These data reveal dynamics of transcriptome polyadenylation and abundance and provides a high-confidence catalogue of novel and long non-coding RNAs.
Epigenome Aberrations: Emerging Driving Factors of the Clear Cell Renal Cell Carcinoma
Mehdi, Ali; Riazalhosseini, Yasser
2017-01-01
Clear cell renal cell carcinoma (ccRCC), the most common form of Kidney cancer, is characterized by frequent mutations of the von Hippel-Lindau (VHL) tumor suppressor gene in ~85% of sporadic cases. Loss of pVHL function affects multiple cellular processes, among which the activation of hypoxia inducible factor (HIF) pathway is the best-known function. Constitutive activation of HIF signaling in turn activates hundreds of genes involved in numerous oncogenic pathways, which contribute to the development or progression of ccRCC. Although VHL mutations are considered as drivers of ccRCC, they are not sufficient to cause the disease. Recent genome-wide sequencing studies of ccRCC have revealed that mutations of genes coding for epigenome modifiers and chromatin remodelers, including PBRM1, SETD2 and BAP1, are the most common somatic genetic abnormalities after VHL mutations in these tumors. Moreover, recent research has shed light on the extent of abnormal epigenome alterations in ccRCC tumors, including aberrant DNA methylation patterns, abnormal histone modifications and deregulated expression of non-coding RNAs. In this review, we discuss the epigenetic modifiers that are commonly mutated in ccRCC, and our growing knowledge of the cellular processes that are impacted by them. Furthermore, we explore new avenues for developing therapeutic approaches based on our knowledge of epigenome aberrations of ccRCC. PMID:28812986
Epigenome Aberrations: Emerging Driving Factors of the Clear Cell Renal Cell Carcinoma.
Mehdi, Ali; Riazalhosseini, Yasser
2017-08-16
Clear cell renal cell carcinoma (ccRCC), the most common form of Kidney cancer, is characterized by frequent mutations of the von Hippel-Lindau ( VHL ) tumor suppressor gene in ~85% of sporadic cases. Loss of pVHL function affects multiple cellular processes, among which the activation of hypoxia inducible factor (HIF) pathway is the best-known function. Constitutive activation of HIF signaling in turn activates hundreds of genes involved in numerous oncogenic pathways, which contribute to the development or progression of ccRCC. Although VHL mutations are considered as drivers of ccRCC, they are not sufficient to cause the disease. Recent genome-wide sequencing studies of ccRCC have revealed that mutations of genes coding for epigenome modifiers and chromatin remodelers, including PBRM1 , SETD2 and BAP1 , are the most common somatic genetic abnormalities after VHL mutations in these tumors. Moreover, recent research has shed light on the extent of abnormal epigenome alterations in ccRCC tumors, including aberrant DNA methylation patterns, abnormal histone modifications and deregulated expression of non-coding RNAs. In this review, we discuss the epigenetic modifiers that are commonly mutated in ccRCC, and our growing knowledge of the cellular processes that are impacted by them. Furthermore, we explore new avenues for developing therapeutic approaches based on our knowledge of epigenome aberrations of ccRCC.
Su, Lining; Wang, Chunjie; Zheng, Chenqing; Wei, Huiping; Song, Xiaoqing
2018-04-13
Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation. We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network. Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs. Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our study can determine whether a genetic variant is associated with PD. Overall, these findings will help guide our study of the complex molecular mechanism of PD.
The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications.
Shen, Jinhui; Cong, Qian; Grishin, Nick V
2015-09-01
Due to the intriguing morphology, lifecycle, and diversity of butterflies and moths, Lepidoptera are emerging as model organisms for the study of genetics, evolution and speciation. The progress of these studies relies on decoding Lepidoptera genomes, both nuclear and mitochondrial. Here we describe a protocol to obtain mitogenomes from Next Generation Sequencing reads performed for whole-genome sequencing and report the complete mitogenome of Papilio (Pterourus) glaucus. The circular mitogenome is 15,306 bp in length and rich in A and T. It contains 13 protein-coding genes (PCGs), 22 transfer-RNA-coding genes (tRNA), and 2 ribosomal-RNA-coding genes (rRNA), with a gene order typical for mitogenomes of Lepidoptera. We performed phylogenetic analyses based on PCG and RNA-coding genes or protein sequences using Bayesian Inference and Maximum Likelihood methods. The phylogenetic trees consistently show that among species with available mitogenomes Papilio glaucus is the closest to Papilio (Agehana) maraho from Asia.
Natural variation in non-coding regions underlying phenotypic diversity in budding yeast
Salinas, Francisco; de Boer, Carl G.; Abarca, Valentina; García, Verónica; Cuevas, Mara; Araos, Sebastian; Larrondo, Luis F.; Martínez, Claudio; Cubillos, Francisco A.
2016-01-01
Linkage mapping studies in model organisms have typically focused their efforts in polymorphisms within coding regions, ignoring those within regulatory regions that may contribute to gene expression variation. In this context, differences in transcript abundance are frequently proposed as a source of phenotypic diversity between individuals, however, until now, little molecular evidence has been provided. Here, we examined Allele Specific Expression (ASE) in six F1 hybrids from Saccharomyces cerevisiae derived from crosses between representative strains of the four main lineages described in yeast. ASE varied between crosses with levels ranging between 28% and 60%. Part of the variation in expression levels could be explained by differences in transcription factors binding to polymorphic cis-regulations and to differences in trans-activation depending on the allelic form of the TF. Analysis on highly expressed alleles on each background suggested ASN1 as a candidate transcript underlying nitrogen consumption differences between two strains. Further promoter allele swap analysis under fermentation conditions confirmed that coding and non-coding regions explained aspartic and glutamic acid consumption differences, likely due to a polymorphism affecting Uga3 binding. Together, we provide a new catalogue of variants to bridge the gap between genotype and phenotype. PMID:26898953
Activity-Dependent Human Brain Coding/Noncoding Gene Regulatory Networks
Lipovich, Leonard; Dachet, Fabien; Cai, Juan; Bagla, Shruti; Balan, Karina; Jia, Hui; Loeb, Jeffrey A.
2012-01-01
While most gene transcription yields RNA transcripts that code for proteins, a sizable proportion of the genome generates RNA transcripts that do not code for proteins, but may have important regulatory functions. The brain-derived neurotrophic factor (BDNF) gene, a key regulator of neuronal activity, is overlapped by a primate-specific, antisense long noncoding RNA (lncRNA) called BDNFOS. We demonstrate reciprocal patterns of BDNF and BDNFOS transcription in highly active regions of human neocortex removed as a treatment for intractable seizures. A genome-wide analysis of activity-dependent coding and noncoding human transcription using a custom lncRNA microarray identified 1288 differentially expressed lncRNAs, of which 26 had expression profiles that matched activity-dependent coding genes and an additional 8 were adjacent to or overlapping with differentially expressed protein-coding genes. The functions of most of these protein-coding partner genes, such as ARC, include long-term potentiation, synaptic activity, and memory. The nuclear lncRNAs NEAT1, MALAT1, and RPPH1, composing an RNAse P-dependent lncRNA-maturation pathway, were also upregulated. As a means to replicate human neuronal activity, repeated depolarization of SY5Y cells resulted in sustained CREB activation and produced an inverse pattern of BDNF-BDNFOS co-expression that was not achieved with a single depolarization. RNAi-mediated knockdown of BDNFOS in human SY5Y cells increased BDNF expression, suggesting that BDNFOS directly downregulates BDNF. Temporal expression patterns of other lncRNA-messenger RNA pairs validated the effect of chronic neuronal activity on the transcriptome and implied various lncRNA regulatory mechanisms. lncRNAs, some of which are unique to primates, thus appear to have potentially important regulatory roles in activity-dependent human brain plasticity. PMID:22960213
Valenzuela-Miranda, Diego; Gallardo-Escárate, Cristian
2016-12-01
Despite the high prevalence and impact to Chilean salmon aquaculture of the intracellular bacterium Piscirickettsia salmonis, the molecular underpinnings of host-pathogen interactions remain unclear. Herein, the interplay of coding and non-coding transcripts has been proposed as a key mechanism involved in immune response. Therefore, the aim of this study was to evidence how coding and non-coding transcripts are modulated during the infection process of Atlantic salmon with P. salmonis. For this, RNA-seq was conducted in brain, spleen, and head kidney samples, revealing different transcriptional profiles according to bacterial load. Additionally, while most of the regulated genes annotated for diverse biological processes during infection, a common response associated with clathrin-mediated endocytosis and iron homeostasis was present in all tissues. Interestingly, while endocytosis-promoting factors and clathrin inductions were upregulated, endocytic receptors were mainly downregulated. Furthermore, the regulation of genes related to iron homeostasis suggested an intracellular accumulation of iron, a process in which heme biosynthesis/degradation pathways might play an important role. Regarding the non-coding response, 918 putative long non-coding RNAs were identified, where 425 were newly characterized for S. salar. Finally, co-localization and co-expression analyses revealed a strong correlation between the modulations of long non-coding RNAs and genes associated with endocytosis and iron homeostasis. These results represent the first comprehensive study of putative interplaying mechanisms of coding and non-coding RNAs during bacterial infection in salmonids. Copyright © 2016 Elsevier Ltd. All rights reserved.
Denitrification gene expression in clay-soil bacterial community
NASA Astrophysics Data System (ADS)
Pastorelli, R.; Landi, S.
2009-04-01
Our contribution in the Italian research project SOILSINK was focused on microbial denitrification gene expression in Mediterranean agricultural soils. In ecosystems with high inputs of nitrogen, such as agricultural soils, denitrification causes a net loss of nitrogen since nitrate is reduced to gaseous forms, which are released into the atmosphere. Moreover, incomplete denitrification can lead to emission of nitrous oxide, a potent greenhouse gas which contributes to global warming and destruction of ozone layer. A critical role in denitrification is played by microorganisms and the ability to denitrify is widespread among a variety of phylogenetically unrelated organisms. Data reported here are referred to wheat cultivation in a clay-rich soil under different environmental impact management (Agugliano, AN, Italy). We analysed the RNA directly extracted from soil to provide information on in situ activities of specific populations. The expression of genes coding for two nitrate reductases (narG and napA), two nitrite reductases (nirS and nirK), two nitric oxide reductases (cnorB and qnorB) and nitrous oxide reductase (nosZ) was analyzed by reverse transcription (RT)-nested PCR. Only napA, nirS, nirK, qnorB and nosZ were detected and fragments sequenced showed high similarity with the corresponding gene sequences deposited in GenBank database. These results suggest the suitability of the method for the qualitative detection of denitrifying bacteria in environmental samples and they offered us the possibility to perform the denaturing gradient gel electrophoresis (DGGE) analyzes for denitrification genes.. Earlier conclusions showed nirK gene is more widely distributed in soil environment than nirS gene. The results concerning the nosZ expression indicated that microbial activity was clearly present only in no-tilled and no-fertilized soils.
Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren
2015-01-01
There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098
Hahn, Julia; Tsoy, Olga V.; Thalmann, Sebastian; Čuklina, Jelena; Gelfand, Mikhail S.
2016-01-01
Small open reading frames (sORFs) and genes for non-coding RNAs are poorly investigated components of most genomes. Our analysis of 1391 ORFs recently annotated in the soybean symbiont Bradyrhizobium japonicum USDA 110 revealed that 78% of them contain less than 80 codons. Twenty-one of these sORFs are conserved in or outside Alphaproteobacteria and most of them are similar to genes found in transposable elements, in line with their broad distribution. Stabilizing selection was demonstrated for sORFs with proteomic evidence and bll1319_ISGA which is conserved at the nucleotide level in 16 alphaproteobacterial species, 79 species from other taxa and 49 other Proteobacteria. Further we used Northern blot hybridization to validate ten small RNAs (BjsR1 to BjsR10) belonging to new RNA families. We found that BjsR1 and BjsR3 have homologs outside the genus Bradyrhizobium, and BjsR5, BjsR6, BjsR7, and BjsR10 have up to four imperfect copies in Bradyrhizobium genomes. BjsR8, BjsR9, and BjsR10 are present exclusively in nodules, while the other sRNAs are also expressed in liquid cultures. We also found that the level of BjsR4 decreases after exposure to tellurite and iron, and this down-regulation contributes to survival under high iron conditions. Analysis of additional small RNAs overlapping with 3’-UTRs revealed two new repetitive elements named Br-REP1 and Br-REP2. These REP elements may play roles in the genomic plasticity and gene regulation and could be useful for strain identification by PCR-fingerprinting. Furthermore, we studied two potential toxin genes in the symbiotic island and confirmed toxicity of the yhaV homolog bll1687 but not of the newly annotated higB homolog blr0229_ISGA in E. coli. Finally, we revealed transcription interference resulting in an antisense RNA complementary to blr1853, a gene induced in symbiosis. The presented results expand our knowledge on sORFs, non-coding RNAs and repetitive elements in B. japonicum and related bacteria. PMID:27788207
Explaining the disease phenotype of intergenic SNP through predicted long range regulation
Chen, Jingqi; Tian, Weidong
2016-01-01
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978
Effects of DNA Methylation and Chromatin State on Rates of Molecular Evolution in Insects.
Glastad, Karl M; Goodisman, Michael A D; Yi, Soojin V; Hunt, Brendan G
2015-12-04
Epigenetic information is widely appreciated for its role in gene regulation in eukaryotic organisms. However, epigenetic information can also influence genome evolution. Here, we investigate the effects of epigenetic information on gene sequence evolution in two disparate insects: the fly Drosophila melanogaster, which lacks substantial DNA methylation, and the ant Camponotus floridanus, which possesses a functional DNA methylation system. We found that DNA methylation was positively correlated with the synonymous substitution rate in C. floridanus, suggesting a key effect of DNA methylation on patterns of gene evolution. However, our data suggest the link between DNA methylation and elevated rates of synonymous substitution was explained, in large part, by the targeting of DNA methylation to genes with signatures of transcriptionally active chromatin, rather than the mutational effect of DNA methylation itself. This phenomenon may be explained by an elevated mutation rate for genes residing in transcriptionally active chromatin, or by increased structural constraints on genes in inactive chromatin. This result highlights the importance of chromatin structure as the primary epigenetic driver of genome evolution in insects. Overall, our study demonstrates how different epigenetic systems contribute to variation in the rates of coding sequence evolution. Copyright © 2016 Glastad et al.
Zorc, Minja; Kunej, Tanja
2016-05-01
MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a starting point for further functional studies and association studies with poultry production and health traits and the basis for systematic screening of exonic miRNAs and missense/miRNA seed polymorphisms in other genomes.
New target genes of MITF-induced microRNA-211 contribute to melanoma cell invasion.
Margue, Christiane; Philippidou, Demetra; Reinsbach, Susanne E; Schmitt, Martina; Behrmann, Iris; Kreis, Stephanie
2013-01-01
The non-coding microRNAs (miRNA) have tissue- and disease-specific expression patterns. They down-regulate target mRNAs, which likely impacts on most fundamental cellular processes. Differential expression patterns of miRNAs are currently being exploited for identification of biomarkers for early disease diagnosis, prediction of progression for melanoma and other cancers and as promising drug targets, since they can easily be inhibited or replaced in a given cellular context. Before successfully manipulating miRNAs in clinical settings, their precise expression levels, endogenous functions and thus their target genes have to be determined. MiR-211, a melanocyte lineage-specific small non-coding miRNA, is located in an intron of TRPM1, a target gene of the microphtalmia-associated transcription factor (MITF). By transcriptionally up-regulating TRPM1, MITF, which is critical for both melanocyte differentiation and survival and for melanoma progression, indirectly drives the expression of miR-211. Expression of this miRNA is often reduced in melanoma samples. Here, we investigated functional roles of miR-211 by identifying and studying new target genes. We show that MITF-correlated miR-211 expression levels are mostly but not always reduced in a panel of 11 melanoma cell lines and in primary and metastatic melanoma compared to normal melanocytes and nevi, respectively. MiR-211 itself only marginally impacted on cell invasion and migration, while perturbation of some new miR-211 target genes, such as AP1S2, SOX11, IGFBP5, and SERINC3 significantly increased invasion. These results and the variable expression levels of miR-211 raise serious doubts on the value of miR-211 as a melanoma tumor-suppressing miRNA and/or as a biomarker for melanoma.
Dealtry, Simone; Holmsgaard, Peter N.; Dunon, Vincent; Jechalke, Sven; Ding, Guo-Chun; Krögerrecklenfort, Ellen; Heuer, Holger; Hansen, Lars H.; Springael, Dirk; Zühlke, Sebastian; Sørensen, Søren J.
2014-01-01
Biopurification systems (BPS) are used on farms to control pollution by treating pesticide-contaminated water. It is assumed that mobile genetic elements (MGEs) carrying genes coding for enzymes involved in degradation might contribute to the degradation of pesticides. Therefore, the composition and shifts of MGEs, in particular, of IncP-1 plasmids carried by BPS bacterial communities exposed to various pesticides, were monitored over the course of an agricultural season. PCR amplification of total community DNA using primers targeting genes specific to different plasmid groups combined with Southern blot hybridization indicated a high abundance of plasmids belonging to IncP-1, IncP-7, IncP-9, IncQ, and IncW, while IncU and IncN plasmids were less abundant or not detected. Furthermore, the integrase genes of class 1 and 2 integrons (intI1, intI2) and genes encoding resistance to sulfonamides (sul1, sul2) and streptomycin (aadA) were detected and seasonality was revealed. Amplicon pyrosequencing of the IncP-1 trfA gene coding for the replication initiation protein revealed high IncP-1 plasmid diversity and an increase in the abundance of IncP-1β and a decrease in the abundance of IncP-1ε over time. The data of the chemical analysis showed increasing concentrations of various pesticides over the course of the agricultural season. As an increase in the relative abundances of bacteria carrying IncP-1β plasmids also occurred, this might point to a role of these plasmids in the degradation of many different pesticides. PMID:24771027
Common Variants in Cardiac Ion Channel Genes are Associated with Sudden Cardiac Death
Albert, Christine M.; MacRae, Calum A.; Chasman, Daniel I.; VanDenburgh, Martin; Buring, Julie E; Manson, JoAnn E; Cook, Nancy R; Newton-Cheh, Christopher
2010-01-01
Background Rare variants in cardiac ion channel genes are associated with sudden cardiac death (SCD) in rare primary arrhythmic syndromes; however, it is unknown whether common variation in these same genes may contribute to SCD risk at the population level. Methods and Results We examined the association between 147 single nucleotide polymorphisms (SNPs) (137 tag, 5 non-coding SNPs associated with QT interval duration and 5 nonsynonymous SNPs) in 5 cardiac ion channel genes, KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2 and sudden and/or arrhythmic death in a combined nested case-control analysis among 516 cases and 1522 matched controls of European ancestry enrolled in six prospective cohort studies. After accounting for multiple testing, two SNPs (rs2283222 located in intron 11 in KCNQ1 and rs11720524 located in intron 1 in SCN5A) remained significantly associated with sudden/arrhythmic death (FDR = 0.01 and 0.03 respectively). Each increasing copy of the major T allele of rs2283222 or the major C allele of rs1172052 was associated with an OR = 1.36 (95% CI 1.16-1.60, P=0.0002) and 1.30 (95% CI 1.12-1.51, P=0.0005) respectively. Control for cardiovascular risk factors and/or limiting the analysis to definite SCDs did not significantly alter these relationships. Conclusion In this combined analysis of 6 prospective cohort studies, two common intronic variants in KCNQ1 and SCN5A were associated with SCD in individuals of European ancestry. Further study in other populations and investigation into the functional abnormalities associated with non-coding variation in these genes may lead to important insights into predisposition to lethal arrhythmias. PMID:20400777
Aging in the Brain: New Roles of Epigenetics in Cognitive Decline.
Barter, Jolie D; Foster, Thomas C
2018-06-01
Gene expression in the aging brain depends on transcription signals generated by senescent physiology, interacting with genetic and epigenetic programs. In turn, environmental factors influence epigenetic mechanisms, such that an epigenetic-environmental link may contribute to the accumulation of cellular damage, susceptibility or resilience to stressors, and variability in the trajectory of age-related cognitive decline. Epigenetic mechanisms, DNA methylation and histone modifications, alter chromatin structure and the accessibility of DNA. Furthermore, small non-coding RNA, termed microRNA (miRNA) bind to messenger RNA (mRNA) to regulate translation. In this review, we examine key questions concerning epigenetic mechanisms in regulating the expression of genes associated with brain aging and age-related cognitive decline. In addition, we highlight the interaction of epigenetics with senescent physiology and environmental factors in regulating transcription.
Rao, Shu-Quan; Hu, Hui-Ling; Ye, Ning; Shen, Yan; Xu, Qi
2015-08-01
The heritability of schizophrenia has been reported to be as high as ~80%, but the contribution of genetic variants identified to this heritability remains to be estimated. Long non-coding RNAs (LncRNAs) are involved in multiple processes critical to normal cellular function and dysfunction of lncRNA MIAT may contribute to the pathophysiology of schizophrenia. However, the genetic evidence of lncRNAs involved in schizophrenia has not been documented. Here, we conducted a two-stage association analysis on 8 tag SNPs that cover the whole MIAT locus in two independent Han Chinese schizophrenia case-control cohorts (discovery sample from Shanxi Province: 1093 patients with paranoid schizophrenia and 1180 control subjects; replication cohort from Jilin Province: 1255 cases and 1209 healthy controls). In discovery stage, significant genetic association with paranoid schizophrenia was observed for rs1894720 (χ(2)=74.20, P=7.1E-18), of which minor allele (T) had an OR of 1.70 (95% CI=1.50-1.91). This association was confirmed in the replication cohort (χ(2)=22.66, P=1.9E-06, OR=1.32, 95%CI 1.18-1.49). Besides, a weak genotypic association was detected for rs4274 (χ(2)=4.96, df=2, P=0.03); the AA carriers showed increased disease risk (OR=1.30, 95%CI=1.03-1.64). No significant association was found between any haplotype and paranoid schizophrenia. The present studies showed that lncRNA MIAT was a novel susceptibility gene for paranoid schizophrenia in the Chinese Han population. Considering that most lncRNAs locate in non-coding regions, our result may explain why most susceptibility loci for schizophrenia identified by genome wide association studies were out of coding regions. Copyright © 2015 Elsevier B.V. All rights reserved.
Olfson, Emily; Saccone, Nancy L.; Johnson, Eric O.; Chen, Li-Shiun; Culverhouse, Robert; Doheny, Kimberly; Foltz, Steven M.; Fox, Louis; Gogarten, Stephanie M.; Hartz, Sarah; Hetrick, Kurt; Laurie, Cathy C.; Marosy, Beth; Amin, Najaf; Arnett, Donna; Barr, R. Graham; Bartz, Traci M.; Bertelsen, Sarah; Borecki, Ingrid B.; Brown, Michael R.; Chasman, Daniel I.; van Duijn, Cornelia M.; Feitosa, Mary F.; Fox, Ervin R.; Franceschini, Nora; Franco, Oscar H.; Grove, Megan L.; Guo, Xiuqing; Hofman, Albert; Kardia, Sharon L.R.; Morrison, Alanna C.; Musani, Solomon K.; Psaty, Bruce M.; Rao, D.C.; Reiner, Alex P.; Rice, Kenneth; Ridker, Paul M.; Rose, Lynda M.; Schick, Ursula M.; Schwander, Karen; Uitterlinden, Andre G.; Vojinovic, Dina; Wang, Jen-Chyong; Ware, Erin B.; Wilson, Gregory; Yao, Jie; Zhao, Wei; Breslau, Naomi; Hatsukami, Dorothy; Stitzel, Jerry A.; Rice, John; Goate, Alison; Bierut, Laura J.
2015-01-01
The common nonsynonymous variant rs16969968 in the α5 nicotinic receptor subunit gene (CHRNA5) is the strongest genetic risk factor for nicotine dependence in European Americans and contributes to risk in African Americans. To comprehensively examine whether other CHRNA5 coding variation influences nicotine dependence risk, we performed targeted sequencing on 1582 nicotine dependent cases (Fagerström Test for Nicotine Dependence score≥4) and 1238 non-dependent controls, with independent replication of common and low frequency variants using 12 studies with exome chip data. Nicotine dependence was examined using logistic regression with individual common variants (MAF≥0.05), aggregate low frequency variants (0.05>MAF≥0.005), and aggregate rare variants (MAF<0.005). Meta-analysis of primary results was performed with replication studies containing 12 174 heavy and 11 290 light smokers. Next-generation sequencing with 180X coverage identified 24 nonsynonymous variants and 2 frameshift deletions in CHRNA5, including 9 novel variants in the 2820 subjects. Meta-analysis confirmed the risk effect of the only common variant (rs16969968, European ancestry: OR=1.3, p=3.5×10−11; African ancestry: OR=1.3, p=0.01) and demonstrated that 3 low frequency variants contributed an independent risk (aggregate term, European ancestry: OR=1.3, p=0.005; African ancestry: OR=1.4, p=0.0006). The remaining 22 rare coding variants were associated with increased risk of nicotine dependence in the European American primary sample (OR=12.9, p=0.01) and in the same risk direction in African Americans (OR=1.5, p=0.37). Our results indicate that common, low frequency and rare CHRNA5 coding variants are independently associated with nicotine dependence risk. These newly identified variants likely influence risk for smoking-related diseases such as lung cancer. PMID:26239294
Olfson, E; Saccone, N L; Johnson, E O; Chen, L-S; Culverhouse, R; Doheny, K; Foltz, S M; Fox, L; Gogarten, S M; Hartz, S; Hetrick, K; Laurie, C C; Marosy, B; Amin, N; Arnett, D; Barr, R G; Bartz, T M; Bertelsen, S; Borecki, I B; Brown, M R; Chasman, D I; van Duijn, C M; Feitosa, M F; Fox, E R; Franceschini, N; Franco, O H; Grove, M L; Guo, X; Hofman, A; Kardia, S L R; Morrison, A C; Musani, S K; Psaty, B M; Rao, D C; Reiner, A P; Rice, K; Ridker, P M; Rose, L M; Schick, U M; Schwander, K; Uitterlinden, A G; Vojinovic, D; Wang, J-C; Ware, E B; Wilson, G; Yao, J; Zhao, W; Breslau, N; Hatsukami, D; Stitzel, J A; Rice, J; Goate, A; Bierut, L J
2016-05-01
The common nonsynonymous variant rs16969968 in the α5 nicotinic receptor subunit gene (CHRNA5) is the strongest genetic risk factor for nicotine dependence in European Americans and contributes to risk in African Americans. To comprehensively examine whether other CHRNA5 coding variation influences nicotine dependence risk, we performed targeted sequencing on 1582 nicotine-dependent cases (Fagerström Test for Nicotine Dependence score⩾4) and 1238 non-dependent controls, with independent replication of common and low frequency variants using 12 studies with exome chip data. Nicotine dependence was examined using logistic regression with individual common variants (minor allele frequency (MAF)⩾0.05), aggregate low frequency variants (0.05>MAF⩾0.005) and aggregate rare variants (MAF<0.005). Meta-analysis of primary results was performed with replication studies containing 12 174 heavy and 11 290 light smokers. Next-generation sequencing with 180 × coverage identified 24 nonsynonymous variants and 2 frameshift deletions in CHRNA5, including 9 novel variants in the 2820 subjects. Meta-analysis confirmed the risk effect of the only common variant (rs16969968, European ancestry: odds ratio (OR)=1.3, P=3.5 × 10(-11); African ancestry: OR=1.3, P=0.01) and demonstrated that three low frequency variants contributed an independent risk (aggregate term, European ancestry: OR=1.3, P=0.005; African ancestry: OR=1.4, P=0.0006). The remaining 22 rare coding variants were associated with increased risk of nicotine dependence in the European American primary sample (OR=12.9, P=0.01) and in the same risk direction in African Americans (OR=1.5, P=0.37). Our results indicate that common, low frequency and rare CHRNA5 coding variants are independently associated with nicotine dependence risk. These newly identified variants likely influence the risk for smoking-related diseases such as lung cancer.
Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin
ERIC Educational Resources Information Center
Offner, Susan
2010-01-01
The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.
Lazzarato, F; Franceschinis, G; Botta, M; Cordero, F; Calogero, R A
2004-11-01
RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html
Genome dynamics and its impact on evolution of Escherichia coli.
Dobrindt, Ulrich; Chowdary, M Geddam; Krumbholz, G; Hacker, J
2010-08-01
The Escherichia coli genome consists of a conserved part, the so-called core genome, which encodes essential cellular functions and of a flexible, strain-specific part. Genes that belong to the flexible genome code for factors involved in bacterial fitness and adaptation to different environments. Adaptation includes increase in fitness and colonization capacity. Pathogenic as well as non-pathogenic bacteria carry mobile and accessory genetic elements such as plasmids, bacteriophages, genomic islands and others, which code for functions required for proper adaptation. Escherichia coli is a very good example to study the interdependency of genome architecture and lifestyle of bacteria. Thus, these species include pathogenic variants as well as commensal bacteria adapted to different host organisms. In Escherichia coli, various genetic elements encode for pathogenicity factors as well as factors, which increase the fitness of non-pathogenic bacteria. The processes of genome dynamics, such as gene transfer, genome reduction, rearrangements as well as point mutations contribute to the adaptation of the bacteria into particular environments. Using Escherichia coli model organisms, such as uropathogenic strain 536 or commensal strain Nissle 1917, we studied mechanisms of genome dynamics and discuss these processes in the light of the evolution of microbes.
de Freitas, Michele C. R.; Resende, Juliana A.; Ferreira-Machado, Alessandra B.; Saji, Guadalupe D. R. Q.; de Vasconcelos, Ana T. R.; da Silva, Vânia L.; Nicolás, Marisa F.; Diniz, Cláudio G.
2016-01-01
Bacteroides fragilis, member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug efflux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment. PMID:27703449
Bostrom, Meredith A.; Kao, W.H. Linda; Li, Man; Abboud, Hanna E.; Adler, Sharon G.; Iyengar, Sudha K.; Kimmel, Paul L.; Hanson, Robert L.; Nicholas, Susanne B.; Rasooly, Rebekah S.; Sedor, John R.; Coresh, Josef; Kohn, Orly F.; Leehey, David J.; Thornley-Brown, Denyse; Bottinger, Erwin P.; Lipkowitz, Michael S.; Meoni, Lucy A.; Klag, Michael J.; Lu, Lingyi; Hicks, Pamela J.; Langefeld, Carl D.; Parekh, Rulan S.; Bowden, Donald W.; Freedman, Barry I.
2011-01-01
Background African Americans (AAs) have increased susceptibility to non-diabetic nephropathy relative to European Americans. Study Design Follow-up of a pooled genome-wide association study (GWAS) in AA dialysis patients with nondiabetic nephropathy; novel gene-gene interaction analyses. Setting & Participants Wake Forest sample: 962 AA nondiabetic nephropathy cases; 931 non-nephropathy controls. Replication sample: 668 Family Investigation of Nephropathy and Diabetes (FIND) AA nondiabetic nephropathy cases; 804 non-nephropathy controls. Predictors Individual genotyping of top 1420 pooled GWAS-associated single nucleotide polymorphisms (SNPs) and 54 SNPs in six nephropathy susceptibility genes. Outcomes APOL1 genetic association and additional candidate susceptibility loci interacting with, or independently from, APOL1. Results The strongest GWAS associations included two non-coding APOL1 SNPs, rs2239785 (odds ratio [OR], 0.33; dominant; p = 5.9 × 10−24) and rs136148 (OR, 0.54; additive; p = 1.1 × 10−7) with replication in FIND (p = 5.0 × 10−21 and 1.9 × 10−05, respectively). Rs2239785 remained significantly associated after controlling for the APOL1 G1 and G2 coding variants. Additional top hits included a CFH SNP(OR from meta-analysis in above 3367 AA cases and controls, 0.81; additive; p = 6.8 × 10−4). The 1420 SNPs were tested for interaction with APOL1 G1 and G2 variants. Several interactive SNPs were detected, the most significant was rs16854341 in the podocin gene (NPHS2) (p = 0.0001). Limitations Non-pooled GWAS have not been performed in AA nondiabetic nephropathy. Conclusions This follow-up of a pooled GWAS provides additional and independent evidence that APOL1 variants contribute to nondiabetic nephropathy in AAs and identified additional associated and interactive non-diabetic nephropathy susceptibility genes. PMID:22119407
Evolution of coding and non-coding genes in HOX clusters of a marsupial.
Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B
2012-06-18
The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Evolution of coding and non-coding genes in HOX clusters of a marsupial
2012-01-01
Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672
Two Perspectives on the Origin of the Standard Genetic Code
NASA Astrophysics Data System (ADS)
Sengupta, Supratim; Aggarwal, Neha; Bandhu, Ashutosh Vishwa
2014-12-01
The origin of a genetic code made it possible to create ordered sequences of amino acids. In this article we provide two perspectives on code origin by carrying out simulations of code-sequence coevolution in finite populations with the aim of examining how the standard genetic code may have evolved from more primitive code(s) encoding a small number of amino acids. We determine the efficacy of the physico-chemical hypothesis of code origin in the absence and presence of horizontal gene transfer (HGT) by allowing a diverse collection of code-sequence sets to compete with each other. We find that in the absence of horizontal gene transfer, natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. However, for certain probabilities of the horizontal transfer events, a universal code emerges having a structure that is consistent with the standard genetic code.
Chen, Wei-Hua; Lu, Guanting; Chen, Xiao; Zhao, Xing-Ming; Bork, Peer
2017-01-04
OGEE is an Online GEne Essentiality database. To enhance our understanding of the essentiality of genes, in OGEE we collected experimentally tested essential and non-essential genes, as well as associated gene properties known to contribute to gene essentiality. We focus on large-scale experiments, and complement our data with text-mining results. We organized tested genes into data sets according to their sources, and tagged those with variable essentiality statuses across data sets as conditionally essential genes, intending to highlight the complex interplay between gene functions and environments/experimental perturbations. Developments since the last public release include increased numbers of species and gene essentiality data sets, inclusion of non-coding essential sequences and genes with intermediate essentiality statuses. In addition, we included 16 essentiality data sets from cancer cell lines, corresponding to 9 human cancers; with OGEE, users can easily explore the shared and differentially essential genes within and between cancer types. These genes, especially those derived from cell lines that are similar to tumor samples, could reveal the oncogenic drivers, paralogous gene expression pattern and chromosomal structure of the corresponding cancer types, and can be further screened to identify targets for cancer therapy and/or new drug development. OGEE is freely available at http://ogee.medgenius.info. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
System Design for FEC in Aeronautical Telemetry
2012-03-12
rate punctured convolutional codes for soft decision Viterbi...below follows that given in [8]. The final coding rate of exactly 2/3 is achieved by puncturing the rate -1/2 code as follows. We begin with the buffer c1...concatenated convolutional code (SCCC). The contributions of this paper are on the system-design level. One major contribution is to design a SCCC code
Ran, Xueqin; Wang, Jiafu; Li, Sheng; Liu, Jianfeng
2018-01-01
Genomic structural variation (SV) is noticed for the contribution to genetic diversity and phenotypic changes. Guizhou indigenous pig (GZP) has been raised for hundreds of years with many special characteristics. The present paper aimed to uncover the influence of SV on gene polymorphism and the genetic mechanisms of phenotypic traits for GZP. Eighteen GZPs were chosen for resequencing by Illumina sequencing platform. The confident SVs of GZP were called out by both programs of pindel and softSV simultaneously and compared with the SVs deduced from the genomic data of European pig (EUP) and the native pig outside of Guizhou, China (NPOG). A total of 39,166 SVs were detected and covered 27.37 Mb of pig genome. All of 76 SVs were confirmed in GZP pig population by PCR method. The SVs numbers in NPOG and GZP were about 1.8 to 1.9 times higher than that in EUP. And a SV hotspot was found out from the 20 Mb of chromosome X of GZP, which harbored 29 genes and focused on histone modification. More than half of SVs was positioned in the intergenic regions and about one third of SVs in the introns of genes. And we found that SVs tended to locate in genes produced multi-transcripts, in which a positive correlation was found out between the numbers of SV and the gene transcripts. It illustrated that the primary mode of SVs might function on the regulation of gene expression or the transcripts splicing process. A total of 1,628 protein-coding genes were disturbed by 1,956 SVs specific in GZP, in which 93 GZP-specific SV-related genes would lose their functions due to the SV interference and gathered in reproduction ability. Interestingly, the 1,628 protein-coding genes were mainly enriched in estrogen receptor binding, steroid hormone receptor binding, retinoic acid receptor binding, oxytocin signaling pathway, mTOR signaling pathway, axon guidance and cholinergic synapse pathways. It suggested that SV might be a reason for the strong adaptability and low fecundity of GZP, and 51 candidate genes would be useful for the configuration phenotype in Xiang pig breed. PMID:29558483
Liu, Chang; Ran, Xueqin; Wang, Jiafu; Li, Sheng; Liu, Jianfeng
2018-01-01
Genomic structural variation (SV) is noticed for the contribution to genetic diversity and phenotypic changes. Guizhou indigenous pig (GZP) has been raised for hundreds of years with many special characteristics. The present paper aimed to uncover the influence of SV on gene polymorphism and the genetic mechanisms of phenotypic traits for GZP. Eighteen GZPs were chosen for resequencing by Illumina sequencing platform. The confident SVs of GZP were called out by both programs of pindel and softSV simultaneously and compared with the SVs deduced from the genomic data of European pig (EUP) and the native pig outside of Guizhou, China (NPOG). A total of 39,166 SVs were detected and covered 27.37 Mb of pig genome. All of 76 SVs were confirmed in GZP pig population by PCR method. The SVs numbers in NPOG and GZP were about 1.8 to 1.9 times higher than that in EUP. And a SV hotspot was found out from the 20 Mb of chromosome X of GZP, which harbored 29 genes and focused on histone modification. More than half of SVs was positioned in the intergenic regions and about one third of SVs in the introns of genes. And we found that SVs tended to locate in genes produced multi-transcripts, in which a positive correlation was found out between the numbers of SV and the gene transcripts. It illustrated that the primary mode of SVs might function on the regulation of gene expression or the transcripts splicing process. A total of 1,628 protein-coding genes were disturbed by 1,956 SVs specific in GZP, in which 93 GZP-specific SV-related genes would lose their functions due to the SV interference and gathered in reproduction ability. Interestingly, the 1,628 protein-coding genes were mainly enriched in estrogen receptor binding, steroid hormone receptor binding, retinoic acid receptor binding, oxytocin signaling pathway, mTOR signaling pathway, axon guidance and cholinergic synapse pathways. It suggested that SV might be a reason for the strong adaptability and low fecundity of GZP, and 51 candidate genes would be useful for the configuration phenotype in Xiang pig breed.
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.
Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M
2010-12-15
Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
Dushyanth, K; Bhattacharya, T K; Shukla, R; Chatterjee, R N; Sitaramamma, T; Paswan, C; Guru Vishnu, P
2016-10-01
Myostatin is a member of TGF-β super family and is directly involved in regulation of body growth through limiting muscular growth. A study was carried out in three chicken lines to identify the polymorphism in the coding region of the myostatin gene through SSCP and DNA sequencing. A total of 12 haplotypes were observed in myostatin coding region of chicken. Significant associations between haplogroups with body weight at day 1, 14, 28, and 42 days, and carcass traits at 42 days were observed across the lines. It is concluded that the coding region of myostatin gene was polymorphic, with varied levels of expression among lines and had significant effects on growth traits. The expression of MSTN gene varied during embryonic and post hatch development stage.
Zhu, Yan; Chen, Longxian; Zhang, Chengjun; Hao, Pei; Jing, Xinyun; Li, Xuan
2017-01-25
Selaginella moellendorffii, a lycophyte, is a model plant to study the early evolution and development of vascular plants. As the first and only sequenced lycophyte to date, the genome of S. moellendorffii revealed many conserved genes and pathways, as well as specialized genes different from flowering plants. Despite the progress made, little is known about long noncoding RNAs (lncRNA) and the alternative splicing (AS) of coding genes in S. moellendorffii. Its coding gene models have not been fully validated with transcriptome data. Furthermore, it remains important to understand whether the regulatory mechanisms similar to flowering plants are used, and how they operate in a non-seed primitive vascular plant. RNA-sequencing (RNA-seq) was performed for three S. moellendorffii tissues, root, stem, and leaf, by constructing strand-specific RNA-seq libraries from RNA purified using RiboMinus isolation protocol. A total of 176 million reads (44 Gbp) were obtained from three tissue types, and were mapped to S. moellendorffii genome. By comparing with 22,285 existing gene models of S. moellendorffii, we identified 7930 high-confidence novel coding genes (a 35.6% increase), and for the first time reported 4422 lncRNAs in a lycophyte. Further, we refined 2461 (11.0%) of existing gene models, and identified 11,030 AS events (for 5957 coding genes) revealed for the first time for lycophytes. Tissue-specific gene expression with functional implication was analyzed, and 1031, 554, and 269 coding genes, and 174, 39, and 17 lncRNAs were identified in root, stem, and leaf tissues, respectively. The expression of critical genes for vascular development stages, i.e. formation of provascular cells, xylem specification and differentiation, and phloem specification and differentiation, was compared in S. moellendorffii tissues, indicating a less complex regulatory mechanism in lycophytes than in flowering plants. The results were further strengthened by the evolutionary trend of seven transcription factor families related to vascular development, which was observed among four representative species of seed and non-seed vascular plants, and nonvascular land and aquatic plants. The deep RNA-seq study of S. moellendorffii discovered extensive new gene contents, including novel coding genes, lncRNAs, AS events, and refined gene models. Compared to flowering vascular plants, S. moellendorffii displayed a less complexity in both gene structure, alternative splicing, and regulatory elements of vascular development. The study offered important insight into the evolution of vascular plants, and the regulation mechanism of vascular development in a non-seed plant.
Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.
Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis
2014-12-01
Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Complete mitochondrial genome of the agarophyte red alga Gelidium vagum (Gelidiales).
Yang, Eun Chan; Kim, Kyeong Mi; Boo, Ga Hun; Lee, Jung-Hyun; Boo, Sung Min; Yoon, Hwan Su
2014-08-01
We describe the first complete mitochondrial genome of Gelidium vagum (Gelidiales) (24,901 bp, 30.4% GC content), an agar-producing red alga. The circular mitochondrial genome contains 43 genes, including 23 protein-coding, 18 tRNA and 2 rRNA genes. All the protein-coding genes have a typical ATG start codon. No introns were found. Two genes, secY and rps12, were overlapped by 41 bp.
Moreno-Sánchez, Natalia; Rueda, Julia; Reverter, Antonio; Carabaño, María Jesús; Díaz, Clara
2012-03-01
Variations on the transcriptome from one skeletal muscle type to another still remain unknown. The reliable identification of stable gene coexpression networks is essential to unravel gene functions and define biological processes. The differential expression of two distinct muscles, M. flexor digitorum (FD) and M. psoas major (PM), was studied using microarrays in cattle to illustrate muscle-specific transcription patterns and to quantify changes in connectivity regarding the expected gene coexpression pattern. A total of 206 genes were differentially expressed (DE), 94 upregulated in PM and 112 in FD. The distribution of DE genes in pathways and biological functions was explored in the context of system biology. Global interactomes for genes of interest were predicted. Fast/slow twitch genes, genes coding for extracellular matrix, ribosomal and heat shock proteins, and fatty acid uptake centred the specific gene expression patterns per muscle. Genes involved in repairing mechanisms, such as ribosomal and heat shock proteins, suggested a differential ability of muscles to react to similar stressing factors, acting preferentially in slow twitch muscles. Muscle attributes do not seem to be completely explained by the muscle fibre composition. Changes in connectivity accounted for 24% of significant correlations between DE genes. Genes changing their connectivity mostly seem to contribute to the main differential attributes that characterize each specific muscle type. These results underscore the unique flexibility of skeletal muscle where a substantial set of genes are able to change their behavior depending on the circumstances.
Olvera-García, Myrna; Sanchez-Flores, Alejandro; Quirasco Baruch, Maricarmen
2018-03-01
Enterococcus spp. are present in the native microbiota of many traditional fermented foods. Their ability to produce antibacterial compounds, mainly against Listeria monocytogenes, has raised interest recently. However, there is scarce information about their proteolytic and lipolytic potential, and their biotechnological application is currently limited because enterococcal strains have been related to nosocomial infections. In this work, next-generation sequencing and optimised bioinformatic pipelines were used to annotate the genomes of two Enterococcus strains-one E. faecium and one E. faecalis-isolated from the Mexican artisanal ripened Cotija cheese. A battery of genes involved in their proteolytic system was annotated. Genes coding for lipases, esterases and other enzymes whose final products contribute to cheese aroma and flavour were identified as well. As for the production of antibacterial compounds, several peptidoglycan hydrolase- and bacteriocin-coding genes were identified in both genomes experimentally and by bioinformatic analyses. E. faecalis showed resistance to aminoglycosides and E. faecium to aminoglycosides and macrolides, as predicted by the genome functional annotation. No pathogenicity islands were found in any of the strains, although traits such as the ability of biofilm formation and cell aggregation were observed. Finally, a comparative genomic analysis was able to discriminate between the food strains isolated and nosocomial strains. In summary, pathogenic strains are resistant to a wide range of antibiotics and contain virulence factors that cause host damage; in contrast, food strains display less antibiotic resistance, include genes that encode class II bacteriocins and express virulence factors associated with host colonisation rather than invasion.
Quantifying the Effect of DNA Packaging on Gene Expression Level
NASA Astrophysics Data System (ADS)
Kim, Harold
2010-10-01
Gene expression, the process by which the genetic code comes alive in the form of proteins, is one of the most important biological processes in living cells, and begins when transcription factors bind to specific DNA sequences in the promoter region upstream of a gene. The relationship between gene expression output and transcription factor input which is termed the gene regulation function is specific to each promoter, and predicting this gene regulation function from the locations of transcription factor binding sites is one of the challenges in biology. In eukaryotic organisms (for example, animals, plants, fungi etc), DNA is highly compacted into nucleosomes, 147-bp segments of DNA tightly wrapped around histone protein core, and therefore, the accessibility of transcription factor binding sites depends on their locations with respect to nucleosomes - sites inside nucleosomes are less accessible than those outside nucleosomes. To understand how transcription factor binding sites contribute to gene expression in a quantitative manner, we obtain gene regulation functions of promoters with various configurations of transcription factor binding sites by using fluorescent protein reporters to measure transcription factor input and gene expression output in single yeast cells. In this talk, I will show that the affinity of a transcription factor binding site inside and outside the nucleosome controls different aspects of the gene regulation function, and explain this finding based on a mass-action kinetic model that includes competition between nucleosomes and transcription factors.
Evolution of functional specialization and division of labor.
Rueffler, Claus; Hermisson, Joachim; Wagner, Günter P
2012-02-07
Division of labor among functionally specialized modules occurs at all levels of biological organization in both animals and plants. Well-known examples include the evolution of specialized enzymes after gene duplication, the evolution of specialized cell types, limb diversification in arthropods, and the evolution of specialized colony members in many taxa of marine invertebrates and social insects. Here, we identify conditions favoring the evolution of division of labor by means of a general mathematical model. Our starting point is the assumption that modules contribute to two different biological tasks and that the potential of modules to contribute to these tasks is traded off. Our results are phrased in terms of properties of performance functions that map the phenotype of modules to measures of performance. We show that division of labor is favored by three factors: positional effects that predispose modules for one of the tasks, accelerating performance functions, and synergistic interactions between modules. If modules can be lost or damaged, selection for robustness can counteract selection for functional specialization. To illustrate our theory we apply it to the evolution of specialized enzymes coded by duplicated genes.
No evidence for the effect of MHC on male mating success in the brown bear.
Kuduk, Katarzyna; Babik, Wieslaw; Bellemain, Eva; Valentini, Alice; Zedrosser, Andreas; Taberlet, Pierre; Kindberg, Jonas; Swenson, Jon E; Radwan, Jacek
2014-01-01
Mate choice is thought to contribute to the maintenance of the spectacularly high polymorphism of the Major Histocompatibility Complex (MHC) genes, along with balancing selection from parasites, but the relative contribution of the former mechanism is debated. Here, we investigated the association between male MHC genotype and mating success in the brown bear. We analysed fragments of sequences coding for the peptide-binding region of the highly polymorphic MHC class I and class II DRB genes, while controlling for genome-wide effects using a panel of 18 microsatellite markers. Male mating success did not depend on the number of alleles shared with the female or amino-acid distance between potential mates at either locus. Furthermore, we found no indication of female mating preferences for MHC similarity being contingent on the number of alleles the females carried. Finally, we found no significant association between the number of MHC alleles a male carried and his mating success. Thus, our results provided no support for the role of mate choice in shaping MHC polymorphism in the brown bear.
HLA-E regulatory and coding region variability and haplotypes in a Brazilian population sample.
Ramalho, Jaqueline; Veiga-Castelli, Luciana C; Donadi, Eduardo A; Mendes-Junior, Celso T; Castelli, Erick C
2017-11-01
The HLA-E gene is characterized by low but wide expression on different tissues. HLA-E is considered a conserved gene, being one of the least polymorphic class I HLA genes. The HLA-E molecule interacts with Natural Killer cell receptors and T lymphocytes receptors, and might activate or inhibit immune responses depending on the peptide associated with HLA-E and with which receptors HLA-E interacts to. Variable sites within the HLA-E regulatory and coding segments may influence the gene function by modifying its expression pattern or encoded molecule, thus, influencing its interaction with receptors and the peptide. Here we propose an approach to evaluate the gene structure, haplotype pattern and the complete HLA-E variability, including regulatory (promoter and 3'UTR) and coding segments (with introns), by using massively parallel sequencing. We investigated the variability of 420 samples from a very admixed population such as Brazilians by using this approach. Considering a segment of about 7kb, 63 variable sites were detected, arranged into 75 extended haplotypes. We detected 37 different promoter sequences (but few frequent ones), 27 different coding sequences (15 representing new HLA-E alleles) and 12 haplotypes at the 3'UTR segment, two of them presenting a summed frequency of 90%. Despite the number of coding alleles, they encode mainly two different full-length molecules, known as E*01:01 and E*01:03, which corresponds to about 90% of all. In addition, differently from what has been previously observed for other non classical HLA genes, the relationship among the HLA-E promoter, coding and 3'UTR haplotypes is not straightforward because the same promoter and 3'UTR haplotypes were many times associated with different HLA-E coding haplotypes. This data reinforces the presence of only two main full-length HLA-E molecules encoded by the many HLA-E alleles detected in our population sample. In addition, this data does indicate that the distal HLA-E promoter is by far the most variable segment. Further analyses involving the binding of transcription factors and non-coding RNAs, as well as the HLA-E expression in different tissues, are necessary to evaluate whether these variable sites at regulatory segments (or even at the coding sequence) may influence the gene expression profile. Copyright © 2017 Elsevier Ltd. All rights reserved.
Juul, Malene; Bertl, Johanna; Guo, Qianyun; Nielsen, Morten Muhlig; Świtnicki, Michał; Hornshøj, Henrik; Madsen, Tobias; Hobolth, Asger; Pedersen, Jakob Skou
2017-01-01
Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5’UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance. DOI: http://dx.doi.org/10.7554/eLife.21778.001 PMID:28362259
Elchuri, Sailaja V; Rajasekaran, Swetha; Miles, Wayne O
2018-01-01
Retinoblastoma is rare tumor of the retina caused by the homozygous loss of the Retinoblastoma 1 tumor suppressor gene (RB1). Loss of the RB1 protein, pRB, results in de-regulated activity of the E2F transcription factors, chromatin changes and developmental defects leading to tumor development. Extensive microarray profiles of these tumors have enabled the identification of genes sensitive to pRB disruption, however, this technology has a number of limitations in the RNA profiles that they generate. The advent of RNA-sequencing has enabled the global profiling of all of the RNA within the cell including both coding and non-coding features and the detection of aberrant RNA processing events. In this perspective, we focus on discussing how RNA-sequencing of rare Retinoblastoma tumors will build on existing data and open up new area's to improve our understanding of the biology of these tumors. In particular, we discuss how the RB-research field may be to use this data to determine how RB1 loss results in the expression of; non-coding RNAs, causes aberrant RNA processing events and how a deeper analysis of metabolic RNA changes can be utilized to model tumor specific shifts in metabolism. Each section discusses new opportunities and challenges associated with these types of analyses and aims to provide an honest assessment of how understanding these different processes may contribute to the treatment of Retinoblastoma.
Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong
2012-01-01
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.
Dal'Maso, Vinícius Buaes; Mallmann, Lucas; Siebert, Marina; Simon, Laura; Saraiva-Pereira, Maria Luiza; Dalcin, Paulo de Tarso Roth
2013-01-01
OBJECTIVE: To evaluate the diagnostic contribution of molecular analysis of the cystic fibrosis transmembrane conductance regulator (CFTR) gene in patients suspected of having mild or atypical cystic fibrosis (CF). METHODS: This was a cross-sectional study involving adolescents and adults aged ≥ 14 years. Volunteers underwent clinical, laboratory, and radiological evaluation, as well as spirometry, sputum microbiology, liver ultrasound, sweat tests, and molecular analysis of the CFTR gene. We then divided the patients into three groups by the number of mutations identified (none, one, and two or more) and compared those groups in terms of their characteristics. RESULTS: We evaluated 37 patients with phenotypic findings of CF, with or without sweat test confirmation. The mean age of the patients was 32.5 ± 13.6 years, and females predominated (75.7%). The molecular analysis contributed to the definitive diagnosis of CF in 3 patients (8.1%), all of whom had at least two mutations. There were 7 patients (18.9%) with only one mutation and 26 patients (70.3%) with no mutations. None of the clinical characteristics evaluated was found to be associated with the genetic diagnosis. The most common mutation was p.F508del, which was found in 5 patients. The combination of p.V232D and p.F508del was found in 2 patients. Other mutations identified were p.A559T, p.D1152H, p.T1057A, p.I148T, p.V754M, p.P1290P, p.R1066H, and p.T351S. CONCLUSIONS: The molecular analysis of the CFTR gene coding region showed a limited contribution to the diagnostic investigation of patients suspected of having mild or atypical CF. In addition, there were no associations between the clinical characteristics and the genetic diagnosis. PMID:23670503
Dal'Maso, Vinícius Buaes; Mallmann, Lucas; Siebert, Marina; Simon, Laura; Saraiva-Pereira, Maria Luiza; Dalcin, Paulo de Tarso Roth
2013-01-01
To evaluate the diagnostic contribution of molecular analysis of the cystic fibrosis transmembrane conductance regulator (CFTR) gene in patients suspected of having mild or atypical cystic fibrosis (CF). This was a cross-sectional study involving adolescents and adults aged > 14 years. Volunteers underwent clinical, laboratory, and radiological evaluation, as well as spirometry, sputum microbiology, liver ultrasound, sweat tests, and molecular analysis of the CFTR gene. We then divided the patients into three groups by the number of mutations identified (none, one, and two or more) and compared those groups in terms of their characteristics. We evaluated 37 patients with phenotypic findings of CF, with or without sweat test confirmation. The mean age of the patients was 32.5 ± 13.6 years, and females predominated (75.7%). The molecular analysis contributed to the definitive diagnosis of CF in 3 patients (8.1%), all of whom had at least two mutations. There were 7 patients (18.9%) with only one mutation and 26 patients (70.3%) with no mutations. None of the clinical characteristics evaluated was found to be associated with the genetic diagnosis. The most common mutation was p.F508del, which was found in 5 patients. The combination of p.V232D and p.F508del was found in 2 patients. Other mutations identified were p.A559T, p.D1152H, p.T1057A, p.I148T, p.V754M, p.P1290P, p.R1066H, and p.T351S. The molecular analysis of the CFTR gene coding region showed a limited contribution to the diagnostic investigation of patients suspected of having mild or atypical CF. In addition, there were no associations between the clinical characteristics and the genetic diagnosis.
Yu, Hong; Kong, Lingfeng; Li, Qi
2016-01-01
In this study, we evaluated the efficacy of 12 mitochondrial protein-coding genes from 238 mitochondrial genomes of 140 molluscan species as potential DNA barcodes for mollusks. Three barcoding methods (distance, monophyly and character-based methods) were used in species identification. The species recovery rates based on genetic distances for the 12 genes ranged from 70.83 to 83.33%. There were no significant differences in intra- or interspecific variability among the 12 genes. The monophyly and character-based methods provided higher resolution than the distance-based method in species delimitation. Especially in closely related taxa, the character-based method showed some advantages. The results suggested that besides the standard COI barcode, other 11 mitochondrial protein-coding genes could also be potentially used as a molecular diagnostic for molluscan species discrimination. Our results also showed that the combination of mitochondrial genes did not enhance the efficacy for species identification and a single mitochondrial gene would be fully competent.
Hu, Bo; Liu, Dong-Xing; Zhang, Yu-Qing; Song, Jian-Tao; Ji, Xian-Fei; Hou, Zhi-Qiang; Zhang, Zhen-Hai
2016-05-01
In this study we sequenced the complete mitochondrial genome sequencing of a heart failure model of cardiomyopathic Syrian hamster (Mesocricetus auratus) for the first time. The total length of the mitogenome was 16,267 bp. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region.
Yong, Hoi-Sen; Song, Sze-Looi; Lim, Phaik-Eem; Chan, Kok-Gan; Chow, Wan-Loo; Eamsobhana, Praphathip
2015-01-01
The whole mitochondrial genome of the pest fruit fly Bactrocera arecae was obtained from next-generation sequencing of genomic DNA. It had a total length of 15,900 bp, consisting of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The control region (952 bp) was flanked by rrnS and trnI genes. The start codons included 6 ATG, 3 ATT and 1 each of ATA, ATC, GTG and TCG. Eight TAA, two TAG, one incomplete TA and two incomplete T stop codons were represented in the protein-coding genes. The cloverleaf structure for trnS1 lacked the D-loop, and that of trnN and trnF lacked the TΨC-loop. Molecular phylogeny based on 13 protein-coding genes was concordant with 37 mitochondrial genes, with B. arecae having closest genetic affinity to B. tryoni. The subgenus Bactrocera of Dacini tribe and the Dacinae subfamily (Dacini and Ceratitidini tribes) were monophyletic. The whole mitogenome of B. arecae will serve as a useful dataset for studying the genetics, systematics and phylogenetic relationships of the many species of Bactrocera genus in particular, and tephritid fruit flies in general. PMID:26472633
Li, Yali; Tan, Yanxiao; Shao, Yun; Li, Mingjun; Ma, Fengwang
2015-05-01
Diacylglycerol kinase (DGK) is a pivotal enzyme that phosphorylates diacylglycerol (DAG) to form phosphatidic acid (PA). The production of PA from phospholipase D (PLD) and the coupled phospholipase C (PLC)/DGK route is a critical signaling process in animal and plant cells. Next to PLD, DGK is the second most important generator of PA in biotic and abiotic stress responses. We identified 8 DGK members within the apple genome and all of their putative proteins contain one DGK catalytic domain and one DGK accessory domain. Four coding sequences were confirmed by cloning from Malus prunifolia. Phylogenetic and gene structure analyses showed that the apple DGK genes could be assigned to Clusters I, II, or III. Expression analysis of 6 of them revealed that their transcript levels were highest in stems. Some apple DGK genes were also significantly up-regulated in response to salt and drought stresses. This suggested their possible roles in plant defenses against environmental challenges. As a first step toward genome-wide analyses of the DGK genes in woody plants, our results imply that apple DGK genes are involved in the signaling of stress responses. These findings will contribute to further functional dissection of this gene family. Copyright © 2015 Elsevier B.V. All rights reserved.
Shimizu, Takeo; Kanematsu, Satoko; Yaegashi, Hajime
2018-04-24
Understanding the molecular mechanisms of pathogenesis is useful in developing effective control methods for fungal diseases. The white root rot fungus Rosellinia necatrix is a soil-borne pathogen that causes serious economic losses in various crops, including fruit trees, worldwide. Here, using next-generation sequencing techniques, we first produced a 44-Mb draft genome sequence of R. necatrix strain W97, an isolate from Japan, in which 12,444 protein-coding genes were predicted. To survey differentially expressed genes (DEGs) associated with the pathogenesis of the fungus, the hypovirulent W97 strain infected with Rosellinia necatrix megabirnavirus 1 (RnMBV1) was used for a comprehensive transcriptome analysis. In total, 545 and 615 genes are up- and down-regulated, respectively, in R. necatrix infected with RnMBV1. Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses of the DEGs suggested that primary and secondary metabolism would be greatly disturbed in R. necatrix infected with RnMBV1. The genes encoding transcriptional regulators, plant cell wall-degrading enzymes, and toxin production, such as cytochalasin E, were also found in the DEGs. The genetic resources provided in this study will accelerate the discovery of genes associated with pathogenesis and other biological characteristics of R. necatrix, thus contributing to disease control.
Antisense transcriptional interference mediates condition-specific gene repression in budding yeast.
Nevers, Alicia; Doyen, Antonia; Malabat, Christophe; Néron, Bertrand; Kergrohen, Thomas; Jacquier, Alain; Badis, Gwenael
2018-05-18
Pervasive transcription generates many unstable non-coding transcripts in budding yeast. The transcription of such noncoding RNAs, in particular antisense RNAs (asRNAs), has been shown in a few examples to repress the expression of the associated mRNAs. Yet, such mechanism is not known to commonly contribute to the regulation of a given class of genes. Using a mutant context that stabilized pervasive transcripts, we observed that the least expressed mRNAs during the exponential phase were associated with high levels of asRNAs. These asRNAs also overlapped their corresponding gene promoters with a much higher frequency than average. Interrupting antisense transcription of a subset of genes corresponding to quiescence-enriched mRNAs restored their expression. The underlying mechanism acts in cis and involves several chromatin modifiers. Our results convey that transcription interference represses up to 30% of the 590 least expressed genes, which includes 163 genes with quiescence-enriched mRNAs. We also found that pervasive transcripts constitute a higher fraction of the transcriptome in quiescence relative to the exponential phase, consistent with gene expression itself playing an important role to suppress pervasive transcription. Accordingly, the HIS1 asRNA, normally only present in quiescence, is expressed in exponential phase upon HIS1 mRNA transcription interruption.
Steinberg, Karyn Meltz; Ramachandran, Dhanya; Patel, Viren C; Shetty, Amol C; Cutler, David J; Zwick, Michael E
2012-09-28
Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3' UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects.
2012-01-01
Background Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. Methods We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. Results We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3’ UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. Conclusions These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects. PMID:23020841
Modeling Host Genetic Regulation of Influenza Pathogenesis in the Collaborative Cross
Ferris, Martin T.; Aylor, David L.; Bottomly, Daniel; Whitmore, Alan C.; Aicher, Lauri D.; Bell, Timothy A.; Bradel-Tretheway, Birgit; Bryan, Janine T.; Buus, Ryan J.; Gralinski, Lisa E.; Haagmans, Bart L.; McMillan, Leonard; Miller, Darla R.; Rosenzweig, Elizabeth; Valdar, William; Wang, Jeremy; Churchill, Gary A.; Threadgill, David W.; McWeeney, Shannon K.; Katze, Michael G.; Pardo-Manuel de Villena, Fernando; Baric, Ralph S.; Heise, Mark T.
2013-01-01
Genetic variation contributes to host responses and outcomes following infection by influenza A virus or other viral infections. Yet narrow windows of disease symptoms and confounding environmental factors have made it difficult to identify polymorphic genes that contribute to differential disease outcomes in human populations. Therefore, to control for these confounding environmental variables in a system that models the levels of genetic diversity found in outbred populations such as humans, we used incipient lines of the highly genetically diverse Collaborative Cross (CC) recombinant inbred (RI) panel (the pre-CC population) to study how genetic variation impacts influenza associated disease across a genetically diverse population. A wide range of variation in influenza disease related phenotypes including virus replication, virus-induced inflammation, and weight loss was observed. Many of the disease associated phenotypes were correlated, with viral replication and virus-induced inflammation being predictors of virus-induced weight loss. Despite these correlations, pre-CC mice with unique and novel disease phenotype combinations were observed. We also identified sets of transcripts (modules) that were correlated with aspects of disease. In order to identify how host genetic polymorphisms contribute to the observed variation in disease, we conducted quantitative trait loci (QTL) mapping. We identified several QTL contributing to specific aspects of the host response including virus-induced weight loss, titer, pulmonary edema, neutrophil recruitment to the airways, and transcriptional expression. Existing whole-genome sequence data was applied to identify high priority candidate genes within QTL regions. A key host response QTL was located at the site of the known anti-influenza Mx1 gene. We sequenced the coding regions of Mx1 in the eight CC founder strains, and identified a novel Mx1 allele that showed reduced ability to inhibit viral replication, while maintaining protection from weight loss. PMID:23468633
The complete mitochondrial genome of the Giant Manta ray, Manta birostris.
Hinojosa-Alvarez, Silvia; Díaz-Jaimes, Pindaro; Marcet-Houben, Marina; Gabaldón, Toni
2015-01-01
The complete mitochondrial genome of the giant manta ray (Manta birostris), consists of 18,075 bp with rich A + T and low G content. Gene organization and length is similar to other species of ray. It comprises of 13 protein-coding genes, 2 rRNAs genes, 23 tRNAs genes and 1 non-coding sequence, and the control region. We identified an AT tandem repeat region, similar to that reported in Mobula japanica.
Origins of Genes: "Big Bang" or Continuous Creation?
NASA Astrophysics Data System (ADS)
Kesse, Paul K.; Gibbs, Adrian
1992-10-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni
2005-01-01
Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747
Diversity, expansion, and evolutionary novelty of plant DNA-binding transcription factor families.
Lehti-Shiu, Melissa D; Panchy, Nicholas; Wang, Peipei; Uygun, Sahra; Shiu, Shin-Han
2017-01-01
Plant transcription factors (TFs) that interact with specific sequences via DNA-binding domains are crucial for regulating transcriptional initiation and are fundamental to plant development and environmental response. In addition, expansion of TF families has allowed functional divergence of duplicate copies, which has contributed to novel, and in some cases adaptive, traits in plants. Thus, TFs are central to the generation of the diverse plant species that we see today. Major plant agronomic traits, including those relevant to domestication, have also frequently arisen through changes in TF coding sequence or expression patterns. Here our goal is to provide an overview of plant TF evolution by first comparing the diversity of DNA-binding domains and the sizes of these domain families in plants and other eukaryotes. Because TFs are among the most highly expanded gene families in plants, the birth and death process of TFs as well as the mechanisms contributing to their retention are discussed. We also provide recent examples of how TFs have contributed to novel traits that are important in plant evolution and in agriculture.This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer. Copyright © 2016 Elsevier B.V. All rights reserved.
Davis, Matthew P; Carrieri, Claudia; Saini, Harpreet K; van Dongen, Stijn; Leonardi, Tommaso; Bussotti, Giovanni; Monahan, Jack M; Auchynnikava, Tania; Bitetti, Angelo; Rappsilber, Juri; Allshire, Robin C; Shkumatava, Alena; O'Carroll, Dónal; Enright, Anton J
2017-07-01
Spermatogenesis is associated with major and unique changes to chromosomes and chromatin. Here, we sought to understand the impact of these changes on spermatogenic transcriptomes. We show that long terminal repeats (LTRs) of specific mouse endogenous retroviruses (ERVs) drive the expression of many long non-coding transcripts (lncRNA). This process occurs post-mitotically predominantly in spermatocytes and round spermatids. We demonstrate that this transposon-driven lncRNA expression is a conserved feature of vertebrate spermatogenesis. We propose that transposon promoters are a mechanism by which the genome can explore novel transcriptional substrates, increasing evolutionary plasticity and allowing for the genesis of novel coding and non-coding genes. Accordingly, we show that a small fraction of these novel ERV-driven transcripts encode short open reading frames that produce detectable peptides. Finally, we find that distinct ERV elements from the same subfamilies act as differentially activated promoters in a tissue-specific context. In summary, we demonstrate that LTRs can act as tissue-specific promoters and contribute to post-mitotic spermatogenic transcriptome diversity. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.
Zhang, Fan; Zhang, Liang; Zhang, Caiguo
2016-01-01
The human genome contains a large number of nonprotein-coding sequences. Recently, new discoveries in the functions of nonprotein-coding sequences have demonstrated that the "Dark Genome" significantly contributes to human diseases, especially with regard to cancer. Of particular interest in this review are long noncoding RNAs (lncRNAs), which comprise a class of nonprotein-coding transcripts that are longer than 200 nucleotides. Accumulating evidence indicates that a large number of lncRNAs exhibit genetic associations with tumorigenesis, tumor progression, and metastasis. Our current understanding of the molecular bases of these lncRNAs that are associated with cancer indicate that they play critical roles in gene transcription, translation, and chromatin modification. Therapeutic strategies based on the targeting of lncRNAs to disrupt their expression or their functions are being developed. In this review, we briefly summarize and discuss the genetic associations and the aberrant expression of lncRNAs in cancer, with a particular focus on studies that have revealed the molecular mechanisms of lncRNAs in tumorigenesis. In addition, we also discuss different therapeutic strategies that involve the targeting of lncRNAs.
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.
Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H
2017-12-20
Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
Effect of genomic distance on coexpression of coregulated genes in E. coli
Merino, Enrique; Marchal, Kathleen; Collado-Vides, Julio
2017-01-01
In prokaryotes, genomic distance is a feature that in addition to coregulation affects coexpression. Several observations, such as genomic clustering of highly coexpressed small regulons, support the idea that coexpression behavior of coregulated genes is affected by the distance between the coregulated genes. However, the specific contribution of distance in addition to coregulation in determining the degree of coexpression has not yet been studied systematically. In this work, we exploit the rich information in RegulonDB to study how the genomic distance between coregulated genes affects their degree of coexpression, measured by pairwise similarity of expression profiles obtained under a large number of conditions. We observed that, in general, coregulated genes display higher degrees of coexpression as they are more closely located on the genome. This contribution of genomic distance in determining the degree of coexpression was relatively small compared to the degree of coexpression that was determined by the tightness of the coregulation (degree of overlap of regulatory programs) but was shown to be evolutionary constrained. In addition, the distance effect was sufficient to guarantee coexpression of coregulated genes that are located at very short distances, irrespective of their tightness of coregulation. This is partly but definitely not always because the close distance is also the cause of the coregulation. In cases where it is not, we hypothesize that the effect of the distance on coexpression could be caused by the fact that coregulated genes closely located to each other are also relatively more equidistantly located from their common TF and therefore subject to more similar levels of TF molecules. The absolute genomic distance of the coregulated genes to their common TF-coding gene tends to be less important in determining the degree of coexpression. Our results pinpoint the importance of taking into account the combined effect of distance and coregulation when studying prokaryotic coexpression and transcriptional regulation. PMID:28419102
Henström, Maria; Diekmann, Lena; Bonfiglio, Ferdinando; Hadizadeh, Fatemeh; Kuech, Eva-Maria; von Köckritz-Blickwede, Maren; Thingholm, Louise B; Zheng, Tenghao; Assadi, Ghazaleh; Dierks, Claudia; Heine, Martin; Philipp, Ute; Distl, Ottmar; Money, Mary E; Belheouane, Meriem; Heinsen, Femke-Anouska; Rafter, Joseph; Nardone, Gerardo; Cuomo, Rosario; Usai-Satta, Paolo; Galeazzi, Francesca; Neri, Matteo; Walter, Susanna; Simrén, Magnus; Karling, Pontus; Ohlsson, Bodil; Schmidt, Peter T; Lindberg, Greger; Dlugosz, Aldona; Agreus, Lars; Andreasson, Anna; Mayer, Emeran; Baines, John F; Engstrand, Lars; Portincasa, Piero; Bellini, Massimo; Stanghellini, Vincenzo; Barbara, Giovanni; Chang, Lin; Camilleri, Michael; Franke, Andre; Naim, Hassan Y
2018-01-01
Objective IBS is a common gut disorder of uncertain pathogenesis. Among other factors, genetics and certain foods are proposed to contribute. Congenital sucrase–isomaltase deficiency (CSID) is a rare genetic form of disaccharide malabsorption characterised by diarrhoea, abdominal pain and bloating, which are features common to IBS. We tested sucrase–isomaltase (SI) gene variants for their potential relevance in IBS. Design We sequenced SI exons in seven familial cases, and screened four CSID mutations (p.Val557Gly, p.Gly1073Asp, p.Arg1124Ter and p.Phe1745Cys) and a common SI coding polymorphism (p.Val15Phe) in a multicentre cohort of 1887 cases and controls. We studied the effect of the 15Val to 15Phe substitution on SI function in vitro. We analysed p.Val15Phe genotype in relation to IBS status, stool frequency and faecal microbiota composition in 250 individuals from the general population. Results CSID mutations were more common in patients than asymptomatic controls (p=0.074; OR=1.84) and Exome Aggregation Consortium reference sequenced individuals (p=0.020; OR=1.57). 15Phe was detected in 6/7 sequenced familial cases, and increased IBS risk in case–control and population-based cohorts, with best evidence for diarrhoea phenotypes (combined p=0.00012; OR=1.36). In the population-based sample, 15Phe allele dosage correlated with stool frequency (p=0.026) and Parabacteroides faecal microbiota abundance (p=0.0024). The SI protein with 15Phe exhibited 35% reduced enzymatic activity in vitro compared with 15Val (p<0.05). Conclusions SI gene variants coding for disaccharidases with defective or reduced enzymatic activity predispose to IBS. This may help the identification of individuals at risk, and contribute to personalising treatment options in a subset of patients. PMID:27872184
Henström, Maria; Diekmann, Lena; Bonfiglio, Ferdinando; Hadizadeh, Fatemeh; Kuech, Eva-Maria; von Köckritz-Blickwede, Maren; Thingholm, Louise B; Zheng, Tenghao; Assadi, Ghazaleh; Dierks, Claudia; Heine, Martin; Philipp, Ute; Distl, Ottmar; Money, Mary E; Belheouane, Meriem; Heinsen, Femke-Anouska; Rafter, Joseph; Nardone, Gerardo; Cuomo, Rosario; Usai-Satta, Paolo; Galeazzi, Francesca; Neri, Matteo; Walter, Susanna; Simrén, Magnus; Karling, Pontus; Ohlsson, Bodil; Schmidt, Peter T; Lindberg, Greger; Dlugosz, Aldona; Agreus, Lars; Andreasson, Anna; Mayer, Emeran; Baines, John F; Engstrand, Lars; Portincasa, Piero; Bellini, Massimo; Stanghellini, Vincenzo; Barbara, Giovanni; Chang, Lin; Camilleri, Michael; Franke, Andre; Naim, Hassan Y; D'Amato, Mauro
2018-02-01
IBS is a common gut disorder of uncertain pathogenesis. Among other factors, genetics and certain foods are proposed to contribute. Congenital sucrase-isomaltase deficiency (CSID) is a rare genetic form of disaccharide malabsorption characterised by diarrhoea, abdominal pain and bloating, which are features common to IBS. We tested sucrase-isomaltase ( SI ) gene variants for their potential relevance in IBS. We sequenced SI exons in seven familial cases, and screened four CSID mutations (p.Val557Gly, p.Gly1073Asp, p.Arg1124Ter and p.Phe1745Cys) and a common SI coding polymorphism (p.Val15Phe) in a multicentre cohort of 1887 cases and controls. We studied the effect of the 15Val to 15Phe substitution on SI function in vitro. We analysed p.Val15Phe genotype in relation to IBS status, stool frequency and faecal microbiota composition in 250 individuals from the general population. CSID mutations were more common in patients than asymptomatic controls (p=0.074; OR=1.84) and Exome Aggregation Consortium reference sequenced individuals (p=0.020; OR=1.57). 15Phe was detected in 6/7 sequenced familial cases, and increased IBS risk in case-control and population-based cohorts, with best evidence for diarrhoea phenotypes (combined p=0.00012; OR=1.36). In the population-based sample, 15Phe allele dosage correlated with stool frequency (p=0.026) and Parabacteroides faecal microbiota abundance (p=0.0024). The SI protein with 15Phe exhibited 35% reduced enzymatic activity in vitro compared with 15Val (p<0.05). SI gene variants coding for disaccharidases with defective or reduced enzymatic activity predispose to IBS. This may help the identification of individuals at risk, and contribute to personalising treatment options in a subset of patients. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Evidence for a Complex Class of Nonadenylated mRNA in Drosophila
Zimmerman, J. Lynn; Fouts, David L.; Manning, Jerry E.
1980-01-01
The amount, by mass, of poly(A+) mRNA present in the polyribosomes of third-instar larvae of Drosophila melanogaster, and the relative contribution of the poly(A+) mRNA to the sequence complexity of total polysomal RNA, has been determined. Selective removal of poly(A+) mRNA from total polysomal RNA by use of either oligo-dT-cellulose, or poly(U)-sepharose affinity chromatography, revealed that only 0.15% of the mass of the polysomal RNA was present as poly(A+) mRNA. The present study shows that this RNA hybridized at saturation with 3.3% of the single-copy DNA in the Drosophila genome. After correction for asymmetric transcription and reactability of the DNA, 7.4% of the single-copy DNA in the Drosophila genome is represented in larval poly(A+) mRNA. This corresponds to 6.73 x 106 nucleotides of mRNA coding sequences, or approximately 5,384 diverse RNA sequences of average size 1,250 nucleotides. However, total polysomal RNA hybridizes at saturation to 10.9% of the single-copy DNA sequences. After correcting this value for asymmetric transcription and tracer DNA reactability, 24% of the single-copy DNA in Drosophila is represented in total polysomal RNA. This corresponds to 2.18 x 107 nucleotides of RNA coding sequences or 17,440 diverse RNA molecules of size 1,250 nucleotides. This value is 3.2 times greater than that observed for poly(A+) mRNA, and indicates that ≃69% of the polysomal RNA sequence complexity is contributed by nonadenylated RNA. Furthermore, if the number of different structural genes represented in total polysomal RNA is ≃1.7 x 104, then the number of genes expressed in third-instar larvae exceeds the number of chromomeres in Drosophila by about a factor of three. This numerology indicates that the number of chromomeres observed in polytene chromosomes does not reflect the number of structural gene sequences in the Drosophila genome. PMID:6777246
Li, Chaoqun; Cao, Feifei; Li, Shengli; Huang, Shenglin; Li, Wei; Abumaria, Nashat
2018-01-01
Although studies provide insights into the neurobiology of stress and depression, the exact molecular mechanisms underlying their pathologies remain largely unknown. Long non-coding RNA (lncRNA) has been implicated in brain functions and behavior. A potential link between lncRNA and psychiatric disorders has been proposed. However, it remains undetermined whether IncRNA regulation, in the brain, contributes to stress or depression pathologies. In this study, we used a valid animal model of depression-like symptoms; namely learned helplessness, RNA-seq, Gene Ontology and co-expression network analyses to profile the expression pattern of lncRNA and mRNA in the hippocampus of mice. We identified 6346 differentially expressed transcripts. Among them, 340 lncRNAs and 3559 protein coding mRNAs were differentially expressed in helpless mice in comparison with control and/or non-helpless mice (inescapable stress resilient mice). Gene Ontology and pathway enrichment analyses indicated that induction of helplessness altered expression of mRNAs enriched in fundamental biological functions implicated in stress/depression neurobiology such as synaptic, metabolic, cell survival and proliferation, developmental and chromatin modification functions. To explore the possible regulatory roles of the altered lncRNAs, we constructed co-expression networks composed of the lncRNAs and mRNAs. Among our differentially expressed lncRNAs, 17% showed significant correlation with genes. Functional co-expression analysis linked the identified lncRNAs to several cellular mechanisms implicated in stress/depression neurobiology. Importantly, 57% of the identified regulatory lncRNAs significantly correlated with 18 different synapse-related functions. Thus, the current study identifies for the first time distinct groups of lncRNAs regulated by induction of learned helplessness in the mouse brain. Our results suggest that lncRNA-directed regulatory mechanisms might contribute to stress-induced pathologies; in particular, to inescapable stress-induced synaptic modifications. PMID:29375311
Li, Chaoqun; Cao, Feifei; Li, Shengli; Huang, Shenglin; Li, Wei; Abumaria, Nashat
2017-01-01
Although studies provide insights into the neurobiology of stress and depression, the exact molecular mechanisms underlying their pathologies remain largely unknown. Long non-coding RNA (lncRNA) has been implicated in brain functions and behavior. A potential link between lncRNA and psychiatric disorders has been proposed. However, it remains undetermined whether IncRNA regulation, in the brain, contributes to stress or depression pathologies. In this study, we used a valid animal model of depression-like symptoms; namely learned helplessness, RNA-seq, Gene Ontology and co-expression network analyses to profile the expression pattern of lncRNA and mRNA in the hippocampus of mice. We identified 6346 differentially expressed transcripts. Among them, 340 lncRNAs and 3559 protein coding mRNAs were differentially expressed in helpless mice in comparison with control and/or non-helpless mice (inescapable stress resilient mice). Gene Ontology and pathway enrichment analyses indicated that induction of helplessness altered expression of mRNAs enriched in fundamental biological functions implicated in stress/depression neurobiology such as synaptic, metabolic, cell survival and proliferation, developmental and chromatin modification functions. To explore the possible regulatory roles of the altered lncRNAs, we constructed co-expression networks composed of the lncRNAs and mRNAs. Among our differentially expressed lncRNAs, 17% showed significant correlation with genes. Functional co-expression analysis linked the identified lncRNAs to several cellular mechanisms implicated in stress/depression neurobiology. Importantly, 57% of the identified regulatory lncRNAs significantly correlated with 18 different synapse-related functions. Thus, the current study identifies for the first time distinct groups of lncRNAs regulated by induction of learned helplessness in the mouse brain. Our results suggest that lncRNA-directed regulatory mechanisms might contribute to stress-induced pathologies; in particular, to inescapable stress-induced synaptic modifications.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).
Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren
2016-04-01
Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)
Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren
2016-01-01
Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575
Roadway contributing factors in traffic crashes.
DOT National Transportation Integrated Search
2014-09-01
This project involved an evaluation of the codes which relate to roadway contributing : factors. This included a review of relevant codes used in other states. Crashes with related : codes were summarized and analyzed. A sample of crash sites was ins...
The opportunities and challenges of large-scale molecular approaches to songbird neurobiology
Mello, C.V.; Clayton, D.F.
2014-01-01
High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907
Manning, Viola A.; Pandelova, Iovanna; Dhillon, Braham; Wilhelm, Larry J.; Goodwin, Stephen B.; Berlin, Aaron M.; Figueroa, Melania; Freitag, Michael; Hane, James K.; Henrissat, Bernard; Holman, Wade H.; Kodira, Chinnappa D.; Martin, Joel; Oliver, Richard P.; Robbertse, Barbara; Schackwitz, Wendy; Schwartz, David C.; Spatafora, Joseph W.; Turgeon, B. Gillian; Yandava, Chandri; Young, Sarah; Zhou, Shiguo; Zeng, Qiandong; Grigoriev, Igor V.; Ma, Li-Jun; Ciuffetti, Lynda M.
2013-01-01
Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes. PMID:23316438
DOE Office of Scientific and Technical Information (OSTI.GOV)
Manning, Viola A.; Pandelova, Iovanna; Dhillon, Braham
2012-08-16
Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11more » chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.« less
Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas
2017-01-01
B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.
A study of the role of the FOXP2 and CNTNAP2 genes in persistent developmental stuttering.
Han, Tae-Un; Park, John; Domingues, Carlos F; Moretti-Ferreira, Danilo; Paris, Emily; Sainz, Eduardo; Gutierrez, Joanne; Drayna, Dennis
2014-09-01
A number of speech disorders including stuttering have been shown to have important genetic contributions, as indicated by high heritability estimates from twin and other studies. We studied the potential contribution to stuttering from variants in the FOXP2 gene, which have previously been associated with developmental verbal dyspraxia, and from variants in the CNTNAP2 gene, which have been associated with specific language impairment (SLI). DNA sequence analysis of these two genes in a group of 602 unrelated cases, all with familial persistent developmental stuttering, revealed no excess of potentially deleterious coding sequence variants in the cases compared to a matched group of 487 well characterized neurologically normal controls. This was compared to the distribution of variants in the GNPTAB, GNPTG, and NAGPA genes which have previously been associated with persistent stuttering. Using an expanded subject data set, we again found that NAGPA showed significantly different mutation frequencies in North Americans of European descent (p=0.0091) and a significant difference existed in the mutation frequency of GNPTAB in Brazilians (p=0.00050). No significant differences in mutation frequency in the FOXP2 and CNTNAP2 genes were observed between cases and controls. To examine the pattern of expression of these five genes in the human brain, real time quantitative reverse transcription PCR was performed on RNA purified from 27 different human brain regions. The expression patterns of FOXP2 and CNTNAP2 were generally different from those of GNPTAB, GNPTG and NAPGA in terms of relatively lower expression in the cerebellum. This study provides an improved estimate of the contribution of mutations in GNPTAB, GNPTG and NAGPA to persistent stuttering, and suggests that variants in FOXP2 and CNTNAP2 are not involved in the genesis of familial persistent stuttering. This, together with the different brain expression patterns of GNPTAB, GNPTG, and NAGPA compared to that of FOXP2 and CNTNAP2, suggests that the genetic neuropathological origins of stuttering differ from those of verbal dyspraxia and SLI. Published by Elsevier Inc.
Setoh, Yin Xiang; Prow, Natalie A; Rawle, Daniel J; Tan, Cindy Si En; Edmonds, Judith H; Hall, Roy A; Khromykh, Alexander A
2015-06-01
A variant Australian West Nile virus (WNV) strain, WNVNSW2011, emerged in 2011 causing an unprecedented outbreak of encephalitis in horses in south-eastern Australia. However, no human cases associated with this strain have yet been reported. Studies using mouse models for WNV pathogenesis showed that WNVNSW2011 was less virulent than the human-pathogenic American strain of WNV, New York 99 (WNVNY99). To identify viral genes and mutations responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, we constructed chimeric viruses with substitution of large genomic regions coding for the structural genes, non-structural genes and untranslated regions, as well as seven individual non-structural gene chimeras, using a modified circular polymerase extension cloning method. Our results showed that the complete non-structural region of WNVNSW2011, when substituted with that of WNVNY99, significantly enhanced viral replication and the ability to suppress type I IFN response in cells, resulting in higher virulence in mice. Analysis of the individual non-structural gene chimeras showed a predominant contribution of WNVNY99 NS3 to increased virus replication and evasion of IFN response in cells, and to virulence in mice. Other WNVNY99 non-structural proteins (NS2A, NS4B and NS5) were shown to contribute to the modulation of IFN response. Thus a combination of non-structural proteins, likely NS2A, NS3, NS4B and NS5, is primarily responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, and accumulative mutations within these proteins would likely be required for the Australian WNVNSW2011 strain to become significantly more virulent. © 2015 The Authors.
LncRNA-DANCR: A valuable cancer related long non-coding RNA for human cancers.
Thin, Khaing Zar; Liu, Xuefang; Feng, Xiaobo; Raveendran, Sudheesh; Tu, Jian Cheng
2018-06-01
Long noncoding RNAs (lncRNA) are a type of noncoding RNA that comprise of longer than 200 nucleotides sequences. They can regulate chromosome structure, gene expression and play an essential role in the pathophysiology of human diseases, especially in tumorigenesis and progression. Nowadays, they are being targeted as potential biomarkers for various cancer types. And many research studies have proven that lncRNAs might bring a new era to cancer diagnosis and support treatment management. The purpose of this review was to inspect the molecular mechanism and clinical significance of long non-coding RNA- differentiation antagonizing nonprotein coding RNA(DANCR) in various types of human cancers. In this review, we summarize and figure out recent research studies concerning the expression and biological mechanisms of lncRNA-DANCR in tumour development. The related studies were obtained through a systematic search of PubMed, Embase and Cochrane Library. Long non-coding RNAs-DANCR is a valuable cancer-related lncRNA that its dysregulated expression was found in a variety of malignancies, including hepatocellular carcinoma, breast cancer, glioma, colorectal cancer, gastric cancer, and lung cancer. The aberrant expressions of DANCR have been shown to contribute to proliferation, migration and invasion of cancer cells. Long non-coding RNAs-DANCR likely serves as a useful disease biomarker or therapeutic cancer target. Copyright © 2018 Elsevier GmbH. All rights reserved.
Lesmana, Harry; Dyer, Lisa; Li, Xia; Denton, James; Griffiths, Jenna; Chonat, Satheesh; Seu, Katie G; Heeney, Matthew M; Zhang, Kejian; Hopkin, Robert J; Kalfa, Theodosia A
2018-03-01
Pyruvate kinase deficiency (PKD) is the most frequent red blood cell enzyme abnormality of the glycolytic pathway and the most common cause of hereditary nonspherocytic hemolytic anemia. Over 250 PKLR-gene mutations have been described, including missense/nonsense, splicing and regulatory mutations, small insertions, small and gross deletions, causing PKD and hemolytic anemia of variable severity. Alu retrotransposons are the most abundant mobile DNA sequences in the human genome, contributing to almost 11% of its mass. Alu insertions have been associated with a number of human diseases either by disrupting a coding region or a splice signal. Here, we report on two unrelated Middle Eastern patients, both born from consanguineous parents, with transfusion-dependent hemolytic anemia, where sequence analysis revealed a homozygous insertion of AluYb9 within exon 6 of the PKLR gene, causing precipitous decrease of PKLR RNA levels. This Alu element insertion consists a previously unrecognized mechanism underlying pathogenesis of PKD. © 2017 Wiley Periodicals, Inc.
Establishing the role of rare coding variants in known Parkinson's disease risk loci.
Jansen, Iris E; Gibbs, J Raphael; Nalls, Mike A; Price, T Ryan; Lubbe, Steven; van Rooij, Jeroen; Uitterlinden, André G; Kraaij, Robert; Williams, Nigel M; Brice, Alexis; Hardy, John; Wood, Nicholas W; Morris, Huw R; Gasser, Thomas; Singleton, Andrew B; Heutink, Peter; Sharma, Manu
2017-11-01
Many common genetic factors have been identified to contribute to Parkinson's disease (PD) susceptibility, improving our understanding of the related underlying biological mechanisms. The involvement of rarer variants in these loci has been poorly studied. Using International Parkinson's Disease Genomics Consortium data sets, we performed a comprehensive study to determine the impact of rare variants in 23 previously published genome-wide association studies (GWAS) loci in PD. We applied Prix fixe to select the putative causal genes underneath the GWAS peaks, which was based on underlying functional similarities. The Sequence Kernel Association Test was used to analyze the joint effect of rare, common, or both types of variants on PD susceptibility. All genes were tested simultaneously as a gene set and each gene individually. We observed a moderate association of common variants, confirming the involvement of the known PD risk loci within our genetic data sets. Focusing on rare variants, we identified additional association signals for LRRK2, STBD1, and SPATA19. Our study suggests an involvement of rare variants within several putatively causal genes underneath previously identified PD GWAS peaks. Copyright © 2017 Elsevier Inc. All rights reserved.
Cao, Heping; Graves, Donald J; Anderson, Richard A
2010-11-01
Cinnamon extracts (CE) are reported to have beneficial effects on people with normal and impaired glucose tolerance, the metabolic syndrome, type 2 diabetes, and insulin resistance. However, clinical results are controversial. Molecular characterization of CE effects is limited. This study investigated the effects of CE on gene expression in cultured mouse adipocytes. Water-soluble CE was prepared from ground cinnamon (Cinnamomum burmannii). Quantitative real-time PCR was used to investigate CE effects on the expression of genes coding for adipokines, glucose transporter (GLUT) family, and insulin-signaling components in mouse 3T3-L1 adipocytes. CE (100 μg/ml) increased GLUT1 mRNA levels 1.91±0.15, 4.39±0.78, and 6.98±2.18-fold of the control after 2-, 4-, and 16-h treatments, respectively. CE decreased the expression of further genes encoding insulin-signaling pathway proteins including GSK3B, IGF1R, IGF2R, and PIK3R1. This study indicates that CE regulates the expression of multiple genes in adipocytes and this regulation could contribute to the potential health benefits of CE. Published by Elsevier GmbH.
Novel variants of the 5S rRNA genes in Eruca sativa.
Singh, K; Bhatia, S; Lakshmikumaran, M
1994-02-01
The 5S ribosomal RNA (rRNA) genes of Eruca sativa were cloned and characterized. They are organized into clusters of tandemly repeated units. Each repeat unit consists of a 119-bp coding region followed by a noncoding spacer region that separates it from the coding region of the next repeat unit. Our study reports novel gene variants of the 5S rRNA genes in plants. Two families of the 5S rDNA, the 0.5-kb size family and the 1-kb size family, coexist in the E. sativa genome. The 0.5-kb size family consists of the 5S rRNA genes (S4) that have coding regions similar to those of other reported plant 5S rDNA sequences, whereas the 1-kb size family consists of the 5S rRNA gene variants (S1) that exist as 1-kb BamHI tandem repeats. S1 is made up of two variant units (V1 and V2) of 5S rDNA where the BamHI site between the two units is mutated. Sequence heterogeneity among S4, V1, and V2 units exists throughout the sequence and is not limited to the noncoding spacer region only. The coding regions of V1 and V2 show approximately 20% dissimilarity to the coding regions of S4 and other reported plant 5S rDNA sequences. Such a large variation in the coding regions of the 5S rDNA units within the same plant species has been observed for the first time. Restriction site variation is observed between the two size classes of 5S rDNA in E. sativa.(ABSTRACT TRUNCATED AT 250 WORDS)
Hernández, M Luisa; Sicardo, M Dolores; Martínez-Rivas, José M
2016-01-01
Linolenic acid is a polyunsaturated fatty acid present in plant lipids, which plays key roles in plant metabolism as a structural component of storage and membrane lipids, and as a precursor of signaling molecules. The synthesis of linolenic acid is catalyzed by two different ω-3 fatty acid desaturases, which correspond to microsomal- (FAD3) and chloroplast- (FAD7 and FAD8) localized enzymes. We have investigated the specific contribution of each enzyme to the linolenic acid content in olive fruit. With that aim, we isolated two different cDNA clones encoding two ω-3 fatty acid desaturases from olive (Olea europaea cv. Picual). Sequence analysis indicates that they code for microsomal (OepFAD3B) and chloroplast (OepFAD7-2) ω-3 fatty acid desaturase enzymes, different from the previously characterized OekFAD3A and OekFAD7-1 genes. Functional expression in yeast of the corresponding OepFAD3A and OepFAD3B cDNAs confirmed that they encode microsomal ω-3 fatty acid desaturases. The linolenic acid content and transcript levels of olive FAD3 and FAD7 genes were measured in different tissues of Picual and Arbequina cultivars, including mesocarp and seed during development and ripening of olive fruit. Gene expression and lipid analysis indicate that FAD3A is the gene mainly responsible for the linolenic acid present in the seed, while FAD7-1 and FAD7-2 contribute mostly to the linolenic acid present in the mesocarp and, therefore, in the olive oil. These results also indicate the relevance of lipid trafficking between the endoplasmic reticulum and chloroplast in determining the linolenic acid content of membrane and storage lipids in oil-accumulating photosynthetic tissues. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
The HUSH complex cooperates with TRIM28 to repress young retrotransposons and new genes.
Robbez-Masson, Luisa; Tie, Christopher H C; Conde, Lucia; Tunbak, Hale; Husovsky, Connor; Tchasovnikarova, Iva A; Timms, Richard T; Herrero, Javier; Lehner, Paul J; Rowe, Helen M
2018-05-04
Retrotransposons encompass half of the human genome and contribute to the formation of heterochromatin, which provides nuclear structure and regulates gene expression. Here, we asked if the human silencing hub (HUSH) complex is necessary to silence retrotransposons and whether it collaborates with TRIM28 and the chromatin remodeler ATRX at specific genomic loci. We show that the HUSH complex contributes to de novo repression and DNA methylation of a SVA retrotransposon reporter. By using naïve vs. primed mouse pluripotent stem cells, we reveal a critical role for the HUSH complex in naïve cells, implicating it in programming epigenetic marks in development. While the HUSH component FAM208A binds to endogenous retroviruses (ERVs) and long interspersed element-1s (LINE-1s or L1s), it is mainly required to repress evolutionarily young L1s (mouse-specific lineages less than 5 million years old). TRIM28, in contrast, is necessary to repress both ERVs and young L1s. Genes co-repressed by TRIM28 and FAM208A are evolutionarily young, or exhibit tissue-specific expression, are enriched in young L1s and display evidence for regulation through LTR promoters. Finally, we demonstrate that the HUSH complex is also required to repress L1 elements in human cells. Overall, these data indicate that the HUSH complex and TRIM28 co-repress young retrotransposons and new genes rewired by retrotransposon non-coding DNA. Published by Cold Spring Harbor Laboratory Press.
Mueller, Kathryn L; Murray, Jeffrey C; Michaelson, Jacob J; Christiansen, Morten H; Reilly, Sheena; Tomblin, J Bruce
2016-01-01
Much of our current knowledge regarding the association of FOXP2 with speech and language development comes from singleton and small family studies where a small number of rare variants have been identified. However, neither genome-wide nor gene-specific studies have provided evidence that common polymorphisms in the gene contribute to individual differences in language development in the general population. One explanation for this inconsistency is that previous studies have been limited to relatively small samples of individuals with low language abilities, using low density gene coverage. The current study examined the association between common variants in FOXP2 and a quantitative measure of language ability in a population-based cohort of European decent (n = 812). No significant associations were found for a panel of 13 SNPs that covered the coding region of FOXP2 and extended into the promoter region. Power analyses indicated we should have been able to detect a QTL variance of 0.02 for an associated allele with MAF of 0.2 or greater with 80% power. This suggests that, if a common variant associated with language ability in this gene does exist, it is likely of small effect. Our findings lead us to conclude that while genetic variants in FOXP2 may be significant for rare forms of language impairment, they do not contribute appreciably to individual variation in the normal range as found in the general population.
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
Hansen, Karina K; Hauser, Frank; Williamson, Michael; Weber, Stine B; Grimmelikhuijzen, Cornelis J P
2011-01-07
Recently, a novel neuropeptide, CCHamide, was discovered in the silkworm Bombyx mori (L. Roller et al., Insect Biochem. Mol. Biol. 38 (2008) 1147-1157). We have now found that all insects with a sequenced genome have two genes, each coding for a different CCHamide, CCHamide-1 and -2. We have also cloned and deorphanized two Drosophila G-protein-coupled receptors (GPCRs) coded for by genes CG14593 and CG30106 that are selectively activated by Drosophila CCH-amide-1 (EC(50), 2×10(-9) M) and CCH-amide-2 (EC(50), 5×10(-9) M), respectively. Gene CG30106 (symbol synonym CG14484) has in a previous publication (E.C. Johnson et al., J. Biol. Chem. 278 (2003) 52172-52178) been wrongly assigned to code for an allatostatin-B receptor. This conclusion is based on our findings that the allatostatins-B do not activate the CG30106 receptor and on the recent findings from other research groups that the allatostatins-B activate an unrelated GPCR coded for by gene CG16752. Comparative genomics suggests that a duplication of the CCHamide neuropeptide signalling system occurred after the split of crustaceans and insects, about 410 million years ago, because only one CCHamide neuropeptide gene is found in the water flea Daphnia pulex (Crustacea) and the tick Ixodes scapularis (Chelicerata). Copyright © 2010 Elsevier Inc. All rights reserved.
Heimann, Louisa; Horst, Ina; Perduns, Renke; Dreesen, Björn; Offermann, Sascha; Peterhansel, Christoph
2013-01-01
C4 photosynthesis evolved more than 60 times independently in different plant lineages. Each time, multiple genes were recruited into C4 metabolism. The corresponding promoters acquired new regulatory features such as high expression, light induction, or cell type-specific expression in mesophyll or bundle sheath cells. We have previously shown that histone modifications contribute to the regulation of the model C4 phosphoenolpyruvate carboxylase (C4-Pepc) promoter in maize (Zea mays). We here tested the light- and cell type-specific responses of three selected histone acetylations and two histone methylations on five additional C4 genes (C4-Ca, C4-Ppdk, C4-Me, C4-Pepck, and C4-RbcS2) in maize. Histone acetylation and nucleosome occupancy assays indicated extended promoter regions with regulatory upstream regions more than 1,000 bp from the transcription initiation site for most of these genes. Despite any detectable homology of the promoters on the primary sequence level, histone modification patterns were highly coregulated. Specifically, H3K9ac was regulated by illumination, whereas H3K4me3 was regulated in a cell type-specific manner. We further compared histone modifications on the C4-Pepc and C4-Me genes from maize and the homologous genes from sorghum (Sorghum bicolor) and Setaria italica. Whereas sorghum and maize share a common C4 origin, C4 metabolism evolved independently in S. italica. The distribution of histone modifications over the promoters differed between the species, but differential regulation of light-induced histone acetylation and cell type-specific histone methylation were evident in all three species. We propose that a preexisting histone code was recruited into C4 promoter control during the evolution of C4 metabolism. PMID:23564230
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2015-01-01
Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
A Fault-Oblivious Extreme-Scale Execution Environment (FOX)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Hensbergen, Eric; Speight, William; Xenidis, Jimi
IBM Research’s contribution to the Fault Oblivious Extreme-scale Execution Environment (FOX) revolved around three core research deliverables: • collaboration with Boston University around the Kittyhawk cloud infrastructure which both enabled a development and deployment platform for the project team and provided a fault-injection testbed to evaluate prototypes • operating systems research focused on exploring role-based operating system technologies through collaboration with Sandia National Labs on the NIX research operating system and collaboration with the broader IBM Research community around a hybrid operating system model which became known as FusedOS • IBM Research also participated in an advisory capacity with themore » Boston University SESA project, the core of which was derived from the K42 operating system research project funded in part by DARPA’s HPCS program. Both of these contributions were built on a foundation of previous operating systems research funding by the Department of Energy’s FastOS Program. Through the course of the X-stack funding we were able to develop prototypes, deploy them on production clusters at scale, and make them available to other researchers. As newer hardware, in the form of BlueGene/Q, came online, we were able to port the prototypes to the new hardware and release the source code for the resulting prototypes as open source to the community. In addition to the open source coded for the Kittyhawk and NIX prototypes, we were able to bring the BlueGene/Q Linux patches up to a more recent kernel and contribute them for inclusion by the broader Linux community. The lasting impact of the IBM Research work on FOX can be seen in its effect on the shift of IBM’s approach to HPC operating systems from Linux and Compute Node Kernels to role-based approaches as prototyped by the NIX and FusedOS work. This impact can be seen beyond IBM in follow-on ideas being incorporated into the proposals for the Exasacale Operating Systems/Runtime program.« less
NASA Astrophysics Data System (ADS)
Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming
2018-01-01
Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.
Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers
Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Jerusalem, Guy
2018-01-01
Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers. PMID:29301303
Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers.
Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Josse, Claire; Jerusalem, Guy
2018-01-02
Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers.
Soybean kinome: functional classification and gene expression patterns
Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek
2015-01-01
The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
Diversity and Phylogenetic Distribution of Extracellular Microbial Peptidases
NASA Astrophysics Data System (ADS)
Nguyen, Trang; Mueller, Ryan; Myrold, David
2017-04-01
Depolymerization of proteinaceous compounds by extracellular proteolytic enzymes is a bottleneck in the nitrogen cycle, limiting the rate of the nitrogen turnover in soils. Protein degradation is accomplished by a diverse range of extracellular (secreted) peptidases. Our objective was to better understand the evolution of these enzymes and how their functional diversity corresponds to known phylogenetic diversity. Peptidase subfamilies from 110 archaeal, 1,860 bacterial, and 97 fungal genomes were extracted from the MEROPS database along with corresponding SSU sequences for each genome from the SILVA database, resulting in 43,177 secreted peptidases belonging to 34 microbial phyla and 149 peptidase subfamilies. We compared the distribution of each peptidase subfamily across all taxa to the phylogenetic relationships of these organisms based on their SSU gene sequences. The occurrence and abundance of genes coding for secreted peptidases varied across microbial taxa, distinguishing the peptidase complement of the three microbial kingdoms. Bacteria had the highest frequency of secreted peptidase coding genes per 1,000 genes and contributed from 1% to 6% of the gene content. Fungi only had a slightly higher number of secreted peptidase gene content than archaea, standardized by the total genes. The relative abundance profiles of secreted peptidases in each microbial kingdom also varied, in which aspartic family was found to be the greatest in fungi (25%), whereas it was only 12% in archaea and 4% in bacteria. Serine, metallo, and cysteine families consistently contributed widely up to 75% of the secreted peptidase abundance across the three kingdoms. Overall, bacteria had a much wider collection of secreted peptidases, whereas fungi and archaea shared most of their secreted peptidase families. Principle coordinate analysis of the peptidase subfamily-based dissimilarities showed distinguishable clusters for different groups of microorganisms. The distribution of secreted peptidases was found to be significantly correlated with phylogenetic relationships within kingdoms (archaea rMantel=0.364, p=0.001; bacteria rMantel=0.257, p=0.001, and fungi rMantel=0.281, p=0.005), inferring an evolutionary relationship where subsets of phylogenetically related organisms share similar types of secreted peptidases. We also tested the phylogenetic signal strength of each peptidase subfamily for each microbial kingdom based on the binary traits of the distribution (presence or absence of secreted peptidase subfamilies in individual species). About one-third of the peptidase subfamilies displayed a strong evolutionary signal; the rest were phylogenetically over-dispersed, suggesting that these subfamilies are randomly distributed across the tree of life or the result of events such as horizontal gene transfer. Study of the diversity and phylogenetic distribution of secreted peptidases offered a mechanistic basis to anticipate the proteolytic potential function of microbial communities.
Self-complementary circular codes in coding theory.
Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz
2018-04-01
Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.
[Research advances of genomic GYP coding MNS blood group antigens].
Liu, Chang-Li; Zhao, Wei-Jun
2012-02-01
The MNS blood group system includes more than 40 antigens, and the M, N, S and s antigens are the most significant ones in the system. The antigenic determinants of M and N antigens lie on the top of GPA on the surface of red blood cells, while the antigenic determinants of S and s antigens lie on the top of GPB on the surface of red blood cells. The GYPA gene coding GPA and the GYPB gene coding GPB locate at the longarm of chromosome 4 and display 95% homologus sequence, meanwhile both genes locate closely to GYPE gene that did not express product. These three genes formed "GYPA-GYPB-GYPE" structure called GYP genome. This review focuses on the molecular basis of genomic GYP and the variety of GYP genome in the expression of diversity MNS blood group antigens. The molecular basis of Miltenberger hybrid glycophorin polymorphism is specifically expounded.
Raju, Hemalatha B.; Tsinoremas, Nicholas F.; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches. PMID:27803687
Raju, Hemalatha B; Tsinoremas, Nicholas F; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein-protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smialowska, Agata, E-mail: smialowskaa@gmail.com; School of Life Sciences, Södertörn Högskola, Huddinge 141-89; Djupedal, Ingela
Highlights: • Protein coding genes accumulate anti-sense sRNAs in fission yeast S. pombe. • RNAi represses protein-coding genes in S. pombe. • RNAi-mediated gene repression is post-transcriptional. - Abstract: RNA interference (RNAi) is a gene silencing mechanism conserved from fungi to mammals. Small interfering RNAs are products and mediators of the RNAi pathway and act as specificity factors in recruiting effector complexes. The Schizosaccharomyces pombe genome encodes one of each of the core RNAi proteins, Dicer, Argonaute and RNA-dependent RNA polymerase (dcr1, ago1, rdp1). Even though the function of RNAi in heterochromatin assembly in S. pombe is established, its rolemore » in controlling gene expression is elusive. Here, we report the identification of small RNAs mapped anti-sense to protein coding genes in fission yeast. We demonstrate that these genes are up-regulated at the protein level in RNAi mutants, while their mRNA levels are not significantly changed. We show that the repression by RNAi is not a result of heterochromatin formation. Thus, we conclude that RNAi is involved in post-transcriptional gene silencing in S. pombe.« less
2012-01-01
Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
Prié, Dominique; Huart, Virginie; Bakouh, Naziha; Planelles, Gabrielle; Dellis, Olivier; Gérard, Bénédicte; Hulin, Philippe; Benqué-Blanchet, François; Silve, Caroline; Grandchamp, Bernard; Friedlander, Gérard
2002-09-26
Epidemiologic studies suggest that genetic factors confer a predisposition to the formation of renal calcium stones or bone demineralization. Low serum phosphate concentrations due to a decrease in renal phosphate reabsorption have been reported in some patients with these conditions, suggesting that genetic factors leading to a decrease in renal phosphate reabsorption may contribute to them. We hypothesized that mutations in the gene coding for the main renal sodium-phosphate cotransporter (NPT2a) may be present in patients with these disorders. We studied 20 patients with urolithiasis or bone demineralization and persistent idiopathic hypophosphatemia associated with a decrease in maximal renal phosphate reabsorption. The coding region of the gene for NPT2a was sequenced in all patients. The functional consequences of the mutations identified were analyzed by expressing the mutated RNA in Xenopus laevis oocytes. Two patients, one with recurrent urolithiasis and one with bone demineralization, were heterozygous for two distinct mutations. One mutation resulted in the substitution of phenylalanine for alanine at position 48, and the other in a substitution of methionine for valine at position 147. Phosphate-induced current and sodium-dependent phosphate uptake were impaired in oocytes expressing the mutant NPT2a. Coinjection of oocytes with wild-type and mutant RNA indicated that the mutant protein had altered function. Heterozygous mutations in the NPT2a gene may be responsible for hypophosphatemia and urinary phosphate loss in persons with urolithiasis or bone demineralization. Copyright 2002 Massachusetts Medical Society
Nantón, Ana; Ruiz-Ruano, Francisco J.; Camacho, Juan Pedro M.; Méndez, Josefina
2017-01-01
Background Four species of the genus Donax (D. semistriatus, D. trunculus, D. variegatus and D. vittatus) are common on Iberian Peninsula coasts. Nevertheless, despite their economic importance and overexploitation, scarce genetic resources are available. In this work, we newly determined the complete mitochondrial genomes of these four representatives of the family Donacidae, with the aim of contributing to unveil phylogenetic relationships within the Veneroida order, and of developing genetic markers being useful in wedge clam identification and authentication, and aquaculture stock management. Principal findings The complete female mitochondrial genomes of the four species vary in size from 17,044 to 17,365 bp, and encode 13 protein-coding genes (including the atp8 gene), 2 rRNAs and 22 tRNAs, all located on the same strand. A long non-coding region was identified in each of the four Donax species between cob and cox2 genes, presumably corresponding to the Control Region. The Bayesian and Maximum Likelihood phylogenetic analysis of the Veneroida order indicate that all four species of Donax form a single clade as a sister group of other bivalves within the Tellinoidea superfamily. However, although Tellinoidea is actually monophyletic, none of its families are monophyletic. Conclusions Sequencing of complete mitochondrial genomes provides highly valuable information to establish the phylogenetic relationships within the Veneroida order. Furthermore, we provide here significant genetic resources for further research and conservation of this commercially important fishing resource. PMID:28886105
Tzagoloff, A; Shtanko, A
1995-06-01
Three complementation groups of a pet mutant collection have been found to be composed of respiratory-deficient deficient mutants with lesions in mitochondrial protein synthesis. Recombinant plasmids capable of restoring respiration were cloned by transformation of representatives of each complementation group with a yeast genomic library. The plasmids were used to characterize the complementing genes and to institute disruption of the chromosomal copies of each gene in respiratory-proficient yeast. The sequences of the cloned genes indicate that they code for isoleucyl-, arginyl- and glutamyl-tRNA synthetases. The properties of the mutants used to obtain the genes and of strains with the disrupted genes indicate that all three aminoacyl-tRNA synthetases function exclusively in mitochondrial proteins synthesis. The ISM1 gene for mitochondrial isoleucyl-tRNA synthetase has been localized to chromosome XVI next to UME5. The MSR1 gene for the arginyl-tRNA synthetase was previously located on yeast chromosome VIII. The third gene MSE1 for the mitochondrial glutamyl-tRNA synthetase has not been localized. The identification of three new genes coding for mitochondrial-specific aminoacyl-tRNA synthetases indicates that in Saccharomyces cerevisiae at least 11 members of this protein family are encoded by genes distinct from those coding for the homologous cytoplasmic enzymes.
Recognition of Protein-coding Genes Based on Z-curve Algorithms
-Biao Guo, Feng; Lin, Yan; -Ling Chen, Ling
2014-01-01
Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation. PMID:24822027
Ethanol production by recombinant hosts
Fowler, David E.; Horton, Philip G.; Ben-Bassat, Arie
1996-01-01
Novel plasmids comprising genes which code for the alcohol dehydrogenase and pyruvate decarboxylase are described. Also described are recombinant hosts which have been transformed with genes coding for alcohol dehydrogenase and pyruvate. By virtue of their transformation with these genes, the recombinant hosts are capable of producing significant amounts of ethanol as a fermentation product. Also disclosed are methods for increasing the growth of recombinant hosts and methods for reducing the accumulation of undesirable metabolic products in the growth medium of these hosts. Also disclosed are recombinant host capable of producing significant amounts of ethanol as a fermentation product of oligosaccharides and plasmids comprising genes encoding polysaccharases, in addition to the genes described above which code for the alcohol dehydrogenase and pyruvate decarboxylase. Further, methods are described for producing ethanol from oligomeric feedstock using the recombinant hosts described above. Also provided is a method for enhancing the production of functional proteins in a recombinant host comprising overexpressing an adhB gene in the host. Further provided are process designs for fermenting oligosaccharide-containing biomass to ethanol.
Ethanol production by recombinant hosts
Ingram, Lonnie O.; Beall, David S.; Burchhardt, Gerhard F. H.; Guimaraes, Walter V.; Ohta, Kazuyoshi; Wood, Brent E.; Shanmugam, Keelnatham T.
1995-01-01
Novel plasmids comprising genes which code for the alcohol dehydrogenase and pyruvate decarboxylase are described. Also described are recombinant hosts which have been transformed with genes coding for alcohol dehydrogenase and pyruvate. By virtue of their transformation with these genes, the recombinant hosts are capable of producing significant amounts of ethanol as a fermentation product. Also disclosed are methods for increasing the growth of recombinant hosts and methods for reducing the accumulation of undesirable metabolic products in the growth medium of these hosts. Also disclosed are recombinant host capable of producing significant amounts of ethanol as a fermentation product of oligosaccharides and plasmids comprising genes encoding polysaccharases, in addition to the genes described above which code for the alcohol dehydrogenase and pyruvate decarboxylase. Further, methods are described for producing ethanol from oligomeric feedstock using the recombinant hosts described above. Also provided is a method for enhancing the production of functional proteins in a recombinant host comprising overexpressing an adhB gene in the host. Further provided are process designs for fermenting oligosaccharide-containing biomass to ethanol.
A human haploid gene trap collection to study lncRNAs with unusual RNA biology.
Kornienko, Aleksandra E; Vlatkovic, Irena; Neesen, Jürgen; Barlow, Denise P; Pauler, Florian M
2016-01-01
Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator.
Kanda, Kojun; Pflug, James M; Sproul, John S; Dasenko, Mark A; Maddison, David R
2015-01-01
In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced.
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Dasenko, Mark A.
2015-01-01
In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced. PMID:26716693
The complete mitochondrial genome of Rapana venosa (Gastropoda, Muricidae).
Sun, Xiujun; Yang, Aiguo
2016-01-01
The complete mitochondrial (mt) genome of the veined rapa whelk, Rapana venosa, was determined using genome walking techniques in this study. The total length of the mt genome sequence of R. venosa was 15,271 bp, which is comparable to the reported Muricidae mitogenomes to date. It contained 13 protein-coding genes, 21 transfer RNA genes, and two ribosomal RNA genes. A bias towards a higher representation of nucleotides A and T (69%) was detected in the mt genome of R. venosa. A small number of non-coding nucleotides (302 bp) was detected, and the largest non-coding region was 74 bp in length.
Towards a complete map of the human long non-coding RNA transcriptome.
Uszczynska-Ratajczak, Barbara; Lagarde, Julien; Frankish, Adam; Guigó, Roderic; Johnson, Rory
2018-05-23
Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.
Evidence of translation efficiency adaptation of the coding regions of the bacteriophage lambda.
Goz, Eli; Mioduser, Oriah; Diament, Alon; Tuller, Tamir
2017-08-01
Deciphering the way gene expression regulatory aspects are encoded in viral genomes is a challenging mission with ramifications related to all biomedical disciplines. Here, we aimed to understand how the evolution shapes the bacteriophage lambda genes by performing a high resolution analysis of ribosomal profiling data and gene expression related synonymous/silent information encoded in bacteriophage coding regions.We demonstrated evidence of selection for distinct compositions of synonymous codons in early and late viral genes related to the adaptation of translation efficiency to different bacteriophage developmental stages. Specifically, we showed that evolution of viral coding regions is driven, among others, by selection for codons with higher decoding rates; during the initial/progressive stages of infection the decoding rates in early/late genes were found to be superior to those in late/early genes, respectively. Moreover, we argued that selection for translation efficiency could be partially explained by adaptation to Escherichia coli tRNA pool and the fact that it can change during the bacteriophage life cycle.An analysis of additional aspects related to the expression of viral genes, such as mRNA folding and more complex/longer regulatory signals in the coding regions, is also reported. The reported conclusions are likely to be relevant also to additional viruses. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Glez-Peña, Daniel; Díaz, Fernando; Hernández, Jesús M; Corchado, Juan M; Fdez-Riverola, Florentino
2009-06-18
Bioinformatics and medical informatics are two research fields that serve the needs of different but related communities. Both domains share the common goal of providing new algorithms, methods and technological solutions to biomedical research, and contributing to the treatment and cure of diseases. Although different microarray techniques have been successfully used to investigate useful information for cancer diagnosis at the gene expression level, the true integration of existing methods into day-to-day clinical practice is still a long way off. Within this context, case-based reasoning emerges as a suitable paradigm specially intended for the development of biomedical informatics applications and decision support systems, given the support and collaboration involved in such a translational development. With the goals of removing barriers against multi-disciplinary collaboration and facilitating the dissemination and transfer of knowledge to real practice, case-based reasoning systems have the potential to be applied to translational research mainly because their computational reasoning paradigm is similar to the way clinicians gather, analyze and process information in their own practice of clinical medicine. In addressing the issue of bridging the existing gap between biomedical researchers and clinicians who work in the domain of cancer diagnosis, prognosis and treatment, we have developed and made accessible a common interactive framework. Our geneCBR system implements a freely available software tool that allows the use of combined techniques that can be applied to gene selection, clustering, knowledge extraction and prediction for aiding diagnosis in cancer research. For biomedical researches, geneCBR expert mode offers a core workbench for designing and testing new techniques and experiments. For pathologists or oncologists, geneCBR diagnostic mode implements an effective and reliable system that can diagnose cancer subtypes based on the analysis of microarray data using a CBR architecture. For programmers, geneCBR programming mode includes an advanced edition module for run-time modification of previous coded techniques. geneCBR is a new translational tool that can effectively support the integrative work of programmers, biomedical researches and clinicians working together in a common framework. The code is freely available under the GPL license and can be obtained at http://www.genecbr.org.
Mu-Like Prophage in Serogroup B Neisseria meningitidis Coding for Surface-Exposed Antigens
Masignani, Vega; Giuliani, Marzia Monica; Tettelin, Hervé; Comanducci, Maurizio; Rappuoli, Rino; Scarlato, Vincenzo
2001-01-01
Sequence analysis of the genome of Neisseria meningititdis serogroup B revealed the presence of an ∼35-kb region inserted within a putative gene coding for an ABC-type transporter. The region contains 46 open reading frames, 29 of which are colinear and homologous to the genes of Escherichia coli Mu phage. Two prophages with similar organizations were also found in serogroup A meningococcus, and one was found in Haemophilus influenzae. Early and late phage functions are well preserved in this family of Mu-like prophages. Several regions of atypical nucleotide content were identified. These likely represent genes acquired by horizontal transfer. Three of the acquired genes are shown to code for surface-associated antigens, and the encoded proteins are able to induce bactericidal antibodies. PMID:11254622
The complete mitochondrial genome of Chrysopa pallens (Insecta, Neuroptera, Chrysopidae).
He, Kun; Chen, Zhe; Yu, Dan-Na; Zhang, Jia-Yong
2012-10-01
The complete mitochondrial genome of Chrysopa pallens (Neuroptera, Chrysopidae) was sequenced. It consists of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA (rRNA) genes, and a control region (AT-rich region). The total length of C. pallens mitogenome is 16,723 bp with 79.5% AT content, and the length of control region is 1905 bp with 89.1% AT content. The non-coding regions of C. pallens include control region between 12S rRNA and trnI genes, and a 75-bp space region between trnI and trnQ genes.
Structure and expression of canary myc family genes.
Collum, R G; Clayton, D F; Alt, F W
1991-01-01
We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons. Images PMID:1996121
Functional annotation of the vlinc class of non-coding RNAs using systems biology approach
Laurent, Georges St.; Vyatkin, Yuri; Antonets, Denis; Ri, Maxim; Qi, Yao; Saik, Olga; Shtokalo, Dmitry; de Hoon, Michiel J.L.; Kawaji, Hideya; Itoh, Masayoshi; Lassmann, Timo; Arner, Erik; Forrest, Alistair R.R.; Nicolas, Estelle; McCaffrey, Timothy A.; Carninci, Piero; Hayashizaki, Yoshihide; Wahlestedt, Claes; Kapranov, Philipp
2016-01-01
Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlincRNAs genes likely function in cis to activate nearby genes. This effect while most pronounced in closely spaced vlincRNA–gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlincRNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs. PMID:27001520
Robertson, Helen E; Lapraz, François; Egger, Bernhard; Telford, Maximilian J; Schiffer, Philipp H
2017-05-12
Acoels are small, ubiquitous - but understudied - marine worms with a very simple body plan. Their internal phylogeny is still not fully resolved, and the position of their proposed phylum Xenacoelomorpha remains debated. Here we describe mitochondrial genome sequences from the acoels Paratomella rubra and Isodiametra pulchra, and the complete mitochondrial genome of the acoel Archaphanostoma ylvae. The P. rubra and A. ylvae sequences are typical for metazoans in size and gene content. The larger I. pulchra mitochondrial genome contains both ribosomal genes, 21 tRNAs, but only 11 protein-coding genes. We find evidence suggesting a duplicated sequence in the I. pulchra mitochondrial genome. The P. rubra, I. pulchra and A. ylvae mitochondria have a unique genome organisation in comparison to other metazoan mitochondrial genomes. We found a large degree of protein-coding gene and tRNA overlap with little non-coding sequence in the compact P. rubra genome. Conversely, the A. ylvae and I. pulchra genomes have many long non-coding sequences between genes, likely driving genome size expansion in the latter. Phylogenetic trees inferred from mitochondrial genes retrieve Xenacoelomorpha as an early branching taxon in the deuterostomes. Sequence divergence analysis between P. rubra sampled in England and Spain indicates cryptic diversity.
Origins of genes: "big bang" or continuous creation?
Keese, P K; Gibbs, A
1992-01-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes. PMID:1329098
Yuan, Daojun; Tang, Zhonghui; Wang, Maojun; Gao, Wenhui; Tu, Lili; Jin, Xin; Chen, Lingling; He, Yonghui; Zhang, Lin; Zhu, Longfu; Li, Yang; Liang, Qiqi; Lin, Zhongxu; Yang, Xiyan; Liu, Nian; Jin, Shuangxia; Lei, Yang; Ding, Yuanhao; Li, Guoliang; Ruan, Xiaoan; Ruan, Yijun; Zhang, Xianlong
2015-01-01
Gossypium hirsutum contributes the most production of cotton fibre, but G. barbadense is valued for its better comprehensive resistance and superior fibre properties. However, the allotetraploid genome of G. barbadense has not been comprehensively analysed. Here we present a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a “relay race”-like fashion. We anticipate that the G. barbadense genome sequence will advance our understanding the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus. PMID:26634818
Explaining the disease phenotype of intergenic SNP through predicted long range regulation.
Chen, Jingqi; Tian, Weidong
2016-10-14
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Lehmann, Jason S.; Corey, Victoria C.; Ricaldi, Jessica N.; Vinetz, Joseph M.; Winzeler, Elizabeth A.; Matthias, Michael A.
2016-01-01
Leptospirosis is the most common zoonotic disease worldwide with an estimated 500,000 severe cases reported annually, and case fatality rates of 12–25%, due primarily to acute kidney and lung injuries. Despite its prevalence, the molecular mechanisms underlying leptospirosis pathogenesis remain poorly understood. To identify virulence-related genes in Leptospira interrogans, we delineated cumulative genome changes that occurred during serial in vitro passage of a highly virulent strain of L. interrogans serovar Lai into a nearly avirulent isogenic derivative. Comparison of protein coding and computationally predicted noncoding RNA (ncRNA) genes between these two polyclonal strains identified 15 nonsynonymous single nucleotide variant (nsSNV) alleles that increased in frequency and 19 that decreased, whereas no changes in allelic frequency were observed among the ncRNA genes. Some of the nsSNV alleles were in six genes shown previously to be transcriptionally upregulated during exposure to in vivo-like conditions. Five of these nsSNVs were in evolutionarily conserved positions in genes related to signal transduction and metabolism. Frequency changes of minor nsSNV alleles identified in this study likely contributed to the loss of virulence during serial in vitro culture. The identification of new virulence-associated genes should spur additional experimental inquiry into their potential role in Leptospira pathogenesis. PMID:26711524
Park, Young-Jun; Nishikawa, Tomotaro; Minami, Mineo; Nemoto, Kazuhiro; Iwasaki, Tomohiro; Matsushima, Kenichi
2015-12-01
The purpose of this study was to identify the genetic mechanism underlying capsinoid biosynthesis in S3212, a low-pungency genotype of Capsicum frutescens. Screening of C. frutescens accessions for capsaicinoid and capsiate contents by high-performance liquid chromatography revealed that low-pungency S3212 contained high levels of capsiate but no capsaicin. Comparison of DNA coding sequences of pungent (T1 and Bird Eye) and low-pungency (S3212) genotypes uncovered a significant 12-bp deletion mutation in exon 7 of the p-AMT gene of S3212. In addition, p-AMT gene transcript levels in placental tissue were positively correlated with the degree of pungency. S3212, the low-pungency genotype, exhibited no significant p-AMT transcript levels, whereas T1, one of the pungent genotypes, displayed high transcript levels of this gene. We therefore conclude that the deletion mutation in the p-AMT gene is related to the loss of pungency in placental tissue and has given rise to the low-pungency S3212 C. frutescens genotype. C. frutescens S3212 represents a good natural source of capsinoids. Finally, our basic characterization of the uncovered p-AMT gene mutation should contribute to future studies of capsinoid biosynthesis in Capsicum.
Schneeberger, Stefan; Amberger, Albert; Mandl, Julia; Hautz, Theresa; Renz, Oliver; Obrist, Peter; Meusburger, Hugo; Brandacher, Gerald; Mark, Walter; Strobl, Daniela; Troppmair, Jakob; Pratschke, Johann; Margreiter, Raimund; Kuznetsov, Andrey V
2010-12-01
Chronic rejection (CR) remains an unsolved hurdle for long-term heart transplant survival. The effect of cold ischemia (CI) on progression of CR and the mechanisms resulting in functional deficit were investigated by studying gene expression, mitochondrial function, and enzymatic activity. Allogeneic (Lew→F344) and syngeneic (Lew→Lew) heart transplantations were performed with or without 10 h of CI. After evaluation of myocardial contraction, hearts were excised at 2, 10, 40, and 60 days for investigation of vasculopathy, gene expression, enzymatic activities, and mitochondrial respiration. Gene expression studies identified a gene cluster coding for subunits of the mitochondrial electron transport chain regulated in response to CI and CR. Myocardial performance, mitochondrial function, and mitochondrial marker enzyme activities declined in all allografts with time after transplantation. These declines were more rapid and severe in CI allografts (CR-CI) and correlated well with progression of vasculopathy and fibrosis. Mitochondria related gene expression and mitochondrial function are substantially compromised with the progression of CR and show that CI impacts on progression, gene profile, and mitochondrial function of CR. Monitoring mitochondrial function and enzyme activity might allow for earlier detection of CR and cardiac allograft dysfunction. © 2010 The Authors. Journal compilation © 2010 European Society for Organ Transplantation.
Pakzad, Iraj; Zayyen Karin, Maasoume; Taherikalani, Morovat; Boustanshenas, Mina; Lari, Abdolaziz Rastegar
2013-01-01
Resistance to fluoroquinolones has been recently increased among bacterial strains isolated from outpatients. Multidrug-resistant K. pneumoniae is one of the major organisms isolated from burn patients and the AcrAB efflux pump is the principal pump contributing to the intrinsic resistance in K. pneumoniae against multiple antimicrobial agents including ciprofloxacin and other fluoroquinolones. Fifty-two K. pneumoniae isolated from burn patients in Shahid Motahari hospital and confirmed by conventional biochemical tests. Antimicrobial susceptibility testing was done according to CLSI 2011 guidelines, to determine the antimicrobial resistance pattern of isolates. AcrA gene was detected among ciprofloxacin-resistant isolates by PCR assay. MICs to ciprofloxacin were measured with and without carbonyl cyanide 3-chlorophenylhydrazone (CCCP). Forty out of the 52 K. pneumoniae isolated from burn patients in Shahid Motahari hospital were resistant to ciprofloxacin according to breakpoint of CLSI guideline. PCR assay for acrA gene demonstrated that all ciprofloxacin-resistant isolates harbored acrA gene coding the membrane fusion protein AcrA and is a part of AcrAB efflux system. Among these isolates, 19 strains (47.5%) showed 2 to 32 fold reduction in MICs after using CCCP as an efflux pump inhibitor. The other 21 strains (52.5%) showed no disparity in MICs before and after using CCCP. In conclusion, the AcrAB efflux system is one of the principal mechanisms contribute in ciprofloxacin resistance among K. pneumoniae isolates but there are some other mechanisms interfere with ciprofloxacin resistance such as mutation in target proteins of DNA gyrase of topoisomerase IV enzymes.
Foox, Jonathan; Brugler, Mercer; Siddall, Mark Edward; Rodríguez, Estefanía
2016-07-01
Six complete and three partial actiniarian mitochondrial genomes were amplified in two semi-circles using long-range PCR and pyrosequenced in a single run on a 454 GS Junior, doubling the number of complete mitogenomes available within the order. Typical metazoan mtDNA features included circularity, 13 protein-coding genes, 2 ribosomal RNA genes, and length ranging from 17,498 to 19,727 bp. Several typical anthozoan mitochondrial genome features were also observed including the presence of only two transfer RNA genes, elevated A + T richness ranging from 54.9 to 62.4%, large intergenic regions, and group 1 introns interrupting NADH dehydrogenase subunit 5 and cytochrome c oxidase subunit I, the latter of which possesses a homing endonuclease gene. Within the sea anemone Alicia sansibarensis, we report the first mitochondrial gene order rearrangement within the Actiniaria, as well as putative novel non-canonical protein-coding genes. Phylogenetic analyses of all 13 protein-coding and 2 ribosomal genes largely corroborated current hypotheses of sea anemone interrelatedness, with a few lower-level differences.
Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki
2010-01-01
A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057
Naville, M; Warren, I A; Haftek-Terreau, Z; Chalopin, D; Brunet, F; Levin, P; Galiana, D; Volff, J-N
2016-04-01
Viruses and transposable elements, once considered as purely junk and selfish sequences, have repeatedly been used as a source of novel protein-coding genes during the evolution of most eukaryotic lineages, a phenomenon called 'molecular domestication'. This is exemplified perfectly in mammals and other vertebrates, where many genes derived from long terminal repeat (LTR) retroelements (retroviruses and LTR retrotransposons) have been identified through comparative genomics and functional analyses. In particular, genes derived from gag structural protein and envelope (env) genes, as well as from the integrase-coding and protease-coding sequences, have been identified in humans and other vertebrates. Retroelement-derived genes are involved in many important biological processes including placenta formation, cognitive functions in the brain and immunity against retroelements, as well as in cell proliferation, apoptosis and cancer. These observations support an important role of retroelement-derived genes in the evolution and diversification of the vertebrate lineage. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Liu, Zhongliang; Hui, Yi; Shi, Lei; Chen, Zhenyu; Xu, Xiangjie; Chi, Liankai; Fan, Beibei; Fang, Yujiang; Liu, Yang; Ma, Lin; Wang, Yiran; Xiao, Lei; Zhang, Quanbin; Jin, Guohua; Liu, Ling; Zhang, Xiaoqing
2016-09-13
Loss-of-function studies in human pluripotent stem cells (hPSCs) require efficient methodologies for lesion of genes of interest. Here, we introduce a donor-free paired gRNA-guided CRISPR/Cas9 knockout strategy (paired-KO) for efficient and rapid gene ablation in hPSCs. Through paired-KO, we succeeded in targeting all genes of interest with high biallelic targeting efficiencies. More importantly, during paired-KO, the cleaved DNA was repaired mostly through direct end joining without insertions/deletions (precise ligation), and thus makes the lesion product predictable. The paired-KO remained highly efficient for one-step targeting of multiple genes and was also efficient for targeting of microRNA, while for long non-coding RNA over 8 kb, cleavage of a short fragment of the core promoter region was sufficient to eradicate downstream gene transcription. This work suggests that the paired-KO strategy is a simple and robust system for loss-of-function studies for both coding and non-coding genes in hPSCs. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
RPS8—a New Informative DNA Marker for Phylogeny of Babesia and Theileria Parasites in China
Tian, Zhan-Cheng; Liu, Guang-Yuan; Yin, Hong; Luo, Jian-Xun; Guan, Gui-Quan; Luo, Jin; Xie, Jun-Ren; Shen, Hui; Tian, Mei-Yuan; Zheng, Jin-feng; Yuan, Xiao-song; Wang, Fang-fang
2013-01-01
Piroplasmosis is a serious debilitating and sometimes fatal disease. Phylogenetic relationships within piroplasmida are complex and remain unclear. We compared the intron–exon structure and DNA sequences of the RPS8 gene from Babesia and Theileria spp. isolates in China. Similar to 18S rDNA, the 40S ribosomal protein S8 gene, RPS8, including both coding and non-coding regions is a useful and novel genetic marker for defining species boundaries and for inferring phylogenies because it tends to have little intra-specific variation but considerable inter-specific difference. However, more samples are needed to verify the usefulness of the RPS8 (coding and non-coding regions) gene as a marker for the phylogenetic position and detection of most Babesia and Theileria species, particularly for some closely related species. PMID:24244571
Galián, José A; Rosato, Marcela; Rosselló, Josep A
2014-03-01
Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclear 5S rDNA gene family rarely used contiguous 5S coding sequences due to the routine use of head-to-tail polymerase chain reaction primers that are anchored to the coding region. Moreover, the 5S coding sequences have been concatenated with independent, adjacent gene units in many studies, creating simulated chimeric genes as the raw data for evolutionary analysis. This practice is based on the tacitly assumed, but rarely tested, hypothesis that strict intra-locus concerted evolution processes are operating in 5S rDNA genes, without any empirical evidence as to whether it holds for the recovered data. The potential pitfalls of analysing the patterns of molecular evolution and reconstructing phylogenies based on these chimeric genes have not been assessed to date. Here, we compared the sequence integrity and phylogenetic behavior of entire versus concatenated 5S coding regions from a real data set obtained from closely related plant species (Medicago, Fabaceae). Our results suggest that within arrays sequence homogenization is partially operating in the 5S coding region, which is traditionally assumed to be highly conserved. Consequently, concatenating 5S genes increases haplotype diversity, generating novel chimeric genotypes that most likely do not exist within the genome. In addition, the patterns of gene evolution are distorted, leading to incorrect haplotype relationships in some evolutionary reconstructions.
Goremykin, Vadim V; Lockhart, Peter J; Viola, Roberto; Velasco, Riccardo
2012-08-01
Mitochondrial genomes of spermatophytes are the largest of all organellar genomes. Their large size has been attributed to various factors; however, the relative contribution of these factors to mitochondrial DNA (mtDNA) expansion remains undetermined. We estimated their relative contribution in Malus domestica (apple). The mitochondrial genome of apple has a size of 396 947 bp and a one to nine ratio of coding to non-coding DNA, close to the corresponding average values for angiosperms. We determined that 71.5% of the apple mtDNA sequence was highly similar to sequences of its nuclear DNA. Using nuclear gene exons, nuclear transposable elements and chloroplast DNA as markers of promiscuous DNA content in mtDNA, we estimated that approximately 20% of the apple mtDNA consisted of DNA sequences imported from other cell compartments, mostly from the nucleus. Similar marker-based estimates of promiscuous DNA content in the mitochondrial genomes of other species ranged between 21.2 and 25.3% of the total mtDNA length for grape, between 23.1 and 38.6% for rice, and between 47.1 and 78.4% for maize. All these estimates are conservative, because they underestimate the import of non-functional DNA. We propose that the import of promiscuous DNA is a core mechanism for mtDNA size expansion in seed plants. In apple, maize and grape this mechanism contributed far more to genome expansion than did homologous recombination. In rice the estimated contribution of both mechanisms was found to be similar. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Zhang, Haiyun; Sun, Dejun; Li, Defu; Zheng, Zeguang; Xu, Jingyi; Liang, Xue; Zhang, Chenting; Wang, Sheng; Wang, Jian; Lu, Wenju
2018-05-15
Long non-coding RNAs (lncRNAs) have critical regulatory roles in protein-coding gene expression. Aberrant expression profiles of lncRNAs have been observed in various human diseases. In this study, we investigated transcriptome profiles in lung tissues of chronic cigarette smoke (CS)-induced COPD mouse model. We found that 109 lncRNAs and 260 mRNAs were significantly differential expressed in lungs of chronic CS-induced COPD mouse model compared with control animals. GO and KEGG analyses indicated that differentially expressed lncRNAs associated protein-coding genes were mainly involved in protein processing of endoplasmic reticulum pathway, and taurine and hypotaurine metabolism pathway. The combination of high throughput data analysis and the results of qRT-PCR validation in lungs of chronic CS-induced COPD mouse model, 16HBE cells with CSE treatment and PBMC from patients with COPD revealed that NR_102714 and its associated protein-coding gene UCHL1 might be involved in the development of COPD both in mouse and human. In conclusion, our study demonstrated that aberrant expression profiles of lncRNAs and mRNAs existed in lungs of chronic CS-induced COPD mouse model. From animal models perspective, these results might provide further clues to investigate biological functions of lncRNAs and their potential target protein-coding genes in the pathogenesis of COPD.
Bäumlein, H; Wobus, U; Pustell, J; Kafatos, F C
1986-01-01
The field bean, Vicia faba L. var. minor, possesses two sub-families of 11 S legumin genes named A and B. We isolated from a genomic library a B-type gene (LeB4) and determined its primary DNA sequence. Gene LeB4 codes for a 484 amino acid residue prepropolypeptide, encompassing a signal peptide of 22 amino acid residues, an acidic, very hydrophilic alpha-chain of 281 residues and a basic, somewhat hydrophobic beta-chain of 181 residues. The latter two coding regions are immediately contiguous, but each is interrupted by a short intron. Type A legumin genes from soybean and pea are known to have introns in the same two positions, in addition to an extra intron (within the alpha-coding sequence). Sequence comparisons of legumin genes from these three plants revealed a highly conserved sequence element of at least 28 bp, centered at approximately 100 bp upstream of each cap site. The element is absent from the equivalent position of all non-legumin and other plant and fungal genes examined. We tentatively name this element "legumin box" and suggest that it may have a function in the regulation of legumin gene expression. PMID:3960730
Genomic Correlates of Relationship QTL Involved in Fore- versus Hind Limb Divergence in Mice
Pavlicev, Mihaela; Wagner, Günter P.; Noonan, James P.; Hallgrímsson, Benedikt; Cheverud, James M.
2013-01-01
Divergence of serially homologous elements of organisms is a common evolutionary pattern contributing to increased phenotypic complexity. Here, we study the genomic intervals affecting the variational independence of fore- and hind limb traits within an experimental mouse population. We use an advanced intercross of inbred mouse strains to map the loci associated with the degree of autonomy between fore- and hind limb long bone lengths (loci affecting the relationship between traits, relationship quantitative trait loci [rQTL]). These loci have been proposed to interact locally with the products of pleiotropic genes, thereby freeing the local trait from the variational constraint due to pleiotropic mutations. Using the known polymorphisms (single nucleotide polymorphisms [SNPs]) between the parental strains, we characterized and compared the genomic regions in which the rQTL, as well as their interaction partners (intQTL), reside. We find that these two classes of QTL intervals harbor different kinds of molecular variation. SNPs in rQTL intervals more frequently reside in limb-specific cis-regulatory regions than SNPs in intQTL intervals. The intQTL loci modified by the rQTL, in contrast, show the signature of protein-coding variation. This result is consistent with the widely accepted view that protein-coding mutations have broader pleiotropic effects than cis-regulatory polymorphisms. For both types of QTL intervals, the underlying candidate genes are enriched for genes involved in protein binding. This finding suggests that rQTL effects are caused by local interactions among the products of the causal genes harbored in rQTL and intQTL intervals. This is the first study to systematically document the population-level molecular variation underlying the evolution of character individuation. PMID:24065733
Cardoso, Alexander M.; Cavalcante, Janaína J. V.; Cantão, Maurício E.; Thompson, Claudia E.; Flatschart, Roberto B.; Glogauer, Arnaldo; Scapin, Sandra M. N.; Sade, Youssef B.; Beltrão, Paulo J. M. S. I.; Gerber, Alexandra L.; Martins, Orlando B.; Garcia, Eloi S.; de Souza, Wanderley; Vasconcelos, Ana Tereza R.
2012-01-01
The shortage of petroleum reserves and the increase in CO2 emissions have raised global concerns and highlighted the importance of adopting sustainable energy sources. Second-generation ethanol made from lignocellulosic materials is considered to be one of the most promising fuels for vehicles. The giant snail Achatina fulica is an agricultural pest whose biotechnological potential has been largely untested. Here, the composition of the microbial population within the crop of this invasive land snail, as well as key genes involved in various biochemical pathways, have been explored for the first time. In a high-throughput approach, 318 Mbp of 454-Titanium shotgun metagenomic sequencing data were obtained. The predominant bacterial phylum found was Proteobacteria, followed by Bacteroidetes and Firmicutes. Viruses, Fungi, and Archaea were present to lesser extents. The functional analysis reveals a variety of microbial genes that could assist the host in the degradation of recalcitrant lignocellulose, detoxification of xenobiotics, and synthesis of essential amino acids and vitamins, contributing to the adaptability and wide-ranging diet of this snail. More than 2,700 genes encoding glycoside hydrolase (GH) domains and carbohydrate-binding modules were detected. When we compared GH profiles, we found an abundance of sequences coding for oligosaccharide-degrading enzymes (36%), very similar to those from wallabies and giant pandas, as well as many novel cellulase and hemicellulase coding sequences, which points to this model as a remarkable potential source of enzymes for the biofuel industry. Furthermore, this work is a major step toward the understanding of the unique genetic profile of the land snail holobiont. PMID:23133637
Saha, Jayita; Giri, Kalyan
2017-04-20
Compelling evidences anticipated the well acclamation of involvement of exogenous and endogenous polyamines (PAs) in conferring salt tolerance in plants. Intracellular PA's anabolism and catabolism should have contributed to maintain endogenous PAs homeostasis to induce stress signal networks. In this report, the evolutionary study has been conducted to reveal the phylogenetic relationship of genes encoding enzymes of the anabolic and catabolic pathway of PAs among the five plant lineages including green algae, moss, lycophyte, dicot and monocot along with their respective exon-intron structural patterns. Our results indicated that natural selection pressure had considerable influence on the ancestral PA metabolic pathway coding genes of land plants. PA metabolic genes have undergone gradual evolution by duplication and diversification process leading to subsequent structural modification through exon-intron gain and loss events to acquire specific function under environmental stress conditions. We have illuminated on the potential regulation of both the pathways by investigating the real-time expression analyses of PA metabolic pathway related enzyme coding genes at the transcriptional level in root and shoot tissues of two indica rice varieties, namely IR 36 (salt sensitive) and Nonabokra (salt-tolerant) in response to salinity in presence or absence of exogenous spermidine (Spd) treatment. Additionally, we have performed tissue specific quantification of the intracellular PAs and tried to draw probable connection between the PA metabolic pathway activation and endogenous PAs accumulation. Our results successfully enlighten the fact that how exogenous Spd in presence or absence of salt stress adjust the intracellular PA pathways to equilibrate the cellular PAs that would have been attributed to plant salt tolerance. Copyright © 2017 Elsevier B.V. All rights reserved.
Borba, Ana Rita; Serra, Tânia S; Górska, Alicja; Gouveia, Paulo; Cordeiro, André M; Reyna-Llorens, Ivan; Knerová, Jana; Barros, Pedro M; Abreu, Isabel A; Oliveira, M Margarida; Hibberd, Julian M; Saibo, Nelson J M
2018-04-05
C4 photosynthesis has evolved repeatedly from the ancestral C3 state to generate a carbon concentrating mechanism that increases photosynthetic efficiency. This specialised form of photosynthesis is particularly common in the PACMAD clade of grasses, and is used by many of the world's most productive crops. The C4 cycle is accomplished through cell-type specific accumulation of enzymes but cis-elements and transcription factors controlling C4 photosynthesis remain largely unknown. Using the NADP-Malic Enzyme (NADP-ME) gene as a model we tested whether mechanisms impacting on transcription in C4 plants evolved from ancestral components found in C3 species. Two basic Helix-Loop-Helix (bHLH) transcription factors, ZmbHLH128 and ZmbHLH129, were shown to bind the C4NADP-ME promoter from maize. These proteins form heterodimers and ZmbHLH129 impairs trans-activation by ZmbHLH128. Electrophoretic mobility shift assays indicate that a pair of cis-elements separated by a seven base pair spacer synergistically bind either ZmbHLH128 or ZmbHLH129. This pair of cis-elements is found in both C3 and C4 Panicoid grass species of the PACMAD clade. Our analysis is consistent with this cis-element pair originating from a single motif present in the ancestral C3 state. We conclude that C4 photosynthesis has co-opted an ancient C3 regulatory code built on G-box recognition by bHLH to regulate the NADP-ME gene. More broadly, our findings also contribute to the understanding of gene regulatory networks controlling C4 photosynthesis.
Pathophysiological understanding of HFpEF: microRNAs as part of the puzzle.
Rech, Monika; Barandiarán Aizpurua, Arantxa; van Empel, Vanessa; van Bilsen, Marc; Schroen, Blanche
2018-05-01
Half of all heart failure patients have preserved ejection fraction (HFpEF). Comorbidities associated with and contributing to HFpEF include obesity, diabetes and hypertension. Still, the underlying pathophysiological mechanisms of HFpEF are unknown. A preliminary consensus proposes that the multi-morbidity triggers a state of systemic, chronic low-grade inflammation, and microvascular dysfunction, causing reduced nitric oxide bioavailability to adjacent cardiomyocytes. As a result, the cardiomyocyte remodels its contractile elements and fails to relax properly, causing diastolic dysfunction, and eventually HFpEF. HFpEF is a complex syndrome for which currently no efficient therapies exist. This is notably due to the current one-size-fits-all therapy approach that ignores individual patient differences. MicroRNAs have been studied in relation to pathophysiological mechanisms and comorbidities underlying and contributing to HFpEF. As regulators of gene expression, microRNAs may contribute to the pathophysiology of HFpEF. In addition, secreted circulating microRNAs are potential biomarkers and as such, they could help stratify the HFpEF population and open new ways for individualized therapies. In this review, we provide an overview of the ever-expanding world of non-coding RNAs and their contribution to the molecular mechanisms underlying HFpEF. We propose prospects for microRNAs in stratifying the HFpEF population. MicroRNAs add a new level of complexity to the regulatory network controlling cardiac function and hence the understanding of gene regulation becomes a fundamental piece in solving the HFpEF puzzle.
Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru
2007-01-01
The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
GeneMachine: gene prediction and sequence annotation.
Makalowska, I; Ryan, J F; Baxevanis, A D
2001-09-01
A number of free-standing programs have been developed in order to help researchers find potential coding regions and deduce gene structure for long stretches of what is essentially 'anonymous DNA'. As these programs apply inherently different criteria to the question of what is and is not a coding region, multiple algorithms should be used in the course of positional cloning and positional candidate projects to assure that all potential coding regions within a previously-identified critical region are identified. We have developed a gene identification tool called GeneMachine which allows users to query multiple exon and gene prediction programs in an automated fashion. BLAST searches are also performed in order to see whether a previously-characterized coding region corresponds to a region in the query sequence. A suite of Perl programs and modules are used to run MZEF, GENSCAN, GRAIL 2, FGENES, RepeatMasker, Sputnik, and BLAST. The results of these runs are then parsed and written into ASN.1 format. Output files can be opened using NCBI Sequin, in essence using Sequin as both a workbench and as a graphical viewer. The main feature of GeneMachine is that the process is fully automated; the user is only required to launch GeneMachine and then open the resulting file with Sequin. Annotations can then be made to these results prior to submission to GenBank, thereby increasing the intrinsic value of these data. GeneMachine is freely-available for download at http://genome.nhgri.nih.gov/genemachine. A public Web interface to the GeneMachine server for academic and not-for-profit users is available at http://genemachine.nhgri.nih.gov. The Web supplement to this paper may be found at http://genome.nhgri.nih.gov/genemachine/supplement/.
Evidence for an ergot alkaloid gene cluster in Claviceps purpurea.
Tudzynski, P; Hölter, K; Correia, T; Arntz, C; Grammel, N; Keller, U
1999-02-01
A gene (cpd1) coding for the dimethylallyltryptophan synthase (DMATS) that catalyzes the first specific step in the biosynthesis of ergot alkaloids, was cloned from a strain of Claviceps purpurea that produces alkaloids in axenic culture. The derived gene product (CPD1) shows only 70% similarity to the corresponding gene previously isolated from Claviceps strain ATCC 26245, which is likely to be an isolate of C. fusiformis. Therefore, the related cpd1 most probably represents the first C. purpurea gene coding for an enzymatic step of the alkaloid biosynthetic pathway to be cloned. Analysis of the 3'-flanking region of cpd1 revealed a second, closely linked ergot alkaloid biosynthetic gene named cpps1, which codes for a 356-kDa polypeptide showing significant similarity to fungal modular peptide synthetases. The protein contains three amino acid-activating modules, and in the second module a sequence is found which matches that of an internal peptide (17 amino acids in length) obtained from a tryptic digest of lysergyl peptide synthetase 1 (LPS1) of C. purpurea, thus confirming that cpps1 encodes LPS1. LPS1 activates the three amino acids of the peptide portion of ergot peptide alkaloids during D-lysergyl peptide assembly. Chromosome walking revealed the presence of additional genes upstream of cpd1 which are probably also involved in ergot alkaloid biosynthesis: cpox1 probably codes for an FAD-dependent oxidoreductase (which could represent the chanoclavine cyclase), and a second putative oxidoreductase gene, cpox2, is closely linked to it in inverse orientation. RT-PCR experiments confirm that all four genes are expressed under conditions of peptide alkaloid biosynthesis. These results strongly suggest that at least some genes of ergot alkaloid biosynthesis in C. purpurea are clustered, opening the way for a detailed molecular genetic analysis of the pathway.
A universal genomic coordinate translator for comparative genomics
2014-01-01
Background Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Results Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across species. Conclusions Kraken is a computational genome coordinate translator that facilitates cross-species comparisons, distinguishes orthologs from paralogs, and does not require costly all-to-all whole genome mappings. Kraken is freely available under LPGL from http://github.com/nedaz/kraken. PMID:24976580
A universal genomic coordinate translator for comparative genomics.
Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G
2014-06-30
Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across species. Kraken is a computational genome coordinate translator that facilitates cross-species comparisons, distinguishes orthologs from paralogs, and does not require costly all-to-all whole genome mappings. Kraken is freely available under LPGL from http://github.com/nedaz/kraken.
Recombinant Rp1 genes confer necrotic or nonspecific resistance phenotypes.
Smith, Shavannor M; Steinau, Martin; Trick, Harold N; Hulbert, Scot H
2010-06-01
Genes at the Rp1 rust resistance locus of maize confer race-specific resistance to the common rust fungus Puccinia sorghi. Three variant genes with nonspecific effects (HRp1 -Kr1N, -D*21 and -MD*19) were found to be generated by intragenic crossing over within the LRR region. The LRR region of most NBS-LRR encoding genes is quite variable and codes for one of the regions in resistance gene proteins that controls specificity. Sequence comparisons demonstrated that the Rp1-Kr1N recombinant gene was identical to the N-terminus of the rp1-kp2 gene and C-terminus of another gene from its HRp1-K grandparent. The Rp1-D*21 recombinant gene consists of the N-terminus of the rp1-dp2 gene and C-terminus of the Rp1-D gene from the parental haplotype. Similarly, a recombinant gene from the Rp1-MD*19 haplotype has the N-terminus of an rp1 gene from the HRp1-M parent and C-terminus of the rp1-D19 gene from the HRp1-D parent. The recombinant Rp1 -Kr1N, -D*21 and -MD*19 genes activated defense responses in the absence of their AVR proteins triggering HR (hypersensitive response) in the absence of the pathogen. The results indicate that the frequent intragenic recombination events that occur in the Rp1 gene cluster not only recombine the genes into novel haplotypes, but also create genes with nonspecific effects. Some of these may contribute to nonspecific quantitative resistance but others have severe consequences for the fitness of the plant.
Nowacka-Woszuk, Joanna; Switonski, Marek
2009-01-01
The sex determination process is under the control of several genes of which two (SRY and SOX9), encoding transcription factors, play a crucial role. It is well-known that mutations at these genes may cause the development of an intersexual phenotype. The aim of this study was to conduct a comparative analysis of the coding sequence and 5'-flanking regions of both genes in four species of the family Canidae (the dog, red fox, arctic fox and Chinese raccoon dog). Similarity of the coding sequence of the SOX9 gene among the studied species was higher (99.7-99.9%) than in the case of the SRY gene (96.7-97.3%). Only single nucleotide changes were found in the compared coding sequences, whereas in the 5'-flanking region of both genes nucleotide substitutions, as well as insertions and deletions were observed. None of the changes detected in the 5'-flanking region occurred within the potential consensus sequences for transcription factors. No polymorphism was found for either of these genes in any of the analyzed species.
Bernick, David L.; Dennis, Patrick P.; Lui, Lauren M.; Lowe, Todd M.
2012-01-01
A great diversity of small, non-coding RNA (ncRNA) molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs (sRNA) in archaea is limited. We employed RNA-seq to identify novel sRNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense sRNAs encoded opposite to key regulatory (ferric uptake regulator), metabolic (triose-phosphate isomerase), and core transcriptional apparatus genes (transcription factor B). We also found a large increase in the number of conserved C/D box sRNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these sRNAs indicates they are relatively recent, stable adaptations. PMID:22783241
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luethi, E.; Jasmat, N.B.; Grayling, R.A.
1991-03-01
A {lambda} recombinant phage expressing {beta}-mannanase activity in Escherichia coli has been isolated from a genomic library of the extremely thermophilic anaerobe Caldocellum saccharolyticum. The gene was cloned into pBR322 on a 5-kb BamHI fragment, and its location was obtained by deletion analysis. The sequence of a 2.1-kb fragment containing the mannanase gene has been determined. One open reading frame was found which could code for a protein of M{sub r} 38,904. The mannanase gene (manA) was overexpressed in E. coli by cloning the gene downstream from the lacZ promoter of pUC18. The enzyme was most active at pH 6more » and 80 C and degraded locust bean gum, guar gum, Pinus radiata glucomannan, and konjak glucomannan. The noncoding region downstream from the mannanase gene showed strong homology to celB, a gene coding for a cellulase from the same organism, suggesting that the manA gene might have been inserted into its present position on the C. saccharolyticum genome by homologous recombination.« less
Quach, Tommy; Brooks, Daniel M; Miranda, Hector C
2016-01-01
The complete mitochondrial genome of the Palawan peacock-pheasant Polyplectron napoleonis is 16,710 bp and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a control-region. All protein-coding genes use the standard ATG start codon, except for cox1 which has GTG start codon. Seven out of 13 PCGs have TAA stop codons, two have AGG (cox1 and nd6), and three PCGs (nd2, cox2 and nd4) have incomplete stop codon of just T- - nucleotide.