large structural gene: Topics by Science.gov

Sample records for large structural gene

Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering.

PubMed

Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan

2017-03-01

Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2016, The International Biometric Society.
Macro optical projection tomography for large scale 3D imaging of plant structures and gene activity

PubMed Central

Lee, Karen J. I.; Calder, Grant M.; Hindle, Christopher R.; Newman, Jacob L.; Robinson, Simon N.; Avondo, Jerome J. H. Y.

2017-01-01

Abstract Optical projection tomography (OPT) is a well-established method for visualising gene activity in plants and animals. However, a limitation of conventional OPT is that the specimen upper size limit precludes its application to larger structures. To address this problem we constructed a macro version called Macro OPT (M-OPT). We apply M-OPT to 3D live imaging of gene activity in growing whole plants and to visualise structural morphology in large optically cleared plant and insect specimens up to 60 mm tall and 45 mm deep. We also show how M-OPT can be used to image gene expression domains in 3D within fixed tissue and to visualise gene activity in 3D in clones of growing young whole Arabidopsis plants. A further application of M-OPT is to visualise plant-insect interactions. Thus M-OPT provides an effective 3D imaging platform that allows the study of gene activity, internal plant structures and plant-insect interactions at a macroscopic scale. PMID:28025317
The structure and evolution of angiosperm nuclear genomes.

PubMed

Bennetzen, J L

1998-04-01

Despite several decades of investigation, the organization of angiosperm genomes remained largely unknown until very recently. Data describing the sequence composition of large segments of genomes, covering hundreds of kilobases of contiguous sequence, have only become available in the past two years. Recent results indicate commonalities in the characteristics of many plant genomes, including in the structure of chromosomal components like telomeres and centromeres, and in the order and content of genes. Major differences between angiosperms have been associated mainly with repetitive DNAs, both gene families and mobile elements. Intriguing new studies have begun to characterize the dynamic three-dimensional structures of chromosomes and chromatin, and the relationship between genome structure and co-ordinated gene function.
Structural and functional studies of a family of Dictyostelium discoideum developmentally regulated, prestalk genes coding for small proteins.

PubMed

Vicente, Juan J; Galardi-Castilla, María; Escalante, Ricardo; Sastre, Leandro

2008-01-03

The social amoeba Dictyostelium discoideum executes a multicellular development program upon starvation. This morphogenetic process requires the differential regulation of a large number of genes and is coordinated by extracellular signals. The MADS-box transcription factor SrfA is required for several stages of development, including slug migration and spore terminal differentiation. Subtractive hybridization allowed the isolation of a gene, sigN (SrfA-induced gene N), that was dependent on the transcription factor SrfA for expression at the slug stage of development. Homology searches detected the existence of a large family of sigN-related genes in the Dictyostelium discoideum genome. The 13 most similar genes are grouped in two regions of chromosome 2 and have been named Group1 and Group2 sigN genes. The putative encoded proteins are 87-89 amino acids long. All these genes have a similar structure, composed of a first exon containing a 13 nucleotides long open reading frame and a second exon comprising the remaining of the putative coding region. The expression of these genes is induced at10 hours of development. Analyses of their promoter regions indicate that these genes are expressed in the prestalk region of developing structures. The addition of antibodies raised against SigN Group 2 proteins induced disintegration of multi-cellular structures at the mound stage of development. A large family of genes coding for small proteins has been identified in D. discoideum. Two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development. Functional studies using antibodies raised against Group 2 SigN proteins indicate that these genes could play a role during multicellular development.
SUC1 gene of Saccharomyces: a structural gene for the large (glycoprotein) and small (carbohydrate-free) forms of invertase.

PubMed Central

Rodriguez, L; Lampen, J O; MacKay, V L

1981-01-01

Saccharomyces cerevisiae revertant strain D10-ER1 has been shown to contain thermosensitive forms of the large (glycoprotein) and small (carbohydrate-free) invertases and a very low level of the small enzyme, along with a wild-type level of the large form (T. Mizunaga et al., Mol. Cell. Biol. 1:460-468, 1981). These characteristics cosegregated in crosses of the revertant strain with wild-type sucrose-fermenting (SUC1) or nonfermenting (suc0) strains. In addition, there is tight linkage between sucrose and maltose fermentation in revertant D10-ER1 (characteristic of the SUC1 and MAL1 genes). From this we infer that a single reversion event is responsible for the several changes observed in D10-ER1, and that this mutation maps within or very close to the SUC1 gene present in the ancestor strain 4059-358D. The revertant SUC1 allele in D10-ER1 (termed SUC1-R1) was expressed independently of the wild-type SUC1 gene when both were present in diploid cells. Diploids carrying only the wild-type or the mutant genes synthesized invertases with the characteristics of the parental Suc+ haploids. The possibility that a modifier gene was responsible for the alterations in the invertases of revertant D10-ER1 was ruled out by appropriate crosses. We conclude that SUC1 is a structural gene that codes for both the large and the small forms of invertase and suggest that SUC2 through SUC5 are structural genes as well. PMID:6765604
Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats.

PubMed

Spielmann, A; Stutz, E

1983-10-25

The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.
Plastid and mitochondrial genomes of Coccophora langsdorfii (Fucales, Phaeophyceae) and the utility of molecular markers

PubMed Central

Graf, Louis; Kim, Yae Jin; Cho, Ga Youn; Miller, Kathy Ann

2017-01-01

Coccophora langsdorfii (Turner) Greville (Fucales) is an intertidal brown alga that is endemic to Northeast Asia and increasingly endangered by habitat loss and climate change. We sequenced the complete circular plastid and mitochondrial genomes of C. langsdorfii. The circular plastid genome is 124,450 bp and contains 139 protein-coding, 28 tRNA and 6 rRNA genes. The circular mitochondrial genome is 35,660 bp and contains 38 protein-coding, 25 tRNA and 3 rRNA genes. The structure and gene content of the C. langsdorfii plastid genome is similar to those of other species in the Fucales. The plastid genomes of brown algae in other orders share similar gene content but exhibit large structural recombination. The large in-frame insert in the cox2 gene in the mitochondrial genome of C. langsdorfii is typical of other brown algae. We explored the effect of this insertion on the structure and function of the cox2 protein. We estimated the usefulness of 135 plastid genes and 35 mitochondrial genes for developing molecular markers. This study shows that 29 organellar genes will prove efficient for resolving brown algal phylogeny. In addition, we propose a new molecular marker suitable for the study of intraspecific genetic diversity that should be tested in a large survey of populations of C. langsdorfii. PMID:29095864
Fragmentation of the large subunit ribosomal RNA gene in oyster mitochondrial genomes.

PubMed

Milbury, Coren A; Lee, Jung C; Cannone, Jamie J; Gaffney, Patrick M; Gutell, Robin R

2010-09-02

Discontinuous genes have been observed in bacteria, archaea, and eukaryotic nuclei, mitochondria and chloroplasts. Gene discontinuity occurs in multiple forms: the two most frequent forms result from introns that are spliced out of the RNA and the resulting exons are spliced together to form a single transcript, and fragmented gene transcripts that are not covalently attached post-transcriptionally. Within the past few years, fragmented ribosomal RNA (rRNA) genes have been discovered in bilateral metazoan mitochondria, all within a group of related oysters. In this study, we have characterized this fragmentation with comparative analysis and experimentation. We present secondary structures, modeled using comparative sequence analysis of the discontinuous mitochondrial large subunit rRNA genes of the cupped oysters C. virginica, C. gigas, and C. hongkongensis. Comparative structure models for the large subunit rRNA in each of the three oyster species are generally similar to those for other bilateral metazoans. We also used RT-PCR and analyzed ESTs to determine if the two fragmented LSU rRNAs are spliced together. The two segments are transcribed separately, and not spliced together although they still form functional rRNAs and ribosomes. Although many examples of discontinuous ribosomal genes have been documented in bacteria and archaea, as well as the nuclei, chloroplasts, and mitochondria of eukaryotes, oysters are some of the first characterized examples of fragmented bilateral animal mitochondrial rRNA genes. The secondary structures of the oyster LSU rRNA fragments have been predicted on the basis of previous comparative metazoan mitochondrial LSU rRNA structure models.
Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats.

PubMed Central

Spielmann, A; Stutz, E

1983-01-01

The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2. PMID:6314279
Search for 5'-leader regulatory RNA structures based on gene annotation aided by the RiboGap database.

PubMed

Naghdi, Mohammad Reza; Smail, Katia; Wang, Joy X; Wade, Fallou; Breaker, Ronald R; Perreault, Jonathan

2017-03-15

The discovery of noncoding RNAs (ncRNAs) and their importance for gene regulation led us to develop bioinformatics tools to pursue the discovery of novel ncRNAs. Finding ncRNAs de novo is challenging, first due to the difficulty of retrieving large numbers of sequences for given gene activities, and second due to exponential demands on calculation needed for comparative genomics on a large scale. Recently, several tools for the prediction of conserved RNA secondary structure were developed, but many of them are not designed to uncover new ncRNAs, or are too slow for conducting analyses on a large scale. Here we present various approaches using the database RiboGap as a primary tool for finding known ncRNAs and for uncovering simple sequence motifs with regulatory roles. This database also can be used to easily extract intergenic sequences of eubacteria and archaea to find conserved RNA structures upstream of given genes. We also show how to extend analysis further to choose the best candidate ncRNAs for experimental validation. Copyright © 2017 Elsevier Inc. All rights reserved.
Evidence for a large expansion and subfunctionalisation of globin genes in sea anemones.

PubMed

Smith, Hayden L; Pavasovic, Ana; Surm, Joachim M; Phillips, Matthew J; Prentis, Peter J

2018-06-27

The globin gene superfamily has been well-characterised in vertebrates, however, there has been limited research in early-diverging lineages, such as phylum Cnidaria. This study aimed to identify globin genes in multiple cnidarian lineages, and use bioinformatic approaches to characterise the evolution, structure and expression of these genes. Phylogenetic analyses and in silico protein predictions showed that all cnidarians have undergone an expansion of globin genes, which likely have a hexacoordinate protein structure. Our protein modelling has also revealed the possibility of a single pentacoordinate globin lineage in anthozoan species. Some cnidarian globin genes displayed tissue and development specific expression with very few orthologous genes similarly expressed across species. Our phylogenetic analyses also revealed that eumetazoan globin genes form a polyphyletic relationship with vertebrate globin genes. Overall, our analyses suggest that a Ngb-like and GbX-like gene were most likely present in the globin gene repertoire for the last common ancestor of eumetazoans. The identification of a large-scale expansion and subfunctionalisation of globin genes in actiniarians provides an excellent starting point to further our understanding of the evolution and function of the globin gene superfamily in early-diverging lineages.
Long-Range Chromosome Interactions Mediated by Cohesin Shape Circadian Gene Expression

PubMed Central

Xu, Yichi; Guo, Weimin; Li, Ping; Zhang, Yan; Zhao, Meng; Fan, Zenghua; Zhao, Zhihu; Yan, Jun

2016-01-01

Mammalian circadian rhythm is established by the negative feedback loops consisting of a set of clock genes, which lead to the circadian expression of thousands of downstream genes in vivo. As genome-wide transcription is organized under the high-order chromosome structure, it is largely uncharted how circadian gene expression is influenced by chromosome architecture. We focus on the function of chromatin structure proteins cohesin as well as CTCF (CCCTC-binding factor) in circadian rhythm. Using circular chromosome conformation capture sequencing, we systematically examined the interacting loci of a Bmal1-bound super-enhancer upstream of a clock gene Nr1d1 in mouse liver. These interactions are largely stable in the circadian cycle and cohesin binding sites are enriched in the interactome. Global analysis showed that cohesin-CTCF co-binding sites tend to insulate the phases of circadian oscillating genes while cohesin-non-CTCF sites are associated with high circadian rhythmicity of transcription. A model integrating the effects of cohesin and CTCF markedly improved the mechanistic understanding of circadian gene expression. Further experiments in cohesin knockout cells demonstrated that cohesin is required at least in part for driving the circadian gene expression by facilitating the enhancer-promoter looping. This study provided a novel insight into the relationship between circadian transcriptome and the high-order chromosome structure. PMID:27135601
Euglena Transcript Processing.

PubMed

McWatters, David C; Russell, Anthony G

2017-01-01

RNA transcript processing is an important stage in the gene expression pathway of all organisms and is subject to various mechanisms of control that influence the final levels of gene products. RNA processing involves events such as nuclease-mediated cleavage, removal of intervening sequences referred to as introns and modifications to RNA structure (nucleoside modification and editing). In Euglena, RNA transcript processing was initially examined in chloroplasts because of historical interest in the secondary endosymbiotic origin of this organelle in this organism. More recent efforts to examine mitochondrial genome structure and RNA maturation have been stimulated by the discovery of unusual processing pathways in other Euglenozoans such as kinetoplastids and diplonemids. Eukaryotes containing large genomes are now known to typically contain large collections of introns and regulatory RNAs involved in RNA processing events, and Euglena gracilis in particular has a relatively large genome for a protist. Studies examining the structure of nuclear genes and the mechanisms involved in nuclear RNA processing have revealed that indeed Euglena contains large numbers of introns in the limited set of genes so far examined and also possesses large numbers of specific classes of regulatory and processing RNAs, such as small nucleolar RNAs (snoRNAs). Most interestingly, these studies have also revealed that Euglena possesses novel processing pathways generating highly fragmented cytosolic ribosomal RNAs and subunits and non-conventional intron classes removed by unknown splicing mechanisms. This unexpected diversity in RNA processing pathways emphasizes the importance of identifying the components involved in these processing mechanisms and their evolutionary emergence in Euglena species.
PHYLOGEOGRAPHIC PATTERNS IN LARGE RIVER ECOSYSTEMS: GENETIC STRUCTURE OF SMALLMOUTH BUFFALO (ICTIOBUS BUBALUS) IN THE OHIO RIVER

EPA Science Inventory

Genetic studies on populations of large river fishes provide a potentially useful but underutilized research and assessment tool. Population genetic research on freshwater systems has provided meaningful insight into stock structure, hybridization issues, and gene flow/migration...
GeneBuilder: interactive in silico prediction of gene structure.

PubMed

Milanesi, L; D'Angelo, D; Rogozin, I B

1999-01-01

Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.
Structure and variation of the mitochondrial genome of fishes.

PubMed

Satoh, Takashi P; Miya, Masaki; Mabuchi, Kohji; Nishida, Mutsumi

2016-09-07

The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets. An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species. Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.
Statistical Analysis of Big Data on Pharmacogenomics

PubMed Central

Fan, Jianqing; Liu, Han

2013-01-01

This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Functional understanding of the diverse exon-intron structures of human GPCR genes.

PubMed

Hammond, Dorothy A; Olman, Victor; Xu, Ying

2014-02-01

The GPCR genes have a variety of exon-intron structures even though their proteins are all structurally homologous. We have examined all human GPCR genes with at least two functional protein isoforms, totaling 199, aiming to gain an understanding of what may have contributed to the large diversity of the exon-intron structures of the GPCR genes. The 199 genes have a total of 808 known protein splicing isoforms with experimentally verified functions. Our analysis reveals that 1301 (80.6%) adjacent exon-exon pairs out of the total of 1,613 in the 199 genes have either exactly one exon skipped or the intron in-between retained in at least one of the 808 protein splicing isoforms. This observation has a statistical significance p-value of 2.051762 * e(-09), assuming that the observed splicing isoforms are independent of the exon-intron structures. Our interpretation of this observation is that the exon boundaries of the GPCR genes are not randomly determined; instead they may be selected to facilitate specific alternative splicing for functional purposes.
Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

Cancer.gov

Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.
Cooperativeness of the higher chromatin structure of the beta-globin locus revealed by the deletion mutations of DNase I hypersensitive site 3 of the LCR.

PubMed

Fang, Xiangdong; Xiang, Ping; Yin, Wenxuan; Stamatoyannopoulos, George; Li, Qiliang

2007-01-05

High-level transcription of the globin genes requires the enhancement by a distant element, the locus control region (LCR). Such long-range regulation in vivo involves spatial interaction between transcriptional elements, with intervening chromatin looping out. It has been proposed that the clustering of the HS sites of the LCR, the active globin genes, as well as the remote 5' hypersensitive sites (HSs) (HS-60/-62 in mouse, HS-110 in human) and 3'HS1 forms a specific spatial chromatin structure, termed active chromatin hub (ACH). Here we report the effects of the HS3 deletions of the LCR on the spatial chromatin structure of the beta-globin locus as revealed by the chromatin conformation capture (3C) technology. The small HS3 core deletion (0.23 kb), but not the large HS3 deletion (2.3 kb), disrupted the spatial interactions among all the HS sites of the LCR, the beta-globin gene and 3'HS1. We have previously demonstrated that the large HS3 deletion barely impairs the structure of the LCR holocomplex, while the structure is significantly disrupted by the HS3 core deletion. Taken together, these results suggest that the formation of the ACH is dependent on a largely intact LCR structure. We propose that the ACH indeed is an extension of the LCR holocomplex.

Simultaneous Drug Targeting of the Promoter MYC G-Quadruplex and BCL2 i-Motif in Diffuse Large B-Cell Lymphoma Delays Tumor Growth.

PubMed

Kendrick, Samantha; Muranyi, Andrea; Gokhale, Vijay; Hurley, Laurence H; Rimsza, Lisa M

2017-08-10

Secondary DNA structures are uniquely poised as therapeutic targets due to their molecular switch function in turning gene expression on or off and scaffold-like properties for protein and small molecule interaction. Strategies to alter gene transcription through these structures thus far involve targeting single DNA conformations. Here we investigate the feasibility of simultaneously targeting different secondary DNA structures to modulate two key oncogenes, cellular-myelocytomatosis (MYC) and B-cell lymphoma gene-2 (BCL2), in diffuse large B-cell lymphoma (DLBCL). Cotreatment with previously identified ellipticine and pregnanol derivatives that recognize the MYC G-quadruplex and BCL2 i-motif promoter DNA structures lowered mRNA levels and subsequently enhanced sensitivity to a standard chemotherapy drug, cyclophosphamide, in DLBCL cell lines. In vivo repression of MYC and BCL2 in combination with cyclophosphamide also significantly slowed tumor growth in DLBCL xenograft mice. Our findings demonstrate concurrent targeting of different DNA secondary structures offers an effective, precise, medicine-based approach to directly impede transcription and overcome aberrant pathways in aggressive malignancies.
A gene expression resource generated by genome-wide lacZ profiling in the mouse

PubMed Central

Tuck, Elizabeth; Estabel, Jeanne; Oellrich, Anika; Maguire, Anna Karin; Adissu, Hibret A.; Souter, Luke; Siragher, Emma; Lillistone, Charlotte; Green, Angela L.; Wardle-Jones, Hannah; Carragher, Damian M.; Karp, Natasha A.; Smedley, Damian; Adams, Niels C.; Bussell, James N.; Adams, David J.; Ramírez-Solis, Ramiro; Steel, Karen P.; Galli, Antonella; White, Jacqueline K.

2015-01-01

ABSTRACT Knowledge of the expression profile of a gene is a critical piece of information required to build an understanding of the normal and essential functions of that gene and any role it may play in the development or progression of disease. High-throughput, large-scale efforts are on-going internationally to characterise reporter-tagged knockout mouse lines. As part of that effort, we report an open access adult mouse expression resource, in which the expression profile of 424 genes has been assessed in up to 47 different organs, tissues and sub-structures using a lacZ reporter gene. Many specific and informative expression patterns were noted. Expression was most commonly observed in the testis and brain and was most restricted in white adipose tissue and mammary gland. Over half of the assessed genes presented with an absent or localised expression pattern (categorised as 0-10 positive structures). A link between complexity of expression profile and viability of homozygous null animals was observed; inactivation of genes expressed in ≥21 structures was more likely to result in reduced viability by postnatal day 14 compared with more restricted expression profiles. For validation purposes, this mouse expression resource was compared with Bgee, a federated composite of RNA-based expression data sets. Strong agreement was observed, indicating a high degree of specificity in our data. Furthermore, there were 1207 observations of expression of a particular gene in an anatomical structure where Bgee had no data, indicating a large amount of novelty in our data set. Examples of expression data corroborating and extending genotype-phenotype associations and supporting disease gene candidacy are presented to demonstrate the potential of this powerful resource. PMID:26398943
Interfacing cellular networks of S. cerevisiae and E. coli: Connecting dynamic and genetic information

PubMed Central

2013-01-01

Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484
Gene-for-gene disease resistance: bridging insect pest and pathogen defense.

PubMed

Kaloshian, Isgouhi

2004-12-01

Active plant defense, also known as gene-for-gene resistance, is triggered when a plant resistance (R) gene recognizes the intrusion of a specific insect pest or pathogen. Activation of plant defense includes an array of physiological and transcriptional reprogramming. During the past decade, a large number of plant R genes that confer resistance to diverse group of pathogens have been cloned from a number of plant species. Based on predicted protein structures, these genes are classified into a small number of groups, indicating that structurally related R genes recognize phylogenetically distinct pathogens. An extreme example is the tomato Mi-1 gene, which confers resistance to potato aphid (Macrosiphum euphorbiae), whitefly (Bemisia tabaci), and root-knot nematodes (Meloidogyne spp.). While Mi-1 remains the only cloned insect R gene, there is evidence that gene-for-gene type of plant defense against piercing-sucking insects exists in a number of plant species.
Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

PubMed

Liu, Ruifeng; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

2017-09-25

The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very large number of models. To address this issue, we examined the performance of a variable nearest neighbor (v-NN) method that uses information on near neighbors conforming to the principle that similar structures have similar activities. Using a data set of gene expression signatures of 13 150 compounds derived from cell-based measurements in the NIH Library of Integrated Network-based Cellular Signatures program, we were able to make predictions for 62% of the compounds in a 10-fold cross validation test, with a correlation coefficient of 0.61 between the predicted and experimentally derived signatures-a reproducibility rivaling that of high-throughput gene expression measurements. To evaluate the utility of the predicted gene expression signatures, we compared the predicted and experimentally derived signatures in their ability to identify drugs known to cause specific liver, kidney, and heart injuries. Overall, the predicted and experimentally derived signatures had similar receiver operating characteristics, whose areas under the curve ranged from 0.71 to 0.77 and 0.70 to 0.73, respectively, across the three organ injury models. However, detailed analyses of enrichment curves indicate that signatures predicted from multiple near neighbors outperformed those derived from experiments, suggesting that averaging information from near neighbors may help improve the signal from gene expression measurements. Our results demonstrate that the v-NN method can serve as a practical approach for modeling large-scale, genomewide, chemical-induced, gene expression changes.
First Mitochondrial Genome from Nemouridae (Plecoptera) Reveals Novel Features of the Elongated Control Region and Phylogenetic Implications

PubMed Central

Chen, Zhi-Teng; Du, Yu-Zhou

2017-01-01

The complete mitochondrial genome (mitogenome) of Nemoura nankinensis (Plecoptera: Nemouridae) was sequenced as the first reported mitogenome from the family Nemouridae. The N. nankinensis mitogenome was the longest (16,602 bp) among reported plecopteran mitogenomes, and it contains 37 genes including 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes. Most PCGs used standard ATN as start codons, and TAN as termination codons. All tRNA genes of N. nankinensis could fold into the cloverleaf secondary structures except for trnSer (AGN), whose dihydrouridine (DHU) arm was reduced to a small loop. There was also a large non-coding region (control region, CR) in the N. nankinensis mitogenome. The 1751 bp CR was the longest and had the highest A+T content (81.8%) among stoneflies. A large tandem repeat region, five potential stem-loop (SL) structures, four tRNA-like structures and four conserved sequence blocks (CSBs) were detected in the elongated CR. The presence of these tRNA-like structures in the CR has never been reported in other plecopteran mitogenomes. These novel features of the elongated CR in N. nankinensis may have functions associated with the process of replication and transcription. Finally, phylogenetic reconstruction suggested that Nemouridae was the sister-group of Capniidae. PMID:28475163
First Mitochondrial Genome from Nemouridae (Plecoptera) Reveals Novel Features of the Elongated Control Region and Phylogenetic Implications.

PubMed

Chen, Zhi-Teng; Du, Yu-Zhou

2017-05-05

The complete mitochondrial genome (mitogenome) of Nemoura nankinensis (Plecoptera: Nemouridae) was sequenced as the first reported mitogenome from the family Nemouridae. The N. nankinensis mitogenome was the longest (16,602 bp) among reported plecopteran mitogenomes, and it contains 37 genes including 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes. Most PCGs used standard ATN as start codons, and TAN as termination codons. All tRNA genes of N. nankinensis could fold into the cloverleaf secondary structures except for trnSer ( AGN ), whose dihydrouridine (DHU) arm was reduced to a small loop. There was also a large non-coding region (control region, CR) in the N. nankinensis mitogenome. The 1751 bp CR was the longest and had the highest A+T content (81.8%) among stoneflies. A large tandem repeat region, five potential stem-loop (SL) structures, four tRNA-like structures and four conserved sequence blocks (CSBs) were detected in the elongated CR. The presence of these tRNA-like structures in the CR has never been reported in other plecopteran mitogenomes. These novel features of the elongated CR in N. nankinensis may have functions associated with the process of replication and transcription. Finally, phylogenetic reconstruction suggested that Nemouridae was the sister-group of Capniidae.
Microgravity

NASA Image and Video Library

1997-01-12

This is a large 2 mm crystal of histone octamer, grown on STS-81. A very dynamic structure which functions in many aspects of gene regulation from control of gene activity to the more subtle mechanisms of genetic imprinting. Principle Investigator is Dan Carter of New Century Pharmaceuticals.
Mobile genes in the human microbiome are structured from global to individual scales

PubMed Central

Brito, IL; Jupiter, SD; Jenkins, AP; Naisilisili, W; Tamminen, M; Smillie, CS; Wortman, JR; Birren, BW; Xavier, RJ; Blainey, PC; Singh, AK; Gevers, D; Alm, EJ

2016-01-01

Recent work has underscored the importance of the microbiome in human health, largely attributing differences in phenotype to differences in the species present across individuals1,2,3,4,5. But mobile genes can confer profoundly different phenotypes on different strains of the same species. Little is known about the function and distribution of mobile genes in the human microbiome, and in particular whether the gene pool is globally homogenous or constrained by human population structure. Here, we investigate this question by comparing the mobile genes found in the microbiomes of 81 metropolitan North Americans with that of 172 agrarian Fiji islanders using a combination of single-cell genomics and metagenomics. We find large differences in mobile gene content between the Fijian and North American microbiomes, with functional variation that mirrors known dietary differences such as the excess of plant-based starch degradation genes. Remarkably, differences are also observed between the mobile gene pools of proximal Fijian villages, even though microbiome composition across villages is similar. Finally, we observe high rates of recombination leading to individual-specific mobile elements, suggesting that the abundance of some genes may reflect environmental selection rather than dispersal limitation. Together, these data support the hypothesis that human activities and behaviors provide selective pressures that shape mobile gene pools, and that acquisition of mobile genes is important to colonizing specific human populations. PMID:27409808
Structural Basis of Cooperative Ligand Binding by the Glycine Riboswitch

DOE Office of Scientific and Technical Information (OSTI.GOV)

E Butler; J Wang; Y Xiong

2011-12-31

The glycine riboswitch regulates gene expression through the cooperative recognition of its amino acid ligand by a tandem pair of aptamers. A 3.6 {angstrom} crystal structure of the tandem riboswitch from the glycine permease operon of Fusobacterium nucleatum reveals the glycine binding sites and an extensive network of interactions, largely mediated by asymmetric A-minor contacts, that serve to communicate ligand binding status between the aptamers. These interactions provide a structural basis for how the glycine riboswitch cooperatively regulates gene expression.
De Novo Protein Structure Prediction

NASA Astrophysics Data System (ADS)

Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram

An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
Comparative analysis of syntenic genes in grass genomes reveals accelerated rates of gene structure and coding sequence evolution in polyploid wheat

USDA-ARS?s Scientific Manuscript database

Cycles of whole genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied...
Crystal Structures of SlyA Protein, a Master Virulence Regulator of Salmonella, in Free and DNA-bound States

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dolan, Kyle T.; Duguid, Erica M.; He, Chuan

2011-11-17

SlyA is a master virulence regulator that controls the transcription of numerous genes in Salmonella enterica. We present here crystal structures of SlyA by itself and bound to a high-affinity DNA operator sequence in the slyA gene. SlyA interacts with DNA through direct recognition of a guanine base by Arg-65, as well as interactions between conserved Arg-86 and the minor groove and a large network of non-base-specific contacts with the sugar phosphate backbone. Our structures, together with an unpublished structure of SlyA bound to the small molecule effector salicylate (Protein Data Bank code 3DEU), reveal that, unlike many other MarRmore » family proteins, SlyA dissociates from DNA without large conformational changes when bound to this effector. We propose that SlyA and other MarR global regulators rely more on indirect readout of DNA sequence to exert control over many genes, in contrast to proteins (such as OhrR) that recognize a single operator.« less
Rice Ribosomal Protein Large Subunit Genes and Their Spatio-temporal and Stress Regulation

PubMed Central

Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Dutta, Mouboni; Madhav, Sheshu M.; Kirti, P. B.

2016-01-01

Ribosomal proteins (RPs) are well-known for their role in mediating protein synthesis and maintaining the stability of the ribosomal complex, which includes small and large subunits. In the present investigation, in a genome-wide survey, we predicted that the large subunit of rice ribosomes is encoded by at least 123 genes including individual gene copies, distributed throughout the 12 chromosomes. We selected 34 candidate genes, each having 2–3 identical copies, for a detailed characterization of their gene structures, protein properties, cis-regulatory elements and comprehensive expression analysis. RPL proteins appear to be involved in interactions with other RP and non-RP proteins and their encoded RNAs have a higher content of alpha-helices in their predicted secondary structures. The majority of RPs have binding sites for metal and non-metal ligands. Native expression profiling of 34 ribosomal protein large (RPL) subunit genes in tissues covering the major stages of rice growth shows that they are predominantly expressed in vegetative tissues and seedlings followed by meiotically active tissues like flowers. The putative promoter regions of these genes also carry cis-elements that respond specifically to stress and signaling molecules. All the 34 genes responded differentially to the abiotic stress treatments. Phytohormone and cold treatments induced significant up-regulation of several RPL genes, while heat and H2O2 treatments down-regulated a majority of them. Furthermore, infection with a bacterial pathogen, Xanthomonas oryzae, which causes leaf blight also induced the expression of 80% of the RPL genes in leaves. Although the expression of RPL genes was detected in all the tissues studied, they are highly responsive to stress and signaling molecules indicating that their encoded proteins appear to have roles in stress amelioration besides house-keeping. This shows that the RPL gene family is a valuable resource for manipulation of stress tolerance in rice and other crops, which may be achieved by overexpressing and raising independent transgenic plants carrying the genes that became up-regulated significantly and instantaneously. PMID:27605933
Mitochondrial genome of the African lion Panthera leo leo.

PubMed

Ma, Yue-ping; Wang, Shuo

2015-01-01

In this study, the complete mitochondrial genome sequence of the African lion P. leo leo was reported. The total length of the mitogenome was 17,054 bp. It contained the typical mitochondrial structure, including 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 1 control region; 21 of the tRNA genes folded into typical cloverleaf secondary structure except for tRNASe. The overall composition of the mitogenome was A (32.0%), G (14.5%), C (26.5%) and T (27.0%). The new sequence will provide molecular genetic information for conservation genetics study of this important large carnivore.
Reverse engineering highlights potential principles of large gene regulatory network design and learning.

PubMed

Carré, Clément; Mas, André; Krouk, Gabriel

2017-01-01

Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.
Structure-Function Analysis of a Broad Specificity Populus trichocarpa Endo-β-glucanase Reveals an Evolutionary Link between Bacterial Licheninases and Plant XTH Gene Products*

PubMed Central

Eklöf, Jens M.; Shojania, Shaheen; Okon, Mark; McIntosh, Lawrence P.; Brumer, Harry

2013-01-01

The large xyloglucan endotransglycosylase/hydrolase (XTH) gene family continues to be the focus of much attention in studies of plant cell wall morphogenesis due to the unique catalytic functions of the enzymes it encodes. The XTH gene products compose a subfamily of glycoside hydrolase family 16 (GH16), which also comprises a broad range of microbial endoglucanases and endogalactanases, as well as yeast cell wall chitin/β-glucan transglycosylases. Previous whole-family phylogenetic analyses have suggested that the closest relatives to the XTH gene products are the bacterial licheninases (EC 3.2.1.73), which specifically hydrolyze linear mixed linkage β(1→3)/β(1→4)-glucans. In addition to their specificity for the highly branched xyloglucan polysaccharide, XTH gene products are distinguished from the licheninases and other GH16 enzyme subfamilies by significant active site loop alterations and a large C-terminal extension. Given these differences, the molecular evolution of the XTH gene products in GH16 has remained enigmatic. Here, we present the biochemical and structural analysis of a unique, mixed function endoglucanase from black cottonwood (Populus trichocarpa), which reveals a small, newly recognized subfamily of GH16 members intermediate between the bacterial licheninases and plant XTH gene products. We postulate that this clade comprises an important link in the evolution of the large plant XTH gene families from a putative microbial ancestor. As such, this analysis provides new insights into the diversification of GH16 and further unites the apparently disparate members of this important family of proteins. PMID:23572521
Ol-Prx 3, a member of an additional class of homeobox genes, is unimodally expressed in several domains of the developing and adult central nervous system of the medaka (Oryzias latipes)

PubMed Central

Joly, Jean-Stephane; Bourrat, Franck; Nguyen, Van; Chourrout, Daniel

1997-01-01

Large-scale genetic screens for mutations affecting early neurogenesis of vertebrates have recently been performed with an aquarium fish, the zebrafish. Later stages of neural morphogenesis have attracted less attention in small fish species, partly because of the lack of molecular markers of developing structures that may facilitate the detection of discrete structural alterations. In this context, we report the characterization of Ol-Prx 3 (Oryzias latipes-Prx 3). This gene was isolated in the course of a large-scale screen for brain cDNAs containing a highly conserved DNA binding region, the homeobox helix-three. Sequence analysis revealed that this gene belongs to another class of homeobox genes, together with a previously isolated mouse ortholog, called OG-12 [Rovescalli, A. C., Asoh, S. & Nirenberg, M. (1996) Proc. Natl. Acad. Sci. USA 93, 10691–10696] and with the human SHOX gene [Rao, E., Weiss, B., Fukami, M., Rump, A., Niesler, B., et al. (1997) Nat. Genet. 16, 54–62], thought to be involved in the short-stature phenotype of Turner syndrome patients. These three genes exhibit a moderate level of identity in the homeobox with the other genes of the paired-related (PRX) gene family. Ol-Prx 3, as well as the PRX genes, are expressed in various cartilaginous structures of head and limbs. These genes might thus be involved in common regulatory pathways during the morphogenesis of these structures. Moreover, this paper reports a complex and monophasic pattern of Ol-Prx 3 expression in the central nervous system, which differs markedly from the patterns reported for the PRX genes, Prx 3 excluded: this gene begins to be expressed in a variety of central nervous system territories at late neurula stage. Strikingly, it remains turned on in some of the derivatives of each territory during the entire life of the fish. We hope this work will thus help identify common features for the PRX 3 family of homeobox genes. PMID:9371787
First report of a deletion encompassing an entire exon in the homogentisate 1,2-dioxygenase gene causing alkaptonuria.

PubMed

Zouheir Habbal, Mohammad; Bou-Assi, Tarek; Zhu, Jun; Owen, Renius; Chehab, Farid F

2014-01-01

Alkaptonuria is often diagnosed clinically with episodes of dark urine, biochemically by the accumulation of peripheral homogentisic acid and molecularly by the presence of mutations in the homogentisate 1,2-dioxygenase gene (HGD). Alkaptonuria is invariably associated with HGD mutations, which consist of single nucleotide variants and small insertions/deletions. Surprisingly, the presence of deletions beyond a few nucleotides among over 150 reported deleterious mutations has not been described, raising the suspicion that this gene might be protected against the detrimental mechanisms of gene rearrangements. The quest for an HGD mutation in a proband with AKU revealed with a SNP array five large regions of homozygosity (5-16 Mb), one of which includes the HGD gene. A homozygous deletion of 649 bp deletion that encompasses the 72 nucleotides of exon 2 and surrounding DNA sequences in flanking introns of the HGD gene was unveiled in a proband with AKU. The nature of this deletion suggests that this in-frame deletion could generate a protein without exon 2. Thus, we modeled the tertiary structure of the mutant protein structure to determine the effect of exon 2 deletion. While the two β-pleated sheets encoded by exon 2 were missing in the mutant structure, other β-pleated sheets are largely unaffected by the deletion. However, nine novel α-helical coils substituted the eight coils present in the native HGD crystal structure. Thus, this deletion results in a deleterious enzyme, which is consistent with the proband's phenotype. Screening for mutations in the HGD gene, particularly in the Middle East, ought to include this exon 2 deletion in order to determine its frequency and uncover its origin.
First Report of a Deletion Encompassing an Entire Exon in the Homogentisate 1,2-Dioxygenase Gene Causing Alkaptonuria

PubMed Central

Habbal, Mohammad Zouheir; Bou-Assi, Tarek; Zhu, Jun; Owen, Renius; Chehab, Farid F.

2014-01-01

Alkaptonuria is often diagnosed clinically with episodes of dark urine, biochemically by the accumulation of peripheral homogentisic acid and molecularly by the presence of mutations in the homogentisate 1,2-dioxygenase gene (HGD). Alkaptonuria is invariably associated with HGD mutations, which consist of single nucleotide variants and small insertions/deletions. Surprisingly, the presence of deletions beyond a few nucleotides among over 150 reported deleterious mutations has not been described, raising the suspicion that this gene might be protected against the detrimental mechanisms of gene rearrangements. The quest for an HGD mutation in a proband with AKU revealed with a SNP array five large regions of homozygosity (5–16 Mb), one of which includes the HGD gene. A homozygous deletion of 649 bp deletion that encompasses the 72 nucleotides of exon 2 and surrounding DNA sequences in flanking introns of the HGD gene was unveiled in a proband with AKU. The nature of this deletion suggests that this in-frame deletion could generate a protein without exon 2. Thus, we modeled the tertiary structure of the mutant protein structure to determine the effect of exon 2 deletion. While the two β-pleated sheets encoded by exon 2 were missing in the mutant structure, other β-pleated sheets are largely unaffected by the deletion. However, nine novel α-helical coils substituted the eight coils present in the native HGD crystal structure. Thus, this deletion results in a deleterious enzyme, which is consistent with the proband’s phenotype. Screening for mutations in the HGD gene, particularly in the Middle East, ought to include this exon 2 deletion in order to determine its frequency and uncover its origin. PMID:25233259

Dizeez: An Online Game for Human Gene-Disease Annotation

PubMed Central

Loguercio, Salvatore; Good, Benjamin M.; Su, Andrew I.

2013-01-01

Structured gene annotations are a foundation upon which many bioinformatics and statistical analyses are built. However the structured annotations available in public databases are a sparse representation of biological knowledge as a whole. The rate of biomedical data generation is such that centralized biocuration efforts struggle to keep up. New models for gene annotation need to be explored that expand the pace at which we are able to structure biomedical knowledge. Recently, online games have emerged as an effective way to recruit, engage and organize large numbers of volunteers to help address difficult biological challenges. For example, games have been successfully developed for protein folding (Foldit), multiple sequence alignment (Phylo) and RNA structure design (EteRNA). Here we present Dizeez, a simple online game built with the purpose of structuring knowledge of gene-disease associations. Preliminary results from game play online and at scientific conferences suggest that Dizeez is producing valid gene-disease annotations not yet present in any public database. These early results provide a basic proof of principle that online games can be successfully applied to the challenge of gene annotation. Dizeez is available at http://genegames.org. PMID:23951102
The History of Bordetella pertussis Genome Evolution Includes Structural Rearrangement

PubMed Central

Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Bowden, Katherine E.; Burroughs, Mark; Cassiday, Pamela K.; Davis, Jamie K.; Johnson, Taccara; Juieng, Phalasy; Knipe, Kristen; Mathis, Marsenia H.; Pruitt, Andrea M.; Rowe, Lori; Sheth, Mili; Tondella, M. Lucia; Williams, Margaret M.

2017-01-01

ABSTRACT Despite high pertussis vaccine coverage, reported cases of whooping cough (pertussis) have increased over the last decade in the United States and other developed countries. Although Bordetella pertussis is well known for its limited gene sequence variation, recent advances in long-read sequencing technology have begun to reveal genomic structural heterogeneity among otherwise indistinguishable isolates, even within geographically or temporally defined epidemics. We have compared rearrangements among complete genome assemblies from 257 B. pertussis isolates to examine the potential evolution of the chromosomal structure in a pathogen with minimal gene nucleotide sequence diversity. Discrete changes in gene order were identified that differentiated genomes from vaccine reference strains and clinical isolates of various genotypes, frequently along phylogenetic boundaries defined by single nucleotide polymorphisms. The observed rearrangements were primarily large inversions centered on the replication origin or terminus and flanked by IS481, a mobile genetic element with >240 copies per genome and previously suspected to mediate rearrangements and deletions by homologous recombination. These data illustrate that structural genome evolution in B. pertussis is not limited to reduction but also includes rearrangement. Therefore, although genomes of clinical isolates are structurally diverse, specific changes in gene order are conserved, perhaps due to positive selection, providing novel information for investigating disease resurgence and molecular epidemiology. IMPORTANCE Whooping cough, primarily caused by Bordetella pertussis, has resurged in the United States even though the coverage with pertussis-containing vaccines remains high. The rise in reported cases has included increased disease rates among all vaccinated age groups, provoking questions about the pathogen's evolution. The chromosome of B. pertussis includes a large number of repetitive mobile genetic elements that obstruct genome analysis. However, these mobile elements facilitate large rearrangements that alter the order and orientation of essential protein-encoding genes, which otherwise exhibit little nucleotide sequence diversity. By comparing the complete genome assemblies from 257 isolates, we show that specific rearrangements have been conserved throughout recent evolutionary history, perhaps by eliciting changes in gene expression, which may also provide useful information for molecular epidemiology. PMID:28167525
MacroBac: New Technologies for Robust and Efficient Large-Scale Production of Recombinant Multiprotein Complexes.

PubMed

Gradia, Scott D; Ishida, Justin P; Tsai, Miaw-Sheue; Jeans, Chris; Tainer, John A; Fuss, Jill O

2017-01-01

Recombinant expression of large, multiprotein complexes is essential and often rate limiting for determining structural, biophysical, and biochemical properties of DNA repair, replication, transcription, and other key cellular processes. Baculovirus-infected insect cell expression systems are especially well suited for producing large, human proteins recombinantly, and multigene baculovirus systems have facilitated studies of multiprotein complexes. In this chapter, we describe a multigene baculovirus system called MacroBac that uses a Biobricks-type assembly method based on restriction and ligation (Series 11) or ligation-independent cloning (Series 438). MacroBac cloning and assembly is efficient and equally well suited for either single subcloning reactions or high-throughput cloning using 96-well plates and liquid handling robotics. MacroBac vectors are polypromoter with each gene flanked by a strong polyhedrin promoter and an SV40 poly(A) termination signal that minimize gene order expression level effects seen in many polycistronic assemblies. Large assemblies are robustly achievable, and we have successfully assembled as many as 10 genes into a single MacroBac vector. Importantly, we have observed significant increases in expression levels and quality of large, multiprotein complexes using a single, multigene, polypromoter virus rather than coinfection with multiple, single-gene viruses. Given the importance of characterizing functional complexes, we believe that MacroBac provides a critical enabling technology that may change the way that structural, biophysical, and biochemical research is done. © 2017 Elsevier Inc. All rights reserved.
Coenzyme Recognition and Gene Regulation by a Flavin Mononucleotide Riboswitch

DOE Office of Scientific and Technical Information (OSTI.GOV)

Serganov, A.; Huang, L; Patel, D

2009-01-01

The biosynthesis of several protein cofactors is subject to feedback regulation by riboswitches. Flavin mononucleotide (FMN)-specific riboswitches also known as RFN elements, direct expression of bacterial genes involved in the biosynthesis and transport of riboflavin (vitamin B2) and related compounds. Here we present the crystal structures of the Fusobacterium nucleatum riboswitch bound to FMN, riboflavin and antibiotic roseoflavin. The FMN riboswitch structure, centred on an FMN-bound six-stem junction, does not fold by collinear stacking of adjacent helices, typical for folding of large RNAs. Rather, it adopts a butterfly-like scaffold, stapled together by opposingly directed but nearly identically folded peripheral domains.more » FMN is positioned asymmetrically within the junctional site and is specifically bound to RNA through interactions with the isoalloxazine ring chromophore and direct and Mg{sup 2+}-mediated contacts with the phosphate moiety. Our structural data, complemented by binding and footprinting experiments, imply a largely pre-folded tertiary RNA architecture and FMN recognition mediated by conformational transitions within the junctional binding pocket. The inherent plasticity of the FMN-binding pocket and the availability of large openings make the riboswitch an attractive target for structure-based design of FMN-like antimicrobial compounds. Our studies also explain the effects of spontaneous and antibiotic-induced deregulatory mutations and provided molecular insights into FMN-based control of gene expression in normal and riboflavin-overproducing bacterial strains.« less
PHENOstruct: Prediction of human phenotype ontology terms using heterogeneous data sources.

PubMed

Kahanda, Indika; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa

2015-01-01

The human phenotype ontology (HPO) was recently developed as a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. At present, only a small fraction of human protein coding genes have HPO annotations. But, researchers believe that a large portion of currently unannotated genes are related to disease phenotypes. Therefore, it is important to predict gene-HPO term associations using accurate computational methods. In this work we demonstrate the performance advantage of the structured SVM approach which was shown to be highly effective for Gene Ontology term prediction in comparison to several baseline methods. Furthermore, we highlight a collection of informative data sources suitable for the problem of predicting gene-HPO associations, including large scale literature mining data.
Origin and Loss of Nested LRRTM/α-Catenin Genes during Vertebrate Evolution

PubMed Central

Uvarov, Pavel; Kajander, Tommi; Airaksinen, Matti S.

2014-01-01

Leucine-rich repeat transmembrane neuronal proteins (LRRTMs) form in mammals a family of four postsynaptic adhesion proteins, which have been shown to bind neurexins and heparan sulphate proteoglycan (HSPG) glypican on the presynaptic side. Mutations in the genes encoding LRRTMs and neurexins are implicated in human cognitive disorders such as schizophrenia and autism. Our analysis shows that in most jawed vertebrates, lrrtm1, lrrtm2, and lrrtm3 genes are nested on opposite strands of large conserved intron of α-catenin genes ctnna2, ctnna1, and ctnna3, respectively. No lrrtm genes could be found in tunicates or lancelets, while two lrrtm genes are found in the lamprey genome, one of which is adjacent to a single ctnna homolog. Based on similar highly positive net charge of lamprey LRRTMs and the HSPG-binding LRRTM3 and LRRTM4 proteins, we speculate that the ancestral LRRTM might have bound HSPG before acquiring neurexins as binding partners. Our model suggests that lrrtm gene translocated into the large ctnna intron in early vertebrates, and that subsequent duplications resulted in three lrrtm/ctnna gene pairs present in most jawed vertebrates. However, we detected three prominent exceptions: (1) the lrrtm3/ctnna3 gene structure is absent in the ray-finned fish genomes, (2) the genomes of clawed frogs contain ctnna1 but lack the corresponding nested (lrrtm2) gene, and (3) contain lrrtm3 gene in the syntenic position but lack the corresponding host (ctnna3) gene. We identified several other protein-coding nested gene structures of which either the host or the nested gene has presumably been lost in the frog or chicken lineages. Interestingly, majority of these nested genes comprise LRR domains. PMID:24587117
Camelid Ig V genes reveal significant human homology not seen in therapeutic target genes, providing for a powerful therapeutic antibody platform

PubMed Central

Klarenbeek, Alex; Mazouari, Khalil El; Desmyter, Aline; Blanchetot, Christophe; Hultberg, Anna; de Jonge, Natalie; Roovers, Rob C; Cambillau, Christian; Spinelli, Sylvia; Del-Favero, Jurgen; Verrips, Theo; de Haard, Hans J; Achour, Ikbel

2015-01-01

Camelid immunoglobulin variable (IGV) regions were found homologous to their human counterparts; however, the germline V repertoires of camelid heavy and light chains are still incomplete and their therapeutic potential is only beginning to be appreciated. We therefore leveraged the publicly available HTG and WGS databases of Lama pacos and Camelus ferus to retrieve the germline repertoire of V genes using human IGV genes as reference. In addition, we amplified IGKV and IGLV genes to uncover the V germline repertoire of Lama glama and sequenced BAC clones covering part of the Lama pacos IGK and IGL loci. Our in silico analysis showed that camelid counterparts of all human IGKV and IGLV families and most IGHV families could be identified, based on canonical structure and sequence homology. Interestingly, this sequence homology seemed largely restricted to the Ig V genes and was far less apparent in other genes: 6 therapeutically relevant target genes differed significantly from their human orthologs. This contributed to efficient immunization of llamas with the human proteins CD70, MET, interleukin (IL)-1β and IL-6, resulting in large panels of functional antibodies. The in silico predicted human-homologous canonical folds of camelid-derived antibodies were confirmed by X-ray crystallography solving the structure of 2 selected camelid anti-CD70 and anti-MET antibodies. These antibodies showed identical fold combinations as found in the corresponding human germline V families, yielding binding site structures closely similar to those occurring in human antibodies. In conclusion, our results indicate that active immunization of camelids can be a powerful therapeutic antibody platform. PMID:26018625
Next generation haplotyping to decipher nuclear genomic interspecific admixture in Citrus species: analysis of chromosome 2.

PubMed

Curk, Franck; Ancillo, Gema; Garcia-Lor, Andres; Luro, François; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Navarro, Luis; Ollitrault, Patrick

2014-12-29

The most economically important Citrus species originated by natural interspecific hybridization between four ancestral taxa (Citrus reticulata, Citrus maxima, Citrus medica, and Citrus micrantha) and from limited subsequent interspecific recombination as a result of apomixis and vegetative propagation. Such reticulate evolution coupled with vegetative propagation results in mosaic genomes with large chromosome fragments from the basic taxa in frequent interspecific heterozygosity. Modern breeding of these species is hampered by their complex heterozygous genomic structures that determine species phenotype and are broken by sexual hybridisation. Nevertheless, a large amount of diversity is present in the citrus gene pool, and breeding to allow inclusion of desirable traits is of paramount importance. However, the efficient mobilization of citrus biodiversity in innovative breeding schemes requires previous understanding of Citrus origins and genomic structures. Haplotyping of multiple gene fragments along the whole genome is a powerful approach to reveal the admixture genomic structure of current species and to resolve the evolutionary history of the gene pools. In this study, the efficiency of parallel sequencing with 454 methodology to decipher the hybrid structure of modern citrus species was assessed by analysis of 16 gene fragments on chromosome 2. 454 amplicon libraries were established using the Fluidigm array system for 48 genotypes and 16 gene fragments from chromosome 2. Haplotypes were established from the reads of each accession and phylogenetic analyses were performed using the haplotypic data for each gene fragment. The length of 454 reads and the level of differentiation between the ancestral taxa of modern citrus allowed efficient haplotype phylogenetic assignations for 12 of the 16 gene fragments. The analysis of the mixed genomic structure of modern species and cultivars (i) revealed C. maxima introgressions in modern mandarins, (ii) was consistent with previous hypotheses regarding the origin of secondary species, and (iii) provided a new picture of the evolution of chromosome 2. 454 sequencing was an efficient strategy to establish haplotypes with significant phylogenetic assignations in Citrus, providing a new picture of the mixed structure on chromosome 2 in 48 citrus genotypes.
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

PubMed Central

2009-01-01

Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

PubMed

Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

2009-08-06

Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering.

PubMed

Ji, Shuiwang

2013-07-11

The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship.
An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.

PubMed

Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

2008-01-29

Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.
An open-source framework for large-scale, flexible evaluation of biomedical text mining systems

PubMed Central

Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

2008-01-01

Background Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Results Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. Conclusion The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net. PMID:18230184
Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes

USDA-ARS?s Scientific Manuscript database

The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-...
Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

PubMed Central

Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

2005-01-01

Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747
Activation of the alpha-globin gene expression correlates with dramatic upregulation of nearby non-globin genes and changes in local and large-scale chromatin spatial structure.

PubMed

Ulianov, Sergey V; Galitsyna, Aleksandra A; Flyamer, Ilya M; Golov, Arkadiy K; Khrameeva, Ekaterina E; Imakaev, Maxim V; Abdennur, Nezar A; Gelfand, Mikhail S; Gavrilov, Alexey A; Razin, Sergey V

2017-07-11

In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation. We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation. Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.
Chromatin organization and global regulation of Hox gene clusters

PubMed Central

Montavon, Thomas; Duboule, Denis

2013-01-01

During development, a properly coordinated expression of Hox genes, within their different genomic clusters is critical for patterning the body plans of many animals with a bilateral symmetry. The fascinating correspondence between the topological organization of Hox clusters and their transcriptional activation in space and time has served as a paradigm for understanding the relationships between genome structure and function. Here, we review some recent observations, which revealed highly dynamic changes in the structure of chromatin at Hox clusters, in parallel with their activation during embryonic development. We discuss the relevance of these findings for our understanding of large-scale gene regulation. PMID:23650639
GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies

PubMed Central

Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay

2004-01-01

Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175
3' rapid amplification of cDNA ends (RACE) walking for rapid structural analysis of large transcripts.

PubMed

Ozawa, Tatsuhiko; Kondo, Masato; Isobe, Masaharu

2004-01-01

The 3' rapid amplification of cDNA ends (3' RACE) is widely used to isolate the cDNA of unknown 3' flanking sequences. However, the conventional 3' RACE often fails to amplify cDNA from a large transcript if there is a long distance between the 5' gene-specific primer and poly(A) stretch, since the conventional 3' RACE utilizes 3' oligo-dT-containing primer complementary to the poly(A) tail of mRNA at the first strand cDNA synthesis. To overcome this problem, we have developed an improved 3' RACE method suitable for the isolation of cDNA derived from very large transcripts. By using the oligonucleotide-containing random 9mer together with the GC-rich sequence for the suppression PCR technology at the first strand of cDNA synthesis, we have been able to amplify the cDNA from a very large transcript, such as the microtubule-actin crosslinking factor 1 (MACF1) gene, which codes a transcript of 20 kb in size. When there is no splicing variant, our highly specific amplification allows us to perform the direct sequencing of 3' RACE products without requiring cloning in bacterial hosts. Thus, this stepwise 3' RACE walking will help rapid characterization of the 3' structure of a gene, even when it encodes a very large transcript.
Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species.

PubMed

Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

2008-06-23

The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.

Identification of causal genes for complex traits

PubMed Central

Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

2015-01-01

Motivation: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider ‘causal variants’ as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. Results: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Availability and implementation: Software is freely available for download at genetics.cs.ucla.edu/caviar. Contact: eeskin@cs.ucla.edu PMID:26072484
Identification of causal genes for complex traits.

PubMed

Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

2015-06-15

Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Software is freely available for download at genetics.cs.ucla.edu/caviar. © The Author 2015. Published by Oxford University Press.
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Shifts in microbial community structure and function in surface waters impacted by unconventional oil and gas wastewater revealed by metagenomics

USGS Publications Warehouse

Fahrenfeld, N.L.; Reyes, Hannah Delos; Eramo, Alessia; Akob, Denise M.; Mumford, Adam; Cozzarelli, Isabelle M.

2017-01-01

Unconventional oil and gas (UOG) production produces large quantities of wastewater with complex geochemistry and largely uncharacterized impacts on surface waters. In this study, we assessed shifts in microbial community structure and function in sediments and waters upstream and downstream from a UOG wastewater disposal facility. To do this, quantitative PCR for 16S rRNA and antibiotic resistance genes along with metagenomic sequencing were performed. Elevated conductivity and markers of UOG wastewater characterized sites sampled downstream from the disposal facility compared to background sites. Shifts in overall high level functions and microbial community structure were observed between background sites and downstream sediments. Increases in Deltaproteobacteria and Methanomicrobia and decreases in Thaumarchaeota were observed at downstream sites. Genes related to dormancy and sporulation and methanogenic respiration were 18–86 times higher at downstream, impacted sites. The potential for these sediments to serve as reservoirs of antimicrobial resistance was investigated given frequent reports of the use of biocides to control the growth of nuisance bacteria in UOG operations. A shift in resistance profiles downstream of the UOG facility was observed including increases in acrB and mexB genes encoding for multidrug efflux pumps, but not overall abundance of resistance genes. The observed shifts in microbial community structure and potential function indicate changes in respiration, nutrient cycling, and markers of stress in a stream impacted by UOG waste disposal operations.
Geographical Genomics of Human Leukocyte Gene Expression Variation in Southern Morocco

PubMed Central

Idaghdour, Youssef; Czika, Wendy; Shianna, Kevin V.; Lee, S. Hong; Visscher, Peter M.; Martin, Hilary C.; Miclaus, Kelci; Jadallah, Sami J.; Goldstein, David B.; Wolfinger, Russell D.; Gibson, Greg

2009-01-01

Studies of the genetics of gene expression reveal expression SNPs that explain variation in transcript abundance. Here we address the robustness of eSNP associations to environmental geography and population structure in a comparison of 194 Arab and Amazigh individuals from a city and two villages in southern Morocco. Gene expression differed between pairs of locations for up to a third of all transcripts, with notable enrichment for ribosomal biosynthesis and oxidative phosphorylation. Robust associations were observed in the leukocyte samples with cis-eSNPs (P < 10−08) for 346 genes, and trans-eSNPs (P < 10−11) with 10 genes. All of these were consistent across the three sample locations and after controlling for ethnicity and relatedness. No evidence for large-effect trans-acting mediators of the pervasive environmental influence was found and instead genetic and environmental factors acted in a largely additive manner. PMID:19966804
Aluminum tolerance is associated with higher MATE1 gene copy-number in maize

USDA-ARS?s Scientific Manuscript database

Genome structure variation, including copy-number (CNV) and presence/absence variation (PAV), comprise a large extent of maize genetic diversity but their effect on phenotypes remains largely unexplored. Here we describe how copy-number variation in a major aluminum (Al) tolerance locus contributes ...
Genetic connectivity among swarming sites in the wide ranging and recently declining little brown bat (Myotis lucifugus)

PubMed Central

Burns, Lynne E; Frasier, Timothy R; Broders, Hugh G

2014-01-01

Characterizing movement dynamics and spatial aspects of gene flow within a species permits inference on population structuring. As patterns of structuring are products of historical and current demographics and gene flow, assessment of structure through time can yield an understanding of evolutionary dynamics acting on populations that are necessary to inform management. Recent dramatic population declines in hibernating bats in eastern North America from white-nose syndrome have prompted the need for information on movement dynamics for multiple bat species. We characterized population genetic structure of the little brown bat, Myotis lucifugus, at swarming sites in southeastern Canada using 9 nuclear microsatellites and a 292-bp region of the mitochondrial genome. Analyses of FST, ΦST, and Bayesian clustering (STRUCTURE) found weak levels of genetic structure among swarming sites for the nuclear and mitochondrial genome (Global FST = 0.001, P < 0.05, Global ΦST = 0.045, P < 0.01, STRUCTURE K = 1) suggesting high contemporary gene flow. Hierarchical AMOVA also suggests little structuring at a regional (provincial) level. Metrics of nuclear genetic structure were not found to differ between males and females suggesting weak asymmetries in gene flow between the sexes. However, a greater degree of mitochondrial structuring does support male-biased dispersal long term. Demographic analyses were consistent with past population growth and suggest a population expansion occurred from approximately 1250 to 12,500 BP, following Pleistocene deglaciation in the region. Our study suggests high gene flow and thus a high degree of connectivity among bats that visit swarming sites whereby mainland areas of the region may be best considered as one large gene pool for management and conservation. PMID:25505539
Knowledge and Attitude of Nigerian Adolescents to Premarital Genotying.

ERIC Educational Resources Information Center

Egbochuku, E. O.; Imogie, A. O.

Sickle cell disease (SCD) refers to a group of hereditary disorders of the structure of hemoglobin of red blood cells. This disorder involves the inheritance of two abnormal genes, which are related to the hemoglobin promotion, at least, one of which is the sickle cell gene. Nigeria, by virtue of her large population, has the greatest number of…
Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

DOE Office of Scientific and Technical Information (OSTI.GOV)

von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

2001-01-01

Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology

PubMed Central

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-01-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells. PMID:28731469
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology.

PubMed

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-11-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells.
[Hsp70 Genes of the Megaphragma amalphitanum (Hymenoptera: Trichogrammatidae) Parasitic Wasp].

PubMed

Chuvakova, L N; Sharko, F S; Nedoluzhko, A V; Polilov, A A; Prokhorchuk, E B; Skryabin, K G; Evgen'ev, M B

2017-01-01

Miniaturization is an evolutionary process that is widely represented in both invertebrates and vertebrates. Miniaturization frequently affects not only the size of the organism and its constituent cells, but also changes the genome structure and functioning. The structure of the main heat shock genes (hsp70 and hsp83) was studied in one of the smallest insects, the Megaphragma amalphitanum (Hymenoptera: Trichogrammatidae) parasitic wasp, which is comparable in size with unicellular organisms. An analysis of the sequenced genome has detected six genes that relate to the hsp70 family, some of which are apparently induced upon heat shock. Both induced and constitutively expressed hsp70 genes contain a large number of introns, which is not typical for the genes of this family. Moreover, none of the found genes form clusters, and they are all very heterogeneous (individual copies are only 75-85% identical), which indicates the absence of gene conversion, which provides the identity of genes of this family in Drosophila and other organisms. Two hsp83 genes, one of which contains an intron, have also been found in the M. amalphitanum genome.
Soybean kinome: functional classification and gene expression patterns

PubMed Central

Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek

2015-01-01

The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering

PubMed Central

2013-01-01

Background The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. Results In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Conclusions Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship. PMID:23845024
Construction of ontology augmented networks for protein complex prediction.

PubMed

Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

2013-01-01

Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
Beta-globin locus activation regions: conservation of organization, structure, and function.

PubMed Central

Li, Q L; Zhou, B; Powers, P; Enver, T; Stamatoyannopoulos, G

1990-01-01

The human beta-globin locus activation region (LAR) comprises four erythroid-specific DNase I hypersensitive sites (I-IV) thought to be largely responsible for activating the beta-globin domain and facilitating high-level erythroid-specific globin gene expression. We identified the goat beta-globin LAR, determined 10.2 kilobases of its sequence, and demonstrated its function in transgenic mice. The human and goat LARs share 6.5 kilobases of homologous sequences that are as highly conserved as the epsilon-globin gene promoters. Furthermore, the overall spatial organization of the two LARs has been conserved. These results suggest that the functionally relevant regions of the LAR are large and that in addition to their primary structure, the spatial relationship of the conserved elements is important for LAR function. Images PMID:2236034
Natural Allelic Diversity, Genetic Structure and Linkage Disequilibrium Pattern in Wild Chickpea

PubMed Central

Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

2014-01-01

Characterization of natural allelic diversity and understanding the genetic structure and linkage disequilibrium (LD) pattern in wild germplasm accessions by large-scale genotyping of informative microsatellite and single nucleotide polymorphism (SNP) markers is requisite to facilitate chickpea genetic improvement. Large-scale validation and high-throughput genotyping of genome-wide physically mapped 478 genic and genomic microsatellite markers and 380 transcription factor gene-derived SNP markers using gel-based assay, fluorescent dye-labelled automated fragment analyser and matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass array have been performed. Outcome revealed their high genotyping success rate (97.5%) and existence of a high level of natural allelic diversity among 94 wild and cultivated Cicer accessions. High intra- and inter-specific polymorphic potential and wider molecular diversity (11–94%) along with a broader genetic base (13–78%) specifically in the functional genic regions of wild accessions was assayed by mapped markers. It suggested their utility in monitoring introgression and transferring target trait-specific genomic (gene) regions from wild to cultivated gene pool for the genetic enhancement. Distinct species/gene pool-wise differentiation, admixed domestication pattern, and differential genome-wide recombination and LD estimates/decay observed in a six structured population of wild and cultivated accessions using mapped markers further signifies their usefulness in chickpea genetics, genomics and breeding. PMID:25222488
Mitochondrial genes in the colourless alga Prototheca wickerhamii resemble plant genes in their exons but fungal genes in their introns.

PubMed Central

Wolff, G; Burger, G; Lang, B F; Kück, U

1993-01-01

The mitochondrial DNA from the colourless alga Prototheca wickerhamii contains two mosaic genes as was revealed from complete sequencing of the circular extranuclear genome. The genes for the large subunit of the ribosomal RNA (LSUrRNA) as well as for subunit I of the cytochrome oxidase (coxI) carry two and three intronic sequences respectively. On the basis of their canonical nucleotide sequences they can be classified as group I introns. Phylogenetic comparisons of the coxI protein sequences allow us to conclude that the P.wickerhamii mtDNA is much closer related to higher plant mtDNAs than to those of the chlorophyte alga C.reinhardtii. The comparison of the intron sequences revealed several unusual features: (1) The P.wickerhamii introns are structurally related to mitochondrial introns from various ascomycetous fungi. (2) Phylogenetic analyses indicate a close relationship between fungal and algal intronic sequences. (3) The P. wickerhamii introns are located at positions within the structural genes which can be considered as preferred intron insertion sites in homologous mitochondrial genes from fungi or liverwort. In all cases, the sequences adjacent to the insertion sites are very well conserved over large evolutionary distances. Our finding of highly similar introns in fungi and algae is consistent with the idea that introns have already been present in the bacterial ancestors of present day mitochondria and evolved concomitantly with the organelles. PMID:7680126
Genetics of bacteria that oxidize one-carbon compounds. Progress report, March 1, 1991--June 30, 1993

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanson, R.S.

In the past several years researchers have identified at least 20 genes whose products were required for the oxidation of methanol to formaldehyde in three different facultative methylotrophic bacteria. These genes include structural genes for a cytochrome c{sub L} (mox G) and is a specific electron acceptor for methanol dehydrogenase (MDH), and the two structural genes that encode the large subunit (mox F) and smaller subunit (mox I) of MDH. Other genes are required for the synthesis of the prosthetic group of MDH, Pyrroloquinoline quinone (PQQ), and proteins required for assembly of the active MDH in the periplasm. Three genesmore » are believed to be required for incorporation of calcium into the MDH tetramer. The principal investigator`s group has studied the regulation of methanol oxidation in the pink-pigmented-facultative methylotroph Methylobacterium organophilum XX. The authors have mapped several genes and have sequenced the mox F gene and sequences upstream of mox F. The authors had tentatively identified several genes required for the transcription of the MDH structural genes in three methylotrophs. In the previous proposal, the P.I. proposed to establish an in-vitro transcription/translation system to study the function of the regulatory gene products. Further studies demonstrated that the regulation of transcription of these genes was far more complex than imagined at that time and the research plan was modified to determine the number and function of the regulatory genes using genetic approaches.« less
Genome-wide identification and characterization of WRKY gene family in Salix suchowensis.

PubMed

Bi, Changwei; Xu, Yiqing; Ye, Qiaolin; Yin, Tongming; Ye, Ning

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I-III), with five subgroups (IIa-IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon-intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants.

Genome-wide identification and characterization of WRKY gene family in Salix suchowensis

PubMed Central

Ye, Qiaolin; Yin, Tongming

2016-01-01

WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I–III), with five subgroups (IIa–IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon–intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants. PMID:27651997
Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.

PubMed

Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L

2005-01-01

The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.
Cyclen-based double-tailed lipids for DNA delivery: Synthesis and the effect of linking group structures.

PubMed

Zhang, Yi-Mei; Chang, De-Chun; Zhang, Ji; Liu, Yan-Hong; Yu, Xiao-Qi

2015-09-01

The gene transfection efficiency (TE) of cationic lipids is largely influenced by the lipid structure. Six novel 1, 4, 7, 10-tetraazacyclododecane (cyclen)-based cationic lipids L1-L6, which contain double oleyl as hydrophobic tails, were designed and synthesized. The difference between these lipids is their diverse backbone. Liposomes prepared by the lipids and DOPE showed good DNA affinity, and full DNA condensation could be achieved at N/P of 4 to form lipoplexes with proper size and zeta-potentials for gene transfection. Structure-activity relationship of these lipids as non-viral gene delivery vectors was investigated. It was found that minor backbone structural variations, including linking group and the structural symmetry would affect the TE. The diethylenetriamine derived lipid L4 containing amide linking bonds gave the best TE, which was several times higher than commercially available transfection reagent lipofectamine 2000. Besides, these lipids exhibited low cytotoxicity, suggesting their good biocompatibility. Results reveal that such type of cationic lipids might be promising non-viral gene vectors, and also afford us clues for the design of novel vectors with higher TE and biocompatibility. Copyright © 2015 Elsevier Ltd. All rights reserved.
Muscle Research and Gene Ontology: New standards for improved data integration

PubMed Central

Feltrin, Erika; Campanaro, Stefano; Diehl, Alexander D; Ehler, Elisabeth; Faulkner, Georgine; Fordham, Jennifer; Gardin, Chiara; Harris, Midori; Hill, David; Knoell, Ralph; Laveder, Paolo; Mittempergher, Lorenza; Nori, Alessandra; Reggiani, Carlo; Sorrentino, Vincenzo; Volpe, Pompeo; Zara, Ivano; Valle, Giorgio; Deegan née Clark, Jennifer

2009-01-01

Background The Gene Ontology Project provides structured controlled vocabularies for molecular biology that can be used for the functional annotation of genes and gene products. In a collaboration between the Gene Ontology (GO) Consortium and the muscle biology community, we have made large-scale additions to the GO biological process and cellular component ontologies. The main focus of this ontology development work concerns skeletal muscle, with specific consideration given to the processes of muscle contraction, plasticity, development, and regeneration, and to the sarcomere and membrane-delimited compartments. Our aims were to update the existing structure to reflect current knowledge, and to resolve, in an accommodating manner, the ambiguity in the language used by the community. Results The updated muscle terminologies have been incorporated into the GO. There are now 159 new terms covering critical research areas, and 57 existing terms have been improved and reorganized to follow their usage in muscle literature. Conclusion The revised GO structure should improve the interpretation of data from high-throughput (e.g. microarray and proteomic) experiments in the area of muscle science and muscle disease. We actively encourage community feedback on, and gene product annotation with these new terms. Please visit the Muscle Community Annotation Wiki . PMID:19178689
Large-scale gene expression profiling data for the model moss Physcomitrella patens aid understanding of developmental progression, culture and stress conditions.

PubMed

Hiss, Manuel; Laule, Oliver; Meskauskiene, Rasa M; Arif, Muhammad A; Decker, Eva L; Erxleben, Anika; Frank, Wolfgang; Hanke, Sebastian T; Lang, Daniel; Martin, Anja; Neu, Christina; Reski, Ralf; Richardt, Sandra; Schallenberg-Rüdinger, Mareike; Szövényi, Peter; Tiko, Theodhor; Wiedemann, Gertrud; Wolf, Luise; Zimmermann, Philip; Rensing, Stefan A

2014-08-01

The moss Physcomitrella patens is an important model organism for studying plant evolution, development, physiology and biotechnology. Here we have generated microarray gene expression data covering the principal developmental stages, culture forms and some environmental/stress conditions. Example analyses of developmental stages and growth conditions as well as abiotic stress treatments demonstrate that (i) growth stage is dominant over culture conditions, (ii) liquid culture is not stressful for the plant, (iii) low pH might aid protoplastation by reduced expression of cell wall structure genes, (iv) largely the same gene pool mediates response to dehydration and rehydration, and (v) AP2/EREBP transcription factors play important roles in stress response reactions. With regard to the AP2 gene family, phylogenetic analysis and comparison with Arabidopsis thaliana shows commonalities as well as uniquely expressed family members under drought, light perturbations and protoplastation. Gene expression profiles for P. patens are available for the scientific community via the easy-to-use tool at https://www.genevestigator.com. By providing large-scale expression profiles, the usability of this model organism is further enhanced, for example by enabling selection of control genes for quantitative real-time PCR. Now, gene expression levels across a broad range of conditions can be accessed online for P. patens. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Using Signature Genes as Tools To Assess Environmental Viral Ecology and Diversity

PubMed Central

Adriaenssens, Evelien M.

2014-01-01

Viruses (including bacteriophages) are the most abundant biological entities on the planet. As such, they are thought to have a major impact on all aspects of microbial community structure and function. Despite this critical role in ecosystem processes, the study of virus/phage diversity has lagged far behind parallel studies of the bacterial and eukaryotic kingdoms, largely due to the absence of any universal phylogenetic marker. Here we review the development and use of signature genes to investigate viral diversity, as a viable strategy for data sets of specific virus groups. Genes that have been used include those encoding structural proteins, such as portal protein, major capsid protein, and tail sheath protein, auxiliary metabolism genes, such as psbA, psbB, and phoH, and several polymerase genes. These marker genes have been used in combination with PCR-based fingerprinting and/or sequencing strategies to investigate spatial, temporal, and seasonal variations and diversity in a wide range of habitats. PMID:24837394
Assessing the effects of common variation in the FOXP2 gene on human brain structure.

PubMed

Hoogman, Martine; Guadalupe, Tulio; Zwiers, Marcel P; Klarenbeek, Patricia; Francks, Clyde; Fisher, Simon E

2014-01-01

The FOXP2 transcription factor is one of the most well-known genes to have been implicated in developmental speech and language disorders. Rare mutations disrupting the function of this gene have been described in different families and cases. In a large three-generation family carrying a missense mutation, neuroimaging studies revealed significant effects on brain structure and function, most notably in the inferior frontal gyrus, caudate nucleus, and cerebellum. After the identification of rare disruptive FOXP2 variants impacting on brain structure, several reports proposed that common variants at this locus may also have detectable effects on the brain, extending beyond disorder into normal phenotypic variation. These neuroimaging genetics studies used groups of between 14 and 96 participants. The current study assessed effects of common FOXP2 variants on neuroanatomy using voxel-based morphometry (VBM) and volumetric techniques in a sample of >1300 people from the general population. In a first targeted stage we analyzed single nucleotide polymorphisms (SNPs) claimed to have effects in prior smaller studies (rs2253478, rs12533005, rs2396753, rs6980093, rs7784315, rs17137124, rs10230558, rs7782412, rs1456031), beginning with regions proposed in the relevant papers, then assessing impact across the entire brain. In the second gene-wide stage, we tested all common FOXP2 variation, focusing on volumetry of those regions most strongly implicated from analyses of rare disruptive mutations. Despite using a sample that is more than 10 times that used for prior studies of common FOXP2 variation, we found no evidence for effects of SNPs on variability in neuroanatomy in the general population. Thus, the impact of this gene on brain structure may be largely limited to extreme cases of rare disruptive alleles. Alternatively, effects of common variants at this gene exist but are too subtle to be detected with standard volumetric techniques.
Transport genes and chemotaxis in Laribacter hongkongensis: a genome-wide analysis

PubMed Central

2011-01-01

Background Laribacter hongkongensis is a Gram-negative, sea gull-shaped rod associated with community-acquired gastroenteritis. The bacterium has been found in diverse freshwater environments including fish, frogs and drinking water reservoirs. Using the complete genome sequence data of L. hongkongensis, we performed a comprehensive analysis of putative transport-related genes and genes related to chemotaxis, motility and quorum sensing, which may help the bacterium adapt to the changing environments and combat harmful substances. Results A genome-wide analysis using Transport Classification Database TCDB, similarity and keyword searches revealed the presence of a large diversity of transporters (n = 457) and genes related to chemotaxis (n = 52) and flagellar biosynthesis (n = 40) in the L. hongkongensis genome. The transporters included those from all seven major transporter categories, which may allow the uptake of essential nutrients or ions, and extrusion of metabolic end products and hazardous substances. L. hongkongensis is unique among closely related members of Neisseriaceae family in possessing higher number of proteins related to transport of ammonium, urea and dicarboxylate, which may reflect the importance of nitrogen and dicarboxylate metabolism in this assacharolytic bacterium. Structural modeling of two C4-dicarboxylate transporters showed that they possessed similar structures to the determined structures of other DctP-TRAP transporters, with one having an unusual disulfide bond. Diverse mechanisms for iron transport, including hemin transporters for iron acquisition from host proteins, were also identified. In addition to the chemotaxis and flagella-related genes, the L. hongkongensis genome also contained two copies of qseB/qseC homologues of the AI-3 quorum sensing system. Conclusions The large number of diverse transporters and genes involved in chemotaxis, motility and quorum sensing suggested that the bacterium may utilize a complex system to adapt to different environments. Structural modeling will provide useful insights on the transporters in L. hongkongensis. PMID:21849034
Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus.

PubMed

Cormier, Alexandre; Avia, Komlan; Sterck, Lieven; Derrien, Thomas; Wucher, Valentin; Andres, Gwendoline; Monsoor, Misharl; Godfroy, Olivier; Lipinska, Agnieszka; Perrineau, Marie-Mathilde; Van De Peer, Yves; Hitte, Christophe; Corre, Erwan; Coelho, Susana M; Cock, J Mark

2017-04-01

The genome of the filamentous brown alga Ectocarpus was the first to be completely sequenced from within the brown algal group and has served as a key reference genome both for this lineage and for the stramenopiles. We present a complete structural and functional reannotation of the Ectocarpus genome. The large-scale assembly of the Ectocarpus genome was significantly improved and genome-wide gene re-annotation using extensive RNA-seq data improved the structure of 11 108 existing protein-coding genes and added 2030 new loci. A genome-wide analysis of splicing isoforms identified an average of 1.6 transcripts per locus. A large number of previously undescribed noncoding genes were identified and annotated, including 717 loci that produce long noncoding RNAs. Conservation of lncRNAs between Ectocarpus and another brown alga, the kelp Saccharina japonica, suggests that at least a proportion of these loci serve a function. Finally, a large collection of single nucleotide polymorphism-based markers was developed for genetic analyses. These resources are available through an updated and improved genome database. This study significantly improves the utility of the Ectocarpus genome as a high-quality reference for the study of many important aspects of brown algal biology and as a reference for genomic analyses across the stramenopiles. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Predicting Gene Structure Changes Resulting from Genetic Variants via Exon Definition Features.

PubMed

Majoros, William H; Holt, Carson; Campbell, Michael S; Ware, Doreen; Yandell, Mark; Reddy, Timothy E

2018-04-25

Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed, and produce functional proteins. We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and noncoding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or noncoding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products, and we propose that they may commonly act as cryptic factors in disease. The software is available from geneprediction.org/SGRF. bmajoros@duke.edu. Supplementary information is available at Bioinformatics online.
Life history and past demography maintain genetic structure, outcrossing rate, contemporary pollen gene flow of an understory herb in a highly fragmented rainforest

PubMed Central

Suárez-Montes, Pilar; Chávez-Pesqueira, Mariana

2016-01-01

Introduction Theory predicts that habitat fragmentation, by reducing population size and increasing isolation among remnant populations, can alter their genetic diversity and structure. A cascade of effects is expected: genetic drift and inbreeding after a population bottleneck, changes in biotic interactions that may affect, as in the case of plants, pollen dynamics, mating system, reproductive success. The detection of the effects of contemporary habitat fragmentation on the genetic structure of populations are conditioned by the magnitude of change, given the few number of generations since the onset of fragmentation, especially for long-lived organisms. However, the present-day genetic structure of populations may bear the signature of past demography events. Here, we examine the effects of rainforest fragmentation on the genetic diversity, population structure, mating system (outcrossing rate), indirect gene flow and contemporary pollen dynamics in the understory herb Aphelandra aurantiaca. Also, we assessed its present-day genetic structure under different past demographic scenarios. Methods Twelve populations of A. aurantiaca were sampled in large (4), medium (3), and small (5) forest fragments in the lowland tropical rainforest at Los Tuxtlas region. Variation at 11 microsatellite loci was assessed in 28–30 reproductive plants per population. In two medium- and two large-size fragments we estimated the density of reproductive plants, and the mating system by analyzing the progeny of different mother plants per population. Results Despite prevailing habitat fragmentation, populations of A. aurantiaca possess high genetic variation (He = 0.61), weak genetic structure (Rst = 0.037), and slight inbreeding in small fragments. Effective population sizes (Ne) were large, but slightly lower in small fragments. Migrants derive mostly from large and medium size fragments. Gene dispersal is highly restricted but long distance gene dispersal events were detected. Aphelandra aurantiaca shows a mixed mating system (tm = 0.81) and the outcrossing rate have not been affected by habitat fragmentation. A strong pollen pool structure was detected due to few effective pollen donors (Nep) and low distance pollen movement, pointing that most plants received pollen from close neighbors. Past demographic fluctuations may have affected the present population genetic structure as Bayesian coalescent analysis revealed the signature of past population expansion, possibly during warmer conditions after the last glacial maximum. Discussion Habitat fragmentation has not increased genetic differentiation or reduced genetic diversity of A. aurantiaca despite dozens of generations since the onset of fragmentation in the region of Los Tuxtlas. Instead, past population expansion is compatible with the lack of observed genetic structure. The predicted negative effects of rainforest fragmentation on genetic diversity and population structure of A. aurantiaca seem to have been buffered owing to its large effective populations and long-distance dispersal events. In particular, its mixed-mating system, mostly of outcrossing, suggests high efficiency of pollinators promoting connectivity and reducing inbreeding. However, some results point that the effects of fragmentation are underway, as two small fragments showed higher membership probabilities to their population of origin, suggesting genetic isolation. Our findings underscore the importance of fragment size to maintain genetic connectivity across the landscape. PMID:28028460
Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

PubMed

Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

2010-07-16

Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
A big data pipeline: Identifying dynamic gene regulatory networks from time-course Gene Expression Omnibus data with applications to influenza infection.

PubMed

Carey, Michelle; Ramírez, Juan Camilo; Wu, Shuang; Wu, Hulin

2018-07-01

A biological host response to an external stimulus or intervention such as a disease or infection is a dynamic process, which is regulated by an intricate network of many genes and their products. Understanding the dynamics of this gene regulatory network allows us to infer the mechanisms involved in a host response to an external stimulus, and hence aids the discovery of biomarkers of phenotype and biological function. In this article, we propose a modeling/analysis pipeline for dynamic gene expression data, called Pipeline4DGEData, which consists of a series of statistical modeling techniques to construct dynamic gene regulatory networks from the large volumes of high-dimensional time-course gene expression data that are freely available in the Gene Expression Omnibus repository. This pipeline has a consistent and scalable structure that allows it to simultaneously analyze a large number of time-course gene expression data sets, and then integrate the results across different studies. We apply the proposed pipeline to influenza infection data from nine studies and demonstrate that interesting biological findings can be discovered with its implementation.
Comparative Analysis of Syntenic Genes in Grass Genomes Reveals Accelerated Rates of Gene Structure and Coding Sequence Evolution in Polyploid Wheat1[W][OA

PubMed Central

Akhunov, Eduard D.; Sehgal, Sunish; Liang, Hanquan; Wang, Shichen; Akhunova, Alina R.; Kaur, Gaganpreet; Li, Wanlong; Forrest, Kerrie L.; See, Deven; Šimková, Hana; Ma, Yaqin; Hayden, Matthew J.; Luo, Mingcheng; Faris, Justin D.; Doležel, Jaroslav; Gill, Bikram S.

2013-01-01

Cycles of whole-genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied by comparing the patterns of gene structure changes, alternative splicing (AS), and codon substitution rates among wheat and model grass genomes. In orthologous gene sets, significantly more acquired and lost exonic sequences were detected in wheat than in model grasses. In wheat, 35% of these gene structure rearrangements resulted in frame-shift mutations and premature termination codons. An increased codon mutation rate in the wheat lineage compared with Brachypodium distachyon was found for 17% of orthologs. The discovery of premature termination codons in 38% of expressed genes was consistent with ongoing pseudogenization of the wheat genome. The rates of AS within the individual wheat subgenomes (21%–25%) were similar to diploid plants. However, we uncovered a high level of AS pattern divergence between the duplicated homeologous copies of genes. Our results are consistent with the accelerated accumulation of AS isoforms, nonsynonymous mutations, and gene structure rearrangements in the wheat lineage, likely due to genetic redundancy created by WGDs. Whereas these processes mostly contribute to the degeneration of a duplicated genome and its diploidization, they have the potential to facilitate the origin of new functional variations, which, upon selection in the evolutionary lineage, may play an important role in the origin of novel traits. PMID:23124323
Microsatellite and mtDNA analysis of lake trout, Salvelinus namaycush, from Great Bear Lake, Northwest Territories: impacts of historical and contemporary evolutionary forces on Arctic ecosystems

PubMed Central

Harris, Les N; Howland, Kimberly L; Kowalchuk, Matthew W; Bajno, Robert; Lindsay, Melissa M; Taylor, Eric B

2013-01-01

Resolving the genetic population structure of species inhabiting pristine, high latitude ecosystems can provide novel insights into the post-glacial, evolutionary processes shaping the distribution of contemporary genetic variation. In this study, we assayed genetic variation in lake trout (Salvelinus namaycush) from Great Bear Lake (GBL), NT and one population outside of this lake (Sandy Lake, NT) at 11 microsatellite loci and the mtDNA control region (d-loop). Overall, population subdivision was low, but significant (global FST θ = 0.025), and pairwise comparisons indicated that significance was heavily influenced by comparisons between GBL localities and Sandy Lake. Our data indicate that there is no obvious genetic structure among the various basins within GBL (global FST = 0.002) despite the large geographic distances between sampling areas. We found evidence of low levels of contemporary gene flow among arms within GBL, but not between Sandy Lake and GBL. Coalescent analyses suggested that some historical gene flow occurred among arms within GBL and between GBL and Sandy Lake. It appears, therefore, that contemporary (ongoing dispersal and gene flow) and historical (historical gene flow and large founding and present-day effective population sizes) factors contribute to the lack of neutral genetic structure in GBL. Overall, our results illustrate the importance of history (e.g., post-glacial colonization) and contemporary dispersal ecology in shaping genetic population structure of Arctic faunas and provide a better understanding of the evolutionary ecology of long-lived salmonids in pristine, interconnected habitats. PMID:23404390
Microsatellite and mtDNA analysis of lake trout, Salvelinus namaycush, from Great Bear Lake, Northwest Territories: impacts of historical and contemporary evolutionary forces on Arctic ecosystems.

PubMed

Harris, Les N; Howland, Kimberly L; Kowalchuk, Matthew W; Bajno, Robert; Lindsay, Melissa M; Taylor, Eric B

2012-01-01

Resolving the genetic population structure of species inhabiting pristine, high latitude ecosystems can provide novel insights into the post-glacial, evolutionary processes shaping the distribution of contemporary genetic variation. In this study, we assayed genetic variation in lake trout (Salvelinus namaycush) from Great Bear Lake (GBL), NT and one population outside of this lake (Sandy Lake, NT) at 11 microsatellite loci and the mtDNA control region (d-loop). Overall, population subdivision was low, but significant (global F(ST) θ = 0.025), and pairwise comparisons indicated that significance was heavily influenced by comparisons between GBL localities and Sandy Lake. Our data indicate that there is no obvious genetic structure among the various basins within GBL (global F(ST) = 0.002) despite the large geographic distances between sampling areas. We found evidence of low levels of contemporary gene flow among arms within GBL, but not between Sandy Lake and GBL. Coalescent analyses suggested that some historical gene flow occurred among arms within GBL and between GBL and Sandy Lake. It appears, therefore, that contemporary (ongoing dispersal and gene flow) and historical (historical gene flow and large founding and present-day effective population sizes) factors contribute to the lack of neutral genetic structure in GBL. Overall, our results illustrate the importance of history (e.g., post-glacial colonization) and contemporary dispersal ecology in shaping genetic population structure of Arctic faunas and provide a better understanding of the evolutionary ecology of long-lived salmonids in pristine, interconnected habitats.
Analysis of flavonoids and the flavonoid structural genes in brown fiber of upland cotton.

PubMed

Feng, Hongjie; Tian, Xinhui; Liu, Yongchang; Li, Yanjun; Zhang, Xinyu; Jones, Brian Joseph; Sun, Yuqiang; Sun, Jie

2013-01-01

As a result of changing consumer preferences, cotton (Gossypium Hirsutum L.) from varieties with naturally colored fibers is becoming increasingly sought after in the textile industry. The molecular mechanisms leading to colored fiber development are still largely unknown, although it is expected that the color is derived from flavanoids. Firstly, four key genes of the flavonoid biosynthetic pathway in cotton (GhC4H, GhCHS, GhF3'H, and GhF3'5'H) were cloned and studied their expression profiles during the development of brown- and white cotton fibers by QRT-PCR. And then, the concentrations of four components of the flavonoid biosynthetic pathway, naringenin, quercetin, kaempferol and myricetin in brown- and white fibers were analyzed at different developmental stages by HPLC. The predicted proteins of the four flavonoid structural genes corresponding to these genes exhibit strong sequence similarity to their counterparts in various plant species. Transcript levels for all four genes were considerably higher in developing brown fibers than in white fibers from a near isogenic line (NIL). The contents of four flavonoids (naringenin, quercetin, kaempferol and myricetin) were significantly higher in brown than in white fibers and corresponding to the biosynthetic gene expression levels. Flavonoid structural gene expression and flavonoid metabolism are important in the development of pigmentation in brown cotton fibers.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Spreitzer, Robert Joseph

Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) catalyzes the rate-limiting step of CO 2 fixation in photosynthesis. However, it is a slow enzyme, and O 2 competes with CO 2 at the active site. Oxygenation initiates the photorespiratory pathway, which also results in the loss of CO 2. If carboxylation could be increased or oxygenation decreased, an increase in net CO 2 fixation would be realized. Because Rubisco provides the primary means by which carbon enters all life on earth, there is much interest in engineering Rubisco to increase the production of food and renewable energy. Rubisco is located in the chloroplasts of plants,more » and it is comprised of two subunits. Much is known about the chloroplast-gene-encoded large subunit (rbcL gene), which contains the active site, but much less is known about the role of the nuclear-gene-encoded small subunit in Rubisco function (rbcS gene). Both subunits are coded by multiple genes in plants, which makes genetic engineering difficult. In the eukaryotic, green alga Chlamydomonas reinhardtii, it has been possible to eliminate all the Rubisco genes. These Rubisco-less mutants can be maintained by providing acetate as an alternative carbon source. In this project, focus has been placed on determining whether the small subunit might be a better genetic-engineering target for improving Rubisco. Analysis of a variable-loop structure (βA-βB loop) of the small subunit by genetic selection, directed mutagenesis, and construction of chimeras has shown that the small subunit can influence CO 2/O 2 specificity. X-ray crystal structures of engineered chimeric-loop enzymes have indicated that additional residues and regions of the small subunit may also contribute to Rubisco function. Structural dynamics of the small-subunit carboxyl terminus was also investigated. Alanine-scanning mutagenesis of the most-conserved small-subunit residues has identified a possible structural pathway between the small-subunit βA-βB loop and alpha-helix 8 of the large-subunit α/β-barrel active site. Hybrid enzymes were also created comprised of plant small subunits and Chlamydomonas large subunits, and these enzymes have increases in CO 2/O 2 specificity, further indicating that small subunits may be the key for ultimately engineering an improved Rubisco enzyme.« less
Single cell Hi-C reveals cell-to-cell variability in chromosome structure

PubMed Central

Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter

2013-01-01

Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610
The complete chloroplast genome sequence of Hibiscus syriacus.

PubMed

Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

2016-09-01

The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

Two-component signal transduction systems of Xanthomonas spp.: a lesson from genomics.

PubMed

Qian, Wei; Han, Zhong-Ji; He, Chaozu

2008-02-01

The two-component signal transduction systems (TCSTSs), consisting of a histidine kinase sensor (HK) and a response regulator (RR), are the dominant molecular mechanisms by which prokaryotes sense and respond to environmental stimuli. Genomes of Xanthomonas generally contain a large repertoire of TCSTS genes (approximately 92 to 121 for each genome), which encode diverse structural groups of HKs and RRs. Among them, although a core set of 70 TCSTS genes (about two-thirds in total) which accumulates point mutations with a slow rate are shared by these genomes, the other genes, especially hybrid HKs, experienced extensive genetic recombination, including genomic rearrangement, gene duplication, addition or deletion, and fusion or fission. The recombinations potentially promote the efficiency and complexity of TCSTSs in regulating gene expression. In addition, our analysis suggests that a co-evolutionary model, rather than a selfish operon model, is the major mechanism for the maintenance and microevolution of TCSTS genes in the genomes of Xanthomonas. Genomic annotation, secondary protein structure prediction, and comparative genomic analyses of TCSTS genes reviewed here provide insights into our understanding of signal networks in these important phytopathogenic bacteria.
Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

PubMed Central

Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

2010-01-01

A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057
An RNAi Screen for Genes Involved in Nanoscale Protrusion Formation on Corneal Lens in Drosophila melanogaster.

PubMed

Minami, Ryunosuke; Sato, Chiaki; Yamahama, Yumi; Kubo, Hideo; Hariyama, Takahiko; Kimura, Ken-Ichi

2016-12-01

The "moth-eye" structure, which is observed on the surface of corneal lens in several insects, supports anti-reflective and self-cleaning functions due to nanoscale protrusions known as corneal nipples. Although the morphology and function of the "moth-eye" structure, are relatively well studied, the mechanism of protrusion formation from cell-secreted substances is unknown. In Drosophila melanogaster, a compound eye consists of approximately 800 facets, the surface of which is formed by the corneal lens with nanoscale protrusions. In the present study, we sought to identify genes involved in "moth-eye" structure, formation in order to elucidate the developmental mechanism of the protrusions in Drosophila. We re-examined the aberrant patterns in classical glossy-eye mutants by scanning electron microscope and classified the aberrant patterns into groups. Next, we screened genes encoding putative structural cuticular proteins and genes involved in cuticular formation using eye specific RNAi silencing methods combined with the Gal4/UAS expression system. We identified 12 of 100 candidate genes, such as cuticular proteins family genes (Cuticular protein 23B and Cuticular protein 49Ah), cuticle secretion-related genes (Syntaxin 1A and Sec61 ββ subunit), ecdysone signaling and biosynthesis-related genes (Ecdysone receptor, Blimp-1, and shroud), and genes involved in cell polarity/cell architecture (Actin 5C, shotgun, armadillo, discs large1, and coracle). Although some of the genes we identified may affect corneal protrusion formation indirectly through general patterning defects in eye formation, these initial findings have encouraged us to more systematically explore the precise mechanisms underlying the formation of nanoscale protrusions in Drosophila.
Diversification of Root Hair Development Genes in Vascular Plants.

PubMed

Huang, Ling; Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui; Schiefelbein, John

2017-07-01

The molecular genetic program for root hair development has been studied intensively in Arabidopsis ( Arabidopsis thaliana ). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. © 2017 American Society of Plant Biologists. All Rights Reserved.
Diversification of Root Hair Development Genes in Vascular Plants1[OPEN

PubMed Central

Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui

2017-01-01

The molecular genetic program for root hair development has been studied intensively in Arabidopsis (Arabidopsis thaliana). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. PMID:28487476
Population genetic structure in migratory sandhill cranes and the role of Pleistocene glaciations.

PubMed

Jones, Kenneth L; Krapu, Gary L; Brandt, David A; Ashley, Mary V

2005-08-01

Previous studies of migratory sandhill cranes (Grus canadensis) have made significant progress explaining evolution of this group at the species scale, but have been unsuccessful in explaining the geographically partitioned variation in morphology seen on the population scale. The objectives of this study were to assess the population structure and gene flow patterns among migratory sandhill cranes using microsatellite DNA genotypes and mitochondrial DNA haplotypes of a large sample of individuals across three populations. In particular, we were interested in evaluating the roles of Pleistocene glaciation events and postglaciation gene flow in shaping the present-day population structure. Our results indicate substantial gene flow across regions of the Midcontinental population that are geographically adjacent, suggesting that gene flow for most of the region follows an isolation-by-distance model. Male-mediated gene flow and strong female philopatry may explain the differing patterns of nuclear and mitochondrial variation. Taken in context with precise geographical information on breeding locations, the morphologic and microsatellite DNA variation shows a gradation from the Arctic-nesting subspecies G. c. canadensis to the nonArctic subspecies G. c. tabida. Analogous to other Arctic-nesting birds, it is probable that the population structure seen in Midcontinental sandhill cranes reflects the result of postglacial secondary contact. Our data suggest that subspecies of migratory sandhills experience significant gene flow and therefore do not represent distinct and independent genetic entities.
Large-Scale Collection and Analysis of Full-Length cDNAs from Brachypodium distachyon and Integration with Pooideae Sequence Resources

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2013-01-01

A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698
RNA-Seq analysis of yak ovary: improving yak gene structure information and mining reproduction-related genes.

PubMed

Lan, DaoLiang; Xiong, XianRong; Wei, YanLi; Xu, Tong; Zhong, JinCheng; Zhi, XiangDong; Wang, Yong; Li, Jian

2014-09-01

RNA-Seq, a high-throughput (HT) sequencing technique, has been used effectively in large-scale transcriptomic studies, and is particularly useful for improving gene structure information and mining of new genes. In this study, RNA-Seq HT technology was employed to analyze the transcriptome of yak ovary. After Illumina-Solexa deep sequencing, 26826516 clean reads with a total of 4828772880 bp were obtained from the ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome and 3734 of these genes were involved in alternative splicing. Gene structure refinement analysis showed that 7340 genes that were annotated in the yak genome could be extended at the 5' or 3' ends based on the alignments been the transcripts and the genome sequence. Novel transcript prediction analysis identified 6321 new transcripts with lengths ranging from 180 to 14884 bp, and 2267 of them were predicted to code proteins. BLAST analysis of the new transcripts showed that 1200?4933 mapped to the non-redundant (nr), nucleotide (nt) and/or SwissProt sequence databases. Comparative statistical analysis of the new mapped transcripts showed that the majority of them were similar to genes in Bos taurus (41.4%), Bos grunniens mutus (33.0%), Ovis aries (6.3%), Homo sapiens (2.8%), Mus musculus (1.6%) and other species. Functional analysis showed that these expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes pathways. GO analysis of the new transcripts found that the largest proportion of them was associated with reproduction. The results of this study will provide a basis for describing the normal transcriptome map of yak ovary and for future studies on yak breeding performance. Moreover, the results confirmed that RNA-Seq HT technology is highly advantageous in improving gene structure information and mining of new genes, as well as in providing valuable data to expand the yak genome information.
Systematic analysis of viral genes responsible for differential virulence between American and Australian West Nile virus strains.

PubMed

Setoh, Yin Xiang; Prow, Natalie A; Rawle, Daniel J; Tan, Cindy Si En; Edmonds, Judith H; Hall, Roy A; Khromykh, Alexander A

2015-06-01

A variant Australian West Nile virus (WNV) strain, WNVNSW2011, emerged in 2011 causing an unprecedented outbreak of encephalitis in horses in south-eastern Australia. However, no human cases associated with this strain have yet been reported. Studies using mouse models for WNV pathogenesis showed that WNVNSW2011 was less virulent than the human-pathogenic American strain of WNV, New York 99 (WNVNY99). To identify viral genes and mutations responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, we constructed chimeric viruses with substitution of large genomic regions coding for the structural genes, non-structural genes and untranslated regions, as well as seven individual non-structural gene chimeras, using a modified circular polymerase extension cloning method. Our results showed that the complete non-structural region of WNVNSW2011, when substituted with that of WNVNY99, significantly enhanced viral replication and the ability to suppress type I IFN response in cells, resulting in higher virulence in mice. Analysis of the individual non-structural gene chimeras showed a predominant contribution of WNVNY99 NS3 to increased virus replication and evasion of IFN response in cells, and to virulence in mice. Other WNVNY99 non-structural proteins (NS2A, NS4B and NS5) were shown to contribute to the modulation of IFN response. Thus a combination of non-structural proteins, likely NS2A, NS3, NS4B and NS5, is primarily responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, and accumulative mutations within these proteins would likely be required for the Australian WNVNSW2011 strain to become significantly more virulent. © 2015 The Authors.
Pseudoscorpion mitochondria show rearranged genes and genome-wide reductions of RNA gene sizes and inferred structures, yet typical nucleotide composition bias

PubMed Central

2012-01-01

Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

PubMed Central

Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

2014-01-01

The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521
Papain-like cysteine proteases in Carica papaya: lineage-specific gene duplication and expansion.

PubMed

Liu, Juan; Sharma, Anupma; Niewiara, Marie Jamille; Singh, Ratnesh; Ming, Ray; Yu, Qingyi

2018-01-06

Papain-like cysteine proteases (PLCPs), a large group of cysteine proteases structurally related to papain, play important roles in plant development, senescence, and defense responses. Papain, the first cysteine protease whose structure was determined by X-ray crystallography, plays a crucial role in protecting papaya from herbivorous insects. Except the four major PLCPs purified and characterized in papaya latex, the rest of the PLCPs in papaya genome are largely unknown. We identified 33 PLCP genes in papaya genome. Phylogenetic analysis clearly separated plant PLCP genes into nine subfamilies. PLCP genes are not equally distributed among the nine subfamilies and the number of PLCPs in each subfamily does not increase or decrease proportionally among the seven selected plant species. Papaya showed clear lineage-specific gene expansion in the subfamily III. Interestingly, all four major PLCPs purified from papaya latex, including papain, chymopapain, glycyl endopeptidase and caricain, were grouped into the lineage-specific expansion branch in the subfamily III. Mapping PLCP genes on chromosomes of five plant species revealed that lineage-specific expansions of PLCP genes were mostly derived from tandem duplications. We estimated divergence time of papaya PLCP genes of subfamily III. The major duplication events leading to lineage-specific expansion of papaya PLCP genes in subfamily III were estimated at 48 MYA, 34 MYA, and 16 MYA. The gene expression patterns of the papaya PLCP genes in different tissues were assessed by transcriptome sequencing and qRT-PCR. Most of the papaya PLCP genes of subfamily III expressed at high levels in leaf and green fruit tissues. Tandem duplications played the dominant role in affecting copy number of PLCPs in plants. Significant variations in size of the PLCP subfamilies among species may reflect genetic adaptation of plant species to different environments. The lineage-specific expansion of papaya PLCPs of subfamily III might have been promoted by the continuous reciprocal selective effects of herbivore attack and plant defense.
The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

PubMed Central

Mello, C.V.; Clayton, D.F.

2014-01-01

High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907
Impact of target mRNA structure on siRNA silencing efficiency: A large-scale study.

PubMed

Gredell, Joseph A; Berger, Angela K; Walton, S Patrick

2008-07-01

The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5'- and 3'-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5'-end or 3'-end were silenced, on average, approximately 10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. (c) 2008 Wiley Periodicals, Inc.
Impact of target mRNA structure on siRNA silencing efficiency: a large-scale study

PubMed Central

Gredell, Joseph A.; Berger, Angela K.; Walton, S. Patrick

2009-01-01

The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5’- and 3’-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5’-end or 3’-end were silenced, on average, ~10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. PMID:18306428
Insights into structural variations and genome rearrangements in prokaryotic genomes.

PubMed

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

PubMed

Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

2014-12-01

Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).
Ubiquitous and gene-specific regulatory 5' sequences in a sea urchin histone DNA clone coding for histone protein variants.

PubMed Central

Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L

1980-01-01

The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Crystal Structure of Borrelia turicatae protein, BTA121, a differentially regulated gene in the tick-mammalian transmission cycle of relapsing fever spirochetes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Zhipu; Kelleher, Alan J.; Darwiche, Rabih

Tick-borne relapsing fever (RF) borreliosis is a neglected disease that is often misdiagnosed. RF species circulating in the United States include Borrelia turicatae, which is transmitted by argasid ticks. Environmental adaptation by RF Borrelia is poorly understood, however our previous studies indicated differential regulation of B. turicatae genes localized on the 150 kb linear megaplasmid during the tick-mammalian transmission cycle, including bta121. This gene is up-regulated by B. turicatae in the tick versus the mammal, and the encoded protein (BTA121) is predicted to be surface localized. The structure of BTA121 was solved by single-wavelength anomalous dispersion (SAD) using selenomethionine-derivative protein.more » The topology of BTA121 is unique with four helical domains organized into two helical bundles. Due to the sequence similarity of several genes on the megaplasmid, BTA121 can serve as a model for their tertiary structures. BTA121 has large interconnected tunnels and cavities that can accommodate ligands, notably long parallel helices, which have a large hydrophobic central pocket. Preliminary in-vitro studies suggest that BTA121 binds lipids, notably palmitate with a similar order of binding affinity as tablysin-15, a known palmitate-binding protein. The reported data will guide mechanistic studies to determine the role of BTA121 in the tick-mammalian transmission cycle of B. turicatae.« less
Common fragile sites (CFS) and extremely large CFS genes are targets for human papillomavirus integrations and chromosome rearrangements in oropharyngeal squamous cell carcinoma.

PubMed

Gao, Ge; Johnson, Sarah H; Vasmatzis, George; Pauley, Christina E; Tombers, Nicole M; Kasperbauer, Jan L; Smith, David I

2017-01-01

Common fragile sites (CFS) are chromosome regions that are prone to form gaps or breaks in response to DNA replication stress. They are often found as hotspots for sister chromatid exchanges, deletions, and amplifications in different cancers. Many of the CFS regions are found to span genes whose genomic sequence is greater than 1 Mb, some of which have been demonstrated to function as important tumor suppressors. CFS regions are also hotspots for human papillomavirus (HPV) integrations in cervical cancer. We used mate-pair sequencing to examine HPV integration events and chromosomal structural variations in 34 oropharyngeal squamous cell carcinoma (OPSCC). We used endpoint PCR and Sanger sequencing to validate each HPV integration event and found HPV integrations preferentially occurred within CFS regions similar to what is observed in cervical cancer. We also found that many of the chromosomal alterations detected also occurred at or near the cytogenetic location of CFSs. Several large genes were also found to be recurrent targets of rearrangements, independent of HPV integrations, including CSMD1 (2.1Mb), LRP1B (1.9Mb), and LARGE1 (0.7Mb). Sanger sequencing revealed that the nucleotide sequences near to identified junction sites contained repetitive and AT-rich sequences that were shown to have the potential to form stem-loop DNA secondary structures that might stall DNA replication fork progression during replication stress. This could then cause increased instability in these regions which could lead to cancer development in human cells. Our findings suggest that CFSs and some specific large genes appear to play important roles in OPSCC. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

A curated catalog of canine and equine keratin genes

PubMed Central

Pujar, Shashikant; McGarvey, Kelly M.; Welle, Monika; Galichet, Arnaud; Müller, Eliane J.; Pruitt, Kim D.; Leeb, Tosso

2017-01-01

Keratins represent a large protein family with essential structural and functional roles in epithelial cells of skin, hair follicles, and other organs. During evolution the genes encoding keratins have undergone multiple rounds of duplication and humans have two clusters with a total of 55 functional keratin genes in their genomes. Due to the high similarity between different keratin paralogs and species-specific differences in gene content, the currently available keratin gene annotation in species with draft genome assemblies such as dog and horse is still imperfect. We compared the National Center for Biotechnology Information (NCBI) (dog annotation release 103, horse annotation release 101) and Ensembl (release 87) gene predictions for the canine and equine keratin gene clusters to RNA-seq data that were generated from adult skin of five dogs and two horses and from adult hair follicle tissue of one dog. Taking into consideration the knowledge on the conserved exon/intron structure of keratin genes, we annotated 61 putatively functional keratin genes in both the dog and horse, respectively. Subsequently, curators in the RefSeq group at NCBI reviewed their annotation of keratin genes in the dog and horse genomes (Annotation Release 104 and Annotation Release 102, respectively) and updated annotation and gene nomenclature of several keratin genes. The updates are now available in the NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene). PMID:28846680
Immune and stress responses in oysters with insights on adaptation.

PubMed

Guo, Ximing; He, Yan; Zhang, Linlin; Lelong, Christophe; Jouaux, Aude

2015-09-01

Oysters are representative bivalve molluscs that are widely distributed in world oceans. As successful colonizers of estuaries and intertidal zones, oysters are remarkably resilient against harsh environmental conditions including wide fluctuations in temperature and salinity as well as prolonged air exposure. Oysters have no adaptive immunity but can thrive in microbe-rich estuaries as filter-feeders. These unique adaptations make oysters interesting models to study the evolution of host-defense systems. Recent advances in genomic studies including sequencing of the oyster genome have provided insights into oyster's immune and stress responses underlying their amazing resilience. Studies show that the oyster genomes are highly polymorphic and complex, which may be key to their resilience. The oyster genome has a large gene repertoire that is enriched for immune and stress response genes. Thousands of genes are involved in oyster's immune and stress responses, through complex interactions, with many gene families expanded showing high sequence, structural and functional diversity. The high diversity of immune receptors and effectors may provide oysters with enhanced specificity in immune recognition and response to cope with diverse pathogens in the absence of adaptive immunity. Some members of expanded immune gene families have diverged to function at different temperatures and salinities or assumed new roles in abiotic stress response. Most canonical innate immunity pathways are conserved in oysters and supported by a large number of diverse and often novel genes. The great diversity in immune and stress response genes exhibited by expanded gene families as well as high sequence and structural polymorphisms may be central to oyster's adaptation to highly stressful and widely changing environments. Copyright © 2015 Elsevier Ltd. All rights reserved.
Structural characteristics of ScBx genes controlling the biosynthesis of hydroxamic acids in rye (Secale cereale L.).

PubMed

Bakera, Beata; Makowska, Bogna; Groszyk, Jolanta; Niziołek, Michał; Orczyk, Wacław; Bolibok-Brągoszewska, Hanna; Hromada-Judycka, Aneta; Rakoczy-Trojanowska, Monika

2015-08-01

Benzoxazinoids (BX) are major secondary metabolites of gramineous plants that play an important role in disease resistance and allelopathy. They also have many other unique properties including anti-bacterial and anti-fungal activity, and the ability to reduce alfa-amylase activity. The biosynthesis and modification of BX are controlled by the genes Bx1 ÷ Bx10, GT and glu, and the majority of these Bx genes have been mapped in maize, wheat and rye. However, the genetic basis of BX biosynthesis remains largely uncharacterized apart from some data from maize and wheat. The aim of this study was to isolate, sequence and characterize five genes (ScBx1, ScBx2, ScBx3, ScBx4 and ScBx5) encoding enzymes involved in the synthesis of DIBOA, an important defense compound of rye. Using a modified 3D procedure of BAC library screening, seven BAC clones containing all of the ScBx genes were isolated and sequenced. Bioinformatic analyses of the resulting contigs were used to examine the structure and other features of these genes, including their promoters, introns and 3'UTRs. Comparative analysis showed that the ScBx genes are similar to those of other Poaceae species, especially to the TaBx genes. The polymorphisms present both in the coding sequences and non-coding regions of ScBx in relation to other Bx genes are predicted to have an impact on the expression, structure and properties of the encoded proteins.
The whole chloroplast genome of wild rice (Oryza australiensis).

PubMed

Wu, Zhiqiang; Ge, Song

2016-01-01

The whole chloroplast genome of wild rice (Oryza australiensis) is characterized in this study. The genome size is 135,224 bp, exhibiting a typical circular structure including a pair of 25,776 bp inverted repeats (IRa,b) separated by a large single-copy region (LSC) of 82,212 bp and a small single-copy region (SSC) of 12,470 bp. The overall GC content of the genome is 38.95%. 110 unique genes were annotated, including 76 protein-coding genes, 4 ribosomal RNA genes, and 30t RNA genes. Among these, 18 are duplicated in the inverted repeat regions, 13 genes contain one intron, and 2 genes (rps12 and ycf3) have two introns.
Analysis of Craniocardiac Malformations in Xenopus using Optical Coherence Tomography

PubMed Central

Deniz, Engin; Jonas, Stephan; Hooper, Michael; N. Griffin, John; Choma, Michael A.; Khokha, Mustafa K.

2017-01-01

Birth defects affect 3% of children in the United States. Among the birth defects, congenital heart disease and craniofacial malformations are major causes of mortality and morbidity. Unfortunately, the genetic mechanisms underlying craniocardiac malformations remain largely uncharacterized. To address this, human genomic studies are identifying sequence variations in patients, resulting in numerous candidate genes. However, the molecular mechanisms of pathogenesis for most candidate genes are unknown. Therefore, there is a need for functional analyses in rapid and efficient animal models of human disease. Here, we coupled the frog Xenopus tropicalis with Optical Coherence Tomography (OCT) to create a fast and efficient system for testing craniocardiac candidate genes. OCT can image cross-sections of microscopic structures in vivo at resolutions approaching histology. Here, we identify optimal OCT imaging planes to visualize and quantitate Xenopus heart and facial structures establishing normative data. Next we evaluate known human congenital heart diseases: cardiomyopathy and heterotaxy. Finally, we examine craniofacial defects by a known human teratogen, cyclopamine. We recapitulate human phenotypes readily and quantify the functional and structural defects. Using this approach, we can quickly test human craniocardiac candidate genes for phenocopy as a critical first step towards understanding disease mechanisms of the candidate genes. PMID:28195132
The gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis contains a group I intron.

PubMed Central

De Wachter, R; Neefs, J M; Goris, A; Van de Peer, Y

1992-01-01

The nucleotide sequence of the gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis was determined. It revealed the presence of a group I intron with a length of 411 nucleotides. This is the third occurrence of such an intron discovered in a small subunit rRNA gene encoded by a eukaryotic nuclear genome. The other two occurrences are in Pneumocystis carinii, a fungus of uncertain taxonomic status, and Ankistrodesmus stipitatus, a green alga. The nucleotides of the conserved core structure of 101 group I intron sequences present in different genes and genome types were aligned and their evolutionary relatedness was examined. This revealed a cluster including all group I introns hitherto found in eukaryotic nuclear genes coding for small and large subunit rRNAs. A secondary structure model was designed for the area of the Ustilago maydis small ribosomal subunit RNA precursor where the intron is situated. It shows that the internal guide sequence pairing with the intron boundaries fits between two helices of the small subunit rRNA, and that minimal rearrangement of base pairs suffices to achieve the definitive secondary structure of the 18S rRNA upon splicing. PMID:1561081
The nuclear lamina as a gene-silencing hub.

PubMed

Shevelyov, Yuri Y; Nurminsky, Dmitry I

2012-01-01

There is accumulating evidence that the nuclear periphery is a transcriptionally repressive compartment. A surprisingly large fraction of the genome is either in transient or permanent contact with nuclear envelope, where the majority of genes are maintained in a silent state, waiting to be awakened during cell differentiation. The integrity of the nuclear lamina and the histone deacetylase activity appear to be essential for gene repression at the nuclear periphery. However, the molecular mechanisms of silencing, as well as the events that lead to the activation of lamina-tethered genes, require further elucidation. This review summarizes recent advances in understanding of the mechanisms that link nuclear architecture, local chromatin structure, and gene regulation.
How to train your microbe: methods for dynamically characterizing gene networks

PubMed Central

Castillo-Hair, Sebastian M.; Igoshin, Oleg A.; Tabor, Jeffrey J.

2015-01-01

Gene networks regulate biological processes dynamically. However, researchers have largely relied upon static perturbations, such as growth media variations and gene knockouts, to elucidate gene network structure and function. Thus, much of the regulation on the path from DNA to phenotype remains poorly understood. Recent studies have utilized improved genetic tools, hardware, and computational control strategies to generate precise temporal perturbations outside and inside of live cells. These experiments have, in turn, provided new insights into the organizing principles of biology. Here, we introduce the major classes of dynamical perturbations that can be used to study gene networks, and discuss technologies available for creating them in a wide range of microbial pathways. PMID:25677419
Sex-dependent association of common variants of microcephaly genes with brain structure.

PubMed

Rimol, Lars M; Agartz, Ingrid; Djurovic, Srdjan; Brown, Andrew A; Roddey, J Cooper; Kähler, Anna K; Mattingsdal, Morten; Athanasiu, Lavinia; Joyner, Alexander H; Schork, Nicholas J; Halgren, Eric; Sundet, Kjetil; Melle, Ingrid; Dale, Anders M; Andreassen, Ole A

2010-01-05

Loss-of-function mutations in the genes associated with primary microcephaly (MCPH) reduce human brain size by about two-thirds, without producing gross abnormalities in brain organization or physiology and leaving other organs largely unaffected [Woods CG, et al. (2005) Am J Hum Genet 76:717-728]. There is also evidence suggesting that MCPH genes have evolved rapidly in primates and humans and have been subjected to selection in recent human evolution [Vallender EJ, et al. (2008) Trends Neurosci 31:637-644]. Here, we show that common variants of MCPH genes account for some of the common variation in brain structure in humans, independently of disease status. We investigated the correlations of SNPs from four MCPH genes with brain morphometry phenotypes obtained with MRI. We found significant, sex-specific associations between common, nonexonic, SNPs of the genes CDK5RAP2, MCPH1, and ASPM, with brain volume or cortical surface area in an ethnically homogenous Norwegian discovery sample (n = 287), including patients with mental illness. The most strongly associated SNP findings were replicated in an independent North American sample (n = 656), which included patients with dementia. These results are consistent with the view that common variation in brain structure is associated with genetic variants located in nonexonic, presumably regulatory, regions.
Contrasting roles for MyoD in organizing myogenic promoter structures during embryonic skeletal muscle development.

PubMed

Cho, Ok Hyun; Mallappa, Chandrashekara; Hernández-Hernández, J Manuel; Rivera-Pérez, Jaime A; Imbalzano, Anthony N

2015-01-01

Among the complexities of skeletal muscle differentiation is a temporal distinction in the onset of expression of different lineage-specific genes. The lineage-determining factor MyoD is bound to myogenic genes at the onset of differentiation whether gene activation is immediate or delayed. How temporal regulation of differentiation-specific genes is established remains unclear. Using embryonic tissue, we addressed the molecular differences in the organization of the myogenin and muscle creatine kinase (MCK) gene promoters by examining regulatory factor binding as a function of both time and spatial organization during somitogenesis. At the myogenin promoter, binding of the homeodomain factor Pbx1 coincided with H3 hyperacetylation and was followed by binding of co-activators that modulate chromatin structure. MyoD and myogenin binding occurred subsequently, demonstrating that Pbx1 facilitates chromatin remodeling and modification before myogenic regulatory factor binding. At the same time, the MCK promoter was bound by HDAC2 and MyoD, and activating histone marks were largely absent. The association of HDAC2 and MyoD was confirmed by co-immunoprecipitation, proximity ligation assay (PLA), and sequential ChIP. MyoD differentially promotes activated and repressed chromatin structures at myogenic genes early after the onset of skeletal muscle differentiation in the developing mouse embryo. © 2014 Wiley Periodicals, Inc.
The novel product of a five-exon stargazin-related gene abolishes CaV2.2 calcium channel expression

PubMed Central

Moss, Fraser J.; Viard, Patricia; Davies, Anthony; Bertaso, Federica; Page, Karen M.; Graham, Alex; Cantí, Carles; Plumpton, Mary; Plumpton, Christopher; Clare, Jeffrey J.; Dolphin, Annette C.

2002-01-01

We have cloned and characterized a new member of the voltage-dependent Ca2+ channel γ subunit family, with a novel gene structure and striking properties. Unlike the genes of other potential γ subunits identified by their homology to the stargazin gene, CACNG7 is a five-, and not four-exon gene whose mRNA encodes a protein we have designated γ7. Expression of human γ7 has been localized specifically to brain. N-type current through CaV2.2 channels was almost abolished when co-expressed transiently with γ7 in either Xenopus oocytes or COS-7 cells. Furthermore, immunocytochemistry and western blots show that γ7 has this effect by causing a large reduction in expression of CaV2.2 rather than by interfering with trafficking or biophysical properties of the channel. No effect of transiently expressed γ7 was observed on pre-existing endogenous N-type calcium channels in sympathetic neurones. Low homology to the stargazin-like γ subunits, different gene structure and the unique functional properties of γ7 imply that it represents a distinct subdivision of the family of proteins identified by their structural and sequence homology to stargazin. PMID:11927536
Muscle Research and Gene Ontology: New standards for improved data integration.

PubMed

Feltrin, Erika; Campanaro, Stefano; Diehl, Alexander D; Ehler, Elisabeth; Faulkner, Georgine; Fordham, Jennifer; Gardin, Chiara; Harris, Midori; Hill, David; Knoell, Ralph; Laveder, Paolo; Mittempergher, Lorenza; Nori, Alessandra; Reggiani, Carlo; Sorrentino, Vincenzo; Volpe, Pompeo; Zara, Ivano; Valle, Giorgio; Deegan, Jennifer

2009-01-29

The Gene Ontology Project provides structured controlled vocabularies for molecular biology that can be used for the functional annotation of genes and gene products. In a collaboration between the Gene Ontology (GO) Consortium and the muscle biology community, we have made large-scale additions to the GO biological process and cellular component ontologies. The main focus of this ontology development work concerns skeletal muscle, with specific consideration given to the processes of muscle contraction, plasticity, development, and regeneration, and to the sarcomere and membrane-delimited compartments. Our aims were to update the existing structure to reflect current knowledge, and to resolve, in an accommodating manner, the ambiguity in the language used by the community. The updated muscle terminologies have been incorporated into the GO. There are now 159 new terms covering critical research areas, and 57 existing terms have been improved and reorganized to follow their usage in muscle literature. The revised GO structure should improve the interpretation of data from high-throughput (e.g. microarray and proteomic) experiments in the area of muscle science and muscle disease. We actively encourage community feedback on, and gene product annotation with these new terms. Please visit the Muscle Community Annotation Wiki http://wiki.geneontology.org/index.php/Muscle_Biology.
Early genetic consequences of defaunation in a large-seeded vertebrate-dispersed palm (Syagrus romanzoffiana)

PubMed Central

Giombini, M I; Bravo, S P; Sica, Y V; Tosto, D S

2017-01-01

Plant populations are seriously threatened by anthropogenic habitat disturbance. In particular, defaunation may disrupt plant-disperser mutualisms, thus reducing levels of seed-mediated gene flow and genetic variation in animal-dispersed plants. This may ultimately limit their adaptive potential and ability to cope with environmental change. Tropical forest remnants are typically deprived of medium to large vertebrates upon which many large-seeded plants rely for accomplishing effective seed dispersal. Our main goal was to examine the potential early genetic consequences of the loss of large vertebrates for large-seeded vertebrate-dispersed plants. We compared the genetic variation in early-stage individuals of the large-seeded palm Syagrus romanzoffiana between continuous protected forest and nearby partially defaunated fragments in the Atlantic Forest of South America. Using nine microsatellites, we found lower allelic richness and stronger fine-scale spatial genetic structure in the disturbed area. In addition, the percentage of dispersed recruits around conspecific adults was lower, although not significantly, in the disturbed area (median values: 0.0 vs 14.4%). On the other hand, no evidence of increased inbreeding or reduced pollen-mediated gene flow (selfing rate and diversity of pollen donors) was found in the disturbed area. Our findings are strongly suggestive of some early genetic consequences resulting from the limitation in contemporary gene flow via seeds, but not pollen, in defaunated areas. Plant-disperser mutualisms involving medium–large frugivores, which are seriously threatened in tropical systems, should therefore be protected to warrant the maintenance of seed-mediated gene flow and genetic diversity in large-seeded plants. PMID:28121308
Dendritic silica nanomaterials (KCC-1) with fibrous pore structure possess high DNA adsorption capacity and effectively deliver genes in vitro.

PubMed

Huang, Xiaoxi; Tao, Zhimin; Praskavich, John C; Goswami, Anandarup; Al-Sharab, Jafar F; Minko, Tamara; Polshettiwar, Vivek; Asefa, Tewodros

2014-09-16

The pore size and pore structure of nanoporous materials can affect the materials' physical properties, as well as potential applications in different areas, including catalysis, drug delivery, and biomolecular therapeutics. KCC-1, one of the newest members of silica nanomaterials, possesses fibrous, large pore, dendritic pore networks with wide pore entrances, large pore size distribution, spacious pore volume and large surface area--structural features that are conducive for adsorption and release of large guest molecules and biomacromolecules (e.g., proteins and DNAs). Here, we report the results of our comparative studies of adsorption of salmon DNA in a series of KCC-1-based nanomaterials that are functionalized with different organoamine groups on different parts of their surfaces (channel walls, external surfaces or both). For comparison the results of our studies of adsorption of salmon DNA in similarly functionalized, MCM-41 mesoporous silica nanomaterials with cylindrical pores, some of the most studied silica nanomaterials for drug/gene delivery, are also included. Our results indicate that, despite their relatively lower specific surface area, the KCC-1-based nanomaterials show high adsorption capacity for DNA than the corresponding MCM-41-based nanomaterials, most likely because of KCC-1's large pores, wide pore mouths, fibrous pore network, and thereby more accessible and amenable structure for DNA molecules to diffuse through. Conversely, the MCM-41-based nanomaterials adsorb much less DNA, presumably because their outer surfaces/cylindrical channel pore entrances can get blocked by the DNA molecules, making the inner parts of the materials inaccessible. Moreover, experiments involving fluorescent dye-tagged DNAs suggest that the amine-grafted KCC-1 materials are better suited for delivering the DNAs adsorbed on their surfaces into cellular environments than their MCM-41 counterparts. Finally, cellular toxicity tests show that the KCC-1-based materials are biocompatible. On the basis of these results, the fibrous and porous KCC-1-based nanomaterials can be said to be more suitable to carry, transport, and deliver DNAs and genes than cylindrical porous nanomaterials such as MCM-41.
Gene coexpression measures in large heterogeneous samples using count statistics.

PubMed

Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

2014-11-18

With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
Ancient trade routes shaped the genetic structure of horses in eastern Eurasia.

PubMed

Warmuth, Vera M; Campana, Michael G; Eriksson, Anders; Bower, Mim; Barker, Graeme; Manica, Andrea

2013-11-01

Animal exchange networks have been shown to play an important role in determining gene flow among domestic animal populations. The Silk Road is one of the oldest continuous exchange networks in human history, yet its effectiveness in facilitating animal exchange across large geographical distances and topographically challenging landscapes has never been explicitly studied. Horses are known to have been traded along the Silk Roads; however, extensive movement of horses in connection with other human activities may have obscured the genetic signature of the Silk Roads. To investigate the role of the Silk Roads in shaping the genetic structure of horses in eastern Eurasia, we analysed microsatellite genotyping data from 455 village horses sampled from 17 locations. Using least-cost path methods, we compared the performance of models containing the Silk Roads as corridors for gene flow with models containing single landscape features. We also determined whether the recent isolation of former Soviet Union countries from the rest of Eurasia has affected the genetic structure of our samples. The overall level of genetic differentiation was low, consistent with historically high levels of gene flow across the study region. The spatial genetic structure was characterized by a significant, albeit weak, pattern of isolation by distance across the continent with no evidence for the presence of distinct genetic clusters. Incorporating landscape features considerably improved the fit of the data; however, when we controlled for geographical distance, only the correlation between genetic differentiation and the Silk Roads remained significant, supporting the effectiveness of this ancient trade network in facilitating gene flow across large geographical distances in a topographically complex landscape. © 2013 John Wiley & Sons Ltd.
Structural and functional analysis of the finished genome of the recently isolated toxic Anabaena sp. WA102.

PubMed

Brown, Nathan M; Mueller, Ryan S; Shepardson, Jonathan W; Landry, Zachary C; Morré, Jeffrey T; Maier, Claudia S; Hardy, F Joan; Dreher, Theo W

2016-06-13

Very few closed genomes of the cyanobacteria that commonly produce toxic blooms in lakes and reservoirs are available, limiting our understanding of the properties of these organisms. A new anatoxin-a-producing member of the Nostocaceae, Anabaena sp. WA102, was isolated from a freshwater lake in Washington State, USA, in 2013 and maintained in non-axenic culture. The Anabaena sp. WA102 5.7 Mbp genome assembly has been closed with long-read, single-molecule sequencing and separately a draft genome assembly has been produced with short-read sequencing technology. The closed and draft genome assemblies are compared, showing a correlation between long repeats in the genome and the many gaps in the short-read assembly. Anabaena sp. WA102 encodes anatoxin-a biosynthetic genes, as does its close relative Anabaena sp. AL93 (also introduced in this study). These strains are distinguished by differences in the genes for light-harvesting phycobilins, with Anabaena sp. AL93 possessing a phycoerythrocyanin operon. Biologically relevant structural variants in the Anabaena sp. WA102 genome were detected only by long-read sequencing: a tandem triplication of the anaBCD promoter region in the anatoxin-a synthase gene cluster (not triplicated in Anabaena sp. AL93) and a 5-kbp deletion variant present in two-thirds of the population. The genome has a large number of mobile elements (160). Strikingly, there was no synteny with the genome of its nearest fully assembled relative, Anabaena sp. 90. Structural and functional genome analyses indicate that Anabaena sp. WA102 has a flexible genome. Genome closure, which can be readily achieved with long-read sequencing, reveals large scale (e.g., gene order) and local structural features that should be considered in understanding genome evolution and function.
Diversity of herbaceous plants and bacterial communities regulates soil resistome across forest biomes.

PubMed

Hu, Hang-Wei; Wang, Jun-Tao; Singh, Brajesh K; Liu, Yu-Rong; Chen, Yong-Liang; Zhang, Yu-Jing; He, Ji-Zheng

2018-04-24

Antibiotic resistance is ancient and prevalent in natural ecosystems and evolved long before the utilization of synthetic antibiotics started, but factors influencing the large-scale distribution patterns of natural antibiotic resistance genes (ARGs) remain largely unknown. Here, a large-scale investigation over 4000 km was performed to profile soil ARGs, plant communities and bacterial communities from 300 quadrats across five forest biomes with minimal human impact. We detected diverse and abundant ARGs in forests, including over 160 genes conferring resistance to eight major categories of antibiotics. The diversity of ARGs was strongly and positively correlated with the diversity of bacteria, herbaceous plants and mobile genetic elements (MGEs). The ARG composition was strongly correlated with the taxonomic structure of bacteria and herbs. Consistent with this strong correlation, structural equation modelling demonstrated that the positive effects of bacterial and herb communities on ARG patterns were maintained even when simultaneously accounting for multiple drivers (climate, spatial predictors and edaphic factors). These findings suggest a paradigm that the interactions between aboveground and belowground communities shape the large-scale distribution of soil resistomes, providing new knowledge for tackling the emerging environmental antibiotic resistance. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.
Analysis of Flavonoids and the Flavonoid Structural Genes in Brown Fiber of Upland Cotton

PubMed Central

Liu, Yongchang; Li, Yanjun; Zhang, Xinyu; Jones, Brian Joseph; Sun, Yuqiang; Sun, Jie

2013-01-01

Backgroud As a result of changing consumer preferences, cotton (Gossypium Hirsutum L.) from varieties with naturally colored fibers is becoming increasingly sought after in the textile industry. The molecular mechanisms leading to colored fiber development are still largely unknown, although it is expected that the color is derived from flavanoids. Experimental Design Firstly, four key genes of the flavonoid biosynthetic pathway in cotton (GhC4H, GhCHS, GhF3′H, and GhF3′5′H) were cloned and studied their expression profiles during the development of brown- and white cotton fibers by QRT-PCR. And then, the concentrations of four components of the flavonoid biosynthetic pathway, naringenin, quercetin, kaempferol and myricetin in brown- and white fibers were analyzed at different developmental stages by HPLC. Result The predicted proteins of the four flavonoid structural genes corresponding to these genes exhibit strong sequence similarity to their counterparts in various plant species. Transcript levels for all four genes were considerably higher in developing brown fibers than in white fibers from a near isogenic line (NIL). The contents of four flavonoids (naringenin, quercetin, kaempferol and myricetin) were significantly higher in brown than in white fibers and corresponding to the biosynthetic gene expression levels. Conclusions Flavonoid structural gene expression and flavonoid metabolism are important in the development of pigmentation in brown cotton fibers. PMID:23527031
Genetic and epigenetic alteration among three homoeologous genes of a class E MADS box gene in hexaploid wheat.

PubMed

Shitsukawa, Naoki; Tahira, Chikako; Kassai, Ken-Ichiro; Hirabayashi, Chizuru; Shimizu, Tomoaki; Takumi, Shigeo; Mochida, Keiichi; Kawaura, Kanako; Ogihara, Yasunari; Murai, Koji

2007-06-01

Bread wheat (Triticum aestivum) is a hexaploid species with A, B, and D ancestral genomes. Most bread wheat genes are present in the genome as triplicated homoeologous genes (homoeologs) derived from the ancestral species. Here, we report that both genetic and epigenetic alterations have occurred in the homoeologs of a wheat class E MADS box gene. Two class E genes are identified in wheat, wheat SEPALLATA (WSEP) and wheat LEAFY HULL STERILE1 (WLHS1), which are homologs of Os MADS45 and Os MADS1 in rice (Oryza sativa), respectively. The three wheat homoeologs of WSEP showed similar genomic structures and expression profiles. By contrast, the three homoeologs of WLHS1 showed genetic and epigenetic alterations. The A genome WLHS1 homoeolog (WLHS1-A) had a structural alteration that contained a large novel sequence in place of the K domain sequence. A yeast two-hybrid analysis and a transgenic experiment indicated that the WLHS1-A protein had no apparent function. The B and D genome homoeologs, WLHS1-B and WLHS1-D, respectively, had an intact MADS box gene structure, but WLHS1-B was predominantly silenced by cytosine methylation. Consequently, of the three WLHS1 homoeologs, only WLHS1-D functions in hexaploid wheat. This is a situation where three homoeologs are differentially regulated by genetic and epigenetic mechanisms.

Enzymatic Synthesis of Self-assembled Dicer Substrate RNA Nanostructures for Programmable Gene Silencing.

PubMed

Jang, Bora; Kim, Boyoung; Kim, Hyunsook; Kwon, Hyokyoung; Kim, Minjeong; Seo, Yunmi; Colas, Marion; Jeong, Hansaem; Jeong, Eun Hye; Lee, Kyuri; Lee, Hyukjin

2018-06-08

Enzymatic synthesis of RNA nanostructures is achieved by isothermal rolling circle transcription (RCT). Each arm of RNA nanostructures provides a functional role of Dicer substrate RNA inducing sequence specific RNA interference (RNAi). Three different RNAi sequences (GFP, RFP, and BFP) are incorporated within the three-arm junction RNA nanostructures (Y-RNA). The template and helper DNA strands are designed for the large-scale in vitro synthesis of RNA strands to prepare self-assembled Y-RNA. Interestingly, Dicer processing of Y-RNA is highly influenced by its physical structure and different gene silencing activity is achieved depending on its arm length and overhang. In addition, enzymatic synthesis allows the preparation of various Y-RNA structures using a single DNA template offering on demand regulation of multiple target genes.
Structural and functional partitioning of bread wheat chromosome 3B.

PubMed

Choulet, Frédéric; Alberti, Adriana; Theil, Sébastien; Glover, Natasha; Barbe, Valérie; Daron, Josquin; Pingault, Lise; Sourdille, Pierre; Couloux, Arnaud; Paux, Etienne; Leroy, Philippe; Mangenot, Sophie; Guilhot, Nicolas; Le Gouis, Jacques; Balfourier, Francois; Alaux, Michael; Jamilloux, Véronique; Poulain, Julie; Durand, Céline; Bellec, Arnaud; Gaspin, Christine; Safar, Jan; Dolezel, Jaroslav; Rogers, Jane; Vandepoele, Klaas; Aury, Jean-Marc; Mayer, Klaus; Berges, Hélène; Quesneville, Hadi; Wincker, Patrick; Feuillet, Catherine

2014-07-18

We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits. Copyright © 2014, American Association for the Advancement of Science.
Genetic Structure and Gene Flows within Horses: A Genealogical Study at the French Population Scale

PubMed Central

Pirault, Pauline; Danvy, Sophy; Verrier, Etienne; Leroy, Grégoire

2013-01-01

Since horse breeds constitute populations submitted to variable and multiple outcrossing events, we analyzed the genetic structure and gene flows considering horses raised in France. We used genealogical data, with a reference population of 547,620 horses born in France between 2002 and 2011, grouped according to 55 breed origins. On average, individuals had 6.3 equivalent generations known. Considering different population levels, fixation index decreased from an overall species FIT of 1.37%, to an average of −0.07% when considering the 55 origins, showing that most horse breeds constitute populations without genetic structure. We illustrate the complexity of gene flows existing among horse breeds, a few populations being closed to foreign influence, most, however, being submitted to various levels of introgression. In particular, Thoroughbred and Arab breeds are largely used as introgression sources, since those two populations explain together 26% of founder origins within the overall horse population. When compared with molecular data, breeds with a small level of coancestry also showed low genetic distance; the gene pool of the breeds was probably impacted by their reproducer exchanges. PMID:23630596
Population Structure and Gene Flow of the Yellow Anaconda (Eunectes notaeus) in Northern Argentina

PubMed Central

McCartney-Melstad, Evan; Waller, Tomás; Micucci, Patricio A.; Barros, Mariano; Draque, Juan; Amato, George; Mendez, Martin

2012-01-01

Yellow anacondas (Eunectes notaeus) are large, semiaquatic boid snakes found in wetland systems in South America. These snakes are commercially harvested under a sustainable management plan in Argentina, so information regarding population structuring can be helpful for determination of management units. We evaluated genetic structure and migration using partial sequences from the mitochondrial control region and mitochondrial genes cyt-b and ND4 for 183 samples collected within northern Argentina. A group of landscape features and environmental variables including several treatments of temperature and precipitation were explored as potential drivers of observed genetic patterns. We found significant population structure between most putative population comparisons and bidirectional but asymmetric migration in several cases. The configuration of rivers and wetlands was found to be significantly associated with yellow anaconda population structure (IBD), and important for gene flow, although genetic distances were not significantly correlated with the environmental variables used here. More in-depth analyses of environmental data may be needed to fully understand the importance of environmental conditions on population structure and migration. These analyses indicate that our putative populations are demographically distinct and should be treated as such in Argentina's management plan for the harvesting of yellow anacondas. PMID:22675425
Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

PubMed

Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

2014-01-01

Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

PubMed Central

Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

2014-01-01

Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417
The walk is never random: subtle landscape effects shape gene flow in a continuous white-tailed deer population in the Midwestern United States

USGS Publications Warehouse

Robinson, Stacie J.; Samuel, Michael D.; Lopez, Davin L.; Shelton, Paul

2012-01-01

One of the pervasive challenges in landscape genetics is detecting gene flow patterns within continuous populations of highly mobile wildlife. Understanding population genetic structure within a continuous population can give insights into social structure, movement across the landscape and contact between populations, which influence ecological interactions, reproductive dynamics or pathogen transmission. We investigated the genetic structure of a large population of deer spanning the area of Wisconsin and Illinois, USA, affected by chronic wasting disease. We combined multiscale investigation, landscape genetic techniques and spatial statistical modelling to address the complex questions of landscape factors influencing population structure. We sampled over 2000 deer and used spatial autocorrelation and a spatial principal components analysis to describe the population genetic structure. We evaluated landscape effects on this pattern using a spatial autoregressive model within a model selection framework to test alternative hypotheses about gene flow. We found high levels of genetic connectivity, with gradients of variation across the large continuous population of white-tailed deer. At the fine scale, spatial clustering of related animals was correlated with the amount and arrangement of forested habitat. At the broader scale, impediments to dispersal were important to shaping genetic connectivity within the population. We found significant barrier effects of individual state and interstate highways and rivers. Our results offer an important understanding of deer biology and movement that will help inform the management of this species in an area where overabundance and disease spread are primary concerns.
Cationic lipids: molecular structure/ transfection activity relationships and interactions with biomembranes.

PubMed

Koynova, Rumiana; Tenchov, Boris

2010-01-01

Abstract Synthetic cationic lipids, which form complexes (lipoplexes) with polyanionic DNA, are presently the most widely used constituents of nonviral gene carriers. A large number of cationic amphiphiles have been synthesized and tested in transfection studies. However, due to the complexity of the transfection pathway, no general schemes have emerged for correlating the cationic lipid chemistry with their transfection efficacy and the approaches for optimizing their molecular structures are still largely empirical. Here we summarize data on the relationships between transfection activity and cationic lipid molecular structure and demonstrate that the transfection activity depends in a systematic way on the lipid hydrocarbon chain structure. A number of examples, including a large series of cationic phosphatidylcholine derivatives, show that optimum transfection is displayed by lipids with chain length of approximately 14 carbon atoms and that the transfection efficiency strongly increases with increase of chain unsaturation, specifically upon replacement of saturated with monounsaturated chains.
Large-scale sequence and structural comparisons of human naive and antigen-experienced antibody repertoires.

PubMed

DeKosky, Brandon J; Lungu, Oana I; Park, Daechan; Johnson, Erik L; Charab, Wissam; Chrysostomou, Constantine; Kuroda, Daisuke; Ellington, Andrew D; Ippolito, Gregory C; Gray, Jeffrey J; Georgiou, George

2016-05-10

Elucidating how antigen exposure and selection shape the human antibody repertoire is fundamental to our understanding of B-cell immunity. We sequenced the paired heavy- and light-chain variable regions (VH and VL, respectively) from large populations of single B cells combined with computational modeling of antibody structures to evaluate sequence and structural features of human antibody repertoires at unprecedented depth. Analysis of a dataset comprising 55,000 antibody clusters from CD19(+)CD20(+)CD27(-) IgM-naive B cells, >120,000 antibody clusters from CD19(+)CD20(+)CD27(+) antigen-experienced B cells, and >2,000 RosettaAntibody-predicted structural models across three healthy donors led to a number of key findings: (i) VH and VL gene sequences pair in a combinatorial fashion without detectable pairing restrictions at the population level; (ii) certain VH:VL gene pairs were significantly enriched or depleted in the antigen-experienced repertoire relative to the naive repertoire; (iii) antigen selection increased antibody paratope net charge and solvent-accessible surface area; and (iv) public heavy-chain third complementarity-determining region (CDR-H3) antibodies in the antigen-experienced repertoire showed signs of convergent paired light-chain genetic signatures, including shared light-chain third complementarity-determining region (CDR-L3) amino acid sequences and/or Vκ,λ-Jκ,λ genes. The data reported here address several longstanding questions regarding antibody repertoire selection and development and provide a benchmark for future repertoire-scale analyses of antibody responses to vaccination and disease.
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript

PubMed Central

Rose, Dominic; Stadler, Peter F.

2011-01-01

Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
CCDB: a curated database of genes involved in cervix cancer.

PubMed

Agarwal, Subhash M; Raghav, Dhwani; Singh, Harinder; Raghava, G P S

2011-01-01

The Cervical Cancer gene DataBase (CCDB, http://crdd.osdd.net/raghava/ccdb) is a manually curated catalog of experimentally validated genes that are thought, or are known to be involved in the different stages of cervical carcinogenesis. In spite of the large women population that is presently affected from this malignancy still at present, no database exists that catalogs information on genes associated with cervical cancer. Therefore, we have compiled 537 genes in CCDB that are linked with cervical cancer causation processes such as methylation, gene amplification, mutation, polymorphism and change in expression level, as evident from published literature. Each record contains details related to gene like architecture (exon-intron structure), location, function, sequences (mRNA/CDS/protein), ontology, interacting partners, homology to other eukaryotic genomes, structure and links to other public databases, thus augmenting CCDB with external data. Also, manually curated literature references have been provided to support the inclusion of the gene in the database and establish its association with cervix cancer. In addition, CCDB provides information on microRNA altered in cervical cancer as well as search facility for querying, several browse options and an online tool for sequence similarity search, thereby providing researchers with easy access to the latest information on genes involved in cervix cancer.
Gene expression studies of developing bovine longissimus muscle from two different beef cattle breeds

PubMed Central

Lehnert, Sigrid A; Reverter, Antonio; Byrne, Keren A; Wang, Yonghong; Nattrass, Greg S; Hudson, Nicholas J; Greenwood, Paul L

2007-01-01

Background The muscle fiber number and fiber composition of muscle is largely determined during prenatal development. In order to discover genes that are involved in determining adult muscle phenotypes, we studied the gene expression profile of developing fetal bovine longissimus muscle from animals with two different genetic backgrounds using a bovine cDNA microarray. Fetal longissimus muscle was sampled at 4 stages of myogenesis and muscle maturation: primary myogenesis (d 60), secondary myogenesis (d 135), as well as beginning (d 195) and final stages (birth) of functional differentiation of muscle fibers. All fetuses and newborns (total n = 24) were from Hereford dams and crossed with either Wagyu (high intramuscular fat) or Piedmontese (GDF8 mutant) sires, genotypes that vary markedly in muscle and compositional characteristics later in postnatal life. Results We obtained expression profiles of three individuals for each time point and genotype to allow comparisons across time and between sire breeds. Quantitative reverse transcription-PCR analysis of RNA from developing longissimus muscle was able to validate the differential expression patterns observed for a selection of differentially expressed genes, with one exception. We detected large-scale changes in temporal gene expression between the four developmental stages in genes coding for extracellular matrix and for muscle fiber structural and metabolic proteins. FSTL1 and IGFBP5 were two genes implicated in growth and differentiation that showed developmentally regulated expression levels in fetal muscle. An abundantly expressed gene with no functional annotation was found to be developmentally regulated in the same manner as muscle structural proteins. We also observed differences in gene expression profiles between the two different sire breeds. Wagyu-sired calves showed higher expression of fatty acid binding protein 5 (FABP5) RNA at birth. The developing longissimus muscle of fetuses carrying the Piedmontese mutation shows an emphasis on glycolytic muscle biochemistry and a large-scale up-regulation of the translational machinery at birth. We also document evidence for timing differences in differentiation events between the two breeds. Conclusion Taken together, these findings provide a detailed description of molecular events accompanying skeletal muscle differentiation in the bovine, as well as gene expression differences that may underpin the phenotype differences between the two breeds. In addition, this study has highlighted a non-coding RNA, which is abundantly expressed and developmentally regulated in bovine fetal muscle. PMID:17697390
Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana)

PubMed Central

2010-01-01

Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
Dispersal and gene flow in the rare, parasitic Large Blue butterfly Maculinea arion.

PubMed

Ugelvig, L V; Andersen, A; Boomsma, J J; Nash, D R

2012-07-01

Dispersal is crucial for gene flow and often determines the long-term stability of meta-populations, particularly in rare species with specialized life cycles. Such species are often foci of conservation efforts because they suffer disproportionally from degradation and fragmentation of their habitat. However, detailed knowledge of effective gene flow through dispersal is often missing, so that conservation strategies have to be based on mark-recapture observations that are suspected to be poor predictors of long-distance dispersal. These constraints have been especially severe in the study of butterfly populations, where microsatellite markers have been difficult to develop. We used eight microsatellite markers to analyse genetic population structure of the Large Blue butterfly Maculinea arion in Sweden. During recent decades, this species has become an icon of insect conservation after massive decline throughout Europe and extinction in Britain followed by reintroduction of a seed population from the Swedish island of Öland. We find that populations are highly structured genetically, but that gene flow occurs over distances 15 times longer than the maximum distance recorded from mark-recapture studies, which can only be explained by maximum dispersal distances at least twice as large as previously accepted. However, we also find evidence that gaps between sites with suitable habitat exceeding ∼20km induce genetic erosion that can be detected from bottleneck analyses. Although further work is needed, our results suggest that M. arion can maintain fully functional metapopulations when they consist of optimal habitat patches that are no further apart than ∼10km. © 2012 Blackwell Publishing Ltd.
Distinctive Architecture of the Chloroplast Genome in the Chlorodendrophycean Green Algae Scherffelia dubia and Tetraselmis sp. CCMP 881.

PubMed

Turmel, Monique; de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude

2016-01-01

The Chlorodendrophyceae is a small class of green algae belonging to the core Chlorophyta, an assemblage that also comprises the Pedinophyceae, Trebouxiophyceae, Ulvophyceae and Chlorophyceae. Here we describe for the first time the chloroplast genomes of chlorodendrophycean algae (Scherffelia dubia, 137,161 bp; Tetraselmis sp. CCMP 881, 100,264 bp). Characterized by a very small single-copy (SSC) region devoid of any gene and an unusually large inverted repeat (IR), the quadripartite structures of the Scherffelia and Tetraselmis genomes are unique among all core chlorophytes examined thus far. The lack of genes in the SSC region is offset by the rich and atypical gene complement of the IR, which includes genes from the SSC and large single-copy regions of prasinophyte and streptophyte chloroplast genomes having retained an ancestral quadripartite structure. Remarkably, seven of the atypical IR-encoded genes have also been observed in the IRs of pedinophycean and trebouxiophycean chloroplast genomes, suggesting that they were already present in the IR of the common ancestor of all core chlorophytes. Considering that the relationships among the main lineages of the core Chlorophyta are still unresolved, we evaluated the impact of including the Chlorodendrophyceae in chloroplast phylogenomic analyses. The trees we inferred using data sets of 79 and 108 genes from 71 chlorophytes indicate that the Chlorodendrophyceae is a deep-diverging lineage of the core Chlorophyta, although the placement of this class relative to the Pedinophyceae remains ambiguous. Interestingly, some of our phylogenomic trees together with our comparative analysis of gene order data support the monophyly of the Trebouxiophyceae, thus offering further evidence that the previously observed affiliation between the Chlorellales and Pedinophyceae is the result of systematic errors in phylogenetic reconstruction.
Representing virus-host interactions and other multi-organism processes in the Gene Ontology.

PubMed

Foulger, R E; Osumi-Sutherland, D; McIntosh, B K; Hulo, C; Masson, P; Poux, S; Le Mercier, P; Lomax, J

2015-07-28

The Gene Ontology project is a collaborative effort to provide descriptions of gene products in a consistent and computable language, and in a species-independent manner. The Gene Ontology is designed to be applicable to all organisms but up to now has been largely under-utilized for prokaryotes and viruses, in part because of a lack of appropriate ontology terms. To address this issue, we have developed a set of Gene Ontology classes that are applicable to microbes and their hosts, improving both coverage and quality in this area of the Gene Ontology. Describing microbial and viral gene products brings with it the additional challenge of capturing both the host and the microbe. Recognising this, we have worked closely with annotation groups to test and optimize the GO classes, and we describe here a set of annotation guidelines that allow the controlled description of two interacting organisms. Building on the microbial resources already in existence such as ViralZone, UniProtKB keywords and MeGO, this project provides an integrated ontology to describe interactions between microbial species and their hosts, with mappings to the external resources above. Housing this information within the freely-accessible Gene Ontology project allows the classes and annotation structure to be utilized by a large community of biologists and users.
The nucleotide sequence of the entire ribosomal DNA operon and the structure of the large subunit rRNA of Giardia muris.

PubMed

van Keulen, H; Gutell, R R; Campbell, S R; Erlandsen, S L; Jarroll, E L

1992-10-01

The total nucleotide sequence of the rDNA of Giardia muris, an intestinal protozoan parasite of rodents, has been determined. The repeat unit is 7668 basepairs (bp) in size and consists of a spacer of 3314 bp, a small-subunit rRNA (SSU-rRNA) gene of 1429, and a large-subunit rRNA (LSU-rRNA) gene of 2698 bp. The spacer contains long direct repeats and is heterogeneous in size. The LSU-rRNA of G. muris was compared to that of the human intestinal parasite Giardia duodenalis, to the bird parasite Giardia ardeae, and to that of Escherichia coli. The LSU-rRNA has a size comparable to the 23S rRNA of E. coli but shows structural features typical for eukaryotes. Some variable regions are typically small and account for the overall smaller size of this rRNA. The structure of the G. muris LSU-rRNA is similar to that of the other Giardia rRNA, but each rRNA has characteristic features residing in a number of variable regions.
Genome-Wide Identification of the Alba Gene Family in Plants and Stress-Responsive Expression of the Rice Alba Genes.

PubMed

Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan

2018-03-28

Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa , Zea mays , Sorghum bicolor , Cicer arietinum , and Vitis vinifera , and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii , Physcomitrella patens , and Amborella trichopoda , revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice ( OsAlba ), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure-function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants.
Microarray and Real-Time PCR Analyses of the Responses of High-Arctic Soil Bacteria to Hydrocarbon Pollution and Bioremediation Treatments▿

PubMed Central

Yergeau, Etienne; Arbour, Mélanie; Brousseau, Roland; Juck, David; Lawrence, John R.; Masson, Luke; Whyte, Lyle G.; Greer, Charles W.

2009-01-01

High-Arctic soils have low nutrient availability, low moisture content, and very low temperatures and, as such, they pose a particular problem in terms of hydrocarbon bioremediation. An in-depth knowledge of the microbiology involved in this process is likely to be crucial to understand and optimize the factors most influencing bioremediation. Here, we compared two distinct large-scale field bioremediation experiments, located at the Canadian high-Arctic stations of Alert (ex situ approach) and Eureka (in situ approach). Bacterial community structure and function were assessed using microarrays targeting the 16S rRNA genes of bacteria found in cold environments and hydrocarbon degradation genes as well as quantitative reverse transcriptase PCR targeting key functional genes. The results indicated a large difference between sampling sites in terms of both soil microbiology and decontamination rates. A rapid reorganization of the bacterial community structure and functional potential as well as rapid increases in the expression of alkane monooxygenases and polyaromatic hydrocarbon-ring-hydroxylating dioxygenases were observed 1 month after the bioremediation treatment commenced in the Alert soils. In contrast, no clear changes in community structure were observed in Eureka soils, while key gene expression increased after a relatively long lag period (1 year). Such discrepancies are likely caused by differences in bioremediation treatments (i.e., ex situ versus in situ), weathering of the hydrocarbons, indigenous microbial communities, and environmental factors such as soil humidity and temperature. In addition, this study demonstrates the value of molecular tools for the monitoring of polar bacteria and their associated functions during bioremediation. PMID:19684169
Mapping the Schizophrenia Genes by Neuroimaging: The Opportunities and the Challenges

PubMed Central

2018-01-01

Schizophrenia (SZ) is a heritable brain disease originating from a complex interaction of genetic and environmental factors. The genes underpinning the neurobiology of SZ are largely unknown but recent data suggest strong evidence for genetic variations, such as single nucleotide polymorphisms, making the brain vulnerable to the risk of SZ. Structural and functional brain mapping of these genetic variations are essential for the development of agents and tools for better diagnosis, treatment and prevention of SZ. Addressing this, neuroimaging methods in combination with genetic analysis have been increasingly used for almost 20 years. So-called imaging genetics, the opportunities of this approach along with its limitations for SZ research will be outlined in this invited paper. While the problems such as reproducibility, genetic effect size, specificity and sensitivity exist, opportunities such as multivariate analysis, development of multisite consortia for large-scale data collection, emergence of non-candidate gene (hypothesis-free) approach of neuroimaging genetics are likely to contribute to a rapid progress for gene discovery besides to gene validation studies that are related to SZ. PMID:29324666

Frequent loss of lineages and deficient duplications accounted for low copy number of disease resistance genes in Cucurbitaceae

PubMed Central

2013-01-01

Background The sequenced genomes of cucumber, melon and watermelon have relatively few R-genes, with 70, 75 and 55 copies only, respectively. The mechanism for low copy number of R-genes in Cucurbitaceae genomes remains unknown. Results Manual annotation of R-genes in the sequenced genomes of Cucurbitaceae species showed that approximately half of them are pseudogenes. Comparative analysis of R-genes showed frequent loss of R-gene loci in different Cucurbitaceae species. Phylogenetic analysis, data mining and PCR cloning using degenerate primers indicated that Cucurbitaceae has limited number of R-gene lineages (subfamilies). Comparison between R-genes from Cucurbitaceae and those from poplar and soybean suggested frequent loss of R-gene lineages in Cucurbitaceae. Furthermore, the average number of R-genes per lineage in Cucurbitaceae species is approximately 1/3 that in soybean or poplar. Therefore, both loss of lineages and deficient duplications in extant lineages accounted for the low copy number of R-genes in Cucurbitaceae. No extensive chimeras of R-genes were found in any of the sequenced Cucurbitaceae genomes. Nevertheless, one lineage of R-genes from Trichosanthes kirilowii, a wild Cucurbitaceae species, exhibits chimeric structures caused by gene conversions, and may contain a large number of distinct R-genes in natural populations. Conclusions Cucurbitaceae species have limited number of R-gene lineages and each genome harbors relatively few R-genes. The scarcity of R-genes in Cucurbitaceae species was due to frequent loss of R-gene lineages and infrequent duplications in extant lineages. The evolutionary mechanisms for large variation of copy number of R-genes in different plant species were discussed. PMID:23682795
Social cohesion among kin, gene flow without dispersal and the evolution of population genetic structure in the killer whale (Orcinus orca).

PubMed

Pilot, M; Dahlheim, M E; Hoelzel, A R

2010-01-01

In social species, breeding system and gregarious behavior are key factors influencing the evolution of large-scale population genetic structure. The killer whale is a highly social apex predator showing genetic differentiation in sympatry between populations of foraging specialists (ecotypes), and low levels of genetic diversity overall. Our comparative assessments of kinship, parentage and dispersal reveal high levels of kinship within local populations and ongoing male-mediated gene flow among them, including among ecotypes that are maximally divergent within the mtDNA phylogeny. Dispersal from natal populations was rare, implying that gene flow occurs without dispersal, as a result of reproduction during temporary interactions. Discordance between nuclear and mitochondrial phylogenies was consistent with earlier studies suggesting a stochastic basis for the magnitude of mtDNA differentiation between matrilines. Taken together our results show how the killer whale breeding system, coupled with social, dispersal and foraging behaviour, contributes to the evolution of population genetic structure.
Dissecting gene-environment interactions: A penalized robust approach accounting for hierarchical structures.

PubMed

Wu, Cen; Jiang, Yu; Ren, Jie; Cui, Yuehua; Ma, Shuangge

2018-02-10

Identification of gene-environment (G × E) interactions associated with disease phenotypes has posed a great challenge in high-throughput cancer studies. The existing marginal identification methods have suffered from not being able to accommodate the joint effects of a large number of genetic variants, while some of the joint-effect methods have been limited by failing to respect the "main effects, interactions" hierarchy, by ignoring data contamination, and by using inefficient selection techniques under complex structural sparsity. In this article, we develop an effective penalization approach to identify important G × E interactions and main effects, which can account for the hierarchical structures of the 2 types of effects. Possible data contamination is accommodated by adopting the least absolute deviation loss function. The advantage of the proposed approach over the alternatives is convincingly demonstrated in both simulation and a case study on lung cancer prognosis with gene expression measurements and clinical covariates under the accelerated failure time model. Copyright © 2017 John Wiley & Sons, Ltd.
Structural basis for gene regulation by a B12-dependent photoreceptor

PubMed Central

Jost, Marco; Fernández-Zapata, Jésus; Polanco, María Carmen; Ortiz-Guerrero, Juan Manuel; Chen, Percival Yang-Ting; Kang, Gyunghoon; Padmanabhan, S.; Elías-Arnanz, Montserrat; Drennan, Catherine L.

2015-01-01

Summary Photoreceptor proteins enable organisms to sense and respond to light. The newly discovered CarH-type photoreceptors use a vitamin B12 derivative, adenosylcobalamin, as the light-sensing chromophore to mediate light-dependent gene regulation. Here, we present crystal structures of Thermus thermophilus CarH in all three relevant states: in the dark, both free and bound to operator DNA, and after light exposure. These structures provide a visualization of how adenosylcobalamin mediates CarH tetramer formation in the dark, how this tetramer binds to the promoter −35 element to repress transcription, and how light exposure leads to a large-scale conformational change that activates transcription. In addition to the remarkable functional repurposing of adenosylcobalamin from an enzyme cofactor to a light sensor, we find that nature also repurposed two independent protein modules in assembling CarH. These results expand the biological role of vitamin B12 and provide fundamental insight into a new mode of light-dependent gene regulation. PMID:26416754
Evolutionary characterization of the West Nile Virus complete genome.

PubMed

Gray, R R; Veras, N M C; Santos, L A; Salemi, M

2010-07-01

The spatial dynamics of the West Nile Virus epidemic in North America are largely unknown. Previous studies that investigated the evolutionary history of the virus used sequence data from the structural genes (prM and E); however, these regions may lack phylogenetic information and obscure true evolutionary relationships. This study systematically evaluated the evolutionary patterns in the eleven genes of the WNV genome in order to determine which region(s) were most phylogenetically informative. We found that while the E region lacks resolution and can potentially result in misleading conclusions, the full NS3 or NS5 regions have strong phylogenetic signal. Furthermore, we show that geographic structure of WNV infection within the US is more pronounced than previously reported in studies that used the structural genes. We conclude that future evolutionary studies should focus on NS3 and NS5 in order to maximize the available sequences while retaining maximal interpretative power to infer temporal and geographic trends among WNV strains. Copyright 2010 Elsevier Inc. All rights reserved.
Structural basis for gene regulation by a B 12-dependent photoreceptor

DOE PAGES

Jost, Marco; Fernández-Zapata, Jésus; Polanco, María Carmen; ...

2015-09-28

Photoreceptor proteins enable organisms to sense and respond to light. The newly discovered CarH-type photoreceptors use a vitamin B 12 derivative, adenosylcobalamin, as the light-sensing chromophore to mediate light-dependent gene regulation. Here in this paper, we present crystal structures of Thermus thermophilus CarH in all three relevant states: in the dark, both free and bound to operator DNA, and after light exposure. These structures provide visualizations of how adenosylcobalamin mediates CarH tetramer formation in the dark, how this tetramer binds to the promoter -35 element to repress transcription, and how light exposure leads to a large-scale conformational change that activatesmore » transcription. In addition to the remarkable functional repurposing of adenosylcobalamin from an enzyme cofactor to a light sensor, we find that nature also repurposed two independent protein modules in assembling CarH. Finally, these results expand the biological role of vitamin B 12 and provide fundamental insight into a new mode of light-dependent gene regulation.« less
The Sequence and Analysis of Duplication Rich Human Chromosome 16

DOE R&D Accomplishments Database

Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

2004-01-01

We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, Morgan N.; Arkin, Adam P.; Alm, Eric J.

Operons are a major feature of all prokaryotic genomes, but how and why operon structures vary is not well understood. To elucidate the life-cycle of operons, we compared gene order between Escherichia coli K12 and its relatives and identified the recently formed and destroyed operons in E. coli. This allowed us to determine how operons form, how they become closely spaced, and how they die. Our findings suggest that operon evolution is driven by selection on gene expression patterns. First, both operon creation and operon destruction lead to large changes in gene expression patterns. For example, the removal of lysAmore » and ruvA from ancestral operons that contained essential genes allowed their expression to respond to lysine levels and DNA damage, respectively. Second, some operons have undergone accelerated evolution, with multiple new genes being added during a brief period. Third, although most operons are closely spaced because of a neutral bias towards deletion and because of selection against large overlaps, highly expressed operons tend to be widely spaced because of regulatory fine-tuning by intervening sequences. Although operon evolution seems to be adaptive, it need not be optimal: new operons often comprise functionally unrelated genes that were already in proximity before the operon formed.« less
The architecture of chicken chromosome territories changes during differentiation

PubMed Central

Stadler, Sonja; Schnapp, Verena; Mayer, Robert; Stein, Stefan; Cremer, Christoph; Bonifer, Constanze; Cremer, Thomas; Dietzel, Steffen

2004-01-01

Background Between cell divisions the chromatin fiber of each chromosome is restricted to a subvolume of the interphase cell nucleus called chromosome territory. The internal organization of these chromosome territories is still largely unknown. Results We compared the large-scale chromatin structure of chromosome territories between several hematopoietic chicken cell types at various differentiation stages. Chromosome territories were labeled by fluorescence in situ hybridization in structurally preserved nuclei, recorded by confocal microscopy and evaluated visually and by quantitative image analysis. Chromosome territories in multipotent myeloid precursor cells appeared homogeneously stained and compact. The inactive lysozyme gene as well as the centromere of the lysozyme gene harboring chromosome located to the interior of the chromosome territory. In further differentiated cell types such as myeloblasts, macrophages and erythroblasts chromosome territories appeared increasingly diffuse, disaggregating to separable substructures. The lysozyme gene, which is gradually activated during the differentiation to activated macrophages, as well as the centromere were relocated increasingly to more external positions. Conclusions Our results reveal a cell type specific constitution of chromosome territories. The data suggest that a repositioning of chromosomal loci during differentiation may be a consequence of general changes in chromosome territory morphology, not necessarily related to transcriptional changes. PMID:15555075
Selective modes determine evolutionary rates, gene compactness and expression patterns in Brassica.

PubMed

Guo, Yue; Liu, Jing; Zhang, Jiefu; Liu, Shengyi; Du, Jianchang

2017-07-01

It has been well documented that most nuclear protein-coding genes in organisms can be classified into two categories: positively selected genes (PSGs) and negatively selected genes (NSGs). The characteristics and evolutionary fates of different types of genes, however, have been poorly understood. In this study, the rates of nonsynonymous substitution (K a ) and the rates of synonymous substitution (K s ) were investigated by comparing the orthologs between the two sequenced Brassica species, Brassica rapa and Brassica oleracea, and the evolutionary rates, gene structures, expression patterns, and codon bias were compared between PSGs and NSGs. The resulting data show that PSGs have higher protein evolutionary rates, lower synonymous substitution rates, shorter gene length, fewer exons, higher functional specificity, lower expression level, higher tissue-specific expression and stronger codon bias than NSGs. Although the quantities and values are different, the relative features of PSGs and NSGs have been largely verified in the model species Arabidopsis. These data suggest that PSGs and NSGs differ not only under selective pressure (K a /K s ), but also in their evolutionary, structural and functional properties, indicating that selective modes may serve as a determinant factor for measuring evolutionary rates, gene compactness and expression patterns in Brassica. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

PubMed

Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

2018-03-01

Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.
Muscular Dystrophy with Ribitol-Phosphate Deficiency: A Novel Post-Translational Mechanism in Dystroglycanopathy

PubMed Central

Kanagawa, Motoi; Toda, Tatsushi

2017-01-01

Muscular dystrophy is a group of genetic disorders characterized by progressive muscle weakness. In the early 2000s, a new classification of muscular dystrophy, dystroglycanopathy, was established. Dystroglycanopathy often associates with abnormalities in the central nervous system. Currently, at least eighteen genes have been identified that are responsible for dystroglycanopathy, and despite its genetic heterogeneity, its common biochemical feature is abnormal glycosylation of alpha-dystroglycan. Abnormal glycosylation of alpha-dystroglycan reduces its binding activities to ligand proteins, including laminins. In just the last few years, remarkable progress has been made in determining the sugar chain structures and gene functions associated with dystroglycanopathy. The normal sugar chain contains tandem structures of ribitol-phosphate, a pentose alcohol that was previously unknown in humans. The dystroglycanopathy genes fukutin, fukutin-related protein (FKRP), and isoprenoid synthase domain-containing protein (ISPD) encode essential enzymes for the synthesis of this structure: fukutin and FKRP transfer ribitol-phosphate onto sugar chains of alpha-dystroglycan, and ISPD synthesizes CDP-ribitol, a donor substrate for fukutin and FKRP. These findings resolved long-standing questions and established a disease subgroup that is ribitol-phosphate deficient, which describes a large population of dystroglycanopathy patients. Here, we review the history of dystroglycanopathy, the properties of the sugar chain structure of alpha-dystroglycan, dystroglycanopathy gene functions, and therapeutic strategies. PMID:29081423
Analysis of the functional gene structure and metabolic potential of microbial community in high arsenic groundwater.

PubMed

Li, Ping; Jiang, Zhou; Wang, Yanhong; Deng, Ye; Van Nostrand, Joy D; Yuan, Tong; Liu, Han; Wei, Dazhun; Zhou, Jizhong

2017-10-15

Microbial functional potential in high arsenic (As) groundwater ecosystems remains largely unknown. In this study, the microbial community functional composition of nineteen groundwater samples was investigated using a functional gene array (GeoChip 5.0). Samples were divided into low and high As groups based on the clustering analysis of geochemical parameters and microbial functional structures. The results showed that As related genes (arsC, arrA), sulfate related genes (dsrA and dsrB), nitrogen cycling related genes (ureC, amoA, and hzo) and methanogen genes (mcrA, hdrB) in groundwater samples were correlated with As, SO 4 2- , NH 4 + or CH 4 concentrations, respectively. Canonical correspondence analysis (CCA) results indicated that some geochemical parameters including As, total organic content, SO 4 2- , NH 4 + , oxidation-reduction potential (ORP) and pH were important factors shaping the functional microbial community structures. Alkaline and reducing conditions with relatively low SO 4 2- , ORP, and high NH 4 + , as well as SO 4 2- and Fe reduction and ammonification involved in microbially-mediated geochemical processes could be associated with As enrichment in groundwater. This study provides an overall picture of functional microbial communities in high As groundwater aquifers, and also provides insights into the critical role of microorganisms in As biogeochemical cycling. Copyright © 2017 Elsevier Ltd. All rights reserved.
Measuring semantic similarities by combining gene ontology annotations and gene co-function networks

DOE PAGES

Peng, Jiajie; Uygun, Sahra; Kim, Taehyong; ...

2015-02-14

Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstratemore » that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.« less
GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes.

PubMed

van den Broek, Evert; van Lieshout, Stef; Rausch, Christian; Ylstra, Bauke; van de Wiel, Mark A; Meijer, Gerrit A; Fijneman, Remond J A; Abeln, Sanne

2016-01-01

Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org ) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html ).
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
Structure of large dsDNA viruses

PubMed Central

Klose, Thomas; Rossmann, Michael G.

2015-01-01

Nucleocytoplasmic large dsDNA viruses (NCLDVs) encompass an ever-increasing group of large eukaryotic viruses, infecting a wide variety of organisms. The set of core genes shared by all these viruses includes a major capsid protein with a double jelly-roll fold forming an icosahedral capsid, which surrounds a double layer membrane that contains the viral genome. Furthermore, some of these viruses, such as the members of the Mimiviridae and Phycodnaviridae have a unique vertex that is used during infection to transport DNA into the host. PMID:25003382
Complete Chloroplast Genome of Pinus massoniana (Pinaceae): Gene Rearrangements, Loss of ndh Genes, and Short Inverted Repeats Contraction, Expansion.

PubMed

Ni, ZhouXian; Ye, YouJu; Bai, Tiandao; Xu, Meng; Xu, Li-An

2017-09-11

The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.
The cryo-electron microscopy structure of huntingtin

NASA Astrophysics Data System (ADS)

Guo, Qiang; Bin Huang; Cheng, Jingdong; Seefelder, Manuel; Engler, Tatjana; Pfeifer, Günter; Oeckl, Patrick; Otto, Markus; Moser, Franziska; Maurer, Melanie; Pautsch, Alexander; Baumeister, Wolfgang; Fernández-Busnadiego, Rubén; Kochanek, Stefan

2018-03-01

Huntingtin (HTT) is a large (348 kDa) protein that is essential for embryonic development and is involved in diverse cellular activities such as vesicular transport, endocytosis, autophagy and the regulation of transcription. Although an integrative understanding of the biological functions of HTT is lacking, the large number of identified HTT interactors suggests that it serves as a protein-protein interaction hub. Furthermore, Huntington’s disease is caused by a mutation in the HTT gene, resulting in a pathogenic expansion of a polyglutamine repeat at the amino terminus of HTT. However, only limited structural information regarding HTT is currently available. Here we use cryo-electron microscopy to determine the structure of full-length human HTT in a complex with HTT-associated protein 40 (HAP40; encoded by three F8A genes in humans) to an overall resolution of 4 Å. HTT is largely α-helical and consists of three major domains. The amino- and carboxy-terminal domains contain multiple HEAT (huntingtin, elongation factor 3, protein phosphatase 2A and lipid kinase TOR) repeats arranged in a solenoid fashion. These domains are connected by a smaller bridge domain containing different types of tandem repeats. HAP40 is also largely α-helical and has a tetratricopeptide repeat-like organization. HAP40 binds in a cleft and contacts the three HTT domains by hydrophobic and electrostatic interactions, thereby stabilizing the conformation of HTT. These data rationalize previous biochemical results and pave the way for improved understanding of the diverse cellular functions of HTT.
Structure and evolution of cereal genomes.

PubMed

Paterson, Andrew H; Bowers, John E; Peterson, Daniel G; Estill, James C; Chapman, Brad A

2003-12-01

The cereal species, of central importance to our diet, began to diverge 50-70 million years ago. For the past few thousand years, these species have undergone largely parallel selection regimes associated with domestication and improvement. The rice genome sequence provides a platform for organizing information about diverse cereals, and together with genetic maps and sequence samples from other cereals is yielding new insights into both the shared and the independent dimensions of cereal evolution. New data and population-based approaches are identifying genes that have been involved in cereal improvement. Reduced-representation sequencing promises to accelerate gene discovery in many large-genome cereals, and to better link the under-explored genomes of 'orphan' cereals with state-of-the-art knowledge.

MACF1 gene structure: a hybrid of plectin and dystrophin.

PubMed

Gong, T W; Besirli, C G; Lomax, M I

2001-11-01

Mammalian MACF1 (Macrophin1; previously named ACF7) is a giant cytoskeletal linker protein with three known isoforms that arise by alternative splicing. We isolated a 19.1-kb cDNA encoding a fourth isoform (MACF1-4) with a unique N-terminus. Instead of an N-terminal actin-binding domain found in the other three isoforms, MACF1-4 has eight plectin repeats. The MACF1 gene is located on human Chr 1p32, contains at least 102 exons, spans over 270 kb, and gives rise to four major isoforms with different N-termini. The genomic organization of the actin-binding domain is highly conserved in mammalian genes for both plectin and BPAG1. All eight plectin repeats are encoded by one large exon; this feature is similar to the genomic structure of plectin. The intron positions within spectrin repeats in MACF1 are very similar to those in the dystrophin gene. This demonstrates that MACF1 has characteristic features of genes for two classes of cytoskeletal proteins, i.e., plectin and dystrophin.
Single molecule real-time sequencing of Xanthomonas oryzae genomes reveals a dynamic structure and complex TAL (transcription activator-like) effector gene relationships

PubMed Central

Booher, Nicholas J.; Carpenter, Sara C. D.; Sebra, Robert P.; Wang, Li; Salzberg, Steven L.; Leach, Jan E.

2015-01-01

Pathogen-injected, direct transcriptional activators of host genes, TAL (transcription activator-like) effectors play determinative roles in plant diseases caused by Xanthomonas spp. A large domain of nearly identical, 33–35 aa repeats in each protein mediates DNA recognition. This modularity makes TAL effectors customizable and thus important also in biotechnology. However, the repeats render TAL effector (tal) genes nearly impossible to assemble using next-generation, short reads. Here, we demonstrate that long-read, single molecule real-time (SMRT) sequencing solves this problem. Taking an ensemble approach to first generate local, tal gene contigs, we correctly assembled de novo the genomes of two strains of the rice pathogen X. oryzae completed previously using the Sanger method and even identified errors in those references. Sequencing two more strains revealed a dynamic genome structure and a striking plasticity in tal gene content. Our results pave the way for population-level studies to inform resistance breeding, improve biotechnology and probe TAL effector evolution. PMID:27148456
Genomic Data Quality Impacts Automated Detection of Lateral Gene Transfer in Fungi

PubMed Central

Dupont, Pierre-Yves; Cox, Murray P.

2017-01-01

Lateral gene transfer (LGT, also known as horizontal gene transfer), an atypical mechanism of transferring genes between species, has almost become the default explanation for genes that display an unexpected composition or phylogeny. Numerous methods of detecting LGT events all rely on two fundamental strategies: primary structure composition or gene tree/species tree comparisons. Discouragingly, the results of these different approaches rarely coincide. With the wealth of genome data now available, detection of laterally transferred genes is increasingly being attempted in large uncurated eukaryotic datasets. However, detection methods depend greatly on the quality of the underlying genomic data, which are typically complex for eukaryotes. Furthermore, given the automated nature of genomic data collection, it is typically impractical to manually verify all protein or gene models, orthology predictions, and multiple sequence alignments, requiring researchers to accept a substantial margin of error in their datasets. Using a test case comprising plant-associated genomes across the fungal kingdom, this study reveals that composition- and phylogeny-based methods have little statistical power to detect laterally transferred genes. In particular, phylogenetic methods reveal extreme levels of topological variation in fungal gene trees, the vast majority of which show departures from the canonical species tree. Therefore, it is inherently challenging to detect LGT events in typical eukaryotic genomes. This finding is in striking contrast to the large number of claims for laterally transferred genes in eukaryotic species that routinely appear in the literature, and questions how many of these proposed examples are statistically well supported. PMID:28235827
Immunoglobulin superfamily members encoded by viruses and their multiple roles in immune evasion.

PubMed

Farré, Domènec; Martínez-Vicente, Pablo; Engel, Pablo; Angulo, Ana

2017-05-01

Pathogens have developed a plethora of strategies to undermine host immune defenses in order to guarantee their survival. For large DNA viruses, these immune evasion mechanisms frequently rely on the expression of genes acquired from host genomes. Horizontally transferred genes include members of the immunoglobulin superfamily, whose products constitute the most diverse group of proteins of vertebrate genomes. Their promiscuous immunoglobulin domains, which comprise the building blocks of these molecules, are involved in a large variety of functions mediated by ligand-binding interactions. The flexible structural nature of the immunoglobulin domains makes them appealing targets for viral capture due to their capacity to generate high functional diversity. Here, we present an up-to-date review of immunoglobulin superfamily gene homologs encoded by herpesviruses, poxviruses, and adenoviruses, that include CD200, CD47, Fc receptors, interleukin-1 receptor 2, interleukin-18 binding protein, CD80, carcinoembryonic antigen-related cell adhesion molecules, and signaling lymphocyte activation molecules. We discuss their distinct structural attributes, binding properties, and functions, shaped by evolutionary pressures to disarm specific immune pathways. We include several novel genes identified from extensive genome database surveys. An understanding of the properties and modes of action of these viral proteins may guide the development of novel immune-modulatory therapeutic tools. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Learning the Structure of Biomedical Relationships from Unstructured Text

PubMed Central

Percha, Bethany; Altman, Russ B.

2015-01-01

The published biomedical research literature encompasses most of our understanding of how drugs interact with gene products to produce physiological responses (phenotypes). Unfortunately, this information is distributed throughout the unstructured text of over 23 million articles. The creation of structured resources that catalog the relationships between drugs and genes would accelerate the translation of basic molecular knowledge into discoveries of genomic biomarkers for drug response and prediction of unexpected drug-drug interactions. Extracting these relationships from natural language sentences on such a large scale, however, requires text mining algorithms that can recognize when different-looking statements are expressing similar ideas. Here we describe a novel algorithm, Ensemble Biclustering for Classification (EBC), that learns the structure of biomedical relationships automatically from text, overcoming differences in word choice and sentence structure. We validate EBC's performance against manually-curated sets of (1) pharmacogenomic relationships from PharmGKB and (2) drug-target relationships from DrugBank, and use it to discover new drug-gene relationships for both knowledge bases. We then apply EBC to map the complete universe of drug-gene relationships based on their descriptions in Medline, revealing unexpected structure that challenges current notions about how these relationships are expressed in text. For instance, we learn that newer experimental findings are described in consistently different ways than established knowledge, and that seemingly pure classes of relationships can exhibit interesting chimeric structure. The EBC algorithm is flexible and adaptable to a wide range of problems in biomedical text mining. PMID:26219079
Chromatin accessibility and guide sequence secondary structure affect CRISPR-Cas9 gene editing efficiency.

PubMed

Jensen, Kristopher Torp; Fløe, Lasse; Petersen, Trine Skov; Huang, Jinrong; Xu, Fengping; Bolund, Lars; Luo, Yonglun; Lin, Lin

2017-07-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (CRISPR-Cas9) systems have emerged as the method of choice for genome editing, but large variations in on-target efficiencies continue to limit their applicability. Here, we investigate the effect of chromatin accessibility on Cas9-mediated gene editing efficiency for 20 gRNAs targeting 10 genomic loci in HEK293T cells using both SpCas9 and the eSpCas9(1.1) variant. Our study indicates that gene editing is more efficient in euchromatin than in heterochromatin, and we validate this finding in HeLa cells and in human fibroblasts. Furthermore, we investigate the gRNA sequence determinants of CRISPR-Cas9 activity using a surrogate reporter system and find that the efficiency of Cas9-mediated gene editing is dependent on guide sequence secondary structure formation. This knowledge can aid in the further improvement of tools for gRNA design. © 2017 Federation of European Biochemical Societies.
A model for evolution and regulation of nicotine biosynthesis regulon in tobacco.

PubMed

Kajikawa, Masataka; Sierro, Nicolas; Hashimoto, Takashi; Shoji, Tsubasa

2017-06-03

In tobacco, the defense alkaloid nicotine is produced in roots and accumulates mainly in leaves. Signaling mediated by jasmonates (JAs) induces the formation of nicotine via a series of structural genes that constitute a regulon and are coordinated by JA-responsive transcription factors of the ethylene response factor (ERF) family. Early steps in the pyrrolidine and pyridine biosynthesis pathways likely arose through duplication of the polyamine and nicotinamide adenine dinucleotide (NAD) biosynthetic pathways, respectively, followed by recruitment of duplicated primary metabolic genes into the nicotine biosynthesis regulon. Transcriptional regulation of nicotine biosynthesis by ERF and cooperatively-acting MYC2 transcription factors is implied by the frequency of cognate cis-regulatory elements for these factors in the promoter regions of the downstream structural genes. Indeed, a mutant tobacco with low nicotine content was found to have a large chromosomal deletion in a cluster of closely related ERF genes at the nicotine-controlling NICOTINE2 (NIC2) locus.
Phage phenomics: Physiological approaches to characterize novel viral proteins

ScienceCinema

Sanchez, Savannah E. [San Diego State Univ., San Diego, CA (United States); Cuevas, Daniel A. [San Diego State Univ., San Diego, CA (United States); Rostron, Jason E. [San Diego State Univ., San Diego, CA (United States); Liang, Tiffany Y. [San Diego State Univ., San Diego, CA (United States); Pivaroff, Cullen G. [San Diego State Univ., San Diego, CA (United States); Haynes, Matthew R. [San Diego State Univ., San Diego, CA (United States); Nulton, Jim [San Diego State Univ., San Diego, CA (United States); Felts, Ben [San Diego State Univ., San Diego, CA (United States); Bailey, Barbara A. [San Diego State Univ., San Diego, CA (United States); Salamon, Peter [San Diego State Univ., San Diego, CA (United States); Edwards, Robert A. [San Diego State Univ., San Diego, CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Burgin, Alex B. [Broad Institute, Cambridge, MA (United States); Segall, Anca M. [San Diego State Univ., San Diego, CA (United States); Rohwer, Forest [San Diego State Univ., San Diego, CA (United States)

2018-06-21

Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysis by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez, Savannah E.; Cuevas, Daniel A.; Rostron, Jason E.

Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysismore » by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.« less
Evolutionary expansion and divergence in a large family of primate-specific zinc finger transcription factor genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hamilton, A T; Huntley, S; Tran-Gyamfi, M

Although most genes are conserved as one-to-one orthologs in different mammalian orders, certain gene families have evolved to comprise different numbers and types of protein-coding genes through independent series of gene duplications, divergence and gene loss in each evolutionary lineage. One such family encodes KRAB-zinc finger (KRAB-ZNF) genes, which are likely to function as transcriptional repressors. One KRAB-ZNF subfamily, the ZNF91 clade, has expanded specifically in primates to comprise more than 110 loci in the human genome, yielding large gene clusters in human chromosomes 19 and 7 and smaller clusters or isolated copies at other chromosomal locations. Although phylogenetic analysismore » indicates that many of these genes arose before the split between old world monkeys and new world monkeys, the ZNF91 subfamily has continued to expand and diversify throughout the evolution of apes and humans. The paralogous loci are distinguished by sequence divergence within their zinc finger arrays indicating a selection for proteins with different DNA binding specificities. RT-PCR and in situ hybridization data show that some of these ZNF genes can have tissue-specific expression patterns, however many KRAB-ZNFs that are near-ubiquitous could also be playing very specific roles in halting target pathways in all tissues except for a few, where the target is released by the absence of its repressor. The number of variant KRAB-ZNF proteins is increased not only because of the large number of loci, but also because many loci can produce multiple splice variants, which because of the modular structure of these genes may have separate and perhaps even conflicting regulatory roles. The lineage-specific duplication and rapid divergence of this family of transcription factor genes suggests a role in determining species-specific biological differences and the evolution of novel primate traits.« less
Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes1[OPEN

PubMed Central

Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.; Stein, Joshua C.; Olson, Andrew J.; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M.; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

2015-01-01

The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. PMID:25384563
Effects of DNA Methylation and Chromatin State on Rates of Molecular Evolution in Insects.

PubMed

Glastad, Karl M; Goodisman, Michael A D; Yi, Soojin V; Hunt, Brendan G

2015-12-04

Epigenetic information is widely appreciated for its role in gene regulation in eukaryotic organisms. However, epigenetic information can also influence genome evolution. Here, we investigate the effects of epigenetic information on gene sequence evolution in two disparate insects: the fly Drosophila melanogaster, which lacks substantial DNA methylation, and the ant Camponotus floridanus, which possesses a functional DNA methylation system. We found that DNA methylation was positively correlated with the synonymous substitution rate in C. floridanus, suggesting a key effect of DNA methylation on patterns of gene evolution. However, our data suggest the link between DNA methylation and elevated rates of synonymous substitution was explained, in large part, by the targeting of DNA methylation to genes with signatures of transcriptionally active chromatin, rather than the mutational effect of DNA methylation itself. This phenomenon may be explained by an elevated mutation rate for genes residing in transcriptionally active chromatin, or by increased structural constraints on genes in inactive chromatin. This result highlights the importance of chromatin structure as the primary epigenetic driver of genome evolution in insects. Overall, our study demonstrates how different epigenetic systems contribute to variation in the rates of coding sequence evolution. Copyright © 2016 Glastad et al.
Genome structure of Rosa multiflora, a wild ancestor of cultivated roses

PubMed Central

Nakamura, Noriko; Hirakawa, Hideki; Sato, Shusei; Otagaki, Shungo; Matsumoto, Shogo; Tabata, Satoshi; Tanaka, Yoshikazu

2018-01-01

Abstract The draft genome sequence of a wild rose (Rosa multiflora Thunb.) was determined using Illumina MiSeq and HiSeq platforms. The total length of the scaffolds was 739,637,845 bp, consisting of 83,189 scaffolds, which was close to the 711 Mbp length estimated by k-mer analysis. N50 length of the scaffolds was 90,830 bp, and extent of the longest was 1,133,259 bp. The average GC content of the scaffolds was 38.9%. After gene prediction, 67,380 candidates exhibiting sequence homology to known genes and domains were extracted, which included complete and partial gene structures. This large number of genes for a diploid plant may reflect heterogeneity of the genome originating from self-incompatibility in R. multiflora. According to CEGMA analysis, 91.9% and 98.0% of the core eukaryotic genes were completely and partially conserved in the scaffolds, respectively. Genes presumably involved in flower color, scent and flowering are assigned. The results of this study will serve as a valuable resource for fundamental and applied research in the rose, including breeding and phylogenetic study of cultivated roses. PMID:29045613
Water-soluble polymers bearing phosphorylcholine group and other zwitterionic groups for carrying DNA derivatives.

PubMed

Lin, Xiaojie; Ishihara, Kazuhiko

2014-01-01

Water-soluble polymers with equal positive and negative charges in the same monomer unit, such as the phosphorylcholine group and other zwitterionic groups, exhibit promising potential in gene delivery with appreciable transfection efficiency, compared with the traditional poly(ethylene glycol)-based polycation-gene complexes. These zwitterionic polymers with various architectural structures and properties have been synthesized by various polymerization methods, such as conventional radical polymerization, atom-transfer radical-polymerization, reversible addition-fragmentation chain-transfer polymerization, and nitroxide-mediated radical polymerization. These techniques have been used to efficiently facilitate gene therapy by fabrication of non-viral vectors with high cytocompatibility, large gene-carrying capacity, effective cell-membrane permeability, and in vivo gene-loading/releasing functionality. Zwitterionic polymer-based gene delivery vectors systems can be categorized into soluble-polymer/gene mixing, molecular self-assembly, and polymer-gene conjugation systems. This review describes the preparation and characterization of various zwitterionic polymer-based gene delivery vectors, specifically water-soluble phospholipid polymers for carrying gene derivatives.
Association mapping of starch chain length distribution and amylose content in pea (Pisum sativum L.) using carbohydrate metabolism candidate genes.

PubMed

Carpenter, Margaret A; Shaw, Martin; Cooper, Rebecca D; Frew, Tonya J; Butler, Ruth C; Murray, Sarah R; Moya, Leire; Coyne, Clarice J; Timmerman-Vaughan, Gail M

2017-08-01

Although starch consists of large macromolecules composed of glucose units linked by α-1,4-glycosidic linkages with α-1,6-glycosidic branchpoints, variation in starch structural and functional properties is found both within and between species. Interest in starch genetics is based on the importance of starch in food and industrial processes, with the potential of genetics to provide novel starches. The starch metabolic pathway is complex but has been characterized in diverse plant species, including pea. To understand how allelic variation in the pea starch metabolic pathway affects starch structure and percent amylose, partial sequences of 25 candidate genes were characterized for polymorphisms using a panel of 92 diverse pea lines. Variation in the percent amylose composition of extracted seed starch and (amylopectin) chain length distribution, one measure of starch structure, were characterized for these lines. Association mapping was undertaken to identify polymorphisms associated with the variation in starch chain length distribution and percent amylose, using a mixed linear model that incorporated population structure and kinship. Associations were found for polymorphisms in seven candidate genes plus Mendel's r locus (which conditions the round versus wrinkled seed phenotype). The genes with associated polymorphisms are involved in the substrate supply, chain elongation and branching stages of the pea carbohydrate and starch metabolic pathways. The association of polymorphisms in carbohydrate and starch metabolic genes with variation in amylopectin chain length distribution and percent amylose may help to guide manipulation of pea seed starch structural and functional properties through plant breeding.
A Subset of Autism-Associated Genes Regulate the Structural Stability of Neurons

PubMed Central

Lin, Yu-Chih; Frei, Jeannine A.; Kilander, Michaela B. C.; Shen, Wenjuan; Blatt, Gene J.

2016-01-01

Autism spectrum disorder (ASD) comprises a range of neurological conditions that affect individuals’ ability to communicate and interact with others. People with ASD often exhibit marked qualitative difficulties in social interaction, communication, and behavior. Alterations in neurite arborization and dendritic spine morphology, including size, shape, and number, are hallmarks of almost all neurological conditions, including ASD. As experimental evidence emerges in recent years, it becomes clear that although there is broad heterogeneity of identified autism risk genes, many of them converge into similar cellular pathways, including those regulating neurite outgrowth, synapse formation and spine stability, and synaptic plasticity. These mechanisms together regulate the structural stability of neurons and are vulnerable targets in ASD. In this review, we discuss the current understanding of those autism risk genes that affect the structural connectivity of neurons. We sub-categorize them into (1) cytoskeletal regulators, e.g., motors and small RhoGTPase regulators; (2) adhesion molecules, e.g., cadherins, NCAM, and neurexin superfamily; (3) cell surface receptors, e.g., glutamatergic receptors and receptor tyrosine kinases; (4) signaling molecules, e.g., protein kinases and phosphatases; and (5) synaptic proteins, e.g., vesicle and scaffolding proteins. Although the roles of some of these genes in maintaining neuronal structural stability are well studied, how mutations contribute to the autism phenotype is still largely unknown. Investigating whether and how the neuronal structure and function are affected when these genes are mutated will provide insights toward developing effective interventions aimed at improving the lives of people with autism and their families. PMID:27909399
Environmental factors that shape biofilm formation.

PubMed

Toyofuku, Masanori; Inaba, Tomohiro; Kiyokawa, Tatsunori; Obana, Nozomu; Yawata, Yutaka; Nomura, Nobuhiko

2016-01-01

Cells respond to the environment and alter gene expression. Recent studies have revealed the social aspects of bacterial life, such as biofilm formation. Biofilm formation is largely affected by the environment, and the mechanisms by which the gene expression of individual cells affects biofilm development have attracted interest. Environmental factors determine the cell's decision to form or leave a biofilm. In addition, the biofilm structure largely depends on the environment, implying that biofilms are shaped to adapt to local conditions. Second messengers such as cAMP and c-di-GMP are key factors that link environmental factors with gene regulation. Cell-to-cell communication is also an important factor in shaping the biofilm. In this short review, we will introduce the basics of biofilm formation and further discuss environmental factors that shape biofilm formation. Finally, the state-of-the-art tools that allow us investigate biofilms under various conditions are discussed.
A combinatorial code for pattern formation in Drosophila oogenesis.

PubMed

Yakoby, Nir; Bristow, Christopher A; Gong, Danielle; Schafer, Xenia; Lembong, Jessica; Zartman, Jeremiah J; Halfon, Marc S; Schüpbach, Trudi; Shvartsman, Stanislav Y

2008-11-01

Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggests that they follow a simple combinatorial code based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of inductive signals, provided by the highly conserved epidermal growth factor receptor and bone morphogenetic protein signaling pathways. We demonstrate the validity of the code by testing it against a set of patterns obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguish 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterize their joint dynamics over four stages of oogenesis. The proposed combinatorial framework allows systematic analysis of the diversity and dynamics of two-dimensional transcriptional patterns and guides future studies of gene regulation.
Cell-autonomous correction of ring chromosomes in human induced pluripotent stem cells

NASA Astrophysics Data System (ADS)

Bershteyn, Marina; Hayashi, Yohei; Desachy, Guillaume; Hsiao, Edward C.; Sami, Salma; Tsang, Kathryn M.; Weiss, Lauren A.; Kriegstein, Arnold R.; Yamanaka, Shinya; Wynshaw-Boris, Anthony

2014-03-01

Ring chromosomes are structural aberrations commonly associated with birth defects, mental disabilities and growth retardation. Rings form after fusion of the long and short arms of a chromosome, and are sometimes associated with large terminal deletions. Owing to the severity of these large aberrations that can affect multiple contiguous genes, no possible therapeutic strategies for ring chromosome disorders have been proposed. During cell division, ring chromosomes can exhibit unstable behaviour leading to continuous production of aneuploid progeny with low viability and high cellular death rate. The overall consequences of this chromosomal instability have been largely unexplored in experimental model systems. Here we generated human induced pluripotent stem cells (iPSCs) from patient fibroblasts containing ring chromosomes with large deletions and found that reprogrammed cells lost the abnormal chromosome and duplicated the wild-type homologue through the compensatory uniparental disomy (UPD) mechanism. The karyotypically normal iPSCs with isodisomy for the corrected chromosome outgrew co-existing aneuploid populations, enabling rapid and efficient isolation of patient-derived iPSCs devoid of the original chromosomal aberration. Our results suggest a fundamentally different function for cellular reprogramming as a means of `chromosome therapy' to reverse combined loss-of-function across many genes in cells with large-scale aberrations involving ring structures. In addition, our work provides an experimentally tractable human cellular system for studying mechanisms of chromosomal number control, which is of critical relevance to human development and disease.
Identification and characterization of novel mutations of the major Fanconi anemia gene FANCA in the Japanese population.

PubMed

Yagasaki, Hiroshi; Hamanoue, Satoshi; Oda, Tsukasa; Nakahata, Tatsutoshi; Asano, Shigetaka; Yamashita, Takayuki

2004-12-01

Fanconi anemia (FA) is a rare autosomal recessive disorder of hematopoiesis, with at least 11 complementation groups. FANCA, a gene for group A, accounts for the majority of FA patients. Previous studies of FANCA mutations revealed high allelic heterogeneity, frequent occurrence of large deletions, and interpopulation differences. However, systematic mutational analysis, including gene dosage assay to detect large deletions, has not been documented for Asian populations. A newly developed TaqMan quantitative PCR-based gene dosage assay, combined with sequencing of exons and cDNA fragments, allowed for detection of 48 mutant alleles of FANCA in 27 (77%) of 35 unrelated Japanese FA families with no detectable mutations in FANCC or FANCG. We identified 29 different mutations (21 nucleotide substitutions or small deletions/insertions and eight large deletions), at least 20 of which were novel. The FANCA mutational spectrum of the Japanese was different from that of other ethnic groups so far studied. This is the largest scale of mutation analysis of FANCA in the Japanese population. Characterization of these mutations provided new information regarding the mutagenesis mechanisms and structure-function relationship of FANCA. Specifically, our data suggest that diverse mechanisms including nonhomologous recombination as well as Alu-mediated homologous recombination are involved in the generation of large deletions in FANCA. Copyright 2004 Wiley-Liss, Inc.

Large-scale sequence and structural comparisons of human naive and antigen-experienced antibody repertoires

PubMed Central

DeKosky, Brandon J.; Lungu, Oana I.; Park, Daechan; Johnson, Erik L.; Charab, Wissam; Chrysostomou, Constantine; Kuroda, Daisuke; Ellington, Andrew D.; Ippolito, Gregory C.; Gray, Jeffrey J.; Georgiou, George

2016-01-01

Elucidating how antigen exposure and selection shape the human antibody repertoire is fundamental to our understanding of B-cell immunity. We sequenced the paired heavy- and light-chain variable regions (VH and VL, respectively) from large populations of single B cells combined with computational modeling of antibody structures to evaluate sequence and structural features of human antibody repertoires at unprecedented depth. Analysis of a dataset comprising 55,000 antibody clusters from CD19+CD20+CD27− IgM-naive B cells, >120,000 antibody clusters from CD19+CD20+CD27+ antigen–experienced B cells, and >2,000 RosettaAntibody-predicted structural models across three healthy donors led to a number of key findings: (i) VH and VL gene sequences pair in a combinatorial fashion without detectable pairing restrictions at the population level; (ii) certain VH:VL gene pairs were significantly enriched or depleted in the antigen-experienced repertoire relative to the naive repertoire; (iii) antigen selection increased antibody paratope net charge and solvent-accessible surface area; and (iv) public heavy-chain third complementarity-determining region (CDR-H3) antibodies in the antigen-experienced repertoire showed signs of convergent paired light-chain genetic signatures, including shared light-chain third complementarity-determining region (CDR-L3) amino acid sequences and/or Vκ,λ–Jκ,λ genes. The data reported here address several longstanding questions regarding antibody repertoire selection and development and provide a benchmark for future repertoire-scale analyses of antibody responses to vaccination and disease. PMID:27114511
“Guilt by Association” Is the Exception Rather Than the Rule in Gene Networks

PubMed Central

Gillis, Jesse; Pavlidis, Paul

2012-01-01

Gene networks are commonly interpreted as encoding functional information in their connections. An extensively validated principle called guilt by association states that genes which are associated or interacting are more likely to share function. Guilt by association provides the central top-down principle for analyzing gene networks in functional terms or assessing their quality in encoding functional information. In this work, we show that functional information within gene networks is typically concentrated in only a very few interactions whose properties cannot be reliably related to the rest of the network. In effect, the apparent encoding of function within networks has been largely driven by outliers whose behaviour cannot even be generalized to individual genes, let alone to the network at large. While experimentalist-driven analysis of interactions may use prior expert knowledge to focus on the small fraction of critically important data, large-scale computational analyses have typically assumed that high-performance cross-validation in a network is due to a generalizable encoding of function. Because we find that gene function is not systemically encoded in networks, but dependent on specific and critical interactions, we conclude it is necessary to focus on the details of how networks encode function and what information computational analyses use to extract functional meaning. We explore a number of consequences of this and find that network structure itself provides clues as to which connections are critical and that systemic properties, such as scale-free-like behaviour, do not map onto the functional connectivity within networks. PMID:22479173
β-Cell-Specific Mafk Overexpression Impairs Pancreatic Endocrine Cell Development

PubMed Central

Abdellatif, Ahmed M.; Oishi, Hisashi; Itagaki, Takahiro; Jung, Yunshin; Shawki, Hossam H.; Okita, Yukari; Hasegawa, Yoshikazu; Suzuki, Hiroyuki; El-Morsy, Salah E.; El-Sayed, Mesbah A.; Shoaib, Mahmoud B.; Sugiyama, Fumihiro; Takahashi, Satoru

2016-01-01

The MAF family transcription factors are homologs of v-Maf, the oncogenic component of the avian retrovirus AS42. They are subdivided into 2 groups, small and large MAF proteins, according to their structure, function, and molecular size. MAFK is a member of the small MAF family and acts as a dominant negative form of large MAFs. In previous research we generated transgenic mice that overexpress MAFK in order to suppress the function of large MAF proteins in pancreatic β-cells. These mice developed hyperglycemia in adulthood due to impairment of glucose-stimulated insulin secretion. The aim of the current study is to examine the effects of β-cell-specific Mafk overexpression in endocrine cell development. The developing islets of Mafk-transgenic embryos appeared to be disorganized with an inversion of total numbers of insulin+ and glucagon+ cells due to reduced β-cell proliferation. Gene expression analysis by quantitative RT-PCR revealed decreased levels of β-cell-related genes whose expressions are known to be controlled by large MAF proteins. Additionally, these changes were accompanied with a significant increase in key β-cell transcription factors likely due to compensatory mechanisms that might have been activated in response to the β-cell loss. Finally, microarray comparison of gene expression profiles between wild-type and transgenic pancreata revealed alteration of some uncharacterized genes including Pcbd1, Fam132a, Cryba2, and Npy, which might play important roles during pancreatic endocrine development. Taken together, these results suggest that Mafk overexpression impairs endocrine development through a regulation of numerous β-cell-related genes. The microarray analysis provided a unique data set of differentially expressed genes that might contribute to a better understanding of the molecular basis that governs the development and function of endocrine pancreas. PMID:26901059
Subunit association of gamma-glutamyltranspeptidase of Escherichia coli K-12.

PubMed

Hashimoto, W; Suzuki, H; Nohara, S; Tachi, H; Yamamoto, K; Kumagai, H

1995-12-01

gamma-Glutamyltranspeptidase [EC 2.3.2.2] of Escherichia coli K-12 consists of one large subunit and one small subunit, which can be separated from each other by high-performance liquid chromatography. Using ion spray mass spectrometry, the masses of the large and the small subunit were determined to be 39,207 and 20,015, respectively. The large subunit exhibited no gamma-glutamyltranspeptidase activity and the small subunit had little enzymatic activity, but a mixture of the two subunits showed partial recovery of the enzymatic activity. The results of native-polyacrylamide gel electrophoresis suggested that they could partially recombine, and that the recombined dimer exhibited enzymatic activity. The gene of gamma-glutamyltranspeptidase encoded a signal peptide, and the large and small subunits in a single open reading frame in that order. Two kinds of plasmid were constructed encoding the signal peptide and either the large or the small subunit. A gamma-glutamyltranspeptidase-less mutant of E. coli K-12 was transformed with each plasmid or with both of them. The strain harboring the plasmid encoding each subunit produced a small amount of the corresponding subunit protein in the periplasmic space but exhibited no enzymatic activity. The strain transformed with both plasmids together exhibited the enzymatic activity, but its specific activity was approximately 3% of that of a strain harboring a plasmid encoding the intact structural gene. These results indicate that a portion of the separated large and small subunits can be reconstituted in vitro and exhibit the enzymatic activity, and that the expressed large and small subunits independently are able to associate in vivo and be folded into an active structure, though the specific activity of the associated subunits was much lower than that of native enzyme. This suggests that the synthesis of gamma-glutamyltranspeptidase in a single precursor polypeptide and subsequent processing are more effective to construct the intact structure of gamma-glutamyltranspeptidase than the association of the separated large and small subunits.
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

PubMed Central

Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

2012-01-01

Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms. PMID:22807664
Phylogenetics and evolution of Su(var)3-9 SET genes in land plants: rapid diversification in structure and function.

PubMed

Zhu, Xinyu; Ma, Hong; Chen, Zhiduan

2011-03-09

Plants contain numerous Su(var)3-9 homologues (SUVH) and related (SUVR) genes, some of which await functional characterization. Although there have been studies on the evolution of plant Su(var)3-9 SET genes, a systematic evolutionary study including major land plant groups has not been reported. Large-scale phylogenetic and evolutionary analyses can help to elucidate the underlying molecular mechanisms and contribute to improve genome annotation. Putative orthologs of plant Su(var)3-9 SET protein sequences were retrieved from major representatives of land plants. A novel clustering that included most members analyzed, henceforth referred to as core Su(var)3-9 homologues and related (cSUVHR) gene clade, was identified as well as all orthologous groups previously identified. Our analysis showed that plant Su(var)3-9 SET proteins possessed a variety of domain organizations, and can be classified into five types and ten subtypes. Plant Su(var)3-9 SET genes also exhibit a wide range of gene structures among different paralogs within a family, even in the regions encoding conserved PreSET and SET domains. We also found that the majority of SUVH members were intronless and formed three subclades within the SUVH clade. A detailed phylogenetic analysis of the plant Su(var)3-9 SET genes was performed. A novel deep phylogenetic relationship including most plant Su(var)3-9 SET genes was identified. Additional domains such as SAR, ZnF_C2H2 and WIYLD were early integrated into primordial PreSET/SET/PostSET domain organization. At least three classes of gene structures had been formed before the divergence of Physcomitrella patens (moss) from other land plants. One or multiple retroposition events might have occurred among SUVH genes with the donor genes leading to the V-2 orthologous group. The structural differences among evolutionary groups of plant Su(var)3-9 SET genes with different functions were described, contributing to the design of further experimental studies.
Computational prediction and experimental verification of HVA1-like abscisic acid responsive promoters in rice (Oryza sativa).

PubMed

Ross, Christian; Shen, Qingxi J

2006-09-01

Abscisic acid (ABA) is one of the central plant hormones, responsible for controlling both maturation and germination in seeds, as well as mediating adaptive responses to desiccation, injury, and pathogen infection in vegetative tissues. Thorough analyses of two barley genes, HVA1 and HVA22, indicate that their response to ABA relies on the interaction of two cis-acting elements in their promoters, an ABA response element (ABRE) and a coupling element (CE). Together, they form an ABA response promoter complex (ABRC). Comparison of promoters of barley HVA1 and it rice orthologue indicates that the structures and sequences of their ABRCs are highly similar. Prediction of ABA responsive genes in the rice genome is then tractable to a bioinformatics approach based on the structures of the well-defined barley ABRCs. Here we describe a model developed based on the consensus, inter-element spacing and orientations of experimentally determined ABREs and CEs. Our search of the rice promoter database for promoters that fit the model has generated a partial list of genes in rice that have a high likelihood of being involved in the ABA signaling network. The ABA inducibility of some of the rice genes identified was validated with quantitative reverse transcription PCR (QPCR). By limiting our input data to known enhancer modules and experimentally derived rules, we have generated a high confidence subset of ABA-regulated genes. The results suggest that the pathways by which cereals respond to biotic and abiotic stresses overlap significantly, and that regulation is not confined to the level transcription. The large fraction of putative regulatory genes carrying HVA1-like enhancer modules in their promoters suggests the ABA signal enters at multiple points into a complex regulatory network that remains largely unmapped.
Copy number variation at the 7q11.23 segmental duplications is a susceptibility factor for the Williams-Beuren syndrome deletion

PubMed Central

Cuscó, Ivon; Corominas, Roser; Bayés, Mònica; Flores, Raquel; Rivera-Brugués, Núria; Campuzano, Victoria; Pérez-Jurado, Luis A.

2008-01-01

Large copy number variants (CNVs) have been recently found as structural polymorphisms of the human genome of still unknown biological significance. CNVs are significantly enriched in regions with segmental duplications or low-copy repeats (LCRs). Williams-Beuren syndrome (WBS) is a neurodevelopmental disorder caused by a heterozygous deletion of contiguous genes at 7q11.23 mediated by nonallelic homologous recombination (NAHR) between large flanking LCRs and facilitated by a structural variant of the region, a ∼2-Mb paracentric inversion present in 20%–25% of WBS-transmitting progenitors. We now report that eight out of 180 (4.44%) WBS-transmitting progenitors are carriers of a CNV, displaying a chromosome with large deletion of LCRs. The prevalence of this CNV among control individuals and non-transmitting progenitors is much lower (1%, n = 600), thus indicating that it is a predisposing factor for the WBS deletion (odds ratio 4.6-fold, P = 0.002). LCR duplications were found in 2.22% of WBS-transmitting progenitors but also in 1.16% of controls, which implies a non–statistically significant increase in WBS-transmitting progenitors. We have characterized the organization and breakpoints of these CNVs, encompassing ∼100–300 kb of genomic DNA and containing several pseudogenes but no functional genes. Additional structural variants of the region have also been defined, all generated by NAHR between different blocks of segmental duplications. Our data further illustrate the highly dynamic structure of regions rich in segmental duplications, such as the WBS locus, and indicate that large CNVs can act as susceptibility alleles for disease-associated genomic rearrangements in the progeny. PMID:18292220
Polytene Chromosomes - A Portrait of Functional Organization of the Drosophila Genome.

PubMed

Zykova, Tatyana Yu; Levitsky, Victor G; Belyaeva, Elena S; Zhimulev, Igor F

2018-04-01

This mini-review is devoted to the problem genetic meaning of main polytene chromosome structures - bands and interbands. Generally, densely packed chromatin forms black bands, moderately condensed regions form grey loose bands, whereas decondensed regions of the genome appear as interbands. Recent progress in the annotation of the Drosophila genome and epigenome has made it possible to compare the banding pattern and the structural organization of genes, as well as their activity. This was greatly aided by our ability to establish the borders of bands and interbands on the physical map, which allowed to perform comprehensive side-by-side comparisons of cytology, genetic and epigenetic maps and to uncover the association between the morphological structures and the functional domains of the genome. These studies largely conclude that interbands 5'-ends of housekeeping genes that are active across all cell types. Interbands are enriched with proteins involved in transcription and nucleosome remodeling, as well as with active histone modifications. Notably, most of the replication origins map to interband regions. As for grey loose bands adjacent to interbands, they typically host the bodies of house-keeping genes. Thus, the bipartite structure composed of an interband and an adjacent grey band functions as a standalone genetic unit. Finally, black bands harbor tissue-specific genes with narrow temporal and tissue expression profiles. Thus, the uniform and permanent activity of interbands combined with the inactivity of genes in bands forms the basis of the universal banding pattern observed in various Drosophila tissues.
Co-Option and De Novo Gene Evolution Underlie Molluscan Shell Diversity

PubMed Central

Aguilera, Felipe; McDougall, Carmel

2017-01-01

Abstract Molluscs fabricate shells of incredible diversity and complexity by localized secretions from the dorsal epithelium of the mantle. Although distantly related molluscs express remarkably different secreted gene products, it remains unclear if the evolution of shell structure and pattern is underpinned by the differential co-option of conserved genes or the integration of lineage-specific genes into the mantle regulatory program. To address this, we compare the mantle transcriptomes of 11 bivalves and gastropods of varying relatedness. We find that each species, including four Pinctada (pearl oyster) species that diverged within the last 20 Ma, expresses a unique mantle secretome. Lineage- or species-specific genes comprise a large proportion of each species’ mantle secretome. A majority of these secreted proteins have unique domain architectures that include repetitive, low complexity domains (RLCDs), which evolve rapidly, and have a proclivity to expand, contract and rearrange in the genome. There are also a large number of secretome genes expressed in the mantle that arose before the origin of gastropods and bivalves. Each species expresses a unique set of these more ancient genes consistent with their independent co-option into these mantle gene regulatory networks. From this analysis, we infer lineage-specific secretomes underlie shell diversity, and include both rapidly evolving RLCD-containing proteins, and the continual recruitment and loss of both ancient and recently evolved genes into the periphery of the regulatory network controlling gene expression in the mantle epithelium. PMID:28053006
Genome Reduction Uncovers a Large Dispensable Genome and Adaptive Role for Copy Number Variation in Asexually Propagated Solanum tuberosum[OPEN

PubMed Central

Hardigan, Michael A.; Crisovan, Emily; Hamilton, John P.; Laimbeer, Parker; Leisner, Courtney P.; Manrique-Carpintero, Norma C.; Newton, Linsey; Pham, Gina M.; Vaillancourt, Brieanne; Zeng, Zixian; Jiang, Jiming

2016-01-01

Clonally reproducing plants have the potential to bear a significantly greater mutational load than sexually reproducing species. To investigate this possibility, we examined the breadth of genome-wide structural variation in a panel of monoploid/doubled monoploid clones generated from native populations of diploid potato (Solanum tuberosum), a highly heterozygous asexually propagated plant. As rare instances of purely homozygous clones, they provided an ideal set for determining the degree of structural variation tolerated by this species and deriving its minimal gene complement. Extensive copy number variation (CNV) was uncovered, impacting 219.8 Mb (30.2%) of the potato genome with nearly 30% of genes subject to at least partial duplication or deletion, revealing the highly heterogeneous nature of the potato genome. Dispensable genes (>7000) were associated with limited transcription and/or a recent evolutionary history, with lower deletion frequency observed in genes conserved across angiosperms. Association of CNV with plant adaptation was highlighted by enrichment in gene clusters encoding functions for environmental stress response, with gene duplication playing a part in species-specific expansions of stress-related gene families. This study revealed unique impacts of CNV in a species with asexual reproductive habits and how CNV may drive adaption through evolution of key stress pathways. PMID:26772996
Stability and structural properties of gene regulation networks with coregulation rules.

PubMed

Warrell, Jonathan; Mhlanga, Musa

2017-05-07

Coregulation of the expression of groups of genes has been extensively demonstrated empirically in bacterial and eukaryotic systems. Such coregulation can arise through the use of shared regulatory motifs, which allow the coordinated expression of modules (and module groups) of functionally related genes across the genome. Coregulation can also arise through the physical association of multi-gene complexes through chromosomal looping, which are then transcribed together. We present a general formalism for modeling coregulation rules in the framework of Random Boolean Networks (RBN), and develop specific models for transcription factor networks with modular structure (including module groups, and multi-input modules (MIM) with autoregulation) and multi-gene complexes (including hierarchical differentiation between multi-gene complex members). We develop a mean-field approach to analyse the dynamical stability of large networks incorporating coregulation, and show that autoregulated MIM and hierarchical gene-complex models can achieve greater stability than networks without coregulation whose rules have matching activation frequency. We provide further analysis of the stability of small networks of both kinds through simulations. We also characterize several general properties of the transients and attractors in the hierarchical coregulation model, and show using simulations that the steady-state distribution factorizes hierarchically as a Bayesian network in a Markov Jump Process analogue of the RBN model. Copyright © 2017. Published by Elsevier Ltd.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

PubMed Central

Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

2001-01-01

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Evolution of a horizontally acquired legume gene, albumin 1, in the parasitic plant Phelipanche aegyptiaca and related species.

PubMed

Zhang, Yeting; Fernandez-Aparicio, Monica; Wafula, Eric K; Das, Malay; Jiao, Yuannian; Wickett, Norman J; Honaas, Loren A; Ralph, Paula E; Wojciechowski, Martin F; Timko, Michael P; Yoder, John I; Westwood, James H; Depamphilis, Claude W

2013-02-20

Parasitic plants, represented by several thousand species of angiosperms, use modified structures known as haustoria to tap into photosynthetic host plants and extract nutrients and water. As a result of their direct plant-plant connections with their host plant, parasitic plants have special opportunities for horizontal gene transfer, the nonsexual transmission of genetic material across species boundaries. There is increasing evidence that parasitic plants have served as recipients and donors of horizontal gene transfer (HGT), but the long-term impacts of eukaryotic HGT in parasitic plants are largely unknown. Here we show that a gene encoding albumin 1 KNOTTIN-like protein, closely related to the albumin 1 genes only known from papilionoid legumes, where they serve dual roles as food storage and insect toxin, was found in Phelipanche aegyptiaca and related parasitic species of family Orobanchaceae, and was likely acquired by a Phelipanche ancestor via HGT from a legume host based on phylogenetic analyses. The KNOTTINs are well known for their unique "disulfide through disulfide knot" structure and have been extensively studied in various contexts, including drug design. Genomic sequences from nine related parasite species were obtained, and 3D protein structure simulation tests and evolutionary constraint analyses were performed. The parasite gene we identified here retains the intron structure, six highly conserved cysteine residues necessary to form a KNOTTIN protein, and displays levels of purifying selection like those seen in legumes. The albumin 1 xenogene has evolved through >150 speciation events over ca. 16 million years, forming a small family of differentially expressed genes that may confer novel functions in the parasites. Moreover, further data show that a distantly related parasitic plant, Cuscuta, obtained two copies of albumin 1 KNOTTIN-like genes from legumes through a separate HGT event, suggesting that legume KNOTTIN structures have been repeatedly co-opted by parasitic plants. The HGT-derived albumins in Phelipanche represent a novel example of how plants can acquire genes from other plants via HGT that then go on to duplicate, evolve, and retain the specialized features required to perform a unique host-derived function.
Aggregating Data for Computational Toxicology Applications ...

EPA Pesticide Factsheets

Computational toxicology combines data from high-throughput test methods, chemical structure analyses and other biological domains (e.g., genes, proteins, cells, tissues) with the goals of predicting and understanding the underlying mechanistic causes of chemical toxicity and for predicting toxicity of new chemicals and products. A key feature of such approaches is their reliance on knowledge extracted from large collections of data and data sets in computable formats. The U.S. Environmental Protection Agency (EPA) has developed a large data resource called ACToR (Aggregated Computational Toxicology Resource) to support these data-intensive efforts. ACToR comprises four main repositories: core ACToR (chemical identifiers and structures, and summary data on hazard, exposure, use, and other domains), ToxRefDB (Toxicity Reference Database, a compilation of detailed in vivo toxicity data from guideline studies), ExpoCastDB (detailed human exposure data from observational studies of selected chemicals), and ToxCastDB (data from high-throughput screening programs, including links to underlying biological information related to genes and pathways). The EPA DSSTox (Distributed Structure-Searchable Toxicity) program provides expert-reviewed chemical structures and associated information for these and other high-interest public inventories. Overall, the ACToR system contains information on about 400,000 chemicals from 1100 different sources. The entire system is built usi
Local scale connectivity in the cave-dwelling brooding fish Apogon imberbis

NASA Astrophysics Data System (ADS)

Muths, Delphine; Rastorgueff, Pierre-Alexandre; Selva, Marjorie; Chevaldonné, Pierre

2015-01-01

A lower degree of population connectivity is generally expected for species living in a naturally fragmented habitat than for species living in a continuum of suitable environment. Due to clear-cut environmental conditions with the surrounding littoral zone, underwater marine caves of the Mediterranean Sea constitute a good model to explore the effect of habitat discontinuity on the population structure of their inhabitants. With this goal, the genetic population structure of Apogon imberbis, a mouth-brooding teleost, was explored using the mitochondrial cytochrome b gene and 7 nuclear microsatellite loci from 164 fishes sampled at the micro-scale (ca. 40 km) of the Marseille area (Bay of Marseille and Calanques coast, in NW Mediterranean). Both marker types indicated a low level of genetic structure within the studied area. We propose that each suitable crack and cavity is used as a stepping-stone habitat between disconnected large cave-habitats. This, together with larval dispersal, ensures enough gene flow between caves to homogenize the genetic pattern at microscale while isolation by distance and by open waters could explain the small structure observed. The present study indicates that the effect of natural fragmentation in connectivity disruption can largely be counter-balanced by life history traits and overlooked details in habitat preferences.
Introduction to bioinformatics.

PubMed

Can, Tolga

2014-01-01

Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
DSSTox chemical-index files for exposure-related ...

EPA Pesticide Factsheets

The Distributed Structure-Searchable Toxicity (DSSTox) ARYEXP and GEOGSE files are newly published, structure-annotated files of the chemical-associated and chemical exposure-related summary experimental content contained in the ArrayExpress Repository and Gene Expression Omnibus (GEO) Series (based on data extracted on September 20, 2008). ARYEXP and GEOGSE contain 887 and 1064 unique chemical substances mapped to 1835 and 2381 chemical exposure-related experiment accession IDs, respectively. The standardized files allow one to assess, compare and search the chemical content in each resource, in the context of the larger DSSTox toxicology data network, as well as across large public cheminformatics resources such as PubChem (http://pubchem.ncbi.nlm.nih.gov). The Distributed Structure-Searchable Toxicity (DSSTox) ARYEXP and GEOGSE files are newly published, structure-annotated files of the chemical-associated and chemical exposure-related summary experimental content contained in the ArrayExpress Repository and Gene Expression Omnibus (GEO) Series (based on data extracted on September 20, 2008). ARYEXP and GEOGSE contain 887 and 1064 unique chemical substances mapped to 1835 and 2381 chemical exposure-related experiment accession IDs, respectively. The standardized files allow one to assess, compare and search the chemical content in each resource, in the context of the larger DSSTox toxicology data network, as well as across large public cheminformatics resourc
Changes in bacterial community of anthracene bioremediation in municipal solid waste composting soil*

PubMed Central

Zhang, Shu-ying; Wang, Qing-feng; Wan, Rui; Xie, Shu-guang

2011-01-01

Polycyclic aromatic hydrocarbons (PAHs) are common contaminants in a municipal solid waste (MSW) composting site. Knowledge of changes in microbial structure is useful to identify particular PAH degraders. However, the microbial community in the MSW composting soil and its change associated with prolonged exposure to PAHs and subsequent biodegradation remain largely unknown. In this study, anthracene was selected as a model compound. The bacterial community structure was investigated using terminal restriction fragment length polymorphism (TRFLP) and 16S rRNA gene clone library analysis. The two bimolecular tools revealed a large shift of bacterial community structure after anthracene amendment and subsequent biodegradation. Genera Methylophilus, Mesorhizobium, and Terrimonas had potential links to anthracene biodegradation, suggesting a consortium playing an active role. PMID:21887852
The fine-scale genetic structure of the two-spotted spider mite in a commercial greenhouse.

PubMed

Uesugi, R; Kunimoto, Y; Osakabe, Mh

2009-02-01

The fine-scale genetic structure of Tetranychus urticae Koch was studied to estimate local gene flow within a rose tree habitat in a commercial greenhouse using seven microsatellite markers. Two beds of rose trees with different population densities were selected and 18 consecutive quadrats of 1.2 m length were sequentially established in each bed. Heterozygote deficiency was positive within quadrats, which was most likely a result of the Wahlund effect because the mites usually form small breeding colonies. Low population density and frequent inbreeding could also accelerate genetic differentiation among the breeding colonies. A short-range (2.4-3.6 m) positive autocorrelation and clear genetic cline among quadrat populations was detected within a bed. This suggests that gene flow was limited to a short range even if population density was substantially increased. Therefore, large-scale dispersal such as aerial dispersal contributed very little to gene flow in the greenhouse.

Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures

PubMed Central

Naughton, Catherine; Avlonitis, Nicolaos; Corless, Samuel; Prendergast, James G.; Mati, Ioulia K.; Eijk, Paul P.; Cockroft, Scott L.; Bradley, Mark; Ylstra, Bauke; Gilbert, Nick

2013-01-01

DNA supercoiling is an inherent consequence of twisting DNA and is critical for regulating gene expression and DNA replication. However, DNA supercoiling at a genomic scale in human cells is uncharacterized. To map supercoiling we used biotinylated-trimethylpsoralen as a DNA structure probe to show the genome is organized into supercoiling domains. Domains are formed and remodeled by RNA polymerase and topoisomerase activities and are flanked by GC-AT boundaries and CTCF binding sites. Under-wound domains are transcriptionally active, enriched in topoisomerase I, “open” chromatin fibers and DNaseI sites, but are depleted of topoisomerase II. Furthermore DNA supercoiling impacts on additional levels of chromatin compaction as under-wound domains are cytologically decondensed, topologically constrained, and decompacted by transcription of short RNAs. We suggest that supercoiling domains create a topological environment that facilitates gene activation providing an evolutionary purpose for clustering genes along chromosomes. PMID:23416946
Nucleic acids encoding plant glutamine phenylpyruvate transaminase (GPT) and uses thereof

DOEpatents

Unkefer, Pat J.; Anderson, Penelope S.; Knight, Thomas J.

2016-03-29

Glutamine phenylpyruvate transaminase (GPT) proteins, nucleic acid molecules encoding GPT proteins, and uses thereof are disclosed. Provided herein are various GPT proteins and GPT gene coding sequences isolated from a number of plant species. As disclosed herein, GPT proteins share remarkable structural similarity within plant species, and are active in catalyzing the synthesis of 2-hydroxy-5-oxoproline (2-oxoglutaramate), a powerful signal metabolite which regulates the function of a large number of genes involved in the photosynthesis apparatus, carbon fixation and nitrogen metabolism.
Comparative genomic analysis of the MHC: the evolution of class I duplication blocks, diversity and complexity from shark to man.

PubMed

Kulski, Jerzy K; Shiina, Takashi; Anzai, Tatsuya; Kohara, Sakae; Inoko, Hidetoshi

2002-12-01

The major histocompatibility complex (MHC) genomic region is composed of a group of linked genes involved functionally with the adaptive and innate immune systems. The class I and class II genes are intrinsic features of the MHC and have been found in all the jawed vertebrates studied so far. The MHC genomic regions of the human and the chicken (B locus) have been fully sequenced and mapped, and the mouse MHC sequence is almost finished. Information on the MHC genomic structures (size, complexity, genic and intergenic composition and organization, gene order and number) of other vertebrates is largely limited or nonexistent. Therefore, we are mapping, sequencing and analyzing the MHC genomic regions of different human haplotypes and at least eight nonhuman species. Here, we review our progress with these sequences and compare the human MHC structure with that of the nonhuman primates (chimpanzee and rhesus macaque), other mammals (pigs, mice and rats) and nonmammalian vertebrates such as birds (chicken and quail), bony fish (medaka, pufferfish and zebrafish) and cartilaginous fish (nurse shark). This comparison reveals a complex MHC structure for mammals and a relatively simpler design for nonmammalian animals with a hypothetical prototypic structure for the shark. In the mammalian MHC, there are two to five different class I duplication blocks embedded within a framework of conserved nonclass I and/or nonclass II genes. With a few exceptions, the class I framework genes are absent from the MHC of birds, bony fish and sharks. Comparative genomics of the MHC reveal a highly plastic region with major structural differences between the mammalian and nonmammalian vertebrates. Additional genomic data are needed on animals of the reptilia, crocodilia and marsupial classes to find the origins of the class I framework genes and examples of structures that may be intermediate between the simple and complex MHC organizations of birds and mammals, respectively.
The Role of Retrotransposons in Gene Family Expansions in the Human and Mouse Genomes

PubMed Central

Janoušek, Václav; Laukaitis, Christina M.; Yanchukov, Alexey

2016-01-01

Abstract Retrotransposons comprise a large portion of mammalian genomes. They contribute to structural changes and more importantly to gene regulation. The expansion and diversification of gene families have been implicated as sources of evolutionary novelties. Given the roles retrotransposons play in genomes, their contribution to the evolution of gene families warrants further exploration. In this study, we found a significant association between two major retrotransposon classes, LINEs and LTRs, and lineage-specific gene family expansions in both the human and mouse genomes. The distribution and diversity differ between LINEs and LTRs, suggesting that each has a distinct involvement in gene family expansion. LTRs are associated with open chromatin sites surrounding the gene families, supporting their involvement in gene regulation, whereas LINEs may play a structural role promoting gene duplication. Our findings also suggest that gene family expansions, especially in the mouse genome, undergo two phases. The first phase is characterized by elevated deposition of LTRs and their utilization in reshaping gene regulatory networks. The second phase is characterized by rapid gene family expansion due to continuous accumulation of LINEs and it appears that, in some instances at least, this could become a runaway process. We provide an example in which this has happened and we present a simulation supporting the possibility of the runaway process. Altogether we provide evidence of the contribution of retrotransposons to the expansion and evolution of gene families. Our findings emphasize the putative importance of these elements in diversification and adaptation in the human and mouse lineages. PMID:27503295
Transcript analysis of the extended hyp-operon in the cyanobacteria Nostoc sp. strain PCC 7120 and Nostoc punctiforme ATCC 29133

PubMed Central

2011-01-01

Background Cyanobacteria harbor two [NiFe]-type hydrogenases consisting of a large and a small subunit, the Hup- and Hox-hydrogenase, respectively. Insertion of ligands and correct folding of nickel-iron hydrogenases require assistance of accessory maturation proteins (encoded by the hyp-genes). The intergenic region between the structural genes encoding the uptake hydrogenase (hupSL) and the accessory maturation proteins (hyp genes) in the cyanobacteria Nostoc PCC 7120 and N. punctiforme were analysed using molecular methods. Findings The five ORFs, located in between the uptake hydrogenase structural genes and the hyp-genes, can form a transcript with the hyp-genes. An identical genomic localization of these ORFs are found in other filamentous, N2-fixing cyanobacterial strains. In N. punctiforme and Nostoc PCC 7120 the ORFs upstream of the hyp-genes showed similar transcript level profiles as hupS (hydrogenase structural gene), nifD (nitrogenase structural gene), hypC and hypF (accessory hydrogenase maturation genes) after nitrogen depletion. In silico analyzes showed that these ORFs in N. punctiforme harbor the same conserved regions as their homologues in Nostoc PCC 7120 and that they, like their homologues in Nostoc PCC 7120, can be transcribed together with the hyp-genes forming a larger extended hyp-operon. DNA binding studies showed interactions of the transcriptional regulators CalA and CalB to the promoter regions of the extended hyp-operon in N. punctiforme and Nostoc PCC 7120. Conclusions The five ORFs upstream of the hyp-genes in several filamentous N2-fixing cyanobacteria have an identical genomic localization, in between the genes encoding the uptake hydrogenase and the maturation protein genes. In N. punctiforme and Nostoc PCC 7120 they are transcribed as one operon and may form transcripts together with the hyp-genes. The expression pattern of the five ORFs within the extended hyp-operon in both Nostoc punctiforme and Nostoc PCC 7120 is similar to the expression patterns of hupS, nifD, hypF and hypC. CalA, a known transcription factor, interacts with the promoter region between hupSL and the five ORFs in the extended hyp-operon in both Nostoc strains. PMID:21672234
Beyond the known functions of the CCR4-NOT complex in gene expression regulatory mechanisms: New structural insights to unravel CCR4-NOT mRNA processing machinery.

PubMed

Ukleja, Marta; Valpuesta, José María; Dziembowski, Andrzej; Cuellar, Jorge

2016-10-01

Large protein assemblies are usually the effectors of major cellular processes. The intricate cell homeostasis network is divided into numerous interconnected pathways, each controlled by a set of protein machines. One of these master regulators is the CCR4-NOT complex, which ultimately controls protein expression levels. This multisubunit complex assembles around a scaffold platform, which enables a wide variety of well-studied functions from mRNA synthesis to transcript decay, as well as other tasks still being identified. Solving the structure of the entire CCR4-NOT complex will help to define the distribution of its functions. The recently published three-dimensional reconstruction of the complex, in combination with the known crystal structures of some of the components, has begun to address this. Methodological improvements in structural biology, especially in cryoelectron microscopy, encourage further structural and protein-protein interaction studies, which will advance our comprehension of the gene expression machinery. © 2016 WILEY Periodicals, Inc.
Physical structure and chromosomal localization of a gene encoding human p58[sup clk-1], a cell division control related protein kinase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eipers, P.G.

1992-01-01

The gene for the human p58[sup clk[minus]1] protein kinase, a cell division control-related gene, has been mapped by somatic cell hybrid analyses, in situ localization with the chromosomal gene, and nested polymerase chain reaction amplification of microdissected chromosomes. These studies indicate that the expressed p58[sup clk[minus]1] chromosomal gene maps to 1p36, while a highly related p58[sup clk[minus]1] sequence of unknown nature maps to chromosome 15. Assignment of a p34[sup cdc2]-related gene to 1p36 region, including neuroblastoma, ductal carcinoma of the breast, malignant melanoma, Merkel cell carcinoma and endocrine neoplasia among others. Aberrant expression of this protein kinase negatively regulates normalmore » cellular growth. The p58[sup clk[minus]1] protein contains a central domain of 299 amino acids that is 46% identical to human p34[sup cdc2], the master mitotic protein kinase. This dissertation details the complete structure of the p58[sup clk[minus]1] chromosomal gene, including its putative promoter region, transcriptional start sites, exonic sequences, and intron/exon boundary sequences. The gene is 10 kb in size and contains 12 exons and 11 introns. Interestingly, the rather large 2.0 kb 3[prime] untranslated region is interrupted by an intron that separates a region containing numerous AUUUA destabilization motifs from the coding region. Furthermore, the expression of this gene in normal human tissues, as well as several human tumor cell samples and lines, is examined. The origin of multiple human transcripts from the same chromosomal gene, and the possible differential stability of these various transcripts, is discussed with regard to the transcriptional and post-transcriptional regulation of this gene. This is the first report of the chromosomal gene structure of a member of the p34[sup cdc2] supergene family.« less
Clinically relevant morphological structures in breast cancer represent transcriptionally distinct tumor cell populations with varied degrees of epithelial-mesenchymal transition and CD44+CD24- stemness

PubMed Central

Denisov, Evgeny V.; Skryabin, Nikolay A.; Gerashchenko, Tatiana S.; Tashireva, Lubov A.; Wilhelm, Jochen; Buldakov, Mikhail A.; Sleptcov, Aleksei A.; Lebedev, Igor N.; Vtorushin, Sergey V.; Zavyalova, Marina V.; Cherdyntseva, Nadezhda V.; Perelmuter, Vladimir M.

2017-01-01

Intratumor morphological heterogeneity in breast cancer is represented by different morphological structures (tubular, alveolar, solid, trabecular, and discrete) and contributes to poor prognosis; however, the mechanisms involved remain unclear. In this study, we performed 3D imaging, laser microdissection-assisted array comparative genomic hybridization and gene expression microarray analysis of different morphological structures and examined their association with the standard immunohistochemistry scorings and CD44+CD24- cancer stem cells. We found that the intratumor morphological heterogeneity is not associated with chromosomal aberrations. By contrast, morphological structures were characterized by specific gene expression profiles and signaling pathways and significantly differed in progesterone receptor and Ki-67 expression. Most importantly, we observed significant differences between structures in the number of expressed genes of the epithelial and mesenchymal phenotypes and the association with cancer invasion pathways. Tubular (tube-shaped) and alveolar (spheroid-shaped) structures were transcriptionally similar and demonstrated co-expression of epithelial and mesenchymal markers. Solid (large shapeless) structures retained epithelial features but demonstrated an increase in mesenchymal traits and collective cell migration hallmarks. Mesenchymal genes and cancer invasion pathways, as well as Ki-67 expression, were enriched in trabecular (one/two rows of tumor cells) and discrete groups (single cells and/or arrangements of 2-5 cells). Surprisingly, the number of CD44+CD24- cells was found to be the lowest in discrete groups and the highest in alveolar and solid structures. Overall, our findings indicate the association of intratumor morphological heterogeneity in breast cancer with the epithelial-mesenchymal transition and CD44+CD24- stemness and the appeal of this heterogeneity as a model for the study of cancer invasion. PMID:28977854
The limitations of simple gene set enrichment analysis assuming gene independence.

PubMed

Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P

2016-02-01

Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. © The Author(s) 2012.
PPARÁ-DEPENDENT GENE EXPRESSION CHANGES IN THE MOUSE LIVER AFTER EXPOSURE TO PEROXISOME PROLIFERATORS

EPA Science Inventory

Peroxisome proliferators (PP) are a large class of structurally diverse chemicals that mediate their effects in the liver mainly through the PP-activated receptor ¿ (PPARα). Development of PP induced hepatocarcinogenesis in mouse liver is known to be dependent on PPAR&#...
GeoChip-Based Analysis of the Functional Gene Diversity and Metabolic Potential of Microbial Communities in Acid Mine Drainage▿ †

PubMed Central

Xie, Jianping; He, Zhili; Liu, Xinxing; Liu, Xueduan; Van Nostrand, Joy D.; Deng, Ye; Wu, Liyou; Zhou, Jizhong; Qiu, Guanzhou

2011-01-01

Acid mine drainage (AMD) is an extreme environment, usually with low pH and high concentrations of metals. Although the phylogenetic diversity of AMD microbial communities has been examined extensively, little is known about their functional gene diversity and metabolic potential. In this study, a comprehensive functional gene array (GeoChip 2.0) was used to analyze the functional diversity, composition, structure, and metabolic potential of AMD microbial communities from three copper mines in China. GeoChip data indicated that these microbial communities were functionally diverse as measured by the number of genes detected, gene overlapping, unique genes, and various diversity indices. Almost all key functional gene categories targeted by GeoChip 2.0 were detected in the AMD microbial communities, including carbon fixation, carbon degradation, methane generation, nitrogen fixation, nitrification, denitrification, ammonification, nitrogen reduction, sulfur metabolism, metal resistance, and organic contaminant degradation, which suggested that the functional gene diversity was higher than was previously thought. Mantel test results indicated that AMD microbial communities are shaped largely by surrounding environmental factors (e.g., S, Mg, and Cu). Functional genes (e.g., narG and norB) and several key functional processes (e.g., methane generation, ammonification, denitrification, sulfite reduction, and organic contaminant degradation) were significantly (P < 0.10) correlated with environmental variables. This study presents an overview of functional gene diversity and the structure of AMD microbial communities and also provides insights into our understanding of metabolic potential in AMD ecosystems. PMID:21097602
Water insoluble and soluble lipids for gene delivery.

PubMed

Mahato, Ram I

2005-04-05

Among various synthetic gene carriers currently in use, liposomes composed of cationic lipids and co-lipids remain the most efficient transfection reagents. Physicochemical properties of lipid/plasmid complexes, such as cationic lipid structure, cationic lipid to co-lipid ratio, charge ratio, particle size and zeta potential have significant influence on gene expression and biodistribution. However, most cationic lipids are toxic and cationic liposomes/plasmid complexes do not disperse well inside the target tissues because of their large particle size. To overcome the problems associated with cationic lipids, we designed water soluble lipopolymers for gene delivery to various cells and tissues. This review provides a critical discussion on how the components of water insoluble and soluble lipids affect their transfection efficiency and biodistribution of lipid/plasmid complexes.
Automated Protocol for Large-Scale Modeling of Gene Expression Data.

PubMed

Hall, Michelle Lynn; Calkins, David; Sherman, Woody

2016-11-28

With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.
The complete chloroplast genome of salt cress (Eutrema salsugineum).

PubMed

Guo, Xinyi; Hao, Guoqian; Ma, Tao

2016-07-01

The complete chloroplast (cp) sequence of the salt cress (Eutrema salsugineum), a plant well-adapted to salt stress, was presented in this study. The circular molecule is 153,407 bp in length and exhibit a typical quadripartite structure containing an 83,894 bp large single copy (LSC) region, a 17,607 bp small single copy (SSC) region, and the two 25,953 bp inverted repeats (IRs). The salt cress cp genome contains 135 known genes, including 87 protein-coding genes, 8 ribosomal RNA genes, and 40 tRNA genes; 21 of these are located in the inverted repeat region. As expected, phylogenetic analysis support the idea that E. salsugineum is sister to Brassiceae species within the Brassicaceae family.
Molecular basis of the polydispersity of mucins: implications for the generation of saccharide diversity.

PubMed

Bhavanandan, V P; Gupta, D; Woitach, J; Guo, X; Jiang, W

1999-06-01

Secreted epithelial mucins are large macromolecules which exhibit extreme polydispersity, the molecular basis of which is not fully understood. We have obtained partial sequences of two genes (BSM1 and BSM2) coding for two distinct molecules. This is the first time that such closely-related genes have been identified for any mucin from an animal. We propose that a combination of multiple homologous genes, alternative splicing, differential glycosylation, and additional post-translational processing all contribute to the extreme polydispersity of mucins. The multiple domain structure and non-identical tandem repeats are also very important for the generation of the saccharide diversities of mucins.
Higher-order organisation of extremely amplified, potentially functional and massively methylated 5S rDNA in European pikes (Esox sp.).

PubMed

Symonová, Radka; Ocalewicz, Konrad; Kirtiklis, Lech; Delmastro, Giovanni Battista; Pelikánová, Šárka; Garcia, Sonia; Kovařík, Aleš

2017-05-18

Pikes represent an important genus (Esox) harbouring a pre-duplication karyotype (2n = 2x = 50) of economically important salmonid pseudopolyploids. Here, we have characterized the 5S ribosomal RNA genes (rDNA) in Esox lucius and its closely related E. cisalpinus using cytogenetic, molecular and genomic approaches. Intragenomic homogeneity and copy number estimation was carried out using Illumina reads. The higher-order structure of rDNA arrays was investigated by the analysis of long PacBio reads. Position of loci on chromosomes was determined by FISH. DNA methylation was analysed by methylation-sensitive restriction enzymes. The 5S rDNA loci occupy exclusively (peri)centromeric regions on 30-38 acrocentric chromosomes in both E. lucius and E. cisalpinus. The large number of loci is accompanied by extreme amplification of genes (>20,000 copies), which is to the best of our knowledge one of the highest copy number of rRNA genes in animals ever reported. Conserved secondary structures of predicted 5S rRNAs indicate that most of the amplified genes are potentially functional. Only few SNPs were found in genic regions indicating their high homogeneity while intergenic spacers were more heterogeneous and several families were identified. Analysis of 10-30 kb-long molecules sequenced by the PacBio technology (containing about 40% of total 5S rDNA) revealed that the vast majority (96%) of genes are organised in large several kilobase-long blocks. Dispersed genes or short tandems were less common (4%). The adjacent 5S blocks were directly linked, separated by intervening DNA and even inverted. The 5S units differing in the intergenic spacers formed both homogeneous and heterogeneous (mixed) blocks indicating variable degree of homogenisation between the loci. Both E. lucius and E. cisalpinus 5S rDNA was heavily methylated at CG dinucleotides. Extreme amplification of 5S rRNA genes in the Esox genome occurred in the absence of significant pseudogenisation suggesting its recent origin and/or intensive homogenisation processes. The dense methylation of units indicates that powerful epigenetic mechanisms have evolved in this group of fish to silence amplified genes. We discuss how the higher-order repeat structures impact on homogenisation of 5S rDNA in the genome.
Population Structure and Domestication Revealed by High-Depth Resequencing of Korean Cultivated and Wild Soybean Genomes†

PubMed Central

Chung, Won-Hyong; Jeong, Namhee; Kim, Jiwoong; Lee, Woo Kyu; Lee, Yun-Gyeong; Lee, Sang-Heon; Yoon, Woongchang; Kim, Jin-Hyun; Choi, Ik-Young; Choi, Hong-Kyu; Moon, Jung-Kyung; Kim, Namshin; Jeong, Soon-Chun

2014-01-01

Despite the importance of soybean as a major crop, genome-wide variation and evolution of cultivated soybeans are largely unknown. Here, we catalogued genome variation in an annual soybean population by high-depth resequencing of 10 cultivated and 6 wild accessions and obtained 3.87 million high-quality single-nucleotide polymorphisms (SNPs) after excluding the sites with missing data in any accession. Nuclear genome phylogeny supported a single origin for the cultivated soybeans. We identified 10-fold longer linkage disequilibrium (LD) in the wild soybean relative to wild maize and rice. Despite the small population size, the long LD and large SNP data allowed us to identify 206 candidate domestication regions with significantly lower diversity in the cultivated, but not in the wild, soybeans. Some of the genes in these candidate regions were associated with soybean homologues of canonical domestication genes. However, several examples, which are likely specific to soybean or eudicot crop plants, were also observed. Consequently, the variation data identified in this study should be valuable for breeding and for identifying agronomically important genes in soybeans. However, the long LD of wild soybeans may hinder pinpointing causal gene(s) in the candidate regions. PMID:24271940
New Genes and New Insights from Old Genes: Update on Alzheimer Disease

PubMed Central

Ringman, John M.; Coppola, Giovanni

2013-01-01

Purpose of Review: This article discusses the current status of knowledge regarding the genetic basis of Alzheimer disease (AD) with a focus on clinically relevant aspects. Recent Findings: The genetic architecture of AD is complex, as it includes multiple susceptibility genes and likely nongenetic factors. Rare but highly penetrant autosomal dominant mutations explain a small minority of the cases but have allowed tremendous advances in understanding disease pathogenesis. The identification of a strong genetic risk factor, APOE, reshaped the field and introduced the notion of genetic risk for AD. More recently, large-scale genome-wide association studies are adding to the picture a number of common variants with very small effect sizes. Large-scale resequencing studies are expected to identify additional risk factors, including rare susceptibility variants and structural variation. Summary: Genetic assessment is currently of limited utility in clinical practice because of the low frequency (Mendelian mutations) or small effect size (common risk factors) of the currently known susceptibility genes. However, genetic studies are identifying with confidence a number of novel risk genes, and this will further our understanding of disease biology and possibly the identification of therapeutic targets. PMID:23558482
Domain organization, genomic structure, evolution, and regulation of expression of the aggrecan gene family.

PubMed

Schwartz, N B; Pirok, E W; Mensch, J R; Domowicz, M S

1999-01-01

Proteoglycans are complex macromolecules, consisting of a polypeptide backbone to which are covalently attached one or more glycosaminoglycan chains. Molecular cloning has allowed identification of the genes encoding the core proteins of various proteoglycans, leading to a better understanding of the diversity of proteoglycan structure and function, as well as to the evolution of a classification of proteoglycans on the basis of emerging gene families that encode the different core proteins. One such family includes several proteoglycans that have been grouped with aggrecan, the large aggregating chondroitin sulfate proteoglycan of cartilage, based on a high number of sequence similarities within the N- and C-terminal domains. Thus far these proteoglycans include versican, neurocan, and brevican. It is now apparent that these proteins, as a group, are truly a gene family with shared structural motifs on the protein and nucleotide (mRNA) levels, and with nearly identical genomic organizations. Clearly a common ancestral origin is indicated for the members of the aggrecan family of proteoglycans. However, differing patterns of amplification and divergence have also occurred within certain exons across species and family members, leading to the class-characteristic protein motifs in the central carbohydrate-rich region exclusively. Thus the overall domain organization strongly suggests that sequence conservation in the terminal globular domains underlies common functions, whereas differences in the central portions of the genes account for functional specialization among the members of this gene family.
Joint genetic analysis of hippocampal size in mouse and human identifies a novel gene linked to neurodegenerative disease.

PubMed

Ashbrook, David G; Williams, Robert W; Lu, Lu; Stein, Jason L; Hibar, Derrek P; Nichols, Thomas E; Medland, Sarah E; Thompson, Paul M; Hager, Reinmar

2014-10-03

Variation in hippocampal volume has been linked to significant differences in memory, behavior, and cognition among individuals. To identify genetic variants underlying such differences and associated disease phenotypes, multinational consortia such as ENIGMA have used large magnetic resonance imaging (MRI) data sets in human GWAS studies. In addition, mapping studies in mouse model systems have identified genetic variants for brain structure variation with great power. A key challenge is to understand how genetically based differences in brain structure lead to the propensity to develop specific neurological disorders. We combine the largest human GWAS of brain structure with the largest mammalian model system, the BXD recombinant inbred mouse population, to identify novel genetic targets influencing brain structure variation that are linked to increased risk for neurological disorders. We first use a novel cross-species, comparative analysis using mouse and human genetic data to identify a candidate gene, MGST3, associated with adult hippocampus size in both systems. We then establish the coregulation and function of this gene in a comprehensive systems-analysis. We find that MGST3 is associated with hippocampus size and is linked to a group of neurodegenerative disorders, such as Alzheimer's.

Abundance and functional diversity of riboswitches in microbial communities

PubMed Central

Kazanov, Marat D; Vitreschak, Alexey G; Gelfand, Mikhail S

2007-01-01

Background Several recently completed large-scale enviromental sequencing projects produced a large amount of genetic information about microbial communities ('metagenomes') which is not biased towards cultured organisms. It is a good source for estimation of the abundance of genes and regulatory structures in both known and unknown members of microbial communities. In this study we consider the distribution of RNA regulatory structures, riboswitches, in the Sargasso Sea, Minnesota Soil and Whale Falls metagenomes. Results Over three hundred riboswitches were found in about 2 Gbp metagenome DNA sequences. The abundabce of riboswitches in metagenomes was highest for the TPP, B12 and GCVT riboswitches; the S-box, RFN, YKKC/YXKD, YYBP/YKOY regulatory elements showed lower but significant abundance, while the LYS, G-box, GLMS and YKOK riboswitches were rare. Regions downstream of identified riboswitches were scanned for open reading frames. Comparative analysis of identified ORFs revealed new riboswitch-regulated functions for several classes of riboswitches. In particular, we have observed phosphoserine aminotransferase serC (COG1932) and malate synthase glcB (COG2225) to be regulated by the glycine (GCVT) riboswitch; fatty acid desaturase ole1 (COG1398), by the cobalamin (B12) riboswitch; 5-methylthioribose-1-phosphate isomerase ykrS (COG0182), by the SAM-riboswitch. We also identified conserved riboswitches upstream of genes of unknown function: thiamine (TPP), cobalamine (B12), and glycine (GCVT, upstream of genes from COG4198). Conclusion This study demonstrates applicability of bioinformatics to the analysis of RNA regulatory structures in metagenomes. PMID:17908319
The complete mitochondrial genome of the styloperlid stonefly species Styloperla spinicercia Wu (Insecta: Plecoptera) with family-level phylogenetic analyses of the Pteronarcyoidea.

PubMed

Wang, Ying; Cao, Jinjun; Li, Weihai

2017-03-13

We present the complete mitochondrial (mt) genome sequence of the stonefly, Styloperla spinicercia Wu, 1935 (Plecoptera: Styloperlidae), the type species of the genus Styloperla and the first complete mt genome for the family Styloperlidae. The genome is circular, 16,129 base pairs long, has an A+T content of 70.7%, and contains 37 genes including the large and small ribosomal RNA (rRNA) subunits, 13 protein coding genes (PCGs), 22 tRNA genes and a large non-coding region (CR). All of the PCGs use the standard initiation codon ATN except ND1 and ND5, which start with TTG and GTG. Twelve of the PCGs stop with conventional terminal codons TAA and TAG, except ND5 which shows an incomplete terminator signal T. All tRNAs have the classic clover-leaf structures with the dihydrouridine (DHU) arm of tRNASer(AGN) forming a simple loop. Secondary structures of the two ribosomal RNAs are presented with reference to previous models. The structural elements and the variable numbers of tandem repeats are described within the control region. Phylogenetic analyses using both Bayesian (BI) and Maximum Likelihood (ML) methods support the previous hypotheses regarding family level relationships within the Pteronarcyoidea. The genetic distance calculated based on 13 PCGs and two rRNAs between Styloperla sp. and S. spinicercia is provided and interspecific divergence is discussed.
Identification and expression profiling analysis of calmodulin-binding transcription activator genes in maize (Zea mays L.) under abiotic and biotic stresses

PubMed Central

Yue, Runqing; Lu, Caixia; Sun, Tao; Peng, Tingting; Han, Xiaohua; Qi, Jianshuang; Yan, Shufeng; Tie, Shuanggui

2015-01-01

The calmodulin-binding transcription activators (CAMTA) play critical roles in plant growth and responses to environmental stimuli. However, how CAMTAs function in responses to abiotic and biotic stresses in maize (Zea mays L.) is largely unknown. In this study, we first identified all the CAMTA homologous genes in the whole genome of maize. The results showed that nine ZmCAMTA genes showed highly diversified gene structures and tissue-specific expression patterns. Many ZmCAMTA genes displayed high expression levels in the roots. We then surveyed the distribution of stress-related cis-regulatory elements in the −1.5 kb promoter regions of ZmCAMTA genes. Notably, a large number of stress-related elements present in the promoter regions of some ZmCAMTA genes, indicating a genetic basis of stress expression regulation of these genes. Quantitative real-time PCR was used to test the expression of ZmCAMTA genes under several abiotic stresses (drought, salt, and cold), various stress-related hormones [abscisic acid, auxin, salicylic acid (SA), and jasmonic acid] and biotic stress [rice black-streaked dwarf virus (RBSDV) infection]. Furthermore, the expression pattern of ZmCAMTA genes under RBSDV infection was analyzed to investigate their potential roles in responses of different maize cultivated varieties to RBSDV. The expression of most ZmCAMTA genes responded to both abiotic and biotic stresses. The data will help us to understand the roles of CAMTA-mediated Ca2+ signaling in maize tolerance to environmental stresses. PMID:26284092
GeneView: a comprehensive semantic search engine for PubMed.

PubMed

Thomas, Philippe; Starlinger, Johannes; Vowinkel, Alexander; Arzt, Sebastian; Leser, Ulf

2012-07-01

Research results are primarily published in scientific literature and curation efforts cannot keep up with the rapid growth of published literature. The plethora of knowledge remains hidden in large text repositories like MEDLINE. Consequently, life scientists have to spend a great amount of time searching for specific information. The enormous ambiguity among most names of biomedical objects such as genes, chemicals and diseases often produces too large and unspecific search results. We present GeneView, a semantic search engine for biomedical knowledge. GeneView is built upon a comprehensively annotated version of PubMed abstracts and openly available PubMed Central full texts. This semi-structured representation of biomedical texts enables a number of features extending classical search engines. For instance, users may search for entities using unique database identifiers or they may rank documents by the number of specific mentions they contain. Annotation is performed by a multitude of state-of-the-art text-mining tools for recognizing mentions from 10 entity classes and for identifying protein-protein interactions. GeneView currently contains annotations for >194 million entities from 10 classes for ∼21 million citations with 271,000 full text bodies. GeneView can be searched at http://bc3.informatik.hu-berlin.de/.
Population Structure of Humpback Whales from Their Breeding Grounds in the South Atlantic and Indian Oceans

PubMed Central

Rosenbaum, Howard C.; Pomilla, Cristina; Mendez, Martin; Leslie, Matthew S.; Best, Peter B.; Findlay, Ken P.; Minton, Gianna; Ersts, Peter J.; Collins, Timothy; Engel, Marcia H.; Bonatto, Sandro L.; Kotze, Deon P. G. H.; Meÿer, Mike; Barendse, Jaco; Thornton, Meredith; Razafindrakoto, Yvette; Ngouessono, Solange; Vely, Michel; Kiszka, Jeremy

2009-01-01

Although humpback whales are among the best-studied of the large whales, population boundaries in the Southern Hemisphere (SH) have remained largely untested. We assess population structure of SH humpback whales using 1,527 samples collected from whales at fourteen sampling sites within the Southwestern and Southeastern Atlantic, the Southwestern Indian Ocean, and Northern Indian Ocean (Breeding Stocks A, B, C and X, respectively). Evaluation of mtDNA population structure and migration rates was carried out under different statistical frameworks. Using all genetic evidence, the results suggest significant degrees of population structure between all ocean basins, with the Southwestern and Northern Indian Ocean most differentiated from each other. Effective migration rates were highest between the Southeastern Atlantic and the Southwestern Indian Ocean, followed by rates within the Southeastern Atlantic, and the lowest between the Southwestern and Northern Indian Ocean. At finer scales, very low gene flow was detected between the two neighbouring sub-regions in the Southeastern Atlantic, compared to high gene flow for whales within the Southwestern Indian Ocean. Our genetic results support the current management designations proposed by the International Whaling Commission of Breeding Stocks A, B, C, and X as four strongly structured populations. The population structure patterns found in this study are likely to have been influenced by a combination of long-term maternally directed fidelity of migratory destinations, along with other ecological and oceanographic features in the region. PMID:19812698
Population structure of humpback whales from their breeding grounds in the South Atlantic and Indian Oceans.

PubMed

Rosenbaum, Howard C; Pomilla, Cristina; Mendez, Martin; Leslie, Matthew S; Best, Peter B; Findlay, Ken P; Minton, Gianna; Ersts, Peter J; Collins, Timothy; Engel, Marcia H; Bonatto, Sandro L; Kotze, Deon P G H; Meÿer, Mike; Barendse, Jaco; Thornton, Meredith; Razafindrakoto, Yvette; Ngouessono, Solange; Vely, Michel; Kiszka, Jeremy

2009-10-08

Although humpback whales are among the best-studied of the large whales, population boundaries in the Southern Hemisphere (SH) have remained largely untested. We assess population structure of SH humpback whales using 1,527 samples collected from whales at fourteen sampling sites within the Southwestern and Southeastern Atlantic, the Southwestern Indian Ocean, and Northern Indian Ocean (Breeding Stocks A, B, C and X, respectively). Evaluation of mtDNA population structure and migration rates was carried out under different statistical frameworks. Using all genetic evidence, the results suggest significant degrees of population structure between all ocean basins, with the Southwestern and Northern Indian Ocean most differentiated from each other. Effective migration rates were highest between the Southeastern Atlantic and the Southwestern Indian Ocean, followed by rates within the Southeastern Atlantic, and the lowest between the Southwestern and Northern Indian Ocean. At finer scales, very low gene flow was detected between the two neighbouring sub-regions in the Southeastern Atlantic, compared to high gene flow for whales within the Southwestern Indian Ocean. Our genetic results support the current management designations proposed by the International Whaling Commission of Breeding Stocks A, B, C, and X as four strongly structured populations. The population structure patterns found in this study are likely to have been influenced by a combination of long-term maternally directed fidelity of migratory destinations, along with other ecological and oceanographic features in the region.
Exome sequencing supports a de novo mutational paradigm for schizophrenia

PubMed Central

Xu, Bin; Roos, J. Louw; Dexheimer, Phillip; Boone, Braden; Plummer, Brooks; Levy, Shawn; Gogos, Joseph A.; Karayiorgou, Maria

2011-01-01

Despite high heritability, a large fraction of cases with schizophrenia do not have a family history of the disease (sporadic cases). Here, we examine the possibility that rare de novo protein-altering mutations contribute to the genetic component of schizophrenia by sequencing the exome of 53 sporadic cases, 22 unaffected controls and their parents. We identified 40 de novo mutations in 27 patients affecting 40 genes including a potentially disruptive mutation in DGCR2, a gene removed by the recurrent schizophrenia-predisposing 22q11.2 microdeletion. Comparison to rare inherited variants revealed that the identified de novo mutations show a large excess of nonsynonymous changes in cases, as well as a greater potential to affect protein structure and function. Our analysis reveals a major role of de novo mutations in schizophrenia and also a large mutational target, which together provide a plausible explanation for the high global incidence and persistence of the disease. PMID:21822266
Functions of the gene products of Escherichia coli.

PubMed Central

Riley, M

1993-01-01

A list of currently identified gene products of Escherichia coli is given, together with a bibliography that provides pointers to the literature on each gene product. A scheme to categorize cellular functions is used to classify the gene products of E. coli so far identified. A count shows that the numbers of genes concerned with small-molecule metabolism are on the same order as the numbers concerned with macromolecule biosynthesis and degradation. One large category is the category of tRNAs and their synthetases. Another is the category of transport elements. The categories of cell structure and cellular processes other than metabolism are smaller. Other subjects discussed are the occurrence in the E. coli genome of redundant pairs and groups of genes of identical or closely similar function, as well as variation in the degree of density of genetic information in different parts of the genome. PMID:7508076
The active gene that encodes human High Mobility Group 1 protein (HMG1) contains introns and maps to chromosome 13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ferrari, S.; Finelli, P.; Rocchi, M.

The human genome contains a large number of sequences related to the cDNA for High Mobility Group 1 protein (HMG1), which so far has hampered the cloning and mapping of the active HMG1 gene. We show that the human HMG1 gene contains introns, while the HMG1-related sequences do not and most likely are retrotransposed pseudogenes. We identified eight YACs from the ICI and CEPH libraries that contain the human HMG1 gene. The HMG1 gene is similar in structure to the previously characterized murine homologue and maps to human chromosome 13 and q12, as determined by in situ hybridization. The mousemore » Hmg1 gene maps to the telomeric region of murine Chromosome 5, which is syntenic to the human 13q12 band. 18 refs., 3 figs.« less
Bacterial phylogeny structures soil resistomes across habitats

NASA Astrophysics Data System (ADS)

Forsberg, Kevin J.; Patel, Sanket; Gibson, Molly K.; Lauber, Christian L.; Knight, Rob; Fierer, Noah; Dantas, Gautam

2014-05-01

Ancient and diverse antibiotic resistance genes (ARGs) have previously been identified from soil, including genes identical to those in human pathogens. Despite the apparent overlap between soil and clinical resistomes, factors influencing ARG composition in soil and their movement between genomes and habitats remain largely unknown. General metagenome functions often correlate with the underlying structure of bacterial communities. However, ARGs are proposed to be highly mobile, prompting speculation that resistomes may not correlate with phylogenetic signatures or ecological divisions. To investigate these relationships, we performed functional metagenomic selections for resistance to 18 antibiotics from 18 agricultural and grassland soils. The 2,895 ARGs we discovered were mostly new, and represent all major resistance mechanisms. We demonstrate that distinct soil types harbour distinct resistomes, and that the addition of nitrogen fertilizer strongly influenced soil ARG content. Resistome composition also correlated with microbial phylogenetic and taxonomic structure, both across and within soil types. Consistent with this strong correlation, mobility elements (genes responsible for horizontal gene transfer between bacteria such as transposases and integrases) syntenic with ARGs were rare in soil by comparison with sequenced pathogens, suggesting that ARGs may not transfer between soil bacteria as readily as is observed between human pathogens. Together, our results indicate that bacterial community composition is the primary determinant of soil ARG content, challenging previous hypotheses that horizontal gene transfer effectively decouples resistomes from phylogeny.
Distinct structural transitions of chromatin topological domains correlate with coordinated hormone-induced gene regulation

PubMed Central

Le Dily, François; Baù, Davide; Pohl, Andy; Vicent, Guillermo P.; Serra, François; Soronellas, Daniel; Castellano, Giancarlo; Wright, Roni H.G.; Ballare, Cecilia; Filion, Guillaume; Marti-Renom, Marc A.

2014-01-01

The human genome is segmented into topologically associating domains (TADs), but the role of this conserved organization during transient changes in gene expression is not known. Here we describe the distribution of progestin-induced chromatin modifications and changes in transcriptional activity over TADs in T47D breast cancer cells. Using ChIP-seq (chromatin immunoprecipitation combined with high-throughput sequencing), Hi-C (chromosome capture followed by high-throughput sequencing), and three-dimensional (3D) modeling techniques, we found that the borders of the ∼2000 TADs in these cells are largely maintained after hormone treatment and that up to 20% of the TADs could be considered as discrete regulatory units where the majority of the genes are either transcriptionally activated or repressed in a coordinated fashion. The epigenetic signatures of the TADs are homogeneously modified by hormones in correlation with the transcriptional changes. Hormone-induced changes in gene activity and chromatin remodeling are accompanied by differential structural changes for activated and repressed TADs, as reflected by specific and opposite changes in the strength of intra-TAD interactions within responsive TADs. Indeed, 3D modeling of the Hi-C data suggested that the structure of TADs was modified upon treatment. The differential responses of TADs to progestins and estrogens suggest that TADs could function as “regulons” to enable spatially proximal genes to be coordinately transcribed in response to hormones. PMID:25274727
Nuclear microsatellites reveal contrasting patterns of genetic structure between western and southeastern European populations of the common ash (Fraxinus excelsior L.).

PubMed

Heuertz, Myriam; Hausman, Jean-François; Hardy, Olivier J; Vendramin, Giovanni G; Frascaria-Lacoste, Nathalie; Vekemans, Xavier

2004-05-01

To determine extant patterns of population genetic structure in common ash and gain insight into postglacial recolonization processes, we applied multilocus-based Bayesian approaches to data from 36 European populations genotyped at five nuclear microsatellite loci. We identified two contrasting patterns in terms of population genetic structure: (1) a large area from the British Isles to Lithuania throughout central Europe constituted effectively a single deme, whereas (2) strong genetic differentiation occurred over short distances in Sweden and southeastern Europe. Concomitant geographical variation was observed in estimates of allelic richness and genetic diversity, which were lowest in populations from southeastern Europe, that is, in regions close to putative ice age refuges, but high in western and central Europe, that is, in more recently recolonized areas. We suggest that in southeastern Europe, restricted postglacial gene flow caused by a rapid expansion of refuge populations in a mountainous topography is responsible for the observed strong genetic structure. In contrast, admixture of previously differentiated gene pools and high gene flow at the onset of postglacial recolonization of western and central Europe would have homogenized the genetic structure and raised the levels of genetic diversity above values in the refuges.
Analysis of the Functions of Recombination-Related Genes in the Generation of Large Chromosomal Deletions by Loop-Out Recombination in Aspergillus oryzae

PubMed Central

Ogawa, Masahiro; Koyama, Yasuji

2012-01-01

Loop-out-type recombination is a type of intrachromosomal recombination followed by the excision of a chromosomal region. The detailed mechanism underlying this recombination and the genes involved in loop-out recombination remain unknown. In the present study, we investigated the functions of ku70, ligD, rad52, rad54, and rdh54 in the construction of large chromosomal deletions via loop-out recombination and the effect of the position of the targeted chromosomal region on the efficiency of loop-out recombination in Aspergillus oryzae. The efficiency of generation of large chromosomal deletions in the near-telomeric region of chromosome 3, including the aflatoxin gene cluster, was compared with that in the near-centromeric region of chromosome 8, including the tannase gene. In the Δku70 and Δku70-rdh54 strains, only precise loop-out recombination occurred in the near-telomeric region. In contrast, in the ΔligD, Δku70-rad52, and Δku70-rad54 strains, unintended chromosomal deletions by illegitimate loop-out recombination occurred in the near-telomeric region. In addition, large chromosomal deletions via loop-out recombination were efficiently achieved in the near-telomeric region, but barely achieved in the near-centromeric region, in the Δku70 strain. Induction of DNA double-strand breaks by I-SceI endonuclease facilitated large chromosomal deletions in the near-centromeric region. These results indicate that ligD, rad52, and rad54 play a role in the generation of large chromosomal deletions via precise loop-out-type recombination in the near-telomeric region and that loop-out recombination between distant sites is restricted in the near-centromeric region by chromosomal structure. PMID:22286092
Long-range gene flow and the effects of climatic and ecological factors on genetic structuring in a large, solitary carnivore: the Eurasian lynx.

PubMed

Ratkiewicz, Mirosław; Matosiuk, Maciej; Saveljev, Alexander P; Sidorovich, Vadim; Ozolins, Janis; Männil, Peep; Balciauskas, Linas; Kojola, Ilpo; Okarma, Henryk; Kowalczyk, Rafał; Schmidt, Krzysztof

2014-01-01

Due to their high mobility, large terrestrial predators are potentially capable of maintaining high connectivity, and therefore low genetic differentiation among populations. However, previous molecular studies have provided contradictory findings in relation to this. To elucidate patterns of genetic structure in large carnivores, we studied the genetic variability of the Eurasian lynx, Lynx lynx throughout north-eastern Europe using microsatellite, mitochondrial DNA control region and Y chromosome-linked markers. Using SAMOVA we found analogous patterns of genetic structure based on both mtDNA and microsatellites, which coincided with a relatively little evidence for male-biased dispersal. No polymorphism for the cytochrome b and ATP6 mtDNA genes and Y chromosome-linked markers were found. Lynx inhabiting a large area encompassing Finland, the Baltic countries and western Russia formed a single genetic unit, while some marginal populations were clearly divergent from others. The existence of a migration corridor was suggested to correspond with distribution of continuous forest cover. The lowest variability (in both markers) was found in lynx from Norway and Białowieża Primeval Forest (BPF), which coincided with a recent demographic bottleneck (Norway) or high habitat fragmentation (BPF). The Carpathian population, being monomorphic for the control region, showed relatively high microsatellite diversity, suggesting the effect of a past bottleneck (e.g. during Last Glacial Maximum) on its present genetic composition. Genetic structuring for the mtDNA control region was best explained by latitude and snow cover depth. Microsatellite structuring correlated with the lynx's main prey, especially the proportion of red deer (Cervus elaphus) in its diet. Eurasian lynx are capable of maintaining panmictic populations across eastern Europe unless they are severely limited by habitat continuity or a reduction in numbers. Different correlations of mtDNA and microsatellite population divergence patterns with climatic and ecological factors may suggest separate selective pressures acting on males and females in this solitary carnivore.
'Click' synthesized sterol-based cationic lipids as gene carriers, and the effect of skeletons and headgroups on gene delivery.

PubMed

Sheng, Ruilong; Luo, Ting; Li, Hui; Sun, Jingjing; Wang, Zhao; Cao, Amin

2013-11-01

In this work, we have successfully prepared a series of new sterol-based cationic lipids (1-4) via an efficient 'Click' chemistry approach. The pDNA binding affinity of these lipids was examined by EB displacement and agarose-gel retardant assay. The average particle sizes and surface charges of the sterol-based cationic lipids/pDNA lipoplexes were analyzed by dynamic laser light scattering instrument (DLS), and the morphologies of the lipoplexes were observed by atomic force microscopy (AFM). The cytotoxicity of the lipids were examined by MTT and LDH assay, and the gene transfection efficiencies of these lipid carriers were investigated by luciferase gene transfection assay in various cell lines. In addition, the intracellular uptake and trafficking/localization behavior of the Cy3-DNA loaded lipoplexes were preliminarily studied by fluorescence microscopy. The results demonstrated that the pDNA loading capacity, lipoplex particle size, zeta potential and morphology of the sterol lipids/pDNA lipoplexes depended largely on the molecular structure factors including sterol-skeletons and headgroups. Furthermore, the sterol-based lipids showed quite different cytotoxicity and gene transfection efficacy in A549 and HeLa cells. Interestingly, it was found that the cholesterol-bearing lipids 1 and 2 showed 7-10(4) times higher transfection capability than their lithocholate-bearing counterparts 3 and 4 in A549 and HeLa cell lines, suggested that the gene transfection capacity strongly relied on the structure of sterol skeletons. Moreover, the study on the structure-activity relationships of these sterol-based cationic lipid gene carriers provided a possible approach for developing low cytotoxic and high efficient lipid gene carriers by selecting suitable sterol hydrophobes and cationic headgroups. Copyright © 2013 Elsevier Ltd. All rights reserved.
SNPs in stress-responsive rice genes: validation, genotyping, functional relevance and population structure

PubMed Central

2012-01-01

Background Single nucleotide polymorphism (SNP) validation and large-scale genotyping are required to maximize the use of DNA sequence variation and determine the functional relevance of candidate genes for complex stress tolerance traits through genetic association in rice. We used the bead array platform-based Illumina GoldenGate assay to validate and genotype SNPs in a select set of stress-responsive genes to understand their functional relevance and study the population structure in rice. Results Of the 384 putative SNPs assayed, we successfully validated and genotyped 362 (94.3%). Of these 325 (84.6%) showed polymorphism among the 91 rice genotypes examined. Physical distribution, degree of allele sharing, admixtures and introgression, and amino acid replacement of SNPs in 263 abiotic and 62 biotic stress-responsive genes provided clues for identification and targeted mapping of trait-associated genomic regions. We assessed the functional and adaptive significance of validated SNPs in a set of contrasting drought tolerant upland and sensitive lowland rice genotypes by correlating their allelic variation with amino acid sequence alterations in catalytic domains and three-dimensional secondary protein structure encoded by stress-responsive genes. We found a strong genetic association among SNPs in the nine stress-responsive genes with upland and lowland ecological adaptation. Higher nucleotide diversity was observed in indica accessions compared with other rice sub-populations based on different population genetic parameters. The inferred ancestry of 16% among rice genotypes was derived from admixed populations with the maximum between upland aus and wild Oryza species. Conclusions SNPs validated in biotic and abiotic stress-responsive rice genes can be used in association analyses to identify candidate genes and develop functional markers for stress tolerance in rice. PMID:22921105
Universal and idiosyncratic characteristic lengths in bacterial genomes

NASA Astrophysics Data System (ADS)

Junier, Ivan; Frémont, Paul; Rivoire, Olivier

2018-05-01

In condensed matter physics, simplified descriptions are obtained by coarse-graining the features of a system at a certain characteristic length, defined as the typical length beyond which some properties are no longer correlated. From a physics standpoint, in vitro DNA has thus a characteristic length of 300 base pairs (bp), the Kuhn length of the molecule beyond which correlations in its orientations are typically lost. From a biology standpoint, in vivo DNA has a characteristic length of 1000 bp, the typical length of genes. Since bacteria live in very different physico-chemical conditions and since their genomes lack translational invariance, whether larger, universal characteristic lengths exist is a non-trivial question. Here, we examine this problem by leveraging the large number of fully sequenced genomes available in public databases. By analyzing GC content correlations and the evolutionary conservation of gene contexts (synteny) in hundreds of bacterial chromosomes, we conclude that a fundamental characteristic length around 10–20 kb can be defined. This characteristic length reflects elementary structures involved in the coordination of gene expression, which are present all along the genome of nearly all bacteria. Technically, reaching this conclusion required us to implement methods that are insensitive to the presence of large idiosyncratic genomic features, which may co-exist along these fundamental universal structures.
Large scale genomic reorganization of topological domains at the HoxD locus.

PubMed

Fabre, Pierre J; Leleu, Marion; Mormann, Benjamin H; Lopez-Delisle, Lucille; Noordermeer, Daan; Beccari, Leonardo; Duboule, Denis

2017-08-07

The transcriptional activation of HoxD genes during mammalian limb development involves dynamic interactions with two topologically associating domains (TADs) flanking the HoxD cluster. In particular, the activation of the most posterior HoxD genes in developing digits is controlled by regulatory elements located in the centromeric TAD (C-DOM) through long-range contacts. To assess the structure-function relationships underlying such interactions, we measured compaction levels and TAD discreteness using a combination of chromosome conformation capture (4C-seq) and DNA FISH. We assessed the robustness of the TAD architecture by using a series of genomic deletions and inversions that impact the integrity of this chromatin domain and that remodel long-range contacts. We report multi-partite associations between HoxD genes and up to three enhancers. We find that the loss of native chromatin topology leads to the remodeling of TAD structure following distinct parameters. Our results reveal that the recomposition of TAD architectures after large genomic re-arrangements is dependent on a boundary-selection mechanism in which CTCF mediates the gating of long-range contacts in combination with genomic distance and sequence specificity. Accordingly, the building of a recomposed TAD at this locus depends on distinct functional and constitutive parameters.
Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lykidis, Athanasios

2006-12-01

Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymesmore » and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.« less
PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.

PubMed

Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D

2017-01-04

The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes.

PubMed

Law, MeiYee; Childs, Kevin L; Campbell, Michael S; Stein, Joshua C; Olson, Andrew J; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M; Lawrence, Carolyn J; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

2015-01-01

The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. © 2015 American Society of Plant Biologists. All Rights Reserved.
Gene therapy in large animal models of human cardiovascular genetic disease.

PubMed

Sleeper, Meg M; Bish, Lawrence T; Sweeney, H Lee

2009-01-01

Several naturally occurring animal models for human genetic heart diseases offer an excellent opportunity to evaluate potential novel therapies, including gene therapy. Some of these diseases--especially those that result in a structural defect during development (e.g., patent ductus arteriosus, pulmonic stenosis)--would likely be difficult to treat with a therapeutic gene transfer approach. However, the ability to transduce a significant proportion of the myocardial cells should make the various forms of inherited cardiomyopathy amenable to a therapeutic gene transfer approach. Adeno-associated virus may be the ideal vector for cardiac gene therapy since its low immunogenicity allows for stable transgene expression, a crucial factor when considering treatment of a chronic disease. Cardiomyopathies are a major cause of morbidity and mortality in both children and adults, and large animal models are available for the major forms of inherited cardiomyopathy (dilated cardiomyopathy, hypertrophic cardiomyopathy, and arrhythmogenic right ventricular cardiomyopathy). One of these animal models, juvenile dilated cardiomyopathy of Portuguese water dogs, offers an effective means to assess the efficacy of therapeutic gene transfer to alter the course of cardiomyopathy and heart failure. Correction of the abnormal metabolic processes that occur with heart failure (e.g., calcium metabolism, apoptosis) could normalize diseased myocardial function. Gene therapy may offer a promising new approach for the treatment of cardiac disease in both veterinary and human clinical settings.
Convergence of Domain Architecture, Structure, and Ligand Affinity in Animal and Plant RNA-Binding Proteins.

PubMed

Dias, Raquel; Manny, Austin; Kolaczkowski, Oralia; Kolaczkowski, Bryan

2017-06-01

Reconstruction of ancestral protein sequences using phylogenetic methods is a powerful technique for directly examining the evolution of molecular function. Although ancestral sequence reconstruction (ASR) is itself very efficient, downstream functional, and structural studies necessary to characterize when and how changes in molecular function occurred are often costly and time-consuming, currently limiting ASR studies to examining a relatively small number of discrete functional shifts. As a result, we have very little direct information about how molecular function evolves across large protein families. Here we develop an approach combining ASR with structure and function prediction to efficiently examine the evolution of ligand affinity across a large family of double-stranded RNA binding proteins (DRBs) spanning animals and plants. We find that the characteristic domain architecture of DRBs-consisting of 2-3 tandem double-stranded RNA binding motifs (dsrms)-arose independently in early animal and plant lineages. The affinity with which individual dsrms bind double-stranded RNA appears to have increased and decreased often across both animal and plant phylogenies, primarily through convergent structural mechanisms involving RNA-contact residues within the β1-β2 loop and a small region of α2. These studies provide some of the first direct information about how protein function evolves across large gene families and suggest that changes in molecular function may occur often and unassociated with major phylogenetic events, such as gene or domain duplications. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genome-wide analysis of WRKY gene family in Cucumis sativus

PubMed Central

2011-01-01

Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985
Genome-wide analysis of WRKY gene family in Cucumis sativus.

PubMed

Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

2011-09-28

WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
Plant rhabdoviruses.

PubMed

Redinbaugh, M G; Hogenhout, S A

2005-01-01

This chapter provides an overview of plant rhabdovirus structure and taxonomy, genome structure, protein function, and insect and plant infection. It is focused on recent research and unique aspects of rhabdovirus biology. Plant rhabdoviruses are transmitted by aphid, leafhopper or planthopper vectors, and the viruses replicate in both their insect and plant hosts. The two plant rhabdovirus genera, Nucleorhabdovirus and Cytorhabdovirus, can be distinguished on the basis of their intracellular site of morphogenesis in plant cells. All plant rhabdoviruses carry analogs of the five core genes: the nucleocapsid (N), phosphoprotein (P), matrix (M), glycoprotein (G) and large or polymerase (L). However, compared to vesiculoviruses that are composed of the five core genes, all plant rhabdoviruses encode more than these five genes, at least one of which is inserted between the P and M genes in the rhabdoviral genome. Interestingly, while these extra genes are not similar among plant rhabdoviruses, two encode proteins with similarity to the 30K superfamily of plant virus movement proteins. Analysis of nucleorhabdoviral protein sequences revealed nuclear localization signals for the N, P, M and L proteins, consistent with virus replication and morphogenesis of these viruses in the nucleus. Plant and insect factors that limit virus infection and transmission are discussed.
RedeR: R/Bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations

PubMed Central

2012-01-01

Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html. PMID:22531049
Genome-wide Identification and Expression Analysis of the CDPK Gene Family in Grape, Vitis spp.

PubMed

Zhang, Kai; Han, Yong-Tao; Zhao, Feng-Li; Hu, Yang; Gao, Yu-Rong; Ma, Yan-Fei; Zheng, Yi; Wang, Yue-Jin; Wen, Ying-Qiang

2015-06-30

Calcium-dependent protein kinases (CDPKs) play vital roles in plant growth and development, biotic and abiotic stress responses, and hormone signaling. Little is known about the CDPK gene family in grapevine. In this study, we performed a genome-wide analysis of the 12X grape genome (Vitis vinifera) and identified nineteen CDPK genes. Comparison of the structures of grape CDPK genes allowed us to examine their functional conservation and differentiation. Segmentally duplicated grape CDPK genes showed high structural conservation and contributed to gene family expansion. Additional comparisons between grape and Arabidopsis thaliana demonstrated that several grape CDPK genes occured in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grapevine and Arabidopsis. Phylogenetic analysis divided the grape CDPK genes into four groups. Furthermore, we examined the expression of the corresponding nineteen homologous CDPK genes in the Chinese wild grape (Vitis pseudoreticulata) under various conditions, including biotic stress, abiotic stress, and hormone treatments. The expression profiles derived from reverse transcription and quantitative PCR suggested that a large number of VpCDPKs responded to various stimuli on the transcriptional level, indicating their versatile roles in the responses to biotic and abiotic stresses. Moreover, we examined the subcellular localization of VpCDPKs by transiently expressing six VpCDPK-GFP fusion proteins in Arabidopsis mesophyll protoplasts; this revealed high variability consistent with potential functional differences. Taken as a whole, our data provide significant insights into the evolution and function of grape CDPKs and a framework for future investigation of grape CDPK genes.
Chloroplast genomes of Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea: Structures and comparative analysis.

PubMed

Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung

2017-08-08

We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
The complete mitochondrial genome of Arctic Calanus hyperboreus (Copepoda, Calanoida) reveals characteristic patterns in calanoid mitochondrial genome.

PubMed

Kim, Sanghee; Lim, Byung-Jin; Min, Gi-Sik; Choi, Han-Gu

2013-05-10

Copepoda is the most diverse and abundant group of crustaceans, but its phylogenetic relationships are ambiguous. Mitochondrial (mt) genomes are useful for studying evolutionary history, but only six complete Copepoda mt genomes have been made available and these have extremely rearranged genome structures. This study determined the mt genome of Calanus hyperboreus, making it the first reported Arctic copepod mt genome and the first complete mt genome of a calanoid copepod. The mt genome of C. hyperboreus is 17,910 bp in length and it contains the entire set of 37 mt genes, including 13 protein-coding genes, 2 rRNAs, and 22 tRNAs. It has a very unusual gene structure, including the longest control region reported for a crustacean, a large tRNA gene cluster, and reversed GC skews in 11 out of 13 protein-coding genes (84.6%). Despite the unusual features, comparing this genome to published copepod genomes revealed retained pan-crustacean features, as well as a conserved calanoid-specific pattern. Our data provide a foundation for exploring the calanoid pattern and the mechanisms of mt gene rearrangement in the evolutionary history of the copepod mt genome. Copyright © 2012 Elsevier B.V. All rights reserved.
Sex-biased gene flow among elk in the greater Yellowstone ecosystem

USGS Publications Warehouse

Hand, Brian K.; Chen, Shanyuan; Anderson, Neil; Beja-Pereira, Albano; Cross, Paul C.; Ebinger, Michael R.; Edwards, Hank; Garrott, Robert A.; Kardos, Marty D.; Kauffman, Matthew J.; Landguth, Erin L.; Middleton, Arthur; Scurlock, Brandon M.; White, P.J.; Zager, Pete; Schwartz, Michael K.; Luikart, Gordon

2014-01-01

We quantified patterns of population genetic structure to help understand gene flow among elk populations across the Greater Yellowstone Ecosystem. We sequenced 596 base pairs of the mitochondrial control region of 380 elk from eight populations. Analysis revealed high mitochondrial DNA variation within populations, averaging 13.0 haplotypes with high mean gene diversity (0.85). The genetic differentiation among populations for mitochondrial DNA was relatively high (FST = 0.161; P = 0.001) compared to genetic differentiation for nuclear microsatellite data (FST = 0.002; P = 0.332), which suggested relatively low female gene flow among populations. The estimated ratio of male to female gene flow (mm/mf = 46) was among the highest we have seen reported for large mammals. Genetic distance (for mitochondrial DNA pairwise FST) was not significantly correlated with geographic (Euclidean) distance between populations (Mantel's r = 0.274, P = 0.168). Large mitochondrial DNA genetic distances (e.g., FST > 0.2) between some of the geographically closest populations (<65 km) suggested behavioral factors and/or landscape features might shape female gene flow patterns. Given the strong sex-biased gene flow, future research and conservation efforts should consider the sexes separately when modeling corridors of gene flow or predicting spread of maternally transmitted diseases. The growing availability of genetic data to compare male vs. female gene flow provides many exciting opportunities to explore the magnitude, causes, and implications of sex-biased gene flow likely to occur in many species.
Structural covariance networks are coupled to expression of genes enriched in supragranular layers of the human cortex.

PubMed

Romero-Garcia, Rafael; Whitaker, Kirstie J; Váša, František; Seidlitz, Jakob; Shinn, Maxwell; Fonagy, Peter; Dolan, Raymond J; Jones, Peter B; Goodyer, Ian M; Bullmore, Edward T; Vértes, Petra E

2018-05-01

Complex network topology is characteristic of many biological systems, including anatomical and functional brain networks (connectomes). Here, we first constructed a structural covariance network from MRI measures of cortical thickness on 296 healthy volunteers, aged 14-24 years. Next, we designed a new algorithm for matching sample locations from the Allen Brain Atlas to the nodes of the SCN. Subsequently we used this to define, transcriptomic brain networks by estimating gene co-expression between pairs of cortical regions. Finally, we explored the hypothesis that transcriptional networks and structural MRI connectomes are coupled. A transcriptional brain network (TBN) and a structural covariance network (SCN) were correlated across connection weights and showed qualitatively similar complex topological properties: assortativity, small-worldness, modularity, and a rich-club. In both networks, the weight of an edge was inversely related to the anatomical (Euclidean) distance between regions. There were differences between networks in degree and distance distributions: the transcriptional network had a less fat-tailed degree distribution and a less positively skewed distance distribution than the SCN. However, cortical areas connected to each other within modules of the SCN had significantly higher levels of whole genome co-expression than expected by chance. Nodes connected in the SCN had especially high levels of expression and co-expression of a human supragranular enriched (HSE) gene set that has been specifically located to supragranular layers of human cerebral cortex and is known to be important for large-scale, long-distance cortico-cortical connectivity. This coupling of brain transcriptome and connectome topologies was largely but not entirely accounted for by the common constraint of physical distance on both networks. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Gene flow and pathogen transmission among bobcats (Lynx rufus) in a fragmented urban landscape

USGS Publications Warehouse

Lee, Justin S.; Ruell, Emily W.; Boydston, Erin E.; Lyren, Lisa M.; Alonso, Robert S.; Troyer, Jennifer L.; Crooks, Kevin R.; VandeWoude, Sue

2012-01-01

Urbanization can result in the fragmentation of once contiguous natural landscapes into a patchy habitat interspersed within a growing urban matrix. Animals living in fragmented landscapes often have reduced movement among habitat patches because of avoidance of intervening human development, which potentially leads to both reduced gene flow and pathogen transmission between patches. Mammalian carnivores with large home ranges, such as bobcats (Lynx rufus), may be particularly sensitive to habitat fragmentation. We performed genetic analyses on bobcats and their directly transmitted viral pathogen, feline immunodeficiency virus (FIV), to investigate the effects of urbanization on bobcat movement. We predicted that urban development, including major freeways, would limit bobcat movement and result in genetically structured host and pathogen populations. We analysed molecular markers from 106 bobcats and 19 FIV isolates from seropositive animals in urban southern California. Our findings indicate that reduced gene flow between two primary habitat patches has resulted in genetically distinct bobcat subpopulations separated by urban development including a major highway. However, the distribution of genetic diversity among FIV isolates determined through phylogenetic analyses indicates that pathogen genotypes are less spatially structured--exhibiting a more even distribution between habitat fragments. We conclude that the types of movement and contact sufficient for disease transmission occur with enough frequency to preclude structuring among the viral population, but that the bobcat population is structured owing to low levels of effective bobcat migration resulting in gene flow. We illustrate the utility in using multiple molecular markers that differentially detect movement and gene flow between subpopulations when assessing connectivity.
Cancer Genomics: Integrative and Scalable Solutions in R / Bioconductor | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

This proposal develops scalable R / Bioconductor software infrastructure and data resources to integrate complex, heterogeneous, and large cancer genomic experiments. The falling cost of genomic assays facilitates collection of multiple data types (e.g., gene and transcript expression, structural variation, copy number, methylation, and microRNA data) from a set of clinical specimens. Furthermore, substantial resources are now available from large consortium activities like The Cancer Genome Atlas (TCGA).
Phage phenomics: Physiological approaches to characterize novel viral proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanchez, Savannah E.; Cuevas, Daniel A.; Rostron, Jason E.

Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysismore » by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.« less
Mitochondrial disease associated with complex I (NADH-CoQ oxidoreductase) deficiency.

PubMed

Scheffler, Immo E

2015-05-01

Mitochondrial diseases due to a reduced capacity for oxidative phosphorylation were first identified more than 20 years ago, and their incidence is now recognized to be quite significant. In a large proportion of cases the problem can be traced to a complex I (NADH-CoQ oxidoreductase) deficiency (Phenotype MIM #252010). Because the complex consists of 44 subunits, there are many potential targets for pathogenic mutations, both on the nuclear and mitochondrial genomes. Surprisingly, however, almost half of the complex I deficiencies are due to defects in as yet unidentified genes that encode proteins other than the structural proteins of the complex. This review attempts to summarize what we know about the molecular basis of complex I deficiencies: mutations in the known structural genes, and mutations in an increasing number of genes encoding "assembly factors", that is, proteins required for the biogenesis of a functional complex I that are not found in the final complex I. More such genes must be identified before definitive genetic counselling can be applied in all cases of affected families.
Phage phenomics: Physiological approaches to characterize novel viral proteins

DOE PAGES

Sanchez, Savannah E.; Cuevas, Daniel A.; Rostron, Jason E.; ...

2015-06-11

Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysismore » by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.« less
Profile of the genes expressed in the human peripheral retina, macula, and retinal pigment epithelium determined through serial analysis of gene expression (SAGE)

PubMed Central

Sharon, Dror; Blackshaw, Seth; Cepko, Constance L.; Dryja, Thaddeus P.

2002-01-01

We used the serial analysis of gene expression (SAGE) technique to catalogue and measure the relative levels of expression of the genes expressed in the human peripheral retina, macula, and retinal pigment epithelium (RPE) from one or both of two humans, aged 88 and 44 years. The cone photoreceptor contribution to all transcription in the retina was found to be similar in the macula versus the retinal periphery, whereas the rod contribution was greater in the periphery versus the macula. Genes encoding structural proteins for axons were found to be expressed at higher levels in the macula versus the retinal periphery, probably reflecting the large proportion of ganglion cells in the central retina. In comparison with the younger eye, the peripheral retina of the older eye had a substantially higher proportion of mRNAs from genes encoding proteins involved in iron metabolism or protection against oxidative damage and a substantially lower proportion of mRNAs from genes encoding proteins involved in rod phototransduction. These differences may reflect the difference in age between the two donors or merely interindividual variation. The RPE library had numerous previously unencountered tags, suggesting that this cell type has a large, idiosyncratic repertoire of expressed genes. Comparison of these libraries with 100 reported nonocular SAGE libraries revealed 89 retina-specific or enriched genes expressed at substantial levels, of which 14 are known to cause a retinal disease and 53 are RPE-specific genes. We expect that these libraries will serve as a resource for understanding the relative expression levels of genes in the retina and the RPE and for identifying additional disease genes. PMID:11756676
Social interactions predict genetic diversification: an experimental manipulation in shorebirds.

PubMed

Cunningham, Charles; Parra, Jorge E; Coals, Lucy; Beltrán, Marcela; Zefania, Sama; Székely, Tamás

2018-01-01

Mating strategy and social behavior influence gene flow and hence affect levels of genetic differentiation and potentially speciation. Previous genetic analyses of closely related plovers Charadrius spp. found strikingly different population genetic structure in Madagascar: Kittlitz's plovers are spatially homogenous whereas white-fronted plovers have well segregated and geographically distinct populations. Here, we test the hypotheses that Kittlitz's plovers are spatially interconnected and have extensive social interactions that facilitate gene flow, whereas white-fronted plovers are spatially discrete and have limited social interactions. By experimentally removing mates from breeding pairs and observing the movements of mate-searching plovers in both species, we compare the spatial behavior of Kittlitz's and white-fronted plovers within a breeding season. The behavior of experimental birds was largely consistent with expectations: Kittlitz's plovers travelled further, sought new mates in larger areas, and interacted with more individuals than white-fronted plovers, however there was no difference in breeding dispersal. These results suggest that mating strategies, through spatial behavior and social interactions, are predictors of gene flow and thus genetic differentiation and speciation. Our study highlights the importance of using social behavior to understand gene flow. However, further work is needed to investigate the relative importance of social structure, as well as intra- and inter-season dispersal, in influencing the genetic structures of populations.
Structure and function of the adhesive type IV pilus of Sulfolobus acidocaldarius

PubMed Central

Henche, Anna-Lena; Ghosh, Abhrajyoti; Yu, Xiong; Jeske, Torsten; Egelman, Edward; Albers, Sonja-Verena

2014-01-01

Archaea display a variety of type IV pili on their surface and employ them in different physiological functions. In the crenarchaeon Sulfolobus acidocaldarius the most abundant surface structure is the aap pilus (archaeal adhesive pilus). The construction of in frame deletions of the aap genes revealed that all the five genes (aapA, aapX, aapE, aapF, aapB) are indispensible for assembly of the pilus and an impact on surface motility and biofilm formation was observed. Our analyses revealed that there exists a regulatory cross-talk between the expression of aap genes and archaella (formerly archaeal flagella) genes during different growth phases. The structure of the aap pilus is entirely different from the known bacterial type IV pili as well as other archaeal type IV pili. An aap pilus displayed 3 stranded helices where there is a rotation per subunit of ~ 138° and a rise per subunit of ~ 5.7 Å. The filaments have a diameter of ~ 110 Å and the resolution was judged to be ~ 9 Å. We concluded that small changes in sequence might be amplified by large changes in higher-order packing. Our finding of an extraordinary stability of aap-pili possibly represents an adaptation to harsh environments that S. acidocaldarius encounters. PMID:23078543

[Botulism: structure and function of botulinum toxin and its clinical application].

PubMed

Oguma, Keiji; Yamamoto, Yumiko; Suzuki, Tomonori; Fatmawati, Ni Nengah Dwi; Fujita, Kumiko

2012-08-01

Clostridium botulinum produces seven immunological distinct poisonous neurotoxins, A to G, with molecular masses of approximately 150kDa. In acidic foods and culture fluid, the neurotoxins associate with non-toxic components, and form large complexes designated progenitor toxins. The progenitor toxins are found in three forms named LL, L, and M. These neurotoxins and progenitor toxins were purified, and whole nucleotide sequences of their structure genes were determined. In this manuscript, the structure and function of these toxins, and the application of these toxins to clinical usage have been described.
BAC mediated transgenic Large White boars with FSHα/β genes from Chinese Erhualian pigs.

PubMed

Xu, Pan; Li, Qiuyan; Jiang, Kai; Yang, Qiang; Bi, Mingjun; Jiang, Chao; Wang, Xiaopeng; Wang, Chengbin; Li, Longyun; Qiao, Chuanmin; Gong, Huanfa; Xing, Yuyun; Ren, Jun

2016-10-01

Follicle-stimulating hormone (FSH) is a critical hormone regulating reproduction in mammals. Transgenic mice show that overexpression of FSH can improve female fecundity. Using a bacterial artificial chromosome (BAC) system and somatic cell nuclear transfer, we herein generated 67 Large White transgenic (TG) boars harboring FSHα/β genes from Chinese Erhualian pigs, the most prolific breed in the world. We selected two F0 TG boars for further breeding and conducted molecular characterization and biosafety assessment for F1 boars. We showed that 8-9 copies of exogenous FSHα and 5-6 copies of exogenous FSHβ were integrated into the genome of transgenic pigs. The inheritance of exogenous genes conforms to the Mendel's law of segregation. TG boars had higher levels of serum FSH, FSHα mRNA in multiple tissues, FSHβ protein in pituitary and more germ cells per seminiferous tubule compared with their wild-type half sibs without any reproductive defects. Analysis of growth curve, hematological and biochemical parameters and histopathology illustrated that TG boars grew healthily and normally. By applying 16S rRNA gene sequencing, we demonstrated that exogenous genes had no impact on the bacterial community structures of pig guts. Moreover, foreign gene drift did not occur as verified by horizontal gene transfer. Our findings indicate that overexpression of FSH could improve spermatogenesis ability of boars. This work provides insight into the effect of FSHα/β genes on male reproductive performance on pigs by a BAC-mediated transgenic approach.
Metagenomic analysis reveals significant changes of microbial compositions and protective functions during drinking water treatment.

PubMed

Chao, Yuanqing; Ma, Liping; Yang, Ying; Ju, Feng; Zhang, Xu-Xiang; Wu, Wei-Min; Zhang, Tong

2013-12-19

The metagenomic approach was applied to characterize variations of microbial structure and functions in raw (RW) and treated water (TW) in a drinking water treatment plant (DWTP) at Pearl River Delta, China. Microbial structure was significantly influenced by the treatment processes, shifting from Gammaproteobacteria and Betaproteobacteria in RW to Alphaproteobacteria in TW. Further functional analysis indicated the basic metabolic functions of microorganisms in TW did not vary considerably. However, protective functions, i.e. glutathione synthesis genes in 'oxidative stress' and 'detoxification' subsystems, significantly increased, revealing the surviving bacteria may have higher chlorine resistance. Similar results were also found in glutathione metabolism pathway, which identified the major reaction for glutathione synthesis and supported more genes for glutathione metabolism existed in TW. This metagenomic study largely enhanced our knowledge about the influences of treatment processes, especially chlorination, on bacterial community structure and protective functions (e.g. glutathione metabolism) in ecosystems of DWTPs.
Resistance to malaria through structural variation of red blood cell invasion receptors

PubMed Central

Leffler, Ellen M.; Band, Gavin; Busby, George B.J.; Kivinen, Katja; Le, Quang Si; Clarke, Geraldine M.; Bojang, Kalifa A.; Conway, David J.; Jallow, Muminatou; Sisay-Joof, Fatoumatta; Bougouma, Edith C.; Mangano, Valentina D.; Modiano, David; Sirima, Sodiomon B.; Achidi, Eric; Apinjoh, Tobias O.; Marsh, Kevin; Ndila, Carolyne M.; Peshu, Norbert; Williams, Thomas N.; Drakeley, Chris; Manjurano, Alphaxard; Reyburn, Hugh; Riley, Eleanor; Kachala, David; Molyneux, Malcolm; Nyirongo, Vysaul; Taylor, Terrie; Thornton, Nicole; Tilley, Louise; Grimsley, Shane; Drury, Eleanor; Stalker, Jim; Cornelius, Victoria; Hubbart, Christina; Jeffreys, Anna E.; Rowlands, Kate; Rockett, Kirk A.; Spencer, Chris C.A.; Kwiatkowski, Dominic P.

2017-01-01

The malaria parasite Plasmodium falciparum invades human red blood cells via interactions between host and parasite surface proteins. By analyzing genome sequence data from human populations, including 1269 individuals from sub-Saharan Africa, we identify a diverse array of large copy number variants affecting the host invasion receptor genes GYPA and GYPB. We find that a nearby association with severe malaria is explained by a complex structural rearrangement involving the loss of GYPB and gain of two GYPB-A hybrid genes, which encode a serologically distinct blood group antigen known as Dantu. This variant reduces the risk of severe malaria by 40% and has recently risen in frequency in parts of Kenya, yet it appears to be absent from west Africa. These findings link structural variation of red blood cell invasion receptors with natural resistance to severe malaria. PMID:28522690
Crystal structure of Bacillus subtilis YabJ, a purine regulatory protein and member of the highly conserved YjgF family

PubMed Central

Sinha, Sangita; Rappu, Pekka; Lange, S. C.; Mäntsälä, Pekka; Zalkin, Howard; Smith, Janet L.

1999-01-01

The yabJ gene in Bacillus subtilis is required for adenine-mediated repression of purine biosynthetic genes in vivo and codes for an acid-soluble, 14-kDa protein. The molecular mechanism of YabJ is unknown. YabJ is a member of a large, widely distributed family of proteins of unknown biochemical function. The 1.7-Å crystal structure of YabJ reveals a trimeric organization with extensive buried hydrophobic surface and an internal water-filled cavity. The most important finding in the structure is a deep, narrow cleft between subunits lined with nine side chains that are invariant among the 25 most similar homologs. This conserved site is proposed to be a binding or catalytic site for a ligand or substrate that is common to YabJ and other members of the YER057c/YjgF/UK114 family of proteins. PMID:10557275
Resistance to malaria through structural variation of red blood cell invasion receptors.

PubMed

Leffler, Ellen M; Band, Gavin; Busby, George B J; Kivinen, Katja; Le, Quang Si; Clarke, Geraldine M; Bojang, Kalifa A; Conway, David J; Jallow, Muminatou; Sisay-Joof, Fatoumatta; Bougouma, Edith C; Mangano, Valentina D; Modiano, David; Sirima, Sodiomon B; Achidi, Eric; Apinjoh, Tobias O; Marsh, Kevin; Ndila, Carolyne M; Peshu, Norbert; Williams, Thomas N; Drakeley, Chris; Manjurano, Alphaxard; Reyburn, Hugh; Riley, Eleanor; Kachala, David; Molyneux, Malcolm; Nyirongo, Vysaul; Taylor, Terrie; Thornton, Nicole; Tilley, Louise; Grimsley, Shane; Drury, Eleanor; Stalker, Jim; Cornelius, Victoria; Hubbart, Christina; Jeffreys, Anna E; Rowlands, Kate; Rockett, Kirk A; Spencer, Chris C A; Kwiatkowski, Dominic P

2017-06-16

The malaria parasite Plasmodium falciparum invades human red blood cells by a series of interactions between host and parasite surface proteins. By analyzing genome sequence data from human populations, including 1269 individuals from sub-Saharan Africa, we identify a diverse array of large copy-number variants affecting the host invasion receptor genes GYPA and GYPB We find that a nearby association with severe malaria is explained by a complex structural rearrangement involving the loss of GYPB and gain of two GYPB-A hybrid genes, which encode a serologically distinct blood group antigen known as Dantu. This variant reduces the risk of severe malaria by 40% and has recently increased in frequency in parts of Kenya, yet it appears to be absent from west Africa. These findings link structural variation of red blood cell invasion receptors with natural resistance to severe malaria. Copyright © 2017, American Association for the Advancement of Science.
Functionally Relevant Microsatellite Markers From Chickpea Transcription Factor Genes for Efficient Genotyping Applications and Trait Association Mapping

PubMed Central

Kujur, Alice; Bajaj, Deepak; Saxena, Maneesha S.; Tripathi, Shailesh; Upadhyaya, Hari D.; Gowda, C.L.L.; Singh, Sube; Jain, Mukesh; Tyagi, Akhilesh K.; Parida, Swarup K.

2013-01-01

We developed 1108 transcription factor gene-derived microsatellite (TFGMS) and 161 transcription factor functional domain-associated microsatellite (TFFDMS) markers from 707 TFs of chickpea. The robust amplification efficiency (96.5%) and high intra-specific polymorphic potential (34%) detected by markers suggest their immense utilities in efficient large-scale genotyping applications, including construction of both physical and functional transcript maps and understanding population structure. Candidate gene-based association analysis revealed strong genetic association of TFFDMS markers with three major seed and pod traits. Further, TFGMS markers in the 5′ untranslated regions of TF genes showing differential expression during seed development had higher trait association potential. The significance of TFFDMS markers was demonstrated by correlating their allelic variation with amino acid sequence expansion/contraction in the functional domain and alteration of secondary protein structure encoded by genes. The seed weight-associated markers were validated through traditional bi-parental genetic mapping. The determination of gene-specific linkage disequilibrium (LD) patterns in desi and kabuli based on single nucleotide polymorphism-microsatellite marker haplotypes revealed extended LD decay, enhanced LD resolution and trait association potential of genes. The evolutionary history of a strong seed-size/weight-associated TF based on natural variation and haplotype sharing among desi, kabuli and wild unravelled useful information having implication for seed-size trait evolution during chickpea domestication. PMID:23633531
Structural Characterization and Evolutionary Relationship of High-Molecular-Weight Glutenin Subunit Genes in Roegneria nakaii and Roegneria alashanica.

PubMed

Zhang, Lujun; Li, Zhixin; Fan, Renchun; Wei, Bo; Zhang, Xiangqi

2016-07-19

The Roegneria of Triticeae is a large genus including about 130 allopolyploid species. Little is known about its high-molecular-weight glutenin subunits (HMW-GSs). Here, we reported six novel HMW-GS genes from R. nakaii and R. alashanica. Sequencing indicated that Rny1, Rny3, and Ray1 possessed intact open reading frames (ORFs), whereas Rny2, Rny4, and Ray2 harbored in-frame stop codons. All of the six genes possessed a similar primary structure to known HMW-GS, while showing some unique characteristics. Their coding regions were significantly shorter than Glu-1 genes in wheat. The amino acid sequences revealed that all of the six genes were intermediate towards the y-type. The phylogenetic analysis showed that the HMW-GSs from species with St, StY, or StH genome(s) clustered in an independent clade, varying from the typical x- and y-type clusters. Thus, the Glu-1 locus in R. nakaii and R. alashanica is a very primitive glutenin locus across evolution. The six genes were phylogenetically split into two groups clustered to different clades, respectively, each of the two clades included the HMW-GSs from species with St (diploid and tetraploid species), StY, and StH genomes. Hence, it is concluded that the six Roegneria HMW-GS genes are from two St genomes undergoing slight differentiation.
Large-scale structural alteration of brain in epileptic children with SCN1A mutation.

PubMed

Lee, Yun-Jeong; Yum, Mi-Sun; Kim, Min-Jee; Shim, Woo-Hyun; Yoon, Hee Mang; Yoo, Il Han; Lee, Jiwon; Lim, Byung Chan; Kim, Ki Joong; Ko, Tae-Sung

2017-01-01

Mutations in SCN1A gene encoding the alpha 1 subunit of the voltage gated sodium channel are associated with several epilepsy syndromes including genetic epilepsy with febrile seizures plus (GEFS +) and severe myoclonic epilepsy of infancy (SMEI). However, in most patients with SCN1A mutation, brain imaging has reported normal or non-specific findings including cerebral or cerebellar atrophy. The aim of this study was to investigate differences in brain morphometry in epileptic children with SCN1A mutation compared to healthy control subjects. We obtained cortical morphology (thickness, and surface area) and brain volume (global, subcortical, and regional) measurements using FreeSurfer (version 5.3.0, https://surfer.nmr.mgh.harvard.edu) and compared measurements of children with epilepsy and SCN1A gene mutation ( n = 21) with those of age and gender matched healthy controls ( n = 42). Compared to the healthy control group, children with epilepsy and SCN1A gene mutation exhibited smaller total brain, total gray matter and white matter, cerebellar white matter, and subcortical volumes, as well as mean surface area and mean cortical thickness. A regional analysis revealed significantly reduced gray matter volume in the patient group in the bilateral inferior parietal, left lateral orbitofrontal, left precentral, right postcentral, right isthmus cingulate, right middle temporal area with smaller surface area and white matter volume in some of these areas. However, the regional cortical thickness was not significantly different in two groups. This study showed large-scale developmental brain changes in patients with epilepsy and SCN1A gene mutation, which may be associated with the core symptoms of the patients. Further longitudinal MRI studies with larger cohorts are required to confirm the effect of SCN1A gene mutation on structural brain development.
A pathway-based network analysis of hypertension-related genes

NASA Astrophysics Data System (ADS)

Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

2016-02-01

Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.
A multifaceted FISH approach to study endogenous RNAs and DNAs in native nuclear and cell structures.

PubMed

Byron, Meg; Hall, Lisa L; Lawrence, Jeanne B

2013-01-01

Fluorescence in situ hybridization (FISH) is not a singular technique, but a battery of powerful and versatile tools for examining the distribution of endogenous genes and RNAs in precise context with each other and in relation to specific proteins or cell structures. This unit offers the details of highly sensitive and successful protocols that were initially developed largely in our lab and honed over a number of years. Our emphasis is on analysis of nuclear RNAs and DNA to address specific biological questions about nuclear structure, pre-mRNA metabolism, or the role of noncoding RNAs; however, cytoplasmic RNA detection is also discussed. Multifaceted molecular cytological approaches bring precise resolution and sensitive multicolor detection to illuminate the organization and functional roles of endogenous genes and their RNAs within the native structure of fixed cells. Solutions to several common technical pitfalls are discussed, as are cautions regarding the judicious use of digital imaging and the rigors of analyzing and interpreting complex molecular cytological results.
Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

PubMed Central

2012-01-01

Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Molecular responses in root-associative rhizospheric bacteria to variations in plant exudates

NASA Astrophysics Data System (ADS)

Abdoun, Hamid; McMillan, Mary; Pereg, Lily

2015-04-01

Plant exudates are a major factor in the interface of plant-soil-microbe interactions and it is well documented that the microbial community structure in the rhizosphere is largely influenced by the particular exudates excreted by various plants. Azospirillum brasilense is a plant growth promoting rhizobacterium that is known to interact with a large number of plants, including important food crops. The regulatory gene flcA has an important role in this interaction as it controls morphological differentiation of the bacterium that is essential for attachment to root surfaces. Being a response regulatory gene, flcA mediates the response of the bacterial cell to signals from the surrounding rhizosphere. This makes this regulatory gene a good candidate for analysis of the response of bacteria to rhizospheric alterations, in this case, variations in root exudates. We will report on our studies on the response of Azospirillum, an ecologically, scientifically and agriculturally important bacterial genus, to variations in the rhizosphere.
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava

PubMed Central

Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

2016-01-01

The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava. PMID:26904033
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava.

PubMed

Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

2016-01-01

The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.
From the Cover: Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features

NASA Astrophysics Data System (ADS)

Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé

2006-08-01

The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, I-Min; Chu, Ken; Ratner, Anna

2014-10-28

In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorwaymore » to a new era in the discovery of novel molecules.« less
Combining functional and structural genomics to sample the essential Burkholderia structome.

PubMed

Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

2013-01-01

The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request.
[Radiation biology of structurally different Drosophila genes. Report 2. The vestigial gene: molecular characteristics of chromosome mutations].

PubMed

Afanas'eva, K P; Aleksandrova, M V; Aleksandrov, I D; Korablinova, S V

2012-01-01

The results of the PCR-assay of mutation lesions at each of 16 fragments overlapping the entire vestigial (vg) gene of Drosophila melanogaster in 52 gamma-ray-, neutron- and neutron + gamma-ray-induced vg mutants having the inversion or translocation breakpoint within the vg microregion are presented. 4 from 52 mutants studied were found to have large deletions of about 200 kb covering the entire vg gene and adjacent to sca and l(2)C gene-markers as well. 23 mutants from 48 (47.9%) were found to have a wild-type gene structure showing that the exchange breakpoints are located outside of the vg gene. 25 others display the intragenic lesions of different complexity detected by PCR as the absence of(i) either one fragment or (ii) two or more (6-7) adjacent fragments and (iii) simultaneously several (i) or (i) and (ii) types separated by normal gene regions. It is important that 6 from 25 mutants have the breakpoint inside the vg gene and display the (i) or (ii) type of lesions at the gene regions containing the putative break whereas 5 others from 25 with the above lesions have the exchange breakpoint outside the vg gene. Therefore, the breakpoints underlying either inversions or translocations induced by low- and high-LET radiation are likely to be located within and outside the gene under study. Thereby, the formation of exchanges is accompanied by DNA deletions of various sizes at the exchange breakpoints. The molecular model of formation of such exchange-deletion rearrangements is elaborated and presented. Also, conception of the predominately clustered action of both low- and high-LET radiation on the germ cell genome is suggested as the summing-up of the presented results. The ability of ionizing radiation to induce the clusters of genetic alterations in the form of hidden DNA damages as well as gene/chromosome mutations is determined by the track structure and hierarchical organization of the genome. To detect the quality and frequency patterns of all components of the cluster, joint molecular, genetic and cytological techniques need to be used.
Functional analysis of the Brassica napus L. phytoene synthase (PSY) gene family.

PubMed

López-Emparán, Ada; Quezada-Martinez, Daniela; Zúñiga-Bustos, Matías; Cifuentes, Víctor; Iñiguez-Luy, Federico; Federico, María Laura

2014-01-01

Phytoene synthase (PSY) has been shown to catalyze the first committed and rate-limiting step of carotenogenesis in several crop species, including Brassica napus L. Due to its pivotal role, PSY has been a prime target for breeding and metabolic engineering the carotenoid content of seeds, tubers, fruits and flowers. In Arabidopsis thaliana, PSY is encoded by a single copy gene but small PSY gene families have been described in monocot and dicotyledonous species. We have recently shown that PSY genes have been retained in a triplicated state in the A- and C-Brassica genomes, with each paralogue mapping to syntenic locations in each of the three "Arabidopsis-like" subgenomes. Most importantly, we have shown that in B. napus all six members are expressed, exhibiting overlapping redundancy and signs of subfunctionalization among photosynthetic and non photosynthetic tissues. The question of whether this large PSY family actually encodes six functional enzymes remained to be answered. Therefore, the objectives of this study were to: (i) isolate, characterize and compare the complete protein coding sequences (CDS) of the six B. napus PSY genes; (ii) model their predicted tridimensional enzyme structures; (iii) test their phytoene synthase activity in a heterologous complementation system and (iv) evaluate their individual expression patterns during seed development. This study further confirmed that the six B. napus PSY genes encode proteins with high sequence identity, which have evolved under functional constraint. Structural modeling demonstrated that they share similar tridimensional protein structures with a putative PSY active site. Significantly, all six B. napus PSY enzymes were found to be functional. Taking into account the specific patterns of expression exhibited by these PSY genes during seed development and recent knowledge of PSY suborganellar localization, the selection of transgene candidates for metabolic engineering the carotenoid content of oilseeds is discussed.

Whole genome comparison between table and wine grapes reveals a comprehensive catalog of structural variants

PubMed Central

2014-01-01

Background Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. ‘Sultanina’ plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. Results Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in ‘Sultanina’ and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. Conclusions This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes. PMID:24397443
An Integrative Framework for Bayesian Variable Selection with Informative Priors for Identifying Genes and Pathways

PubMed Central

Ander, Bradley P.; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R.; Yang, Xiaowei

2013-01-01

The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed. PMID:23844055
An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.

PubMed

Peng, Bin; Zhu, Dianwen; Ander, Bradley P; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R; Yang, Xiaowei

2013-01-01

The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with 'large p, small n' problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.
Morphological and genetic analysis of four color morphs of bean leaf beetle, Cerotoma trifurcata (Coleoptera: Chrysomelidae)

USDA-ARS?s Scientific Manuscript database

Bean leaf beetle (BLB) exhibits a relatively large amount of morphological variation in terms of color but little is known about the underlying genetic structure and gene flow. Genetic variation among four color phenotypes of the BLB was analyzed using amplified fragment length polymorphisms (AFLP) ...
Filtering Gene Ontology semantic similarity for identifying protein complexes in large protein interaction networks.

PubMed

Wang, Jian; Xie, Dong; Lin, Hongfei; Yang, Zhihao; Zhang, Yijia

2012-06-21

Many biological processes recognize in particular the importance of protein complexes, and various computational approaches have been developed to identify complexes from protein-protein interaction (PPI) networks. However, high false-positive rate of PPIs leads to challenging identification. A protein semantic similarity measure is proposed in this study, based on the ontology structure of Gene Ontology (GO) terms and GO annotations to estimate the reliability of interactions in PPI networks. Interaction pairs with low GO semantic similarity are removed from the network as unreliable interactions. Then, a cluster-expanding algorithm is used to detect complexes with core-attachment structure on filtered network. Our method is applied to three different yeast PPI networks. The effectiveness of our method is examined on two benchmark complex datasets. Experimental results show that our method performed better than other state-of-the-art approaches in most evaluation metrics. The method detects protein complexes from large scale PPI networks by filtering GO semantic similarity. Removing interactions with low GO similarity significantly improves the performance of complex identification. The expanding strategy is also effective to identify attachment proteins of complexes.
Gene expression of Caenorhabditis elegans neurons carries information on their synaptic connectivity.

PubMed

Kaufman, Alon; Dror, Gideon; Meilijson, Isaac; Ruppin, Eytan

2006-12-08

The claim that genetic properties of neurons significantly influence their synaptic network structure is a common notion in neuroscience. The nematode Caenorhabditis elegans provides an exciting opportunity to approach this question in a large-scale quantitative manner. Its synaptic connectivity network has been identified, and, combined with cellular studies, we currently have characteristic connectivity and gene expression signatures for most of its neurons. By using two complementary analysis assays we show that the expression signature of a neuron carries significant information about its synaptic connectivity signature, and identify a list of putative genes predicting neural connectivity. The current study rigorously quantifies the relation between gene expression and synaptic connectivity signatures in the C. elegans nervous system and identifies subsets of neurons where this relation is highly marked. The results presented and the genes identified provide a promising starting point for further, more detailed computational and experimental investigations.
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

PubMed Central

Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

2018-01-01

Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE PAGES

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

2018-01-05

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
Chromatin as active matter

NASA Astrophysics Data System (ADS)

Agrawal, Ankit; Ganai, Nirmalendu; Sengupta, Surajit; Menon, Gautam I.

2017-01-01

Active matter models describe a number of biophysical phenomena at the cell and tissue scale. Such models explore the macroscopic consequences of driving specific soft condensed matter systems of biological relevance out of equilibrium through ‘active’ processes. Here, we describe how active matter models can be used to study the large-scale properties of chromosomes contained within the nuclei of human cells in interphase. We show that polymer models for chromosomes that incorporate inhomogeneous activity reproduce many general, yet little understood, features of large-scale nuclear architecture. These include: (i) the spatial separation of gene-rich, low-density euchromatin, predominantly found towards the centre of the nucleus, vis a vis. gene-poor, denser heterochromatin, typically enriched in proximity to the nuclear periphery, (ii) the differential positioning of individual gene-rich and gene-poor chromosomes, (iii) the formation of chromosome territories, as well as (iv), the weak size-dependence of the positions of individual chromosome centres-of-mass relative to the nuclear centre that is seen in some cell types. Such structuring is induced purely by the combination of activity and confinement and is absent in thermal equilibrium. We systematically explore active matter models for chromosomes, discussing how our model can be generalized to study variations in chromosome positioning across different cell types. The approach and model we outline here represent a preliminary attempt towards a quantitative, first-principles description of the large-scale architecture of the cell nucleus.
Unstable genomes elevate transcriptome dynamics

PubMed Central

Stevens, Joshua B.; Liu, Guo; Abdallah, Batoul Y.; Horne, Steven D.; Ye, Karen J.; Bremer, Steven W.; Ye, Christine J.; Krawetz, Stephen A.; Heng, Henry H.

2015-01-01

The challenge of identifying common expression signatures in cancer is well known, however the reason behind this is largely unclear. Traditionally variation in expression signatures has been attributed to technological problems, however recent evidence suggests that chromosome instability (CIN) and resultant karyotypic heterogeneity may be a large contributing factor. Using a well-defined model of immortalization, we systematically compared the pattern of genome alteration and expression dynamics during somatic evolution. Co-measurement of global gene expression and karyotypic alteration throughout the immortalization process reveals that karyotype changes influence gene expression as major structural and numerical karyotypic alterations result in large gene expression deviation. Replicate samples from stages with stable genomes are more similar to each other than are replicate samples with karyotypic heterogeneity. Karyotypic and gene expression change during immortalization is dynamic as each stage of progression has a unique expression pattern. This was further verified by comparing global expression in two replicates grown in one flask with known karyotypes. Replicates with higher karyotypic instability were found to be less similar than replicates with stable karyotypes. This data illustrates the karyotype, transcriptome, and transcriptome determined pathways are in constant flux during somatic cellular evolution (particularly during the macroevolutionary phase) and this flux is an inextricable feature of CIN and essential for cancer formation. The findings presented here underscore the importance of understanding the evolutionary process of cancer in order to design improved treatment modalities. PMID:24122714
Contribution of above- and below-ground plant traits to the structure and function of grassland soil microbial communities

PubMed Central

Legay, N.; Baxendale, C.; Grigulis, K.; Krainer, U.; Kastl, E.; Schloter, M.; Bardgett, R. D.; Arnoldi, C.; Bahn, M.; Dumont, M.; Poly, F.; Pommier, T.; Clément, J. C.; Lavorel, S.

2014-01-01

Background and Aims Abiotic properties of soil are known to be major drivers of the microbial community within it. Our understanding of how soil microbial properties are related to the functional structure and diversity of plant communities, however, is limited and largely restricted to above-ground plant traits, with the role of below-ground traits being poorly understood. This study investigated the relative contributions of soil abiotic properties and plant traits, both above-ground and below-ground, to variations in microbial processes involved in grassland nitrogen turnover. Methods In mountain grasslands distributed across three European sites, a correlative approach was used to examine the role of a large range of plant functional traits and soil abiotic factors on microbial variables, including gene abundance of nitrifiers and denitrifiers and their potential activities. Key Results Direct effects of soil abiotic parameters were found to have the most significant influence on the microbial groups investigated. Indirect pathways via plant functional traits contributed substantially to explaining the relative abundance of fungi and bacteria and gene abundances of the investigated microbial communities, while they explained little of the variance in microbial activities. Gene abundances of nitrifiers and denitrifiers were most strongly related to below-ground plant traits, suggesting that they were the most relevant traits for explaining variation in community structure and abundances of soil microbes involved in nitrification and denitrification. Conclusions The results suggest that consideration of plant traits, and especially below-ground traits, increases our ability to describe variation in the abundances and the functional characteristics of microbial communities in grassland soils. PMID:25122656
Nephrogenic diabetes insipidus: an X chromosome-linked dominant inheritance pattern with a vasopressin type 2 receptor gene that is structurally normal.

PubMed Central

Friedman, E; Bale, A E; Carson, E; Boson, W L; Nordenskjöld, M; Ritzén, M; Ferreira, P C; Jammal, A; De Marco, L

1994-01-01

Nephrogenic diabetes insipidus is a rare hereditary disorder, most commonly transmitted in an X chromosome-linked recessive manner and characterized by the lack of renal response to the action of antidiuretic hormone [Arg8]vasopressin. The vasopressin type 2 receptor (V2R) has been suggested to be the gene that causes the disease, and its role in disease pathogenesis is supported by mutations within this gene in affected individuals. Using the PCR, denaturing gradient gel electrophoresis, and direct DNA sequencing, we examined the V2R gene in four unrelated kindreds. In addition, linkage analysis with chromosome Xq28 markers was done in one large Brazilian kindred with an apparent unusual X chromosome-linked dominant inheritance pattern. In one family, a mutation in codon 280, causing a Tyr-->Cys substitution in the sixth transmembrane domain of the receptor, was found. In the other three additional families with nephrogenic diabetes insipidus, the V2R-coding region was normal in sequence. In one large Brazilian kindred displaying an unusual X chromosome-linked dominant mode of inheritance, the disease-related gene was localized to the same region of the X chromosome as the V2R, but no mutations were found, thus raising the possibility that this disease is caused by a gene other than V2R. Images PMID:8078903
Uncovering the functional constraints underlying the genomic organization of the odorant-binding protein genes.

PubMed

Librado, Pablo; Rozas, Julio

2013-01-01

Animal olfactory systems have a critical role for the survival and reproduction of individuals. In insects, the odorant-binding proteins (OBPs) are encoded by a moderately sized gene family, and mediate the first steps of the olfactory processing. Most OBPs are organized in clusters of a few paralogs, which are conserved over time. Currently, the biological mechanism explaining the close physical proximity among OBPs is not yet established. Here, we conducted a comprehensive study aiming to gain insights into the mechanisms underlying the OBP genomic organization. We found that the OBP clusters are embedded within large conserved arrangements. These organizations also include other non-OBP genes, which often encode proteins integral to plasma membrane. Moreover, the conservation degree of such large clusters is related to the following: 1) the promoter architecture of the confined genes, 2) a characteristic transcriptional environment, and 3) the chromatin conformation of the chromosomal region. Our results suggest that chromatin domains may restrict the location of OBP genes to regions having the appropriate transcriptional environment, leading to the OBP cluster structure. However, the appropriate transcriptional environment for OBP and the other neighbor genes is not dominated by reduced levels of expression noise. Indeed, the stochastic fluctuations in the OBP transcript abundance may have a critical role in the combinatorial nature of the olfactory coding process.
Chemical and structural biology of protein lysine deacetylases

PubMed Central

YOSHIDA, Minoru; KUDO, Norio; KOSONO, Saori; ITO, Akihiro

2017-01-01

Histone acetylation is a reversible posttranslational modification that plays a fundamental role in regulating eukaryotic gene expression and chromatin structure/function. Key enzymes for removing acetyl groups from histones are metal (zinc)-dependent and NAD+-dependent histone deacetylases (HDACs). The molecular function of HDACs have been extensively characterized by various approaches including chemical, molecular, and structural biology, which demonstrated that HDACs regulate cell proliferation, differentiation, and metabolic homeostasis, and that their alterations are deeply involved in various human disorders including cancer. Notably, drug discovery efforts have achieved success in developing HDAC-targeting therapeutics for treatment of several cancers. However, recent advancements in proteomics technology have revealed much broader aspects of HDACs beyond gene expression control. Not only histones but also a large number of cellular proteins are subject to acetylation by histone acetyltransferases (HATs) and deacetylation by HDACs. Furthermore, some of their structures can flexibly accept and hydrolyze other acyl groups on protein lysine residues. This review mainly focuses on structural aspects of HDAC enzymatic activity regulated by interaction with substrates, co-factors, small molecule inhibitors, and activators. PMID:28496053
Chromosome structures: reduction of certain problems with unequal gene content and gene paralogs to integer linear programming.

PubMed

Lyubetsky, Vassily; Gershgorin, Roman; Gorbunov, Konstantin

2017-12-06

Chromosome structure is a very limited model of the genome including the information about its chromosomes such as their linear or circular organization, the order of genes on them, and the DNA strand encoding a gene. Gene lengths, nucleotide composition, and intergenic regions are ignored. Although highly incomplete, such structure can be used in many cases, e.g., to reconstruct phylogeny and evolutionary events, to identify gene synteny, regulatory elements and promoters (considering highly conserved elements), etc. Three problems are considered; all assume unequal gene content and the presence of gene paralogs. The distance problem is to determine the minimum number of operations required to transform one chromosome structure into another and the corresponding transformation itself including the identification of paralogs in two structures. We use the DCJ model which is one of the most studied combinatorial rearrangement models. Double-, sesqui-, and single-operations as well as deletion and insertion of a chromosome region are considered in the model; the single ones comprise cut and join. In the reconstruction problem, a phylogenetic tree with chromosome structures in the leaves is given. It is necessary to assign the structures to inner nodes of the tree to minimize the sum of distances between terminal structures of each edge and to identify the mutual paralogs in a fairly large set of structures. A linear algorithm is known for the distance problem without paralogs, while the presence of paralogs makes it NP-hard. If paralogs are allowed but the insertion and deletion operations are missing (and special constraints are imposed), the reduction of the distance problem to integer linear programming is known. Apparently, the reconstruction problem is NP-hard even in the absence of paralogs. The problem of contigs is to find the optimal arrangements for each given set of contigs, which also includes the mutual identification of paralogs. We proved that these problems can be reduced to integer linear programming formulations, which allows an algorithm to redefine the problems to implement a very special case of the integer linear programming tool. The results were tested on synthetic and biological samples. Three well-known problems were reduced to a very special case of integer linear programming, which is a new method of their solutions. Integer linear programming is clearly among the main computational methods and, as generally accepted, is fast on average; in particular, computation systems specifically targeted at it are available. The challenges are to reduce the size of the corresponding integer linear programming formulations and to incorporate a more detailed biological concept in our model of the reconstruction.
Population structure and strong divergent selection shape phenotypic diversification in maize landraces.

PubMed

Pressoir, G; Berthaud, J

2004-02-01

To conserve the long-term selection potential of maize, it is necessary to investigate past and present evolutionary processes that have shaped quantitative trait variation. Understanding the dynamics of quantitative trait evolution is crucial to future crop breeding. We characterized population differentiation of maize landraces from the State of Oaxaca, Mexico for quantitative traits and molecular markers. Qst values were much higher than Fst values obtained for molecular markers. While low values of Fst (0.011 within-village and 0.003 among-villages) suggest that considerable gene flow occurred among the studied populations, high levels of population differentiation for quantitative traits were observed (ie an among-village Qst value of 0.535 for kernel weight). Our results suggest that although quantitative traits appear to be under strong divergent selection, a considerable amount of gene flow occurs among populations. Furthermore, we characterized nonproportional changes in the G matrix structure both within and among villages that are consequences of farmer selection. As a consequence of these differences in the G matrix structure, the response to multivariate selection will be different from one population to another. Large changes in the G matrix structure could indicate that farmers select for genes of major and pleiotropic effect. Farmers' decision and selection strategies have a great impact on phenotypic diversification in maize landraces.
Identification and expression profiling analysis of TCP family genes involved in growth and development in maize.

PubMed

Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu

2017-10-01

The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers.

PubMed

Campbell, Kieran R; Yau, Christopher

2017-03-15

Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
The Genetic Diversity and Structure of Linkage Disequilibrium of the MTHFR Gene in Populations of Northern Eurasia.

PubMed

Trifonova, E A; Eremina, E R; Urnov, F D; Stepanov, V A

2012-01-01

The structure of the haplotypes and linkage disequilibrium (LD) of the methylenetetrahydrofolate reductase gene (MTHFR) in 9 population groups from Northern Eurasia and populations of the international HapMap project was investigated in the present study. The data suggest that the architecture of LD in the human genome is largely determined by the evolutionary history of populations; however, the results of phylogenetic and haplotype analyses seems to suggest that in fact there may be a common "old" mechanism for the formation of certain patterns of LD. Variability in the structure of LD and the level of diversity of MTHFRhaplotypes cause a certain set of tagSNPs with an established prognostic significance for each population. In our opinion, the results obtained in the present study are of considerable interest for understanding multiple genetic phenomena: namely, the association of interpopulation differences in the patterns of LD with structures possessing a genetic susceptibility to complex diseases, and the functional significance of the pleiotropicMTHFR gene effect. Summarizing the results of this study, a conclusion can be made that the genetic variability analysis with emphasis on the structure of LD in human populations is a powerful tool that can make a significant contribution to such areas of biomedical science as human evolutionary biology, functional genomics, genetics of complex diseases, and pharmacogenomics.

Ecological and genetic determinants of plasmid distribution in Escherichia coli.

PubMed

Medaney, Frances; Ellis, Richard J; Raymond, Ben

2016-11-01

Bacterial plasmids are important carriers of virulence and antibiotic resistance genes. Nevertheless, little is known of the determinants of plasmid distribution in bacterial populations. Here the factors affecting the diversity and distribution of the large plasmids of Escherichia coli were explored in cattle grazing on semi-natural grassland, a set of populations with low frequencies of antibiotic resistance genes. Critically, the population genetic structure of bacterial hosts was chararacterized. This revealed structured E. coli populations with high diversity between sites and individuals but low diversity within cattle hosts. Plasmid profiles, however, varied considerably within the same E. coli genotype. Both ecological and genetic factors affected plasmid distribution: plasmid profiles were affected by site, E. coli diversity, E. coli genotype and the presence of other large plasmids. Notably 3/26 E. coli serotypes accounted for half the observed plasmid-free isolates indicating that within species variation can substantially affect carriage of the major conjugative plasmids. The observed population structure suggest that most of the opportunities for within species plasmid transfer occur between different individuals of the same genotype and support recent experimental work indicating that plasmid-host coevolution, and epistatic interactions on fitness costs are likely to be important in determining occupancy. © 2016 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.
Gene expression profiles of changes underlying different-sized human rotator cuff tendon tears.

PubMed

Chaudhury, Salma; Xia, Zhidao; Thakkar, Dipti; Hakimi, Osnat; Carr, Andrew J

2016-10-01

Progressive cellular and extracellular matrix (ECM) changes related to age and disease severity have been demonstrated in rotator cuff tendon tears. Larger rotator cuff tears demonstrate structural abnormalities that potentially adversely influence healing potential. This study aimed to gain greater insight into the relationship of pathologic changes to tear size by analyzing gene expression profiles from normal rotator cuff tendons, small rotator cuff tears, and large rotator cuff tears. We analyzed gene expression profiles of 28 human rotator cuff tendons using microarrays representing the entire genome; 11 large and 5 small torn rotator cuff tendon specimens were obtained intraoperatively from tear edges, which we compared with 12 age-matched normal controls. We performed real-time polymerase chain reaction and immunohistochemistry for validation. Torn rotator cuff tendons demonstrated upregulation of a number of key genes, such as matrix metalloproteinase 3, 10, 12, 13, 15, 21, and 25; a disintegrin and metalloproteinase (ADAM) 12, 15, and 22; and aggrecan. Amyloid was downregulated in all tears. Small tears displayed upregulation of bone morphogenetic protein 5. Chemokines and cytokines that may play a role in chemotaxis were altered; interleukins 3, 10, 13, and 15 were upregulated in tears, whereas interleukins 1, 8, 11, 18, and 27 were downregulated. The gene expression profiles of normal controls and small and large rotator cuff tear groups differ significantly. Extracellular matrix remodeling genes were found to contribute to rotator cuff tear pathogenesis. Rotator cuff tears displayed upregulation of a number of matrix metalloproteinase (3, 10, 12, 13, 15, 21, and 25), a disintegrin and metalloproteinase (ADAM 12, 15, and 22) genes, and downregulation of some interleukins (1, 8, and 27), which play important roles in chemotaxis. These gene products may potentially have a role as biomarkers of failure of healing or therapeutic targets to improve tendon healing. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Abnormal behavior associated with a point mutation in the structural gene for monoamine oxidase A

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunner, H.G.; Nelen, M.; Ropers, H.H.

1993-10-22

Genetic and metabolic studies have been done on a large kindred in which several males are affected by a syndrome of borderline mental retardation and abnormal behavior. The types of behavior that occurred include impulsive aggression, arson, attempted rape, and exhibitionism. Analysis of 24-hour urine samples indicated markedly disturbed monoamine metabolism. This syndrome was associated with a complete and selective deficiency of enzymatic activity of monoamine oxidase A (MAOA). In each of five affected males, a point mutation was identified in the eighth exon of the MAOA structural gene, which changes a glutamine to a termination codon. Thus, isolated completemore » MAOA deficiency in this family is associated with a recognizable behavioral phenotype that includes disturbed regulation of impulsive aggression.« less
Limits to gene flow in a cosmopolitan marine planktonic diatom.

PubMed

Casteleyn, Griet; Leliaert, Frederik; Backeljau, Thierry; Debeer, Ann-Eline; Kotaki, Yuichi; Rhodes, Lesley; Lundholm, Nina; Sabbe, Koen; Vyverman, Wim

2010-07-20

The role of geographic isolation in marine microbial speciation is hotly debated because of the high dispersal potential and large population sizes of planktonic microorganisms and the apparent lack of strong dispersal barriers in the open sea. Here, we show that gene flow between distant populations of the globally distributed, bloom-forming diatom species Pseudo-nitzschia pungens (clade I) is limited and follows a strong isolation by distance pattern. Furthermore, phylogenetic analysis implies that under appropriate geographic and environmental circumstances, like the pronounced climatic changes in the Pleistocene, population structuring may lead to speciation and hence may play an important role in diversification of marine planktonic microorganisms. A better understanding of the factors that control population structuring is thus essential to reveal the role of allopatric speciation in marine microorganisms.
Bird migratory flyways influence the phylogeography of the invasive brine shrimp Artemia franciscana in its native American range

PubMed Central

Muñoz, Joaquín; Amat, Francisco; Green, Andy J.; Figuerola, Jordi

2013-01-01

Since Darwin’s time, waterbirds have been considered an important vector for the dispersal of continental aquatic invertebrates. Bird movements have facilitated the worldwide invasion of the American brine shrimp Artemia franciscana, transporting cysts (diapausing eggs), and favouring rapid range expansions from introduction sites. Here we address the impact of bird migratory flyways on the population genetic structure and phylogeography of A. franciscana in its native range in the Americas. We examined sequence variation for two mitochondrial gene fragments (COI and 16S for a subset of the data) in a large set of population samples representing the entire native range of A. franciscana. Furthermore, we performed Mantel tests and redundancy analyses (RDA) to test the role of flyways, geography and human introductions on the phylogeography and population genetic structure at a continental scale. A. franciscana mitochondrial DNA was very diverse, with two main clades, largely corresponding to Pacific and Atlantic populations, mirroring American bird flyways. There was a high degree of regional endemism, with populations subdivided into at least 12 divergent, geographically restricted and largely allopatric mitochondrial lineages, and high levels of population structure (ΦST of 0.92), indicating low ongoing gene flow. We found evidence of human-mediated introductions in nine out of 39 populations analysed. Once these populations were removed, Mantel tests revealed a strong association between genetic variation and geographic distance (i.e., isolation-by-distance pattern). RDA showed that shared bird flyways explained around 20% of the variance in genetic distance between populations and this was highly significant, once geographic distance was controlled for. The variance explained increased to 30% when the factor human introduction was included in the model. Our findings suggest that bird-mediated transport of brine shrimp propagules does not result in substantial ongoing gene flow; instead, it had a significant historical role on the current species phylogeography, facilitating the colonisation of new aquatic environments as they become available along their main migratory flyways. PMID:24255814
PRGdb: a bioinformatics platform for plant resistance gene analysis

PubMed Central

Sanseverino, Walter; Roma, Guglielmo; De Simone, Marco; Faino, Luigi; Melito, Sara; Stupka, Elia; Frusciante, Luigi; Ercolano, Maria Raffaella

2010-01-01

PRGdb is a web accessible open-source (http://www.prgdb.org) database that represents the first bioinformatic resource providing a comprehensive overview of resistance genes (R-genes) in plants. PRGdb holds more than 16 000 known and putative R-genes belonging to 192 plant species challenged by 115 different pathogens and linked with useful biological information. The complete database includes a set of 73 manually curated reference R-genes, 6308 putative R-genes collected from NCBI and 10463 computationally predicted putative R-genes. Thanks to a user-friendly interface, data can be examined using different query tools. A home-made prediction pipeline called Disease Resistance Analysis and Gene Orthology (DRAGO), based on reference R-gene sequence data, was developed to search for plant resistance genes in public datasets such as Unigene and Genbank. New putative R-gene classes containing unknown domain combinations were discovered and characterized. The development of the PRG platform represents an important starting point to conduct various experimental tasks. The inferred cross-link between genomic and phenotypic information allows access to a large body of information to find answers to several biological questions. The database structure also permits easy integration with other data types and opens up prospects for future implementations. PMID:19906694
Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data.

PubMed

Chan, Kuang-Lim; Rosli, Rozana; Tatarinova, Tatiana V; Hogan, Michael; Firdaus-Raih, Mohd; Low, Eng-Ti Leslie

2017-01-27

Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion. We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure). Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.
Whole-Gene Positive Selection, Elevated Synonymous Substitution Rates, Duplication, and Indel Evolution of the Chloroplast clpP1 Gene

PubMed Central

Erixon, Per; Oxelman, Bengt

2008-01-01

Background Synonymous DNA substitution rates in the plant chloroplast genome are generally relatively slow and lineage dependent. Non-synonymous rates are usually even slower due to purifying selection acting on the genes. Positive selection is expected to speed up non-synonymous substitution rates, whereas synonymous rates are expected to be unaffected. Until recently, positive selection has seldom been observed in chloroplast genes, and large-scale structural rearrangements leading to gene duplications are hitherto supposed to be rare. Methodology/Principle Findings We found high substitution rates in the exons of the plastid clpP1 gene in Oenothera (the Evening Primrose family) and three separate lineages in the tribe Sileneae (Caryophyllaceae, the Carnation family). Introns have been lost in some of the lineages, but where present, the intron sequences have substitution rates similar to those found in other introns of their genomes. The elevated substitution rates of clpP1 are associated with statistically significant whole-gene positive selection in three branches of the phylogeny. In two of the lineages we found multiple copies of the gene. Neighboring genes present in the duplicated fragments do not show signs of elevated substitution rates or positive selection. Although non-synonymous substitutions account for most of the increase in substitution rates, synonymous rates are also markedly elevated in some lineages. Whereas plant clpP1 genes experiencing negative (purifying) selection are characterized by having very conserved lengths, genes under positive selection often have large insertions of more or less repetitive amino acid sequence motifs. Conclusions/Significance We found positive selection of the clpP1 gene in various plant lineages to correlated with repeated duplication of the clpP1 gene and surrounding regions, repetitive amino acid sequences, and increase in synonymous substitution rates. The present study sheds light on the controversial issue of whether negative or positive selection is to be expected after gene duplications by providing evidence for the latter alternative. The observed increase in synonymous substitution rates in some of the lineages indicates that the detection of positive selection may be obscured under such circumstances. Future studies are required to explore the functional significance of the large inserted repeated amino acid motifs, as well as the possibility that synonymous substitution rates may be affected by positive selection. PMID:18167545
Major psychological factors affecting acceptance of gene-recombination technology.

PubMed

Tanaka, Yutaka

2004-12-01

The purpose of this study was to verify the validity of a causal model that was made to predict the acceptance of gene-recombination technology. A structural equation model was used as a causal model. First of all, based on preceding studies, the factors of perceived risk, perceived benefit, and trust were set up as important psychological factors determining acceptance of gene-recombination technology in the structural equation model. An additional factor, "sense of bioethics," which I consider to be important for acceptance of biotechnology, was added to the model. Based on previous studies, trust was set up to have an indirect influence on the acceptance of gene-recombination technology through perceived risk and perceived benefit in the model. Participants were 231 undergraduate students in Japan who answered a questionnaire with a 5-point bipolar scale. The results indicated that the proposed model fits the data well, and showed that acceptance of gene-recombination technology is explained largely by four factors, that is, perceived risk, perceived benefit, trust, and sense of bioethics, whether the technology is applied to plants, animals, or human beings. However, the relative importance of the four factors was found to vary depending on whether the gene-recombination technology was applied to plants, animals, or human beings. Specifically, the factor of sense of bioethics is the most important factor in acceptance of plant gene-recombination technology and animal gene-recombination technology, and the factors of trust and perceived risk are the most important factors in acceptance of human being gene-recombination technology.
Contemporary gene flow and mating system of Arabis alpina in a Central European alpine landscape

PubMed Central

Buehler, D.; Graf, R.; Holderegger, R.; Gugerli, F.

2012-01-01

Background and Aims Gene flow is important in counteracting the divergence of populations but also in spreading genes among populations. However, contemporary gene flow is not well understood across alpine landscapes. The aim of this study was to estimate contemporary gene flow through pollen and to examine the realized mating system in the alpine perennial plant, Arabis alpina (Brassicaceae). Methods An entire sub-alpine to alpine landscape of 2 km2 was exhaustively sampled in the Swiss Alps. Eighteen nuclear microsatellite loci were used to genotype 595 individuals and 499 offspring from 49 maternal plants. Contemporary gene flow by pollen was estimated from paternity analysis, matching the genotypes of maternal plants and offspring to the pool of likely father plants. Realized mating patterns and genetic structure were also estimated. Key Results Paternity analysis revealed several long-distance gene flow events (≤1 km). However, most outcrossing pollen was dispersed close to the mother plants, and 84 % of all offspring were selfed. Individuals that were spatially close were more related than by chance and were also more likely to be connected by pollen dispersal. Conclusions In the alpine landscape studied, genetic structure occurred on small spatial scales as expected for alpine plants. However, gene flow also covered large distances. This makes it plausible for alpine plants to spread beneficial alleles at least via pollen across landscapes at a short time scale. Thus, gene flow potentially facilitates rapid adaptation in A. alpina likely to be required under ongoing climate change. PMID:22492332
Genome-Wide Identification and Comprehensive Expression Profiling of Ribosomal Protein Small Subunit (RPS) Genes and their Comparative Analysis with the Large Subunit (RPL) Genes in Rice

PubMed Central

Saha, Anusree; Das, Shubhajit; Moin, Mazahar; Dutta, Mouboni; Bakshi, Achala; Madhav, M. S.; Kirti, P. B.

2017-01-01

Ribosomal proteins (RPs) are indispensable in ribosome biogenesis and protein synthesis, and play a crucial role in diverse developmental processes. Our previous studies on Ribosomal Protein Large subunit (RPL) genes provided insights into their stress responsive roles in rice. In the present study, we have explored the developmental and stress regulated expression patterns of Ribosomal Protein Small (RPS) subunit genes for their differential expression in a spatiotemporal and stress dependent manner. We have also performed an in silico analysis of gene structure, cis-elements in upstream regulatory regions, protein properties and phylogeny. Expression studies of the 34 RPS genes in 13 different tissues of rice covering major growth and developmental stages revealed that their expression was substantially elevated, mostly in shoots and leaves indicating their possible involvement in the development of vegetative organs. The majority of the RPS genes have manifested significant expression under all abiotic stress treatments with ABA, PEG, NaCl, and H2O2. Infection with important rice pathogens, Xanthomonas oryzae pv. oryzae (Xoo) and Rhizoctonia solani also induced the up-regulation of several of the RPS genes. RPS4, 13a, 18a, and 4a have shown higher transcript levels under all the abiotic stresses, whereas, RPS4 is up-regulated in both the biotic stress treatments. The information obtained from the present investigation would be useful in appreciating the possible stress-regulatory attributes of the genes coding for rice ribosomal small subunit proteins apart from their functions as house-keeping proteins. A detailed functional analysis of independent genes is required to study their roles in stress tolerance and generating stress- tolerant crops. PMID:28966624
Dietary Supplement of Large Yellow Tea Ameliorates Metabolic Syndrome and Attenuates Hepatic Steatosis in db/db Mice

PubMed Central

Teng, Yun; Li, Daxiang; Guruvaiah, Ponmari; Xu, Na; Xie, Zhongwen

2018-01-01

Yellow tea has been widely recognized for its health benefits. However, its effects and mechanism are largely unknown. The current study investigated the mechanism of dietary supplements of large yellow tea and its effects on metabolic syndrome and the hepatic steatosis in male db/db mice. Our data showed that dietary supplements of large yellow tea and water extract significantly reduced water intake and food consumption, lowered the serum total and low-density lipoprotein cholesterol and triglyceride levels, and significantly reduced blood glucose level and increased glucose tolerance in db/db mice when compared to untreated db/db mice. In addition, the dietary supplement of large yellow tea prevented the fatty liver formation and restored the normal hepatic structure of db/db mice. Furthermore, the dietary supplement of large yellow tea obviously reduced the lipid synthesis related to gene fatty acid synthase, the sterol regulatory element-binding transcription factor 1 and acetyl-CoA carboxylase α, as well as fatty acid synthase and sterol response element-binding protein 1 expression, while the lipid catabolic genes were not altered in the liver of db/db mice. This study substantiated that the dietary supplement of large yellow tea has potential as a food additive for ameliorating type 2 diabetes-associated symptoms. PMID:29329215
Genome-Wide Analyses of the NAC Transcription Factor Gene Family in Pepper (Capsicum annuum L.): Chromosome Location, Phylogeny, Structure, Expression Patterns, Cis-Elements in the Promoter, and Interaction Network

PubMed Central

Diao, Weiping; Snyder, John C.; Liu, Jinbing; Pan, Baogui; Guo, Guangjun; Ge, Wei; Dawood, Mohammad Hasan Salman Ali

2018-01-01

The NAM, ATAF1/2, and CUC2 (NAC) transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.). The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC) proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6). In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III). Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9%) of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA), and methyl jasmonate (MeJA)] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with different types of stress. Our results also showed that CaNAC36 plays an important role in the interaction network, interacting with 48 genes. Most of these genes are in the mitogen-activated protein kinase (MAPK) family. Taken together, our results provide a platform for further studies to identify the biological functions of CaNAC genes. PMID:29596349
microRNAs Databases: Developmental Methodologies, Structural and Functional Annotations.

PubMed

Singh, Nagendra Kumar

2017-09-01

microRNA (miRNA) is an endogenous and evolutionary conserved non-coding RNA, involved in post-transcriptional process as gene repressor and mRNA cleavage through RNA-induced silencing complex (RISC) formation. In RISC, miRNA binds in complementary base pair with targeted mRNA along with Argonaut proteins complex, causes gene repression or endonucleolytic cleavage of mRNAs and results in many diseases and syndromes. After the discovery of miRNA lin-4 and let-7, subsequently large numbers of miRNAs were discovered by low-throughput and high-throughput experimental techniques along with computational process in various biological and metabolic processes. The miRNAs are important non-coding RNA for understanding the complex biological phenomena of organism because it controls the gene regulation. This paper reviews miRNA databases with structural and functional annotations developed by various researchers. These databases contain structural and functional information of animal, plant and virus miRNAs including miRNAs-associated diseases, stress resistance in plant, miRNAs take part in various biological processes, effect of miRNAs interaction on drugs and environment, effect of variance on miRNAs, miRNAs gene expression analysis, sequence of miRNAs, structure of miRNAs. This review focuses on the developmental methodology of miRNA databases such as computational tools and methods used for extraction of miRNAs annotation from different resources or through experiment. This study also discusses the efficiency of user interface design of every database along with current entry and annotations of miRNA (pathways, gene ontology, disease ontology, etc.). Here, an integrated schematic diagram of construction process for databases is also drawn along with tabular and graphical comparison of various types of entries in different databases. Aim of this paper is to present the importance of miRNAs-related resources at a single place.
Geographic origin is not supported by the genetic variability found in a large living collection of Jatropha curcas with accessions from three continents

PubMed Central

Maghuly, Fatemeh; Jankowicz-Cieslak, Joanna; Pabinger, Stephan; Till, Bradley J; Laimer, Margit

2015-01-01

Increasing economic interest in Jatropha curcas requires a major research focus on the genetic background and geographic origin of this non-edible biofuel crop. To determine the worldwide genetic structure of this species, amplified fragment length polymorphisms, inter simple sequence repeats, and novel single nucleotide polymorphisms (SNPs) were employed for a large collection of 907 J. curcas accessions and related species (RS) from three continents, 15 countries and 53 regions. PCoA, phenogram, and cophenetic analyses separated RS from two J. curcas groups. Accessions from Mexico, Bolivia, Paraguay, Kenya, and Ethiopia with unknown origins were found in both groups. In general, there was a considerable overlap between individuals from different regions and countries. The Bayesian approach using structure demonstrated two groups with a low genetic variation. Analysis of molecular varience revealed significant variation among individuals within populations. SNPs found by in silico analyses of Δ12 fatty acid desaturase indicated possible changes in gene expression and thus in fatty acid profiles. SNP variation was higher in the curcin gene compared to genes involved in oil production. Novel SNPs allowed separating toxic, non-toxic, and Mexican accessions. The present study confirms that human activities had a major influence on the genetic diversity of J. curcas, not only because of domestication, but also because of biased selection. PMID:25511658
Computerized image analysis for quantitative neuronal phenotyping in zebrafish.

PubMed

Liu, Tianming; Lu, Jianfeng; Wang, Ye; Campbell, William A; Huang, Ling; Zhu, Jinmin; Xia, Weiming; Wong, Stephen T C

2006-06-15

An integrated microscope image analysis pipeline is developed for automatic analysis and quantification of phenotypes in zebrafish with altered expression of Alzheimer's disease (AD)-linked genes. We hypothesize that a slight impairment of neuronal integrity in a large number of zebrafish carrying the mutant genotype can be detected through the computerized image analysis method. Key functionalities of our zebrafish image processing pipeline include quantification of neuron loss in zebrafish embryos due to knockdown of AD-linked genes, automatic detection of defective somites, and quantitative measurement of gene expression levels in zebrafish with altered expression of AD-linked genes or treatment with a chemical compound. These quantitative measurements enable the archival of analyzed results and relevant meta-data. The structured database is organized for statistical analysis and data modeling to better understand neuronal integrity and phenotypic changes of zebrafish under different perturbations. Our results show that the computerized analysis is comparable to manual counting with equivalent accuracy and improved efficacy and consistency. Development of such an automated data analysis pipeline represents a significant step forward to achieve accurate and reproducible quantification of neuronal phenotypes in large scale or high-throughput zebrafish imaging studies.
New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

PubMed

Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

2011-11-01

Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.
Tick-Borne Encephalitis Virus Structural Proteins Are the Primary Viral Determinants of Non-Viraemic Transmission between Ticks whereas Non-Structural Proteins Affect Cytotoxicity.

PubMed

Khasnatinov, Maxim A; Tuplin, Andrew; Gritsun, Dmitri J; Slovak, Mirko; Kazimirova, Maria; Lickova, Martina; Havlikova, Sabina; Klempa, Boris; Labuda, Milan; Gould, Ernest A; Gritsun, Tamara S

2016-01-01

Over 50 million humans live in areas of potential exposure to tick-borne encephalitis virus (TBEV). The disease exhibits an estimated 16,000 cases recorded annually over 30 European and Asian countries. Conventionally, TBEV transmission to Ixodes spp. ticks occurs whilst feeding on viraemic animals. However, an alternative mechanism of non-viraemic transmission (NVT) between infected and uninfected ticks co-feeding on the same transmission-competent host, has also been demonstrated. Here, using laboratory-bred I. ricinus ticks, we demonstrate low and high efficiency NVT for TBEV strains Vasilchenko (Vs) and Hypr, respectively. These virus strains share high sequence similarity but are classified as two TBEV subtypes. The Vs strain is a Siberian subtype, naturally associated with I. persulcatus ticks whilst the Hypr strain is a European subtype, transmitted by I. ricinus ticks. In mammalian cell culture (porcine kidney cell line PS), Vs and Hypr induce low and high cytopathic effects (cpe), respectively. Using reverse genetics, we engineered a range of viable Vs/Hypr chimaeric strains, with substituted genes. No significant differences in replication rate were detected between wild-type and chimaeric viruses in cell culture. However, the chimaeric strain Vs[Hypr str] (Hypr structural and Vs non-structural genomic regions) demonstrated high efficiency NVT in I. ricinus whereas the counterpart Hypr[Vs str] was not transmitted by NVT, indicating that the virion structural proteins largely determine TBEV NVT transmission efficiency between ticks. In contrast, in cell culture, the extent of cpe was largely determined by the non-structural region of the TBEV genome. Chimaeras with Hypr non-structural genes were more cytotoxic for PS cells when compared with Vs genome-based chimaeras.
Tick-Borne Encephalitis Virus Structural Proteins Are the Primary Viral Determinants of Non-Viraemic Transmission between Ticks whereas Non-Structural Proteins Affect Cytotoxicity

PubMed Central

Khasnatinov, Maxim A.; Tuplin, Andrew; Gritsun, Dmitri J.; Slovak, Mirko; Kazimirova, Maria; Lickova, Martina; Havlikova, Sabina; Klempa, Boris; Gould, Ernest A.

2016-01-01

Over 50 million humans live in areas of potential exposure to tick-borne encephalitis virus (TBEV). The disease exhibits an estimated 16,000 cases recorded annually over 30 European and Asian countries. Conventionally, TBEV transmission to Ixodes spp. ticks occurs whilst feeding on viraemic animals. However, an alternative mechanism of non-viraemic transmission (NVT) between infected and uninfected ticks co-feeding on the same transmission-competent host, has also been demonstrated. Here, using laboratory-bred I. ricinus ticks, we demonstrate low and high efficiency NVT for TBEV strains Vasilchenko (Vs) and Hypr, respectively. These virus strains share high sequence similarity but are classified as two TBEV subtypes. The Vs strain is a Siberian subtype, naturally associated with I. persulcatus ticks whilst the Hypr strain is a European subtype, transmitted by I. ricinus ticks. In mammalian cell culture (porcine kidney cell line PS), Vs and Hypr induce low and high cytopathic effects (cpe), respectively. Using reverse genetics, we engineered a range of viable Vs/Hypr chimaeric strains, with substituted genes. No significant differences in replication rate were detected between wild-type and chimaeric viruses in cell culture. However, the chimaeric strain Vs[Hypr str] (Hypr structural and Vs non-structural genomic regions) demonstrated high efficiency NVT in I. ricinus whereas the counterpart Hypr[Vs str] was not transmitted by NVT, indicating that the virion structural proteins largely determine TBEV NVT transmission efficiency between ticks. In contrast, in cell culture, the extent of cpe was largely determined by the non-structural region of the TBEV genome. Chimaeras with Hypr non-structural genes were more cytotoxic for PS cells when compared with Vs genome-based chimaeras. PMID:27341437
Natural variation in genes potentially involved in plant architecture and adaptation in switchgrass (Panicum virgatum L.).

PubMed

Bahri, Bochra A; Daverdin, Guillaume; Xu, Xiangyang; Cheng, Jan-Fang; Barry, Kerrie W; Brummer, E Charles; Devos, Katrien M

2018-06-14

Advances in genomic technologies have expanded our ability to accurately and exhaustively detect natural genomic variants that can be applied in crop improvement and to increase our knowledge of plant evolution and adaptation. Switchgrass (Panicum virgatum L.), an allotetraploid (2n = 4× = 36) perennial C4 grass (Poaceae family) native to North America and a feedstock crop for cellulosic biofuel production, has a large potential for genetic improvement due to its high genotypic and phenotypic variation. In this study, we analyzed single nucleotide polymorphism (SNP) variation in 372 switchgrass genotypes belonging to 36 accessions for 12 genes putatively involved in biomass production to investigate signatures of selection that could have led to ecotype differentiation and to population adaptation to geographic zones. A total of 11,682 SNPs were mined from ~ 15 Gb of sequence data, out of which 251 SNPs were retained after filtering. Population structure analysis largely grouped upland accessions into one subpopulation and lowland accessions into two additional subpopulations. The most frequent SNPs were in homozygous state within accessions. Sixty percent of the exonic SNPs were non-synonymous and, of these, 45% led to non-conservative amino acid changes. The non-conservative SNPs were largely in linkage disequilibrium with one haplotype being predominantly present in upland accessions while the other haplotype was commonly present in lowland accessions. Tajima's test of neutrality indicated that PHYB, a gene involved in photoperiod response, was under positive selection in the switchgrass population. PHYB carried a SNP leading to a non-conservative amino acid change in the PAS domain, a region that acts as a sensor for light and oxygen in signal transduction. Several non-conservative SNPs in genes potentially involved in plant architecture and adaptation have been identified and led to population structure and genetic differentiation of ecotypes in switchgrass. We suggest here that PHYB is a key gene involved in switchgrass natural selection. Further analyses are needed to determine whether any of the non-conservative SNPs identified play a role in the differential adaptation of upland and lowland switchgrass.

Induction of Interferon-Stimulated Genes by Simian Virus 40 T Antigens

PubMed Central

Rathi, Abhilasha V.; Cantalupo, Paul G.; Sarkar, Saumendra N.; Pipas, James M.

2010-01-01

Simian virus 40 (SV40) large T antigen (TAg) is a multifunctional oncoprotein essential for productive viral infection and for cellular transformation. We have used microarray analysis to examine the global changes in cellular gene expression induced by wild-type T antigen (TAgwt) and TAg-mutants in mouse embryo fibroblasts (MEFs). The expression profile of approximately 800 cellular genes was altered by TAgwt and a truncated TAg (TAgN136), including many genes that influence cell cycle, DNA-replication, transcription, chromatin structure and DNA repair. Unexpectedly, we found a significant number of immune response genes upregulated by TAgwt including many interferon stimulated genes (ISGs) such as ISG56, OAS, Rsad2, Ifi27 and Mx1. Additionally, we also observed activation of STAT1 by TAgwt. Our genetic studies using several TAg mutants reveal an unexplored function of TAg and indicate that the LXCXE motif and p53 binding are required for the upregulation of ISGs. PMID:20692676
Complete Mitochondrial Genome of Eruca sativa Mill. (Garden Rocket)

PubMed Central

Yang, Qing; Chang, Shengxin; Chen, Jianmei; Hu, Maolong; Guan, Rongzhan

2014-01-01

Eruca sativa (Cruciferae family) is an ancient crop of great economic and agronomic importance. Here, the complete mitochondrial genome of Eruca sativa was sequenced and annotated. The circular molecule is 247 696 bp long, with a G+C content of 45.07%, containing 33 protein-coding genes, three rRNA genes, and 18 tRNA genes. The Eruca sativa mitochondrial genome may be divided into six master circles and four subgenomic molecules via three pairwise large repeats, resulting in a more dynamic structure of the Eruca sativa mtDNA compared with other cruciferous mitotypes. Comparison with the Brassica napus MtDNA revealed that most of the genes with known function are conserved between these two mitotypes except for the ccmFN2 and rrn18 genes, and 27 point mutations were scattered in the 14 protein-coding genes. Evolutionary relationships analysis suggested that Eruca sativa is more closely related to the Brassica species and to Raphanus sativus than to Arabidopsis thaliana. PMID:25157569
Mapping cis- and trans-regulatory effects across multiple tissues in twins

PubMed Central

Grundberg, Elin; Small, Kerrin S.; Hedman, Åsa K.; Nica, Alexandra C.; Buil, Alfonso; Keildson, Sarah; Bell, Jordana T.; Yang, Tsun-Po; Meduri, Eshwar; Barrett, Amy; Nisbett, James; Sekowska, Magdalena; Wilk, Alicja; Shin, So-Youn; Glass, Daniel; Travers, Mary; Min, Josine L.; Ring, Sue; Ho, Karen; Thorleifsson, Gudmar; Kong, Augustine; Thorsteindottir, Unnur; Ainali, Chrysanthi; Dimas, Antigone S.; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lowe, Christopher E.; Di Meglio, Paola; Montgomery, Stephen B.; Parts, Leopold; Potter, Simon; Surdulescu, Gabriela; Tsaprouni, Loukia; Tsoka, Sophia; Bataille, Veronique; Durbin, Richard; Nestle, Frank O.; O’Rahilly, Stephen; Soranzo, Nicole; Lindgren, Cecilia M.; Zondervan, Krina T.; Ahmadi, Kourosh R.; Schadt, Eric E.; Stefansson, Kari; Smith, George Davey; McCarthy, Mark I.; Deloukas, Panos; Dermitzakis, Emmanouil T.; Spector, Tim D.

2013-01-01

Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many eQTL studies typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis-effect on expression cannot be accounted for by common cis-variants, a finding which exposes the contribution of low frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene and identify several replicating trans-variants which act predominantly in a tissue-restricted manner and may regulate the transcription of many genes. PMID:22941192
Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.

PubMed

Huang, Ying; Chen, Shi-Yi; Deng, Feilong

2016-01-01

In silico analysis of DNA sequences is an important area of computational biology in the post-genomic era. Over the past two decades, computational approaches for ab initio prediction of gene structure from genome sequence alone have largely facilitated our understanding on a variety of biological questions. Although the computational prediction of protein-coding genes has already been well-established, we are also facing challenges to robustly find the non-coding RNA genes, such as miRNA and lncRNA. Two main aspects of ab initio gene prediction include the computed values for describing sequence features and used algorithm for training the discriminant function, and by which different combinations are employed into various bioinformatic tools. Herein, we briefly review these well-characterized sequence features in eukaryote genomes and applications to ab initio gene prediction. The main purpose of this article is to provide an overview to beginners who aim to develop the related bioinformatic tools.
High-LET Patterns of DSBs in DNA Loops, the HPRT Gene and Phosphorylation Foci

NASA Technical Reports Server (NTRS)

Ponomarev, Artem L.; Huff, Janice L.; Cucinotta, Francis A.

2007-01-01

We present new results obtained with our model based on the track structure and chromatin geometry that predicts the DSB spatial and genomic distributions in a cell nucleus with the full genome represented. The model generates stochastic patterns of DSBs in the physical space of the nucleus filled with the realistic configuration of human chromosomes. The model was re-used to find the distribution of DSBs in a physical volume corresponding to a visible phosphorylation focus believed to be associated with a DSB. The data shows whether there must more than one DSB per foci due to finite size of the visible focus, even if a single DSB is radiochemically responsible for the phosphorylation of DNA in its vicinity. The same model can predict patterns of closely located DSBs in a given gene, or in a DNA loop, one of the large-scale chromatin structures. We demonstrated for the example of the HPRT gene, how different sorts of radiation lead to proximity effect in DSB locations, which is important for modeling gene deletions. The spectrum of intron deletions and total gene deletions was simulated for the HPRT gene. The same proximity effect of DSBs in a loop can hinder DSB restitutions, as parts of the loop between DSBs is deleted with a higher likelihood. The distributions of DSBs and deletions of DNA in a loop are presented.
Comparative analyses identify molecular signature of MRI-classified SVZ-associated glioblastoma

PubMed Central

Lin, Chin-Hsing Annie; Rhodes, Christopher T.; Lin, ChenWei; Phillips, Joanna J.; Berger, Mitchel S.

2017-01-01

ABSTRACT Glioblastoma (GBM) is a highly aggressive brain cancer with limited therapeutic options. While efforts to identify genes responsible for GBM have revealed mutations and aberrant gene expression associated with distinct types of GBM, patients with GBM are often diagnosed and classified based on MRI features. Therefore, we seek to identify molecular representatives in parallel with MRI classification for group I and group II primary GBM associated with the subventricular zone (SVZ). As group I and II GBM contain stem-like signature, we compared gene expression profiles between these 2 groups of primary GBM and endogenous neural stem progenitor cells to reveal dysregulation of cell cycle, chromatin status, cellular morphogenesis, and signaling pathways in these 2 types of MRI-classified GBM. In the absence of IDH mutation, several genes associated with metabolism are differentially expressed in these subtypes of primary GBM, implicating metabolic reprogramming occurs in tumor microenvironment. Furthermore, histone lysine methyltransferase EZH2 was upregulated while histone lysine demethylases KDM2 and KDM4 were downregulated in both group I and II primary GBM. Lastly, we identified 9 common genes across large data sets of gene expression profiles among MRI-classified group I/II GBM, a large cohort of GBM subtypes from TCGA, and glioma stem cells by unsupervised clustering comparison. These commonly upregulated genes have known functions in cell cycle, centromere assembly, chromosome segregation, and mitotic progression. Our findings highlight altered expression of genes important in chromosome integrity across all GBM, suggesting a common mechanism of disrupted fidelity of chromosome structure in GBM. PMID:28278055
Sequence and Structure Analysis of Distantly-Related Viruses Reveals Extensive Gene Transfer between Viruses and Hosts and among Viruses

PubMed Central

Caprari, Silvia; Metzler, Saskia; Lengauer, Thomas; Kalinina, Olga V.

2015-01-01

The origin and evolution of viruses is a subject of ongoing debate. In this study, we provide a full account of the evolutionary relationships between proteins of significant sequence and structural similarity found in viruses that belong to different classes according to the Baltimore classification. We show that such proteins can be found in viruses from all Baltimore classes. For protein families that include these proteins, we observe two patterns of the taxonomic spread. In the first pattern, they can be found in a large number of viruses from all implicated Baltimore classes. In the other pattern, the instances of the corresponding protein in species from each Baltimore class are restricted to a few compact clades. Proteins with the first pattern of distribution are products of so-called viral hallmark genes reported previously. Additionally, this pattern is displayed by the envelope glycoproteins from Flaviviridae and Bunyaviridae and helicases of superfamilies 1 and 2 that have homologs in cellular organisms. The second pattern can often be explained by horizontal gene transfer from the host or between viruses, an example being Orthomyxoviridae and Coronaviridae hemagglutinin esterases. Another facet of horizontal gene transfer comprises multiple independent introduction events of genes from cellular organisms into otherwise unrelated viruses. PMID:26492264
Detecting Role Errors in the Gene Hierarchy of the NCI Thesaurus

PubMed Central

Min, Hua; Cohen, Barry; Halper, Michael; Oren, Marc; Perl, Yehoshua

2008-01-01

Gene terminologies are playing an increasingly important role in the ever-growing field of genomic research. While errors in large, complex terminologies are inevitable, gene terminologies are even more susceptible to them due to the rapid growth of genomic knowledge and the nature of its discovery. It is therefore very important to establish quality-assurance protocols for such genomic-knowledge repositories. Different kinds of terminologies oftentimes require auditing methodologies adapted to their particular structures. In light of this, an auditing methodology tailored to the characteristics of the NCI Thesaurus’s (NCIT’s) Gene hierarchy is presented. The Gene hierarchy is of particular interest to the NCIT’s designers due to the primary role of genomics in current cancer research. This multiphase methodology focuses on detecting role-errors, such as missing roles or roles with incorrect or incomplete target structures, occurring within that hierarchy. The methodology is based on two kinds of abstraction networks, called taxonomies, that highlight the role distribution among concepts within the IS-A (subsumption) hierarchy. These abstract views tend to highlight portions of the hierarchy having a higher concentration of errors. The errors found during an application of the methodology are reported. Hypotheses pertaining to the efficacy of our methodology are investigated. PMID:19221606
Functional and Structural Characterization of FAU Gene/Protein from Marine Sponge Suberites domuncula

PubMed Central

Perina, Dragutin; Korolija, Marina; Popović Hadžija, Marijana; Grbeša, Ivana; Belužić, Robert; Imešek, Mirna; Morrow, Christine; Marjanović, Melanija Posavec; Bakran-Petricioli, Tatjana; Mikoč, Andreja; Ćetković, Helena

2015-01-01

Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously expressed (FAU) gene is down-regulated in human prostate, breast and ovarian cancers. Moreover, its dysregulation is associated with poor prognosis in breast cancer. Sponges (Porifera) are animals without tissues which branched off first from the common ancestor of all metazoans. A large majority of genes implicated in human cancers have their homologues in the sponge genome. Our study suggests that FAU gene from the sponge Suberites domuncula reflects characteristics of the FAU gene from the metazoan ancestor, which have changed only slightly during the course of animal evolution. We found pro-apoptotic activity of sponge FAU protein. The same as its human homologue, sponge FAU increases apoptosis in human HEK293T cells. This indicates that the biological functions of FAU, usually associated with “higher” metazoans, particularly in cancer etiology, possess a biochemical background established early in metazoan evolution. The ancestor of all animals possibly possessed FAU protein with the structure and function similar to evolutionarily more recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis. It provides an opportunity to use pre-bilaterian animals as a simpler model for studying complex interactions in human cancerogenesis. PMID:26198235
The complete mitochondrial genome of Gryllotalpa unispina Saussure, 1874 (Orthoptera: Gryllotalpoidea: Gryllotalpidae).

PubMed

Zhang, Yulong; Shao, Dandan; Cai, Miao; Yin, Hong; Zhang, Daochuan

2016-01-01

The complete mitochondrial genome of Gryllotalpa unispina was 15,513 bp in length and contained 70.9% AT. All G. unispina protein-coding sequences except for the nad2 started with a typical ATN codon. The usual termination codons (TAA) and incomplete stop codons (T) were found from 13 protein-coding genes. All tRNA genes were folded into the typical cloverleaf secondary structure, except trnS(AGN) lacking the dihydrouridine arm. The sizes of the large and small ribosomal RNA genes were 1245 and 725 bp, respectively. The A + T-rich region was 917 bp in length with 76.8%. The orientation and gene order of the G. unispina mitogenome were identical to the G. orientalis and G. pluvialis, there was no phenomenon of "DK rearrangement" which has been widely reported in Caelifera.
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene.

PubMed

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R

2016-07-03

Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish.
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene

PubMed Central

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R.

2016-01-01

ABSTRACT Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish. PMID:27222321
Co-Flocculation of Yeast Species, a New Mechanism to Govern Population Dynamics in Microbial Ecosystems

PubMed Central

Rossouw, Debra; Bagheri, Bahareh; Setati, Mathabatha Evodia; Bauer, Florian Franz

2015-01-01

Flocculation has primarily been studied as an important technological property of Saccharomyces cerevisiae yeast strains in fermentation processes such as brewing and winemaking. These studies have led to the identification of a group of closely related genes, referred to as the FLO gene family, which controls the flocculation phenotype. All naturally occurring S. cerevisiae strains assessed thus far possess at least four independent copies of structurally similar FLO genes, namely FLO1, FLO5, FLO9 and FLO10. The genes appear to differ primarily by the degree of flocculation induced by their expression. However, the reason for the existence of a large family of very similar genes, all involved in the same phenotype, has remained unclear. In natural ecosystems, and in wine production, S. cerevisiae growth together and competes with a large number of other Saccharomyces and many more non-Saccharomyces yeast species. Our data show that many strains of such wine-related non-Saccharomyces species, some of which have recently attracted significant biotechnological interest as they contribute positively to fermentation and wine character, were able to flocculate efficiently. The data also show that both flocculent and non-flocculent S. cerevisiae strains formed mixed species flocs (a process hereafter referred to as co-flocculation) with some of these non-Saccharomyces yeasts. This ability of yeast strains to impact flocculation behaviour of other species in mixed inocula has not been described previously. Further investigation into the genetic regulation of co-flocculation revealed that different FLO genes impact differently on such adhesion phenotypes, favouring adhesion with some species while excluding other species from such mixed flocs. The data therefore strongly suggest that FLO genes govern the selective association of S. cerevisiae with specific species of non-Saccharomyces yeasts, and may therefore be drivers of ecosystem organisational patterns. Our data provide, for the first time, insights into the role of the FLO gene family beyond intraspecies cellular association, and suggest a wider evolutionary role for the FLO genes. Such a role would explain the evolutionary persistence of a large multigene family of genes with apparently similar function. PMID:26317200
Evolutionary origins, molecular cloning and expression of carotenoid hydroxylases in eukaryotic photosynthetic algae

PubMed Central

2013-01-01

Background Xanthophylls, oxygenated derivatives of carotenes, play critical roles in photosynthetic apparatus of cyanobacteria, algae, and higher plants. Although the xanthophylls biosynthetic pathway of algae is largely unknown, it is of particular interest because they have a very complicated evolutionary history. Carotenoid hydroxylase (CHY) is an important protein that plays essential roles in xanthophylls biosynthesis. With the availability of 18 sequenced algal genomes, we performed a comprehensive comparative analysis of chy genes and explored their distribution, structure, evolution, origins, and expression. Results Overall 60 putative chy genes were identified and classified into two major subfamilies (bch and cyp97) according to their domain structures. Genes in the bch subfamily were found in 10 green algae and 1 red alga, but absent in other algae. In the phylogenetic tree, bch genes of green algae and higher plants share a common ancestor and are of non-cyanobacterial origin, whereas that of red algae is of cyanobacteria. The homologs of cyp97a/c genes were widespread only in green algae, while cyp97b paralogs were seen in most of algae. Phylogenetic analysis on cyp97 genes supported the hypothesis that cyp97b is an ancient gene originated before the formation of extant algal groups. The cyp97a gene is more closely related to cyp97c in evolution than to cyp97b. The two cyp97 genes were isolated from the green alga Haematococcus pluvialis, and transcriptional expression profiles of chy genes were observed under high light stress of different wavelength. Conclusions Green algae received a β-xanthophylls biosynthetic pathway from host organisms. Although red algae inherited the pathway from cyanobacteria during primary endosymbiosis, it remains unclear in Chromalveolates. The α-xanthophylls biosynthetic pathway is a common feature in green algae and higher plants. The origination of cyp97a/c is most likely due to gene duplication before divergence of green algae and higher plants. Protein domain structures and expression analyses in green alga H. pluvialis indicate that various chy genes are in different manners response to light. The knowledge of evolution of chy genes in photosynthetic eukaryotes provided information of gene cloning and functional investigation of chy genes in algae in the future. PMID:23834441
Available nitrogen is the key factor influencing soil microbial functional gene diversity in tropical rainforest.

PubMed

Cong, Jing; Liu, Xueduan; Lu, Hui; Xu, Han; Li, Yide; Deng, Ye; Li, Diqiang; Zhang, Yuguang

2015-08-20

Tropical rainforests cover over 50% of all known plant and animal species and provide a variety of key resources and ecosystem services to humans, largely mediated by metabolic activities of soil microbial communities. A deep analysis of soil microbial communities and their roles in ecological processes would improve our understanding on biogeochemical elemental cycles. However, soil microbial functional gene diversity in tropical rainforests and causative factors remain unclear. GeoChip, contained almost all of the key functional genes related to biogeochemical cycles, could be used as a specific and sensitive tool for studying microbial gene diversity and metabolic potential. In this study, soil microbial functional gene diversity in tropical rainforest was analyzed by using GeoChip technology. Gene categories detected in the tropical rainforest soils were related to different biogeochemical processes, such as carbon (C), nitrogen (N) and phosphorus (P) cycling. The relative abundance of genes related to C and P cycling detected mostly derived from the cultured bacteria. C degradation gene categories for substrates ranging from labile C to recalcitrant C were all detected, and gene abundances involved in many recalcitrant C degradation gene categories were significantly (P < 0.05) different among three sampling sites. The relative abundance of genes related to N cycling detected was significantly (P < 0.05) different, mostly derived from the uncultured bacteria. The gene categories related to ammonification had a high relative abundance. Both canonical correspondence analysis and multivariate regression tree analysis showed that soil available N was the most correlated with soil microbial functional gene structure. Overall high microbial functional gene diversity and different soil microbial metabolic potential for different biogeochemical processes were considered to exist in tropical rainforest. Soil available N could be the key factor in shaping the soil microbial functional gene structure and metabolic potential.
Cloning, Purification, and Characterization of a Heterodimeric β-Galactosidase from Lactobacillus kefiranofaciens ZW3.

PubMed

He, Xi; Han, Ning; Wang, Yan-Ping

2016-01-01

Lactobacillus kefiranofaciens ZW3 was obtained from kefir grains, which have high lactose hydrolytic activity. In this study, a heterodimeric LacLM-type β-galactosidase gene (lacLM) from ZW3 was isolated, which was composed of two overlapping genes, lacL (1,884 bp) and lacM (960 bp) encoding large and small subunits with calculated molecular masses of 73,620 and 35,682 Da, respectively. LacLM, LacL, and LacM were expressed in Escherichia coli BL21(DE3) and these recombinant proteins were purified and characterized. The results showed that, compared with the recombinant holoenzyme, the recombinant large subunit exhibits obviously lower thermostability and hydrolytic activity. Moreover, the optimal temperature and pH of the holoenzyme and large subunit are 60°C and 7.0, and 50°C and 8.0, respectively. However, the recombinant small subunit alone has no activity. Interestingly, the activity and thermostability of the large subunit were greatly improved after mixing it with the recombinant small subunit. Therefore, the results suggest that the small subunit might play an important role in maintaining the stability of the structure of the catalytic center located in the large subunit.
On the Sequence-Directed Nature of Human Gene Mutation: The Role of Genomic Architecture and the Local DNA Sequence Environment in Mediating Gene Mutations Underlying Human Inherited Disease

PubMed Central

Cooper, David N.; Bacolla, Albino; Férec, Claude; Vasquez, Karen M.; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min

2011-01-01

Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher-order features of the genomic architecture. The human genome is now recognized to contain ‘pervasive architectural flaws’ in that certain DNA sequences are inherently mutation-prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of non-canonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair, and may serve to increase mutation frequencies in generalized fashion (i.e. both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. PMID:21853507
Shaping skeletal growth by modular regulatory elements in the Bmp5 gene.

PubMed

Guenther, Catherine; Pantalena-Filho, Luiz; Kingsley, David M

2008-12-01

Cartilage and bone are formed into a remarkable range of shapes and sizes that underlie many anatomical adaptations to different lifestyles in vertebrates. Although the morphological blueprints for individual cartilage and bony structures must somehow be encoded in the genome, we currently know little about the detailed genomic mechanisms that direct precise growth patterns for particular bones. We have carried out large-scale enhancer surveys to identify the regulatory architecture controlling developmental expression of the mouse Bmp5 gene, which encodes a secreted signaling molecule required for normal morphology of specific skeletal features. Although Bmp5 is expressed in many skeletal precursors, different enhancers control expression in individual bones. Remarkably, we show here that different enhancers also exist for highly restricted spatial subdomains along the surface of individual skeletal structures, including ribs and nasal cartilages. Transgenic, null, and regulatory mutations confirm that these anatomy-specific sequences are sufficient to trigger local changes in skeletal morphology and are required for establishing normal growth rates on separate bone surfaces. Our findings suggest that individual bones are composite structures whose detailed growth patterns are built from many smaller lineage and gene expression domains. Individual enhancers in BMP genes provide a genomic mechanism for controlling precise growth domains in particular cartilages and bones, making it possible to separately regulate skeletal anatomy at highly specific locations in the body.
Intraspecific variation in mitochondrial genome sequence, structure, and gene content in Silene vulgaris, an angiosperm with pervasive cytoplasmic male sterility.

PubMed

Sloan, Daniel B; Müller, Karel; McCauley, David E; Taylor, Douglas R; Storchová, Helena

2012-12-01

In angiosperms, mitochondrial-encoded genes can cause cytoplasmic male sterility (CMS), resulting in the coexistence of female and hermaphroditic individuals (gynodioecy). We compared four complete mitochondrial genomes from the gynodioecious species Silene vulgaris and found unprecedented amounts of intraspecific diversity for plant mitochondrial DNA (mtDNA). Remarkably, only about half of overall sequence content is shared between any pair of genomes. The four mtDNAs range in size from 361 to 429 kb and differ in gene complement, with rpl5 and rps13 being intact in some genomes but absent or pseudogenized in others. The genomes exhibit essentially no conservation of synteny and are highly repetitive, with evidence of reciprocal recombination occurring even across short repeats (< 250 bp). Some mitochondrial genes exhibit atypically high degrees of nucleotide polymorphism, while others are invariant. The genomes also contain a variable number of small autonomously mapping chromosomes, which have only recently been identified in angiosperm mtDNA. Southern blot analysis of one of these chromosomes indicated a complex in vivo structure consisting of both monomeric circles and multimeric forms. We conclude that S. vulgaris harbors an unusually large degree of variation in mtDNA sequence and structure and discuss the extent to which this variation might be related to CMS. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

PubMed Central

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-01-01

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

Population structure of Columbia spotted frogs (Rana luteiventris) is strongly affected by the landscape

USGS Publications Warehouse

Funk, W.C.; Blouin, M.S.; Corn, P.S.; Maxell, B.A.; Pilliod, D.S.; Amish, S.; Allendorf, F.W.

2005-01-01

Landscape features such as mountains, rivers, and ecological gradients may strongly affect patterns of dispersal and gene flow among populations and thereby shape population dynamics and evolutionary trajectories. The landscape may have a particularly strong effect on patterns of dispersal and gene flow in amphibians because amphibians are thought to have poor dispersal abilities. We examined genetic variation at six microsatellite loci in Columbia spotted frogs (Rana luteiventris) from 28 breeding ponds in western Montana and Idaho, USA, in order to investigate the effects of landscape structure on patterns of gene flow. We were particularly interested in addressing three questions: (i) do ridges act as barriers to gene flow? (ii) is gene flow restricted between low and high elevation ponds? (iii) does a pond equal a 'randomly mating population' (a deme)? We found that mountain ridges and elevational differences were associated with increased genetic differentiation among sites, suggesting that gene flow is restricted by ridges and elevation in this species. We also found that populations of Columbia spotted frogs generally include more than a single pond except for very isolated ponds. There was also evidence for surprisingly high levels of gene flow among low elevation sites separated by large distances. Moreover, genetic variation within populations was strongly negatively correlated with elevation, suggesting effective population sizes are much smaller at high elevation than at low elevation. Our results show that landscape features have a profound effect on patterns of genetic variation in Columbia spotted frogs.
Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.

PubMed

Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean

2012-12-01

Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

PubMed

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Comparative genomics reveals insights into avian genome evolution and adaptation

PubMed Central

Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

2015-01-01

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712
The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis.

PubMed

Duan, Naibin; Sun, Honghe; Wang, Nan; Fei, Zhangjun; Chen, Xuesen

2016-07-01

The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis, a widely used apple rootstock, was determined using the Illumina high-throughput sequencing approach. The genome is 422,555 bp in length and has a GC content of 45.21%. It is separated by a pair of inverted repeats of 32,504 bp, to form a large single copy region of 213,055 bp and a small single copy region of 144,492 bp. The genome contains 38 protein-coding genes, four pseudogenes, 25 tRNA genes, and three rRNA genes. The genome is 25,608 bp longer than that of M. domestica, and several structural variations between these two mitogenomes were detected.
Biofilm density and detection of biofilm-producing genes in methicillin-resistant Staphylococcus aureus strains.

PubMed

Szczuka, Ewa; Urbańska, Katarzyna; Pietryka, Marta; Kaznowski, Adam

2013-01-01

Many serious diseases caused by Staphylococcus aureus appear to be associated with biofilms. Therefore, we investigated the biofilm-forming ability of the methicillin-resistant S. aureus (MRSA) isolates collected from hospitalized patients. As many as 96 % strains had the ability to form biofilm in vitro. The majority of S. aureus strains formed biofilm in ica-dependent mechanism. However, 23 % of MRSA isolates formed biofilm in ica-independent mechanism. Half of these strains carried fnbB genes encoding surface proteins fibronectin-binding protein B involved in intercellular accumulation and biofilm development in S. aureus strains. The biofilm structures were examined via confocal laser scanning microscopy (CLSM) and three-dimensional structures were reconstructed. The images obtained in CLSM revealed that the biofilm created by ica-positive strains was different from biofilm formed by ica-negative strains. The MRSA population showed a large genetic diversity and we did not find a single clone that occurred preferentially in hospital environment. Our results demonstrated the variation in genes encoding adhesins for the host matrix proteins (elastin, laminin, collagen, fibronectin, and fibrinogen) and in the gene involved in biofilm formation (icaA) within the majority of S. aureus clones.
Higher impact of female than male migration on population structure in large mammals.

PubMed

Tiedemann, R; Hardy, O; Vekemans, X; Milinkovitch, M C

2000-08-01

We simulated large mammal populations using an individual-based stochastic model under various sex-specific migration schemes and life history parameters from the blue whale and the Asian elephant. Our model predicts that genetic structure at nuclear loci is significantly more influenced by female than by male migration. We identified requisite comigration of mother and offspring during gravidity and lactation as the primary cause of this phenomenon. In addition, our model predicts that the common assumption that geographical patterns of mitochondrial DNA (mtDNA) could be translated into female migration rates (Nmf) will cause biased estimates of maternal gene flow when extensive male migration occurs and male mtDNA haplotypes are included in the analysis.
A Review of Computational Intelligence Methods for Eukaryotic Promoter Prediction.

PubMed

Singh, Shailendra; Kaur, Sukhbir; Goel, Neelam

2015-01-01

In past decades, prediction of genes in DNA sequences has attracted the attention of many researchers but due to its complex structure it is extremely intricate to correctly locate its position. A large number of regulatory regions are present in DNA that helps in transcription of a gene. Promoter is one such region and to find its location is a challenging problem. Various computational methods for promoter prediction have been developed over the past few years. This paper reviews these promoter prediction methods. Several difficulties and pitfalls encountered by these methods are also detailed, along with future research directions.
Of extracellular matrix, scaffolds, and signaling: Tissuearchitectureregulates development, homeostasis, and cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nelson, Celeste M.; Bissell, Mina J.

2006-03-09

The microenvironment surrounding cells influences gene expression, such that a cell's behavior is largely determined by its interactions with the extracellular matrix, neighboring cells, and soluble cues released locally or by distant tissues. We describe the essential role of context and organ structure in directing mammary gland development and differentiated function, and in determining response to oncogenic insults including mutations. We expand on the concept of 'dynamic reciprocity' to present an integrated view of development, cancer, and aging, and posit that genes are like piano keys: while essential, it is the context that makes the music.
Diseases and Molecular Diagnostics: A Step Closer to Precision Medicine.

PubMed

Dwivedi, Shailendra; Purohit, Purvi; Misra, Radhieka; Pareek, Puneet; Goel, Apul; Khattri, Sanjay; Pant, Kamlesh Kumar; Misra, Sanjeev; Sharma, Praveen

2017-10-01

The current advent of molecular technologies together with a multidisciplinary interplay of several fields led to the development of genomics, which concentrates on the detection of pathogenic events at the genome level. The structural and functional genomics approaches have now pinpointed the technical challenge in the exploration of disease-related genes and the recognition of their structural alterations or elucidation of gene function. Various promising technologies and diagnostic applications of structural genomics are currently preparing a large database of disease-genes, genetic alterations etc., by mutation scanning and DNA chip technology. Further the functional genomics also exploring the expression genetics (hybridization-, PCR- and sequence-based technologies), two-hybrid technology, next generation sequencing with Bioinformatics and computational biology. Advances in microarray "chip" technology as microarrays have allowed the parallel analysis of gene expression patterns of thousands of genes simultaneously. Sequence information collected from the genomes of many individuals is leading to the rapid discovery of single nucleotide polymorphisms or SNPs. Further advances of genetic engineering have also revolutionized immunoassay biotechnology via engineering of antibody-encoding genes and the phage display technology. The Biotechnology plays an important role in the development of diagnostic assays in response to an outbreak or critical disease response need. However, there is also need to pinpoint various obstacles and issues related to the commercialization and widespread dispersal of genetic knowledge derived from the exploitation of the biotechnology industry and the development and marketing of diagnostic services. Implementation of genetic criteria for patient selection and individual assessment of the risks and benefits of treatment emerges as a major challenge to the pharmaceutical industry. Thus this field is revolutionizing current era and further it may open new vistas in the field of disease management.
[FOXP2 and the molecular biology of language: new evidence. II. Molecular aspects and implications for the ontogenesis and phylogeny of language].

PubMed

Benítez-Burraco, A

FOXP2 is the first gene linked to a hereditary variant of specific language impairment and seems to code for a transcriptional repressor that intervenes in the regulation of the development and the functioning of certain thalamic-cortical-striatal circuits. In the last three years, significant progress has been made in the determination of the structural and functional properties of the gene. These advances essentially have to do with the precise analysis of the most important structural motifs of the protein that it codes for and the main parameters that determine its interaction with DNA. They also concern the determination of the functional and behavioural properties in vivo of the main isoforms of the FOXP2 protein, the exact determination of the pattern of expression of new orthologues of the gene, and the identification of the different target genes for factor FOXP2. This new evidence suggests that protein FOXP2 protein has a high degree of versatility in vivo when it comes to binding to DNA; that its different isoforms are biologically functional; and that the FOXP2 gene is functional during embryonic development and during the adult phase. It also suggests that it is involved in the development and/or functioning of the thalamic-cortical-striatal circuits associated to motor planning, sequential behaviour and procedural learning (a significant saving in developmental terms of the regulatory mechanism in which the gene is involved), as well as the accuracy of the models of linguistic processing that consider language to be, to a large extent, the result of an interaction between certain cortical and subcortical structures.
Structural and functional annotation of the porcine immunome

PubMed Central

2013-01-01

Background The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems. Results The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome. Conclusions This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response. PMID:23676093
Comparative Analysis of AGPase Genes and Encoded Proteins in Eight Monocots and Three Dicots with Emphasis on Wheat

PubMed Central

Batra, Ritu; Saripalli, Gautam; Mohan, Amita; Gupta, Saurabh; Gill, Kulvinder S.; Varadwaj, Pritish K.; Balyan, Harindra S.; Gupta, Pushpendra K.

2017-01-01

ADP-glucose pyrophosphorylase (AGPase) is a heterotetrameric enzyme with two large subunits (LS) and two small subunits (SS). It plays a critical role in starch biosynthesis. We are reporting here detailed structure, function and evolution of the genes encoding the LS and the SS among monocots and dicots. “True” orthologs of maize Sh2 (AGPase LS) and Bt2 (AGPase SS) were identified in seven other monocots and three dicots; structure of the enzyme at protein level was also studied. Novel findings of the current study include the following: (i) at the DNA level, the genes controlling the SS are more conserved than those controlling the LS; the variation in both is mainly due to intron number, intron length and intron phase distribution; (ii) at protein level, the SS genes are more conserved relative to those for LS; (iii) “QTCL” motif present in SS showed evolutionary differences in AGPase belonging to wheat 7BS, T. urartu, rice and sorghum, while “LGGG” motif in LS was present in all species except T. urartu and chickpea; SS provides thermostability to AGPase, while LS is involved in regulation of AGPase activity; (iv) heterotetrameric structure of AGPase was predicted and analyzed in real time environment through molecular dynamics simulation for all the species; (v) several cis-acting regulatory elements were identified in the AGPase promoters with their possible role in regulating spatial and temporal expression (endosperm and leaf tissue) and also the expression, in response to abiotic stresses; and (vi) expression analysis revealed downregulation of both subunits under conditions of heat and drought stress. The results of the present study have allowed better understanding of structure and evolution of the genes and the encoded proteins and provided clues for exploitation of variability in these genes for engineering thermostable AGPase. PMID:28174576
Large-Scale Genetic Structuring of a Widely Distributed Carnivore - The Eurasian Lynx (Lynx lynx)

PubMed Central

Rueness, Eli K.; Naidenko, Sergei; Trosvik, Pål; Stenseth, Nils Chr.

2014-01-01

Over the last decades the phylogeography and genetic structure of a multitude of species inhabiting Europe and North America have been described. The flora and fauna of the vast landmasses of north-eastern Eurasia are still largely unexplored in this respect. The Eurasian lynx is a large felid that is relatively abundant over much of the Russian sub-continent and the adjoining countries. Analyzing 148 museum specimens collected throughout its range over the last 150 years we have described the large-scale genetic structuring in this highly mobile species. We have investigated the spatial genetic patterns using mitochondrial DNA sequences (D-loop and cytochrome b) and 11 microsatellite loci, and describe three phylogenetic clades and a clear structuring along an east-west gradient. The most likely scenario is that the contemporary Eurasian lynx populations originated in central Asia and that parts of Europe were inhabited by lynx during the Pleistocene. After the Last Glacial Maximum (LGM) range expansions lead to colonization of north-western Siberia and Scandinavia from the Caucasus and north-eastern Siberia from a refugium further east. No evidence of a Berinigan refugium could be detected in our data. We observed restricted gene flow and suggest that future studies of the Eurasian lynx explore to what extent the contemporary population structure may be explained by ecological variables. PMID:24695745
Dispersal ability and habitat requirements determine landscape-level genetic patterns in desert aquatic insects.

PubMed

Phillipsen, Ivan C; Kirk, Emily H; Bogan, Michael T; Mims, Meryl C; Olden, Julian D; Lytle, David A

2015-01-01

Species occupying the same geographic range can exhibit remarkably different population structures across the landscape, ranging from highly diversified to panmictic. Given limitations on collecting population-level data for large numbers of species, ecologists seek to identify proximate organismal traits-such as dispersal ability, habitat preference and life history-that are strong predictors of realized population structure. We examined how dispersal ability and habitat structure affect the regional balance of gene flow and genetic drift within three aquatic insects that represent the range of dispersal abilities and habitat requirements observed in desert stream insect communities. For each species, we tested for linear relationships between genetic distances and geographic distances using Euclidean and landscape-based metrics of resistance. We found that the moderate-disperser Mesocapnia arizonensis (Plecoptera: Capniidae) has a strong isolation-by-distance pattern, suggesting migration-drift equilibrium. By contrast, population structure in the flightless Abedus herberti (Hemiptera: Belostomatidae) is influenced by genetic drift, while gene flow is the dominant force in the strong-flying Boreonectes aequinoctialis (Coleoptera: Dytiscidae). The best-fitting landscape model for M. arizonensis was based on Euclidean distance. Analyses also identified a strong spatial scale-dependence, where landscape genetic methods only performed well for species that were intermediate in dispersal ability. Our results highlight the fact that when either gene flow or genetic drift dominates in shaping population structure, no detectable relationship between genetic and geographic distances is expected at certain spatial scales. This study provides insight into how gene flow and drift interact at the regional scale for these insects as well as the organisms that share similar habitats and dispersal abilities. © 2014 John Wiley & Sons Ltd.
Fast ancestral gene order reconstruction of genomes with unequal gene content.

PubMed

Feijão, Pedro; Araujo, Eloi

2016-11-11

During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content, while running much faster, making it more scalable to larger datasets. Studing ancestral reconstruction problems under a new light, using the concept of intermediate genomes, allows the design of very fast algorithms by greatly reducing the solution search space, while also giving very good results. The algorithms introduced in this paper were implemented in an open-source software called RINGO (ancestral Reconstruction with INtermediate GenOmes), available at https://github.com/pedrofeijao/RINGO .
Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

PubMed Central

Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

2015-01-01

The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952
Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

PubMed

Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

2015-04-28

The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.
Genetic structure and signatures of selection in grey reef sharks (Carcharhinus amblyrhynchos).

PubMed

Momigliano, P; Harcourt, R; Robbins, W D; Jaiteh, V; Mahardika, G N; Sembiring, A; Stow, A

2017-09-01

With overfishing reducing the abundance of marine predators in multiple marine ecosystems, knowledge of genetic structure and local adaptation may provide valuable information to assist sustainable management. Despite recent technological advances, most studies on sharks have used small sets of neutral markers to describe their genetic structure. We used 5517 nuclear single-nucleotide polymorphisms (SNPs) and a mitochondrial DNA (mtDNA) gene to characterize patterns of genetic structure and detect signatures of selection in grey reef sharks (Carcharhinus amblyrhynchos). Using samples from Australia, Indonesia and oceanic reefs in the Indian Ocean, we established that large oceanic distances represent barriers to gene flow, whereas genetic differentiation on continental shelves follows an isolation by distance model. In Australia and Indonesia differentiation at nuclear SNPs was weak, with coral reefs acting as stepping stones maintaining connectivity across large distances. Differentiation of mtDNA was stronger, and more pronounced in females, suggesting sex-biased dispersal. Four independent tests identified a set of loci putatively under selection, indicating that grey reef sharks in eastern Australia are likely under different selective pressures to those in western Australia and Indonesia. Genetic distances averaged across all loci were uncorrelated with genetic distances calculated from outlier loci, supporting the conclusion that different processes underpin genetic divergence in these two data sets. This pattern of heterogeneous genomic differentiation, suggestive of local adaptation, has implications for the conservation of grey reef sharks; furthermore, it highlights that marine species showing little genetic differentiation at neutral loci may exhibit patterns of cryptic genetic structure driven by local selection.
Homogenous Population Genetic Structure of the Non-Native Raccoon Dog (Nyctereutes procyonoides) in Europe as a Result of Rapid Population Expansion

PubMed Central

Drygala, Frank; Korablev, Nikolay; Ansorge, Hermann; Fickel, Joerns; Isomursu, Marja; Elmeros, Morten; Kowalczyk, Rafał; Baltrunaite, Laima; Balciauskas, Linas; Saarma, Urmas; Schulze, Christoph; Borkenhagen, Peter; Frantz, Alain C.

2016-01-01

The extent of gene flow during the range expansion of non-native species influences the amount of genetic diversity retained in expanding populations. Here, we analyse the population genetic structure of the raccoon dog (Nyctereutes procyonoides) in north-eastern and central Europe. This invasive species is of management concern because it is highly susceptible to fox rabies and an important secondary host of the virus. We hypothesized that the large number of introduced animals and the species’ dispersal capabilities led to high population connectivity and maintenance of genetic diversity throughout the invaded range. We genotyped 332 tissue samples from seven European countries using 16 microsatellite loci. Different algorithms identified three genetic clusters corresponding to Finland, Denmark and a large ‘central’ population that reached from introduction areas in western Russia to northern Germany. Cluster assignments provided evidence of long-distance dispersal. The results of an Approximate Bayesian Computation analysis supported a scenario of equal effective population sizes among different pre-defined populations in the large central cluster. Our results are in line with strong gene flow and secondary admixture between neighbouring demes leading to reduced genetic structuring, probably a result of its fairly rapid population expansion after introduction. The results presented here are remarkable in the sense that we identified a homogenous genetic cluster inhabiting an area stretching over more than 1500km. They are also relevant for disease management, as in the event of a significant rabies outbreak, there is a great risk of a rapid virus spread among raccoon dog populations. PMID:27064784

Canine candidate genes for dilated cardiomyopathy: annotation of and polymorphic markers for 14 genes

PubMed Central

Wiersma, Anje C; Leegwater, Peter AJ; van Oost, Bernard A; Ollier, William E; Dukes-McEwan, Joanna

2007-01-01

Background Dilated cardiomyopathy is a myocardial disease occurring in humans and domestic animals and is characterized by dilatation of the left ventricle, reduced systolic function and increased sphericity of the left ventricle. Dilated cardiomyopathy has been observed in several, mostly large and giant, dog breeds, such as the Dobermann and the Great Dane. A number of genes have been identified, which are associated with dilated cardiomyopathy in the human, mouse and hamster. These genes mainly encode structural proteins of the cardiac myocyte. Results We present the annotation of, and marker development for, 14 of these genes of the dog genome, i.e. α-cardiac actin, caveolin 1, cysteine-rich protein 3, desmin, lamin A/C, LIM-domain binding factor 3, myosin heavy polypeptide 7, phospholamban, sarcoglycan δ, titin cap, α-tropomyosin, troponin I, troponin T and vinculin. A total of 33 Single Nucleotide Polymorphisms were identified for these canine genes and 11 polymorphic microsatellite repeats were developed. Conclusion The presented polymorphisms provide a tool to investigate the role of the corresponding genes in canine Dilated Cardiomyopathy by linkage analysis or association studies. PMID:17949487
Expressing genes do not forget their LINEs: transposable elements and gene expression

PubMed Central

Kines, Kristine J.; Belancio, Victoria P.

2012-01-01

1. ABSTRACT Historically the accumulated mass of mammalian transposable elements (TEs), particularly those located within gene boundaries, was viewed as a genetic burden potentially detrimental to the genomic landscape. This notion has been strengthened by the discovery that transposable sequences can alter the architecture of the transcriptome, not only through insertion, but also long after the integration process is completed. Insertions previously considered harmless are now known to impact the expression of host genes via modification of the transcript quality or quantity, transcriptional interference, or by the control of pathways that affect the mRNA life-cycle. Conversely, several examples of the evolutionary advantageous impact of TEs on the host gene structure that diversified the cellular transcriptome are reported. TE-induced changes in gene expression can be tissue-or disease-specific, raising the possibility that the impact of TE sequences may vary during development, among normal cell types, and between normal and disease-affected tissues. The understanding of the rules and abundance of TE-interference with gene expression is in its infancy, and its contribution to human disease and/or evolution remains largely unexplored. PMID:22201807
The Complete Mitochondrial Genome of the Land Snail Cornu aspersum (Helicidae: Mollusca): Intra-Specific Divergence of Protein-Coding Genes and Phylogenetic Considerations within Euthyneura

PubMed Central

Gaitán-Espitia, Juan Diego; Nespolo, Roberto F.; Opazo, Juan C.

2013-01-01

The complete sequences of three mitochondrial genomes from the land snail Cornu aspersum were determined. The mitogenome has a length of 14050 bp, and it encodes 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes. It also includes nine small intergene spacers, and a large AT-rich intergenic spacer. The intra-specific divergence analysis revealed that COX1 has the lower genetic differentiation, while the most divergent genes were NADH1, NADH3 and NADH4. With the exception of Euhadra herklotsi, the structural comparisons showed the same gene order within the family Helicidae, and nearly identical gene organization to that found in order Pulmonata. Phylogenetic reconstruction recovered Basommatophora as polyphyletic group, whereas Eupulmonata and Pulmonata as paraphyletic groups. Bayesian and Maximum Likelihood analyses showed that C. aspersum is a close relative of Cepaea nemoralis, and with the other Helicidae species form a sister group of Albinaria caerulea, supporting the monophyly of the Stylommatophora clade. PMID:23826260
Canine candidate genes for dilated cardiomyopathy: annotation of and polymorphic markers for 14 genes.

PubMed

Wiersma, Anje C; Leegwater, Peter Aj; van Oost, Bernard A; Ollier, William E; Dukes-McEwan, Joanna

2007-10-19

Dilated cardiomyopathy is a myocardial disease occurring in humans and domestic animals and is characterized by dilatation of the left ventricle, reduced systolic function and increased sphericity of the left ventricle. Dilated cardiomyopathy has been observed in several, mostly large and giant, dog breeds, such as the Dobermann and the Great Dane. A number of genes have been identified, which are associated with dilated cardiomyopathy in the human, mouse and hamster. These genes mainly encode structural proteins of the cardiac myocyte. We present the annotation of, and marker development for, 14 of these genes of the dog genome, i.e. alpha-cardiac actin, caveolin 1, cysteine-rich protein 3, desmin, lamin A/C, LIM-domain binding factor 3, myosin heavy polypeptide 7, phospholamban, sarcoglycan delta, titin cap, alpha-tropomyosin, troponin I, troponin T and vinculin. A total of 33 Single Nucleotide Polymorphisms were identified for these canine genes and 11 polymorphic microsatellite repeats were developed. The presented polymorphisms provide a tool to investigate the role of the corresponding genes in canine Dilated Cardiomyopathy by linkage analysis or association studies.
Spatial expression of Hox cluster genes in the ontogeny of a sea urchin

NASA Technical Reports Server (NTRS)

Arenas-Mena, C.; Cameron, A. R.; Davidson, E. H.

2000-01-01

The Hox cluster of the sea urchin Strongylocentrous purpuratus contains ten genes in a 500 kb span of the genome. Only two of these genes are expressed during embryogenesis, while all of eight genes tested are expressed during development of the adult body plan in the larval stage. We report the spatial expression during larval development of the five 'posterior' genes of the cluster: SpHox7, SpHox8, SpHox9/10, SpHox11/13a and SpHox11/13b. The five genes exhibit a dynamic, largely mesodermal program of expression. Only SpHox7 displays extensive expression within the pentameral rudiment itself. A spatially sequential and colinear arrangement of expression domains is found in the somatocoels, the paired posterior mesodermal structures that will become the adult perivisceral coeloms. No such sequential expression pattern is observed in endodermal, epidermal or neural tissues of either the larva or the presumptive juvenile sea urchin. The spatial expression patterns of the Hox genes illuminate the evolutionary process by which the pentameral echinoderm body plan emerged from a bilateral ancestor.
Aggregating Data for Computational Toxicology Applications: The U.S. Environmental Protection Agency (EPA) Aggregated Computational Toxicology Resource (ACToR) System

PubMed Central

Judson, Richard S.; Martin, Matthew T.; Egeghy, Peter; Gangwal, Sumit; Reif, David M.; Kothiya, Parth; Wolf, Maritja; Cathey, Tommy; Transue, Thomas; Smith, Doris; Vail, James; Frame, Alicia; Mosher, Shad; Cohen Hubal, Elaine A.; Richard, Ann M.

2012-01-01

Computational toxicology combines data from high-throughput test methods, chemical structure analyses and other biological domains (e.g., genes, proteins, cells, tissues) with the goals of predicting and understanding the underlying mechanistic causes of chemical toxicity and for predicting toxicity of new chemicals and products. A key feature of such approaches is their reliance on knowledge extracted from large collections of data and data sets in computable formats. The U.S. Environmental Protection Agency (EPA) has developed a large data resource called ACToR (Aggregated Computational Toxicology Resource) to support these data-intensive efforts. ACToR comprises four main repositories: core ACToR (chemical identifiers and structures, and summary data on hazard, exposure, use, and other domains), ToxRefDB (Toxicity Reference Database, a compilation of detailed in vivo toxicity data from guideline studies), ExpoCastDB (detailed human exposure data from observational studies of selected chemicals), and ToxCastDB (data from high-throughput screening programs, including links to underlying biological information related to genes and pathways). The EPA DSSTox (Distributed Structure-Searchable Toxicity) program provides expert-reviewed chemical structures and associated information for these and other high-interest public inventories. Overall, the ACToR system contains information on about 400,000 chemicals from 1100 different sources. The entire system is built using open source tools and is freely available to download. This review describes the organization of the data repository and provides selected examples of use cases. PMID:22408426
Aggregating data for computational toxicology applications: The U.S. Environmental Protection Agency (EPA) Aggregated Computational Toxicology Resource (ACToR) System.

PubMed

Judson, Richard S; Martin, Matthew T; Egeghy, Peter; Gangwal, Sumit; Reif, David M; Kothiya, Parth; Wolf, Maritja; Cathey, Tommy; Transue, Thomas; Smith, Doris; Vail, James; Frame, Alicia; Mosher, Shad; Cohen Hubal, Elaine A; Richard, Ann M

2012-01-01

Computational toxicology combines data from high-throughput test methods, chemical structure analyses and other biological domains (e.g., genes, proteins, cells, tissues) with the goals of predicting and understanding the underlying mechanistic causes of chemical toxicity and for predicting toxicity of new chemicals and products. A key feature of such approaches is their reliance on knowledge extracted from large collections of data and data sets in computable formats. The U.S. Environmental Protection Agency (EPA) has developed a large data resource called ACToR (Aggregated Computational Toxicology Resource) to support these data-intensive efforts. ACToR comprises four main repositories: core ACToR (chemical identifiers and structures, and summary data on hazard, exposure, use, and other domains), ToxRefDB (Toxicity Reference Database, a compilation of detailed in vivo toxicity data from guideline studies), ExpoCastDB (detailed human exposure data from observational studies of selected chemicals), and ToxCastDB (data from high-throughput screening programs, including links to underlying biological information related to genes and pathways). The EPA DSSTox (Distributed Structure-Searchable Toxicity) program provides expert-reviewed chemical structures and associated information for these and other high-interest public inventories. Overall, the ACToR system contains information on about 400,000 chemicals from 1100 different sources. The entire system is built using open source tools and is freely available to download. This review describes the organization of the data repository and provides selected examples of use cases.
Genomewide analysis of TCP transcription factor gene family in Malus domestica.

PubMed

Xu, Ruirui; Sun, Peng; Jia, Fengjuan; Lu, Longtao; Li, Yuanyuan; Zhang, Shizhong; Huang, Jinguang

2014-12-01

Teosinte branched 1/cycloidea/proliferating cell factor 1 (TCP) proteins are a large family of transcriptional regulators in angiosperms. They are involved in various biological processes, including development and plant metabolism pathways. In this study, a total of 52 TCP genes were identified in apple (Malus domestica) genome. Bioinformatic methods were employed to predicate and analyse their relevant gene classification, gene structure, chromosome location, sequence alignment and conserved domains of MdTCP proteins. Expression analysis from microarray data showed that the expression levels of 28 and 51 MdTCP genes changed during the ripening and rootstock-scion interaction processes, respectively. The expression patterns of 12 selected MdTCP genes were analysed in different tissues and in response to abiotic stresses. All of the selected genes were detected in at least one of the tissues tested, and most of them were modulated by adverse treatments indicating that the MdTCPs were involved in various developmental and physiological processes. To the best of our knowledge, this is the first study of a genomewide analysis of apple TCP gene family. These results provide valuable information for studies on functions of the TCP transcription factor genes in apple.
Determining Semantically Related Significant Genes.

PubMed

Taha, Kamal

2014-01-01

GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.
Microbial Community and Functional Gene Changes in Arctic Tundra Soils in a Microcosm Warming Experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Ziming; Yang, Sihang; Van Nostrand, Joy D.

Microbial decomposition of soil organic carbon (SOC) in the thawing Arctic permafrost is one of the most important, but poorly understood, processes in determining the greenhouse gases feedback of tundra ecosystems to climate. Here in this paper, we examine changes in microbial community structure during an anoxic incubation at either –2 or 8 °C for up to 122 days using both an organic and a mineral soil collected from the Barrow Environmental Observatory in northern Alaska, USA. Soils were characterized for SOC chemistry, and GeoChips were used to determine microbial community structure and functional genes associated with C degradation andmore » Fe(III) reduction. We observed notable decreases in functional gene diversity (at P < 0.05) in response to warming at 8 °C, particularly in the organic soil. A number of genes associated with SOC degradation, fermentation, methanogenesis, and iron cycling decreased significantly (P < 0.05) after 122 days of incubation, which coincided well with decreasing labile SOC content, soil respiration, methane production, and iron reduction. The soil type (i.e., organic vs. mineral) and the availability of labile SOC were among the most significant environmental factors impacting the functional community structure. In contrast, the functional structure was largely unchanged in the –2 °C incubation due to low microbial activity resulting in less competition or exclusion. These results demonstrate the vulnerability of SOC in Arctic tundra to warming, facilitated by iron reduction and methanogenesis, and the importance of microbial communities in moderating such vulnerability.« less
Microbial Community and Functional Gene Changes in Arctic Tundra Soils in a Microcosm Warming Experiment

DOE PAGES

Yang, Ziming; Yang, Sihang; Van Nostrand, Joy D.; ...

2017-09-19

Microbial decomposition of soil organic carbon (SOC) in the thawing Arctic permafrost is one of the most important, but poorly understood, processes in determining the greenhouse gases feedback of tundra ecosystems to climate. Here in this paper, we examine changes in microbial community structure during an anoxic incubation at either –2 or 8 °C for up to 122 days using both an organic and a mineral soil collected from the Barrow Environmental Observatory in northern Alaska, USA. Soils were characterized for SOC chemistry, and GeoChips were used to determine microbial community structure and functional genes associated with C degradation andmore » Fe(III) reduction. We observed notable decreases in functional gene diversity (at P < 0.05) in response to warming at 8 °C, particularly in the organic soil. A number of genes associated with SOC degradation, fermentation, methanogenesis, and iron cycling decreased significantly (P < 0.05) after 122 days of incubation, which coincided well with decreasing labile SOC content, soil respiration, methane production, and iron reduction. The soil type (i.e., organic vs. mineral) and the availability of labile SOC were among the most significant environmental factors impacting the functional community structure. In contrast, the functional structure was largely unchanged in the –2 °C incubation due to low microbial activity resulting in less competition or exclusion. These results demonstrate the vulnerability of SOC in Arctic tundra to warming, facilitated by iron reduction and methanogenesis, and the importance of microbial communities in moderating such vulnerability.« less
Genetic structure of cougar populations across the Wyoming basin: Metapopulation or megapopulation

USGS Publications Warehouse

Anderson, C.R.; Lindzey, F.G.; McDonald, D.B.

2004-01-01

We examined the genetic structure of 5 Wyoming cougar (Puma concolor) populations surrounding the Wyoming Basin, as well as a population from southwestern Colorado. When using 9 microsatellite DNA loci, observed heterozygosity was similar among populations (HO = 0.49-0.59) and intermediate to that of other large carnivores. Estimates of genetic structure (FST = 0.028, RST = 0.029) and number of migrants per generation (Nm) suggested high gene flow. Nm was lowest between distant populations and highest among adjacent populations. Examination of these data, plus Mantel test results of genetic versus geographic distance (P ??? 0.01), suggested both isolation by distance and an effect of habitat matrix. Bayesian assignment to population based on individual genotypes showed that cougars in this region were best described as a single panmictic population. Total effective population size for cougars in this region ranged from 1,797 to 4,532 depending on mutation model and analytical method used. Based on measures of gene flow, extinction risk in the near future appears low. We found no support for the existence of metapopulation structure among cougars in this region.
Predicting Rat and Human Pregnane X Receptor Activators Using Bayesian Classification Models.

PubMed

AbdulHameed, Mohamed Diwan M; Ippolito, Danielle L; Wallqvist, Anders

2016-10-17

The pregnane X receptor (PXR) is a ligand-activated transcription factor that acts as a master regulator of metabolizing enzymes and transporters. To avoid adverse drug-drug interactions and diseases such as steatosis and cancers associated with PXR activation, identifying drugs and chemicals that activate PXR is of crucial importance. In this work, we developed ligand-based predictive computational models for both rat and human PXR activation, which allowed us to identify potentially harmful chemicals and evaluate species-specific effects of a given compound. We utilized a large publicly available data set of nearly 2000 compounds screened in cell-based reporter gene assays to develop Bayesian quantitative structure-activity relationship models using physicochemical properties and structural descriptors. Our analysis showed that PXR activators tend to be hydrophobic and significantly different from nonactivators in terms of their physicochemical properties such as molecular weight, logP, number of rings, and solubility. Our Bayesian models, evaluated by using 5-fold cross-validation, displayed a sensitivity of 75% (76%), specificity of 76% (75%), and accuracy of 89% (89%) for human (rat) PXR activation. We identified structural features shared by rat and human PXR activators as well as those unique to each species. We compared rat in vitro PXR activation data to in vivo data by using DrugMatrix, a large toxicogenomics database with gene expression data obtained from rats after exposure to diverse chemicals. Although in vivo gene expression data pointed to cross-talk between nuclear receptor activators that is captured only by in vivo assays, overall we found broad agreement between in vitro and in vivo PXR activation. Thus, the models developed here serve primarily as efficient initial high-throughput in silico screens of in vitro activity.
Assessment of genetic diversity, population structure, and gene flow of tigers (Panthera tigris tigris) across Nepal's Terai Arc Landscape.

PubMed

Thapa, Kanchan; Manandhar, Sulochana; Bista, Manisha; Shakya, Jivan; Sah, Govind; Dhakal, Maheshwar; Sharma, Netra; Llewellyn, Bronwyn; Wultsch, Claudia; Waits, Lisette P; Kelly, Marcella J; Hero, Jean-Marc; Hughes, Jane; Karmacharya, Dibesh

2018-01-01

With fewer than 200 tigers (Panthera tigris tigris) left in Nepal, that are generally confined to five protected areas across the Terai Arc Landscape, genetic studies are needed to provide crucial information on diversity and connectivity for devising an effective country-wide tiger conservation strategy. As part of the Nepal Tiger Genome Project, we studied landscape change, genetic variation, population structure, and gene flow of tigers across the Terai Arc Landscape by conducting Nepal's first comprehensive and systematic scat-based, non-invasive genetic survey. Of the 770 scat samples collected opportunistically from five protected areas and six presumed corridors, 412 were tiger (57%). Out of ten microsatellite loci, we retain eight markers that were used in identifying 78 individual tigers. We used this dataset to examine population structure, genetic variation, contemporary gene flow, and potential population bottlenecks of tigers in Nepal. We detected three genetic clusters consistent with three demographic sub-populations and found moderate levels of genetic variation (He = 0.61, AR = 3.51) and genetic differentiation (FST = 0.14) across the landscape. We detected 3-7 migrants, confirming the potential for dispersal-mediated gene flow across the landscape. We found evidence of a bottleneck signature likely caused by large-scale land-use change documented in the last two centuries in the Terai forest. Securing tiger habitat including functional forest corridors is essential to enhance gene flow across the landscape and ensure long-term tiger survival. This requires cooperation among multiple stakeholders and careful conservation planning to prevent detrimental effects of anthropogenic activities on tigers.
Assessment of genetic diversity, population structure, and gene flow of tigers (Panthera tigris tigris) across Nepal's Terai Arc Landscape

PubMed Central

Manandhar, Sulochana; Bista, Manisha; Shakya, Jivan; Sah, Govind; Dhakal, Maheshwar; Sharma, Netra; Llewellyn, Bronwyn; Wultsch, Claudia; Waits, Lisette P.; Kelly, Marcella J.; Hero, Jean-Marc; Hughes, Jane

2018-01-01

With fewer than 200 tigers (Panthera tigris tigris) left in Nepal, that are generally confined to five protected areas across the Terai Arc Landscape, genetic studies are needed to provide crucial information on diversity and connectivity for devising an effective country-wide tiger conservation strategy. As part of the Nepal Tiger Genome Project, we studied landscape change, genetic variation, population structure, and gene flow of tigers across the Terai Arc Landscape by conducting Nepal’s first comprehensive and systematic scat-based, non-invasive genetic survey. Of the 770 scat samples collected opportunistically from five protected areas and six presumed corridors, 412 were tiger (57%). Out of ten microsatellite loci, we retain eight markers that were used in identifying 78 individual tigers. We used this dataset to examine population structure, genetic variation, contemporary gene flow, and potential population bottlenecks of tigers in Nepal. We detected three genetic clusters consistent with three demographic sub-populations and found moderate levels of genetic variation (He = 0.61, AR = 3.51) and genetic differentiation (FST = 0.14) across the landscape. We detected 3–7 migrants, confirming the potential for dispersal-mediated gene flow across the landscape. We found evidence of a bottleneck signature likely caused by large-scale land-use change documented in the last two centuries in the Terai forest. Securing tiger habitat including functional forest corridors is essential to enhance gene flow across the landscape and ensure long-term tiger survival. This requires cooperation among multiple stakeholders and careful conservation planning to prevent detrimental effects of anthropogenic activities on tigers. PMID:29561865
Molecular gene organisation and secondary structure of the mitochondrial large subunit ribosomal RNA from the cultivated Basidiomycota Agrocybe aegerita: a 13 kb gene possessing six unusual nucleotide extensions and eight introns.

PubMed

Gonzalez, P; Barroso, G; Labarère, J

1999-04-01

The complete gene sequence and secondary structure of the mitochondrial LSU rRNA from the cultivated Basidiomycota Agrocybe aegerita was derived by chromosome walking. The A.aegerita LSU rRNA gene (13 526 nt) represents, to date, the longest described, due to the highest number of introns (eight) and the occurrence of six long nucleotidic extensions. Seven introns belong to group I, while the intronic sequence i5 constitutes the first typical group II intron reported in a fungal mitochondrial LSU rDNA. As with most fungal LSU rDNA introns reported to date, four introns (i5-i8) are distributed in domain V associated with the peptidyl-transferase activity. One intron (i1) is located in domain I, and three (i2-i4) in domain II. The introns i2-i8 possess homologies with other fungal, algal or protozoan introns located at the same position in LSU rDNAs. One of them (i6) is located at the same insertion site as most Ascomycota or algae LSU introns, suggesting a possible inheritance from a common ancestor. On the contrary, intron i1 is located at a so-far unreported insertion site. Among the six unusual nucleotide extensions, five are located in domain I and one in domain V. This is the first report of a mitochondrial LSU rRNA gene sequence and secondary structure for the whole Basidiomycota division.
Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

PubMed

Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

2017-10-17

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.
Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods

PubMed Central

Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.

2017-01-01

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148
Characterization of a Spontaneous Nonmagnetic Mutant of Magnetospirillum gryphiswaldense Reveals a Large Deletion Comprising a Putative Magnetosome Island

PubMed Central

Schübbe, Sabrina; Kube, Michael; Scheffel, André; Wawer, Cathrin; Heyen, Udo; Meyerdierks, Anke; Madkour, Mohamed H.; Mayer, Frank; Reinhardt, Richard; Schüler, Dirk

2003-01-01

Frequent spontaneous loss of the magnetic phenotype was observed in stationary-phase cultures of the magnetotactic bacterium Magnetospirillum gryphiswaldense MSR-1. A nonmagnetic mutant, designated strain MSR-1B, was isolated and characterized. The mutant lacked any structures resembling magnetosome crystals as well as internal membrane vesicles. The growth of strain MSR-1B was impaired under all growth conditions tested, and the uptake and accumulation of iron were drastically reduced under iron-replete conditions. A large chromosomal deletion of approximately 80 kb was identified in strain MSR-1B, which comprised both the entire mamAB and mamDC clusters as well as further putative operons encoding a number of magnetosome-associated proteins. A bacterial artificial chromosome clone partially covering the deleted region was isolated from the genomic library of wild-type M. gryphiswaldense. Sequence analysis of this fragment revealed that all previously identified mam genes were closely linked with genes encoding other magnetosome-associated proteins within less than 35 kb. In addition, this region was remarkably rich in insertion elements and harbored a considerable number of unknown gene families which appeared to be specific for magnetotactic bacteria. Overall, these findings suggest the existence of a putative large magnetosome island in M. gryphiswaldense and other magnetotactic bacteria. PMID:13129949
Insights into the noncoding RNome of nitrogen-fixing endosymbiotic α-proteobacteria.

PubMed

Jiménez-Zurdo, José I; Valverde, Claudio; Becker, Anke

2013-02-01

Symbiotic chronic infection of legumes by rhizobia involves transition of invading bacteria from a free-living environment in soil to an intracellular state as differentiated nitrogen-fixing bacteroids within the nodules elicited in the host plant. The adaptive flexibility demanded by this complex lifestyle is likely facilitated by the large set of regulatory proteins encoded by rhizobial genomes. However, proteins are not the only relevant players in the regulation of gene expression in bacteria. Large-scale high-throughput analysis of prokaryotic genomes is evidencing the expression of an unexpected plethora of small untranslated transcripts (sRNAs) with housekeeping or regulatory roles. sRNAs mostly act in response to environmental cues as post-transcriptional regulators of gene expression through protein-assisted base-pairing interactions with target mRNAs. Riboregulation contributes to fine-tune a wide range of bacterial processes which, in intracellular animal pathogens, largely compromise virulence traits. Here, we summarize the incipient knowledge about the noncoding RNome structure of nitrogen-fixing endosymbiotic bacteria as inferred from genome-wide searches for sRNA genes in the alfalfa partner Sinorhizobium meliloti and further comparative genomics analysis. The biology of relevant S. meliloti RNA chaperones (e.g., Hfq) is also reviewed as a first global indicator of the impact of riboregulation in the establishment of the symbiotic interaction.

Coordination of genomic structure and transcription by the main bacterial nucleoid-associated protein HU

PubMed Central

Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi

2010-01-01

The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci—spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid. PMID:20010798
Large-scale gene flow in the barnacle Jehlius cirratus and contrasts with other broadly-distributed taxa along the Chilean coast

PubMed Central

Guo, Baoying

2017-01-01

We evaluate the population genetic structure of the intertidal barnacle Jehlius cirratus across a broad portion of its geographic distribution using data from the mitochondrial cytochrome oxidase I (COI) gene region. Despite sampling diversity from over 3,000 km of the linear range of this species, there is only slight regional structure indicated, with overall Φ CT of 0.036 (p < 0.001) yet no support for isolation by distance. While these results suggest greater structure than previous studies of J. cirratus had indicated, the pattern of diversity is still far more subtle than in other similarly-distributed species with similar larval and life history traits. We compare these data and results with recent findings in four other intertidal species that have planktotrophic larvae. There are no clear patterns among these taxa that can be associated with intertidal depth or other known life history traits. PMID:28194316
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yuzawa, Satoshi; Keasling, Jay D.; Katz, Leonard

Complex polyketides comprise a large number of natural products that have broad application in medicine and agriculture. They are produced in bacteria and fungi from large enzyme complexes named type I modular polyketide synthases (PKSs) that are composed of multifunctional polypeptides containing discrete enzymatic domains organized into modules. The modular nature of PKSs has enabled a multitude of efforts to engineer the PKS genes to produce novel polyketides of predicted structure. Finally, we have repurposed PKSs to produce a number of short-chain mono- and di-carboxylic acids and ketones that could have applications as fuels or industrial chemicals.
Identification and expression profiles of the WRKY transcription factor family in Ricinus communis.

PubMed

Li, Hui-Liang; Zhang, Liang-Bo; Guo, Dong; Li, Chang-Zhu; Peng, Shi-Qing

2012-07-25

In plants, WRKY proteins constitute a large family of transcription factors. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. A large number of WRKY transcription factors have been reported from Arabidopsis, rice, and other higher plants. The recent publication of the draft genome sequence of castor bean (Ricinus communis) has allowed a genome-wide search for R. communis WRKY (RcWRKY) transcription factors and the comparison of these positively identified proteins with their homologs in model plants. A total of 47 WRKY genes were identified in the castor bean genome. According to the structural features of the WRKY domain, the RcWRKY are classified into seven main phylogenetic groups. Furthermore, putative orthologs of RcWRKY proteins in Arabidopsis and rice could now be assigned. An analysis of expression profiles of RcWRKY genes indicates that 47 WRKY genes display differential expressions either in their transcript abundance or expression patterns under normal growth conditions. Copyright © 2012 Elsevier B.V. All rights reserved.
Chromosomal DNA Deletions Explain Phenotypic Characteristics of Two Antigenic Variants, Phase II and RSA 514 (Crazy), of the Coxiella burnetii Nine Mile Strain†

PubMed Central

Hoover, T. A.; Culp, D. W.; Vodkin, M. H.; Williams, J. C.; Thompson, H. A.

2002-01-01

After repeated passages through embyronated eggs, the Nine Mile strain of Coxiella burnetii exhibits antigenic variation, a loss of virulence characteristics, and transition to a truncated lipopolysaccharide (LPS) structure. In two independently derived strains, Nine Mile phase II and RSA 514, these phenotypic changes were accompanied by a large chromosomal deletion (M. H. Vodkin and J. C. Williams, J. Gen. Microbiol. 132:2587-2594, 1986). In the work reported here, additional screening of a cosmid bank prepared from the wild-type strain was used to map the deletion termini of both mutant strains and to accumulate all the segments of DNA that comprise the two deletions. The corresponding DNAs were then sequenced and annotated. The Nine Mile phase II deletion was completely nested within the deletion of the RSA 514 strain. Basic alignment and homology studies indicated that a large group of LPS biosynthetic genes, arranged in an apparent O-antigen cluster, was deleted in both variants. Database homologies identified, in particular, mannose pathway genes and genes encoding sugar methylases and nucleotide sugar epimerase-dehydratase proteins. Candidate genes for addition of sugar units to the core oligosaccharide for synthesis of the rare sugar 6-deoxy-3-C-methylgulose (virenose) were identified in the deleted region. Repeats, redundancies, paralogous genes, and two regions with reduced G+C contents were found within the deletions. PMID:12438347
Genome-environment association study suggests local adaptation to climate at the regional scale in Fagus sylvatica.

PubMed

Pluess, Andrea R; Frank, Aline; Heiri, Caroline; Lalagüe, Hadrien; Vendramin, Giovanni G; Oddou-Muratorio, Sylvie

2016-04-01

The evolutionary potential of long-lived species, such as forest trees, is fundamental for their local persistence under climate change (CC). Genome-environment association (GEA) analyses reveal if species in heterogeneous environments at the regional scale are under differential selection resulting in populations with potential preadaptation to CC within this area. In 79 natural Fagus sylvatica populations, neutral genetic patterns were characterized using 12 simple sequence repeat (SSR) markers, and genomic variation (144 single nucleotide polymorphisms (SNPs) out of 52 candidate genes) was related to 87 environmental predictors in the latent factor mixed model, logistic regressions and isolation by distance/environmental (IBD/IBE) tests. SSR diversity revealed relatedness at up to 150 m intertree distance but an absence of large-scale spatial genetic structure and IBE. In the GEA analyses, 16 SNPs in 10 genes responded to one or several environmental predictors and IBE, corrected for IBD, was confirmed. The GEA often reflected the proposed gene functions, including indications for adaptation to water availability and temperature. Genomic divergence and the lack of large-scale neutral genetic patterns suggest that gene flow allows the spread of advantageous alleles in adaptive genes. Thereby, adaptation processes are likely to take place in species occurring in heterogeneous environments, which might reduce their regional extinction risk under CC. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Insights into Bacteriophage T5 Structure from Analysis of Its Morphogenesis Genes and Protein Components

PubMed Central

Zivanovic, Yvan; Confalonieri, Fabrice; Ponchon, Luc; Lurz, Rudi; Chami, Mohamed; Flayhan, Ali; Renouard, Madalena; Huet, Alexis; Decottignies, Paulette; Davidson, Alan R.; Breyton, Cécile

2014-01-01

Bacteriophage T5 represents a large family of lytic Siphoviridae infecting Gram-negative bacteria. The low-resolution structure of T5 showed the T=13 geometry of the capsid and the unusual trimeric organization of the tail tube, and the assembly pathway of the capsid was established. Although major structural proteins of T5 have been identified in these studies, most of the genes encoding the morphogenesis proteins remained to be identified. Here, we combine a proteomic analysis of T5 particles with a bioinformatic study and electron microscopic immunolocalization to assign function to the genes encoding the structural proteins, the packaging proteins, and other nonstructural components required for T5 assembly. A head maturation protease that likely accounts for the cleavage of the different capsid proteins is identified. Two other proteins involved in capsid maturation add originality to the T5 capsid assembly mechanism: the single head-to-tail joining protein, which closes the T5 capsid after DNA packaging, and the nicking endonuclease responsible for the single-strand interruptions in the T5 genome. We localize most of the tail proteins that were hitherto uncharacterized and provide a detailed description of the tail tip composition. Our findings highlight novel variations of viral assembly strategies and of virion particle architecture. They further recommend T5 for exploring phage structure and assembly and for deciphering conformational rearrangements that accompany DNA transfer from the capsid to the host cytoplasm. PMID:24198424
Gene selection for microarray data classification via subspace learning and manifold regularization.

PubMed

Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui

2017-12-19

With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.
Protists and the Wild, Wild West of Gene Expression: New Frontiers, Lawlessness, and Misfits.

PubMed

Smith, David Roy; Keeling, Patrick J

2016-09-08

The DNA double helix has been called one of life's most elegant structures, largely because of its universality, simplicity, and symmetry. The expression of information encoded within DNA, however, can be far from simple or symmetric and is sometimes surprisingly variable, convoluted, and wantonly inefficient. Although exceptions to the rules exist in certain model systems, the true extent to which life has stretched the limits of gene expression is made clear by nonmodel systems, particularly protists (microbial eukaryotes). The nuclear and organelle genomes of protists are subject to the most tangled forms of gene expression yet identified. The complicated and extravagant picture of the underlying genetics of eukaryotic microbial life changes how we think about the flow of genetic information and the evolutionary processes shaping it. Here, we discuss the origins, diversity, and growing interest in noncanonical protist gene expression and its relationship to genomic architecture.
Genetic control of α-Amylase production in wheat.

PubMed

Gale, M D; Law, C N; Chojecki, A J; Kempton, R A

1983-03-01

An analysis of the α-amylase isozymes in GA-treated endosperm of wheat nullisomic-tetrasomics shows that there is more variation at the α-Amy-1 and α-Amy-2 homoeoallelic loci than was previously thought. Among the 16 isozymes produced by genes on the group 7 chromosomes, most could be definitely established as products of a single homoeoallele.Inter-varietal allelic differences would be expected at such loci and clear variation was found in isozymes produced by chromosomes 6B and 7B. The latter allele, α-Amy-B2b carried by the variety 'Hope', was used to locate the enzyme structural gene within chromosome 7B relative to the centromere and five other gene markers.The nature of the α-Amy-B2b phenotype and the rare non-parental isozyme patterns found among the recombinant lines indicates that the locus is large and compound, probably involving some degree of intra-locus gene duplication.
Epigenetic silencing of a foreign gene in nuclear transformants of Chlamydomonas.

PubMed Central

Cerutti, H; Johnson, A M; Gillham, N W; Boynton, J E

1997-01-01

The unstable expression of introduced genes poses a serious problem for the application of transgenic technology in plants. In transformants of the unicellular green alga Chlamydomonas reinhardtii, expression of a eubacterial aadA gene, conferring spectinomycin resistance, is transcriptionally suppressed by a reversible epigenetic mechanism(s). Variations in the size and frequency of colonies surviving on different concentrations of spectinomycin as well as the levels of transcriptional activity of the introduced transgene(s) suggest the existence of intermediate expression states in genetically identical cells. Gene silencing does not correlate with methylation of the integrated DNA and does not involve large alterations in its chromatin structure, as revealed by digestion with restriction endonucleases and DNase I. Transgene repression is enhanced by lower temperatures, similar to position effect variegation in Drosophila. By analogy to epigenetic phenomena in several eukaryotes, our results suggest a possible role for (hetero)chromatic chromosomal domains in transcriptional inactivation. PMID:9212467
Cloning and heterologous expression of genes from the kinamycin biosynthetic pathway of Streptomyces murayamaensis.

PubMed

Gould, S J; Hong, S T; Carney, J R

1998-01-01

The genes for most of the biosynthesis of the kinamycin antibiotics have been cloned and heterologously expressed. Genomic DNA of Streptomyces murayamaensis was partially digested with MboI and a library of approximately 40 kb fragments in E. coli XL1-BlueMR was prepared using the cosmid vector pOJ446. Hybridization with the actI probe from the actinorhodin polyketide synthase genes identified two clusters of polyketide genes. After transferal of these clusters to S. lividans ZX7, expression of one cluster was established by HPLC with photodiode array detection. Peaks were identified from the kin cluster for dehydrorabelomycin, kinobscurinone, and stealthin C, which are known intermediates in kinamycin biosynthesis. Two shunt metabolites, kinafluorenone and seongomycin were also identified. The structure of the latter was determined from a quantity obtained from large-scale fermentation of one of the clones.
LTR Retrotransposons Show Low Levels of Unequal Recombination and High Rates of Intraelement Gene Conversion in Large Plant Genomes

PubMed Central

Cossu, Rosa Maria; Casola, Claudio; Giacomello, Stefania; Vidalis, Amaryllis

2017-01-01

Abstract The accumulation and removal of transposable elements (TEs) is a major driver of genome size evolution in eukaryotes. In plants, long terminal repeat (LTR) retrotransposons (LTR-RTs) represent the majority of TEs and form most of the nuclear DNA in large genomes. Unequal recombination (UR) between LTRs leads to removal of intervening sequence and formation of solo-LTRs. UR is a major mechanism of LTR-RT removal in many angiosperms, but our understanding of LTR-RT-associated recombination within the large, LTR-RT-rich genomes of conifers is quite limited. We employ a novel read-based methodology to estimate the relative rates of LTR-RT-associated UR within the genomes of four conifer and seven angiosperm species. We found the lowest rates of UR in the largest genomes studied, conifers and the angiosperm maize. Recombination may also resolve as gene conversion, which does not remove sequence, so we analyzed LTR-RT-associated gene conversion events (GCEs) in Norway spruce and six angiosperms. Opposite the trend for UR, we found the highest rates of GCEs in Norway spruce and maize. Unlike previous work in angiosperms, we found no evidence that rates of UR correlate with retroelement structural features in the conifers, suggesting that another process is suppressing UR in these species. Recent results from diverse eukaryotes indicate that heterochromatin affects the resolution of recombination, by favoring gene conversion over crossing-over, similar to our observation of opposed rates of UR and GCEs. Control of LTR-RT proliferation via formation of heterochromatin would be a likely step toward large genomes in eukaryotes carrying high LTR-RT content. PMID:29228262
A parallel strategy for predicting the secondary structure of polycistronic microRNAs.

PubMed

Han, Dianwei; Tang, Guiliang; Zhang, Jun

2013-01-01

The biogenesis of a functional microRNA is largely dependent on the secondary structure of the microRNA precursor (pre-miRNA). Recently, it has been shown that microRNAs are present in the genome as the form of polycistronic transcriptional units in plants and animals. It will be important to design efficient computational methods to predict such structures for microRNA discovery and its applications in gene silencing. In this paper, we propose a parallel algorithm based on the master-slave architecture to predict the secondary structure from an input sequence. We conducted some experiments to verify the effectiveness of our parallel algorithm. The experimental results show that our algorithm is able to produce the optimal secondary structure of polycistronic microRNAs.
Implications of the circumpolar genetic structure of polar bears for their conservation in a rapidly warming Arctic.

PubMed

Peacock, Elizabeth; Sonsthagen, Sarah A; Obbard, Martyn E; Boltunov, Andrei; Regehr, Eric V; Ovsyanikov, Nikita; Aars, Jon; Atkinson, Stephen N; Sage, George K; Hope, Andrew G; Zeyl, Eve; Bachmann, Lutz; Ehrich, Dorothee; Scribner, Kim T; Amstrup, Steven C; Belikov, Stanislav; Born, Erik W; Derocher, Andrew E; Stirling, Ian; Taylor, Mitchell K; Wiig, Øystein; Paetkau, David; Talbot, Sandra L

2015-01-01

We provide an expansive analysis of polar bear (Ursus maritimus) circumpolar genetic variation during the last two decades of decline in their sea-ice habitat. We sought to evaluate whether their genetic diversity and structure have changed over this period of habitat decline, how their current genetic patterns compare with past patterns, and how genetic demography changed with ancient fluctuations in climate. Characterizing their circumpolar genetic structure using microsatellite data, we defined four clusters that largely correspond to current ecological and oceanographic factors: Eastern Polar Basin, Western Polar Basin, Canadian Archipelago and Southern Canada. We document evidence for recent (ca. last 1-3 generations) directional gene flow from Southern Canada and the Eastern Polar Basin towards the Canadian Archipelago, an area hypothesized to be a future refugium for polar bears as climate-induced habitat decline continues. Our data provide empirical evidence in support of this hypothesis. The direction of current gene flow differs from earlier patterns of gene flow in the Holocene. From analyses of mitochondrial DNA, the Canadian Archipelago cluster and the Barents Sea subpopulation within the Eastern Polar Basin cluster did not show signals of population expansion, suggesting these areas may have served also as past interglacial refugia. Mismatch analyses of mitochondrial DNA data from polar and the paraphyletic brown bear (U. arctos) uncovered offset signals in timing of population expansion between the two species, that are attributed to differential demographic responses to past climate cycling. Mitogenomic structure of polar bears was shallow and developed recently, in contrast to the multiple clades of brown bears. We found no genetic signatures of recent hybridization between the species in our large, circumpolar sample, suggesting that recently observed hybrids represent localized events. Documenting changes in subpopulation connectivity will allow polar nations to proactively adjust conservation actions to continuing decline in sea-ice habitat.
Implications of the Circumpolar Genetic Structure of Polar Bears for Their Conservation in a Rapidly Warming Arctic

PubMed Central

Peacock, Elizabeth; Sonsthagen, Sarah A.; Obbard, Martyn E.; Boltunov, Andrei; Regehr, Eric V.; Ovsyanikov, Nikita; Aars, Jon; Atkinson, Stephen N.; Sage, George K.; Hope, Andrew G.; Zeyl, Eve; Bachmann, Lutz; Ehrich, Dorothee; Scribner, Kim T.; Amstrup, Steven C.; Belikov, Stanislav; Born, Erik W.; Derocher, Andrew E.; Stirling, Ian; Taylor, Mitchell K.; Wiig, Øystein; Paetkau, David; Talbot, Sandra L.

2015-01-01

We provide an expansive analysis of polar bear (Ursus maritimus) circumpolar genetic variation during the last two decades of decline in their sea-ice habitat. We sought to evaluate whether their genetic diversity and structure have changed over this period of habitat decline, how their current genetic patterns compare with past patterns, and how genetic demography changed with ancient fluctuations in climate. Characterizing their circumpolar genetic structure using microsatellite data, we defined four clusters that largely correspond to current ecological and oceanographic factors: Eastern Polar Basin, Western Polar Basin, Canadian Archipelago and Southern Canada. We document evidence for recent (ca. last 1–3 generations) directional gene flow from Southern Canada and the Eastern Polar Basin towards the Canadian Archipelago, an area hypothesized to be a future refugium for polar bears as climate-induced habitat decline continues. Our data provide empirical evidence in support of this hypothesis. The direction of current gene flow differs from earlier patterns of gene flow in the Holocene. From analyses of mitochondrial DNA, the Canadian Archipelago cluster and the Barents Sea subpopulation within the Eastern Polar Basin cluster did not show signals of population expansion, suggesting these areas may have served also as past interglacial refugia. Mismatch analyses of mitochondrial DNA data from polar and the paraphyletic brown bear (U. arctos) uncovered offset signals in timing of population expansion between the two species, that are attributed to differential demographic responses to past climate cycling. Mitogenomic structure of polar bears was shallow and developed recently, in contrast to the multiple clades of brown bears. We found no genetic signatures of recent hybridization between the species in our large, circumpolar sample, suggesting that recently observed hybrids represent localized events. Documenting changes in subpopulation connectivity will allow polar nations to proactively adjust conservation actions to continuing decline in sea-ice habitat. PMID:25562525
Implications of the circumpolar genetic structure of polar bears for their conservation in a rapidly warming Arctic

USGS Publications Warehouse

Peacock, Elizabeth; Sonsthagen, Sarah A.; Obbard, Martyn E.; Boltunov, Andrei N.; Regehr, Eric V.; Ovsyanikov, Nikita; Aars, Jon; Atkinson, Stephen N.; Sage, George K.; Hope, Andrew G.; Zeyl, Eve; Bachmann, Lutz; Ehrich, Dorothee; Scribner, Kim T.; Amstrup, Steven C.; Belikov, Stanislav; Born, Erik W.; Derocher, Andrew E.; Stirling, Ian; Taylor, Mitchell K.; Wiig, Øystein; Paetkau, David; Talbot, Sandra L.

2015-01-01

We provide an expansive analysis of polar bear (Ursus maritimus) circumpolar genetic variation during the last two decades of decline in their sea-ice habitat. We sought to evaluate whether their genetic diversity and structure have changed over this period of habitat decline, how their current genetic patterns compare with past patterns, and how genetic demography changed with ancient fluctuations in climate. Characterizing their circumpolar genetic structure using microsatellite data, we defined four clusters that largely correspond to current ecological and oceanographic factors: Eastern Polar Basin, Western Polar Basin, Canadian Archipelago and Southern Canada. We document evidence for recent (ca. last 1–3 generations) directional gene flow from Southern Canada and the Eastern Polar Basin towards the Canadian Archipelago, an area hypothesized to be a future refugium for polar bears as climate-induced habitat decline continues. Our data provide empirical evidence in support of this hypothesis. The direction of current gene flow differs from earlier patterns of gene flow in the Holocene. From analyses of mitochondrial DNA, the Canadian Archipelago cluster and the Barents Sea subpopulation within the Eastern Polar Basin cluster did not show signals of population expansion, suggesting these areas may have served also as past interglacial refugia. Mismatch analyses of mitochondrial DNA data from polar and the paraphyletic brown bear (U. arctos) uncovered offset signals in timing of population expansion between the two species, that are attributed to differential demographic responses to past climate cycling. Mitogenomic structure of polar bears was shallow and developed recently, in contrast to the multiple clades of brown bears. We found no genetic signatures of recent hybridization between the species in our large, circumpolar sample, suggesting that recently observed hybrids represent localized events. Documenting changes in subpopulation connectivity will allow polar nations to proactively adjust conservation actions to continuing decline in sea-ice habitat.
Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna

Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less
The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).

PubMed

Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan

2016-01-01

The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales.
Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

DOE PAGES

Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; ...

2015-04-09

Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less

Microarray analysis on gene regulation by estrogen, progesterone and tamoxifen in human endometrial stromal cells.

PubMed

Ren, Chun-E; Zhu, Xueqiong; Li, Jinping; Lyle, Christian; Dowdy, Sean; Podratz, Karl C; Byck, David; Chen, Hai-Bin; Jiang, Shi-Wen

2015-03-13

Epithelial stromal cells represent a major cellular component of human uterine endometrium that is subject to tight hormonal regulation. Through cell-cell contacts and/or paracrine mechanisms, stromal cells play a significant role in the malignant transformation of epithelial cells. We isolated stromal cells from normal human endometrium and investigated the morphological and transcriptional changes induced by estrogen, progesterone and tamoxifen. We demonstrated that stromal cells express appreciable levels of estrogen and progesterone receptors and undergo different morphological changes upon hormonal stimulation. Microarray analysis indicated that both estrogen and progesterone induced dramatic alterations in a variety of genes associated with cell structure, transcription, cell cycle, and signaling. However, divergent patterns of changes, and in some genes opposite effects, were observed for the two hormones. A large number of genes are identified as novel targets for hormonal regulation. These hormone-responsive genes may be involved in normal uterine function and the development of endometrial malignancies.
Molecular Networking and Pattern-Based Genome Mining Improves discovery of biosynthetic gene clusters and their products from Salinispora species

PubMed Central

Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S.; Dorrestein, Pieter C.; Jensen, Paul R.

2015-01-01

Summary Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. Here we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. These efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches. PMID:25865308
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress

PubMed Central

Chowdhary, Surabhi; Kainth, Amoldeep S.

2017-01-01

ABSTRACT Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein (HSP) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5′-3′ gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene “crumpling”). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. PMID:28970326
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress.

PubMed

Chowdhary, Surabhi; Kainth, Amoldeep S; Gross, David S

2017-12-15

Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein ( HSP ) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5'-3' gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene "crumpling"). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. Copyright © 2017 American Society for Microbiology.
Positive correlation between ADAR expression and its targets suggests a complex regulation mediated by RNA editing in the human brain

PubMed Central

Liscovitch, Noa; Bazak, Lily; Levanon, Erez Y; Chechik, Gal

2014-01-01

A-to-I RNA editing by adenosine deaminases acting on RNA is a post-transcriptional modification that is crucial for normal life and development in vertebrates. RNA editing has been shown to be very abundant in the human transcriptome, specifically at the primate-specific Alu elements. The functional role of this wide-spread effect is still not clear; it is believed that editing of transcripts is a mechanism for their down-regulation via processes such as nuclear retention or RNA degradation. Here we combine 2 neural gene expression datasets with genome-level editing information to examine the relation between the expression of ADAR genes with the expression of their target genes. Specifically, we computed the spatial correlation across structures of post-mortem human brains between ADAR and a large set of targets that were found to be edited in their Alu repeats. Surprisingly, we found that a large fraction of the edited genes are positively correlated with ADAR, opposing the assumption that editing would reduce expression. When considering the correlations between ADAR and its targets over development, 2 gene subsets emerge, positively correlated and negatively correlated with ADAR expression. Specifically, in embryonic time points, ADAR is positively correlated with many genes related to RNA processing and regulation of gene expression. These findings imply that the suggested mechanism of regulation of expression by editing is probably not a global one; ADAR expression does not have a genome wide effect reducing the expression of editing targets. It is possible, however, that RNA editing by ADAR in non-coding regions of the gene might be a part of a more complex expression regulation mechanism. PMID:25692240
Positive correlation between ADAR expression and its targets suggests a complex regulation mediated by RNA editing in the human brain.

PubMed

Liscovitch, Noa; Bazak, Lily; Levanon, Erez Y; Chechik, Gal

2014-01-01

A-to-I RNA editing by adenosine deaminases acting on RNA is a post-transcriptional modification that is crucial for normal life and development in vertebrates. RNA editing has been shown to be very abundant in the human transcriptome, specifically at the primate-specific Alu elements. The functional role of this wide-spread effect is still not clear; it is believed that editing of transcripts is a mechanism for their down-regulation via processes such as nuclear retention or RNA degradation. Here we combine 2 neural gene expression datasets with genome-level editing information to examine the relation between the expression of ADAR genes with the expression of their target genes. Specifically, we computed the spatial correlation across structures of post-mortem human brains between ADAR and a large set of targets that were found to be edited in their Alu repeats. Surprisingly, we found that a large fraction of the edited genes are positively correlated with ADAR, opposing the assumption that editing would reduce expression. When considering the correlations between ADAR and its targets over development, 2 gene subsets emerge, positively correlated and negatively correlated with ADAR expression. Specifically, in embryonic time points, ADAR is positively correlated with many genes related to RNA processing and regulation of gene expression. These findings imply that the suggested mechanism of regulation of expression by editing is probably not a global one; ADAR expression does not have a genome wide effect reducing the expression of editing targets. It is possible, however, that RNA editing by ADAR in non-coding regions of the gene might be a part of a more complex expression regulation mechanism.
Molecular cloning, sequence and structural analysis of dehairing Mn(2+) dependent alkaline serine protease (MASPT) of Bacillus pumilus TMS55.

PubMed

Ibrahim, Kalibulla Syed; Muniyandi, Jeyaraj; Pandian, Shunmugiah Karutha

2011-10-01

Leather industries release a large amount of pollution-causing chemicals which creates one of the major industrial pollutions. The development of enzyme based processes as a potent alternative to pollution-causing chemicals is useful to overcome this issue. Proteases are enzymes which have extensive applications in leather processing and in several bioremediation processes due to their high alkaline protease activity and dehairing efficacy. In the present study, we report cloning, characterization of a Mn2+ dependent alkaline serine protease gene (MASPT) of Bacillus pumilus TMS55. The gene encoding the protease from B. pumilus TMS55 was cloned and its nucleotide sequence was determined. This gene has an open reading frame (ORF) of 1,149 bp that encodes a polypeptide of 383 amino acid residues. Our analysis showed that this polypeptide is composed of 29 residues N-terminal signal peptide, a propeptide of 79 residues and a mature protein of 275 amino acids. We performed bioinformatics analysis to compare MASPT enzyme with other proteases. Homology modeling was employed to model three dimensional structure for MASPT. Structural analysis showed that MASPT structure is composed of nine α-helices and nine β-strands. It has 3 catalytic residues and 14 metal binding residues. Docking analysis showed that residues S223, A260, N263, T328 and S329 interact with Mn2+. This study allows initial inferences about the structure of the protease and will allow the rational design of its derivatives for structure-function studies and also for further improvement of the enzyme.
High-resolution phylogenetic microbial community profiling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Singer, Esther; Coleman-Derr, Devin; Bowman, Brett

2014-03-17

The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance ourmore » knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.« less
SNP-VISTA: An interactive SNP visualization tool

PubMed Central

Shah, Nameeta; Teplitsky, Michael V; Minovitsky, Simon; Pennacchio, Len A; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L

2005-01-01

Background Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. Results We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. Conclusion The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user. PMID:16336665
Activity-dependent neuroprotective protein (ADNP): a case study for highly conserved chordata-specific genes shaping the brain and mutated in cancer.

PubMed

Gozes, Illana; Yeheskel, Adva; Pasmanik-Chor, Metsada

2015-01-01

The recent finding of activity-dependent neuroprotective protein (ADNP) as a protein decreased in serum of patients with Alzheimer's disease (AD) compared to controls, alongside with the discovery of ADNP mutations in autism and coupled with the original description of cancer mutations, ignited an interest for a comparative analysis of ADNP with other AD/autism/cancer-associated genes. We strive toward a better understanding of the molecular structure of key players in psychiatric/neurodegenerative diseases including autism, schizophrenia, and AD. This article includes data mining and bioinformatics analysis on the ADNP gene and protein, in addition to other related genes, with emphasis on recent literature. ADNP is discovered here as unique to chordata with specific autism mutations different from cancer-associated mutation. Furthermore, ADNP exhibits similarities to other cancer/autism-associated genes. We suggest that key genes, which shape and maintain our brain and are prone to mutations, are by in large unique to chordata. Furthermore, these brain-controlling genes, like ADNP, are linked to cell growth and differentiation, and under different stress conditions may mutate or exhibit expression changes leading to cancer propagation. Better understanding of these genes could lead to better therapeutics.
Divergence of RNA polymerase α subunits in angiosperm plastid genomes is mediated by genomic rearrangement.

PubMed

Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K

2016-04-18

Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.
Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data.

PubMed

Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger

2017-01-01

Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

PubMed Central

Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

2013-01-01

Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request. PMID:23382856
Visualisation of the mechanosensitive channel of large conductance in bacteria using confocal microscopy.

PubMed

Norman, Christel; Liu, Zhen-Wei; Rigby, Paul; Raso, Albert; Petrov, Yevgeniy; Martinac, Boris

2005-07-01

The mechanosensitive channel of large conductance (MscL) plays an important role in the survival of bacterial cells to hypo-osmotic shock. This channel has been extensively studied and its sequence, structure and electrophysiological characteristics are well known. Here we present a method to visualise MscL in living bacteria using confocal microscopy. By creating a gene fusion between mscl and the gene encoding the green fluorescent protein (GFP) we were able to express the fusion protein MscL-GFP in bacteria. We show that MscL-GFP is present in the cytoplasmic membrane and forms functional channels. These channels have the same characteristics as wild-type MscL, except that they require more pressure to open. This method could prove an interesting, non-invasive, tool to study the localisation and the regulation of expression of MscL in bacteria.
Comprehensive analysis and discovery of drought-related NAC transcription factors in common bean.

PubMed

Wu, Jing; Wang, Lanfen; Wang, Shumin

2016-09-07

Common bean (Phaseolus vulgaris L.) is an important warm-season food legume. Drought is the most important environmental stress factor affecting large areas of common bean via plant death or reduced global production. The NAM, ATAF1/2 and CUC2 (NAC) domain protein family are classic transcription factors (TFs) involved in a variety of abiotic stresses, particularly drought stress. However, the NAC TFs in common bean have not been characterized. In the present study, 86 putative NAC TF proteins were identified from the common bean genome database and located on 11 common bean chromosomes. The proteins were phylogenetically clustered into 8 distinct subfamilies. The gene structure and motif composition of common bean NACs were similar in each subfamily. These results suggest that NACs in the same subfamily may possess conserved functions. The expression patterns of common bean NAC genes were also characterized. The majority of NACs exhibited specific temporal and spatial expression patterns. We identified 22 drought-related NAC TFs based on transcriptome data for drought-tolerant and drought-sensitive genotypes. Quantitative real-time PCR (qRT-PCR) was performed to confirm the expression patterns of the 20 drought-related NAC genes. Based on the common bean genome sequence, we analyzed the structural characteristics, genome distribution, and expression profiles of NAC gene family members and analyzed drought-responsive NAC genes. Our results provide useful information for the functional characterization of common bean NAC genes and rich resources and opportunities for understanding common bean drought stress tolerance mechanisms.
An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein.

PubMed

Wang, Junbai; Wu, Qianqian; Hu, Xiaohua Tony; Tian, Tianhai

2016-11-01

Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation. Copyright © 2016 Elsevier Inc. All rights reserved.
MC EMiNEM maps the interaction landscape of the Mediator.

PubMed

Niederberger, Theresa; Etzold, Stefanie; Lidschreiber, Michael; Maier, Kerstin C; Martin, Dietmar E; Fröhlich, Holger; Cramer, Patrick; Tresch, Achim

2012-01-01

The Mediator is a highly conserved, large multiprotein complex that is involved essentially in the regulation of eukaryotic mRNA transcription. It acts as a general transcription factor by integrating regulatory signals from gene-specific activators or repressors to the RNA Polymerase II. The internal network of interactions between Mediator subunits that conveys these signals is largely unknown. Here, we introduce MC EMiNEM, a novel method for the retrieval of functional dependencies between proteins that have pleiotropic effects on mRNA transcription. MC EMiNEM is based on Nested Effects Models (NEMs), a class of probabilistic graphical models that extends the idea of hierarchical clustering. It combines mode-hopping Monte Carlo (MC) sampling with an Expectation-Maximization (EM) algorithm for NEMs to increase sensitivity compared to existing methods. A meta-analysis of four Mediator perturbation studies in Saccharomyces cerevisiae, three of which are unpublished, provides new insight into the Mediator signaling network. In addition to the known modular organization of the Mediator subunits, MC EMiNEM reveals a hierarchical ordering of its internal information flow, which is putatively transmitted through structural changes within the complex. We identify the N-terminus of Med7 as a peripheral entity, entailing only local structural changes upon perturbation, while the C-terminus of Med7 and Med19 appear to play a central role. MC EMiNEM associates Mediator subunits to most directly affected genes, which, in conjunction with gene set enrichment analysis, allows us to construct an interaction map of Mediator subunits and transcription factors.
Structural organization of the inactive X chromosome in the mouse

PubMed Central

Giorgetti, Luca; Lajoie, Bryan R.; Carter, Ava C.; Attia, Mikael; Zhan, Ye; Xu, Jin; Chen, Chong Jian; Kaplan, Noam; Chang, Howard Y.; Heard, Edith; Dekker, Job

2017-01-01

X-chromosome inactivation (XCI) involves major reorganization of the X chromosome as it becomes silent and heterochromatic. During female mammalian development, XCI is triggered by upregulation of the non-coding Xist RNA from one of the two X chromosomes. Xist coats the chromosome in cis and induces silencing of almost all genes via its A-repeat region1,2, although some genes (constitutive escapees) avoid silencing in most cell types, and others (facultative escapees) escape XCI only in specific contexts3. A role for Xist in organizing the inactive X (Xi) chromosome has been proposed4–6. Recent chromosome conformation capture approaches have revealed global loss of local structure on the Xi chromosome and formation of large mega-domains, separated by a region containing the DXZ4 macrosatellite7–10. However, the molecular architecture of the Xi chromosome, in both the silent and expressed regions, remains unclear. Here we investigate the structure, chromatin accessibility and expression status of the mouse Xi chromosome in highly polymorphic clonal neural progenitors (NPCs) and embryonic stem cells. We demonstrate a crucial role for Xist and the DXZ4-containing boundary in shaping Xi chromosome structure using allele-specific genome-wide chromosome conformation capture (Hi-C) analysis, an assay for transposase-accessible chromatin with high throughput sequencing (ATAC–seq) and RNA sequencing. Deletion of the boundary disrupts mega-domain formation, and induction of Xist RNA initiates formation of the boundary and the loss of DNA accessibility. We also show that in NPCs, the Xi chromosome lacks active/inactive compartments and topologically associating domains (TADs), except around genes that escape XCI. Escapee gene clusters display TAD-like structures and retain DNA accessibility at promoter-proximal and CTCF-binding sites. Furthermore, altered patterns of facultative escape genes in different neural progenitor clones are associated with the presence of different TAD-like structures after XCI. These findings suggest a key role for transcription and CTCF in the formation of TADs in the context of the Xi chromosome in neural progenitors. PMID:27437574
The Mitochondrial Genome of the Prasinophyte Prasinoderma coloniale Reveals Two Trans-Spliced Group I Introns in the Large Subunit rRNA Gene

PubMed Central

Pombert, Jean-François; Otis, Christian; Turmel, Monique; Lemieux, Claude

2013-01-01

Organelle genes are often interrupted by group I and or group II introns. Splicing of these mobile genetic occurs at the RNA level via serial transesterification steps catalyzed by the introns'own tertiary structures and, sometimes, with the help of external factors. These catalytic ribozymes can be found in cis or trans configuration, and although trans-arrayed group II introns have been known for decades, trans-spliced group I introns have been reported only recently. In the course of sequencing the complete mitochondrial genome of the prasinophyte picoplanktonic green alga Prasinoderma coloniale CCMP 1220 (Prasinococcales, clade VI), we uncovered two additional cases of trans-spliced group I introns. Here, we describe these introns and compare the 54,546 bp-long mitochondrial genome of Prasinoderma with those of four other prasinophytes (clades II, III and V). This comparison underscores the highly variable mitochondrial genome architecture in these ancient chlorophyte lineages. Both Prasinoderma trans-spliced introns reside within the large subunit rRNA gene (rnl) at positions where cis-spliced relatives, often containing homing endonuclease genes, have been found in other organelles. In contrast, all previously reported trans-spliced group I introns occur in different mitochondrial genes (rns or coxI). Each Prasinoderma intron is fragmented into two pieces, forming at the RNA level a secondary structure that resembles those of its cis-spliced counterparts. As observed for other trans-spliced group I introns, the breakpoint of the first intron maps to the variable loop L8, whereas that of the second is uniquely located downstream of P9.1. The breakpoint In each Prasinoderma intron corresponds to the same region where the open reading frame (ORF) occurs when present in cis-spliced orthologs. This correlation between the intron breakpoint and the ORF location in cis-spliced orthologs also holds for other trans-spliced introns; we discuss the possible implications of this interesting observation for trans-splicing of group I introns. PMID:24386369
IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

PubMed

Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

2015-07-14

In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.

Comparative genomics reveals insights into avian genome evolution and adaptation.

PubMed

Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

2014-12-12

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.
Penna model from the perspective of one geneticist

NASA Astrophysics Data System (ADS)

Cebrat, Stanis l̶aw

1998-09-01

Penna model of ageing predicts many phenomena in population dynamics. Since the model assumes that all genes in genomes are switched on chronologically and that there are no structural differences between male and female genomes, it cannot explain genetic death before birth and differences in mortality rates of men and women. I suggest adding the set of housekeeping genes, which are switched on during the embryo development, to the “death genes” of Penna model. Taking into account the large fraction of genes located on X chromosome whose deleterious mutations exert dominant effect on the male phenotype and recessive on the female phenotype would make it possible to avoid introducing somatic mutations as a cause of higher mortality of men. The modelling of linkage disequilibrium and its implications on eugenics have also been suggested.
Dehydration responsive element binding transcription factors and their applications for the engineering of stress tolerance.

PubMed

Agarwal, Pradeep K; Gupta, Kapil; Lopato, Sergiy; Agarwal, Parinita

2017-04-01

Dehydration responsive element binding (DREB) factors or CRT element binding factors (CBFs) are members of the AP2/ERF family, which comprises a large number of stress-responsive regulatory genes. This review traverses almost two decades of research, from the discovery of DREB/CBF factors to their optimization for application in plant biotechnology. In this review, we describe (i) the discovery, classification, structure, and evolution of DREB genes and proteins; (ii) induction of DREB genes by abiotic stresses and involvement of their products in stress responses; (iii) protein structure and DNA binding selectivity of different groups of DREB proteins; (iv) post-transcriptional and post-translational mechanisms of DREB transcription factor (TF) regulation; and (v) physical and/or functional interaction of DREB TFs with other proteins during plant stress responses. We also discuss existing issues in applications of DREB TFs for engineering of enhanced stress tolerance and improved performance under stress of transgenic crop plants. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Genetic structuring of northern myotis (Myotis septentrionalis) at multiple spatial scales

USGS Publications Warehouse

Johnson, Joshua B.; Roberts, James H.; King, Timothy L.; Edwards, John W.; Ford, W. Mark; Ray, David A.

2014-01-01

Although groups of bats may be genetically distinguishable at large spatial scales, the effects of forest disturbances, particularly permanent land use conversions on fine-scale population structure and gene flow of summer aggregations of philopatric bat species are less clear. We genotyped and analyzed variation at 10 nuclear DNA microsatellite markers in 182 individuals of the forest-dwelling northern myotis (Myotis septentrionalis) at multiple spatial scales, from within first-order watersheds scaling up to larger regional areas in West Virginia and New York. Our results indicate that groups of northern myotis were genetically indistinguishable at any spatial scale we considered, and the collective population maintained high genetic diversity. It is likely that the ability to migrate, exploit small forest patches, and use networks of mating sites located throughout the Appalachian Mountains, Interior Highlands, and elsewhere in the hibernation range have allowed northern myotis to maintain high genetic diversity and gene flow regardless of forest disturbances at local and regional spatial scales. A consequence of maintaining high gene flow might be the potential to minimize genetic founder effects following population declines caused currently by the enzootic White-nose Syndrome.
Functional Analysis of the Brassica napus L. Phytoene Synthase (PSY) Gene Family

PubMed Central

López-Emparán, Ada; Quezada-Martinez, Daniela; Zúñiga-Bustos, Matías; Cifuentes, Víctor; Iñiguez-Luy, Federico; Federico, María Laura

2014-01-01

Phytoene synthase (PSY) has been shown to catalyze the first committed and rate-limiting step of carotenogenesis in several crop species, including Brassica napus L. Due to its pivotal role, PSY has been a prime target for breeding and metabolic engineering the carotenoid content of seeds, tubers, fruits and flowers. In Arabidopsis thaliana, PSY is encoded by a single copy gene but small PSY gene families have been described in monocot and dicotyledonous species. We have recently shown that PSY genes have been retained in a triplicated state in the A- and C-Brassica genomes, with each paralogue mapping to syntenic locations in each of the three “Arabidopsis-like” subgenomes. Most importantly, we have shown that in B. napus all six members are expressed, exhibiting overlapping redundancy and signs of subfunctionalization among photosynthetic and non photosynthetic tissues. The question of whether this large PSY family actually encodes six functional enzymes remained to be answered. Therefore, the objectives of this study were to: (i) isolate, characterize and compare the complete protein coding sequences (CDS) of the six B. napus PSY genes; (ii) model their predicted tridimensional enzyme structures; (iii) test their phytoene synthase activity in a heterologous complementation system and (iv) evaluate their individual expression patterns during seed development. This study further confirmed that the six B. napus PSY genes encode proteins with high sequence identity, which have evolved under functional constraint. Structural modeling demonstrated that they share similar tridimensional protein structures with a putative PSY active site. Significantly, all six B. napus PSY enzymes were found to be functional. Taking into account the specific patterns of expression exhibited by these PSY genes during seed development and recent knowledge of PSY suborganellar localization, the selection of transgene candidates for metabolic engineering the carotenoid content of oilseeds is discussed. PMID:25506829
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

PubMed

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-07-20

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Ontology based molecular signatures for immune cell types via gene expression analysis

PubMed Central

2013-01-01

Background New technologies are focusing on characterizing cell types to better understand their heterogeneity. With large volumes of cellular data being generated, innovative methods are needed to structure the resulting data analyses. Here, we describe an ‘Ontologically BAsed Molecular Signature’ (OBAMS) method that identifies novel cellular biomarkers and infers biological functions as characteristics of particular cell types. This method finds molecular signatures for immune cell types based on mapping biological samples to the Cell Ontology (CL) and navigating the space of all possible pairwise comparisons between cell types to find genes whose expression is core to a particular cell type’s identity. Results We illustrate this ontological approach by evaluating expression data available from the Immunological Genome project (IGP) to identify unique biomarkers of mature B cell subtypes. We find that using OBAMS, candidate biomarkers can be identified at every strata of cellular identity from broad classifications to very granular. Furthermore, we show that Gene Ontology can be used to cluster cell types by shared biological processes in order to find candidate genes responsible for somatic hypermutation in germinal center B cells. Moreover, through in silico experiments based on this approach, we have identified genes sets that represent genes overexpressed in germinal center B cells and identify genes uniquely expressed in these B cells compared to other B cell types. Conclusions This work demonstrates the utility of incorporating structured ontological knowledge into biological data analysis – providing a new method for defining novel biomarkers and providing an opportunity for new biological insights. PMID:24004649
Examination of Csr regulatory circuitry using epistasis analysis with RNA-seq (Epi-seq) confirms that CsrD affects gene expression via CsrA, CsrB and CsrC.

PubMed

Potts, Anastasia H; Leng, Yuanyuan; Babitzke, Paul; Romeo, Tony

2018-03-29

The Csr global regulatory system coordinates gene expression in response to metabolic status. This system utilizes the RNA binding protein CsrA to regulate gene expression by binding to transcripts of structural and regulatory genes, thus affecting their structure, stability, translation, and/or transcription elongation. CsrA activity is controlled by sRNAs, CsrB and CsrC, which sequester CsrA away from other transcripts. CsrB/C levels are partly determined by their rates of turnover, which requires CsrD to render them susceptible to RNase E cleavage. Previous epistasis analysis suggested that CsrD affects gene expression through the other Csr components, CsrB/C and CsrA. However, those conclusions were based on a limited analysis of reporters. Here, we reassessed the global behavior of the Csr circuitry using epistasis analysis with RNA seq (Epi-seq). Because CsrD effects on mRNA levels were entirely lost in the csrA mutant and largely eliminated in a csrB/C mutant under our experimental conditions, while the majority of CsrA effects persisted in the absence of csrD, the original model accounts for the global behavior of the Csr system. Our present results also reflect a more nuanced role of CsrA as terminal regulator of the Csr system than has been recognized.
Speciation with gene flow in whiptail lizards from a Neotropical xeric biome.

PubMed

Oliveira, Eliana F; Gehara, Marcelo; São-Pedro, Vinícius A; Chen, Xin; Myers, Edward A; Burbrink, Frank T; Mesquita, Daniel O; Garda, Adrian A; Colli, Guarino R; Rodrigues, Miguel T; Arias, Federico J; Zaher, Hussam; Santos, Rodrigo M L; Costa, Gabriel C

2015-12-01

Two main hypotheses have been proposed to explain the diversification of the Caatinga biota. The riverine barrier hypothesis (RBH) claims that the São Francisco River (SFR) is a major biogeographic barrier to gene flow. The Pleistocene climatic fluctuation hypothesis (PCH) states that gene flow, geographic genetic structure and demographic signatures on endemic Caatinga taxa were influenced by Quaternary climate fluctuation cycles. Herein, we analyse genetic diversity and structure, phylogeographic history, and diversification of a widespread Caatinga lizard (Cnemidophorus ocellifer) based on large geographical sampling for multiple loci to test the predictions derived from the RBH and PCH. We inferred two well-delimited lineages (Northeast and Southwest) that have diverged along the Cerrado-Caatinga border during the Mid-Late Miocene (6-14 Ma) despite the presence of gene flow. We reject both major hypotheses proposed to explain diversification in the Caatinga. Surprisingly, our results revealed a striking complex diversification pattern where the Northeast lineage originated as a founder effect from a few individuals located along the edge of the Southwest lineage that eventually expanded throughout the Caatinga. The Southwest lineage is more diverse, older and associated with the Cerrado-Caatinga boundaries. Finally, we suggest that C. ocellifer from the Caatinga is composed of two distinct species. Our data support speciation in the presence of gene flow and highlight the role of environmental gradients in the diversification process. © 2015 John Wiley & Sons Ltd.
The Epidermis of Grhl3-Null Mice Displays Altered Lipid Processing and Cellular Hyperproliferation

PubMed Central

Ting, Stephen B; Caddy, Jacinta; Wilanowski, Tomasz; Auden, Alana; Cunningham, John M; Elias, Peter M; Holleran, Walter M

2005-01-01

The presence of an impermeable surface barrier is an essential homeostatic mechanism in almost all living organisms. We have recently described a novel gene that is critical for the developmental instruction and repair of the integument in mammals. This gene, Grainy head-like 3 (Grhl3) is a member of a large family of transcription factors that are homologs of the Drosophila developmental gene grainy head (grh). Mice lacking Grhl3 fail to form an adequate skin barrier, and die at birth due to dehydration. These animals are also unable to repair the epidermis, exhibiting failed wound healing in both fetal and adult stages of development. These defects are due, in part, to diminished expression of a Grhl3 target gene, Transglutaminase 1 (TGase 1), which encodes a key enzyme involved in cross-linking of epidermal structural proteins and lipids into the cornified envelope (CE). Remarkably, the Drosophila grh gene plays an analogous role, regulating enzymes involved in the generation of quinones, which are essential for cross-linking structural components of the fly epidermis. In an extension of our initial analyses, we focus this report on additional defects observed in the Grhl3-null epidermis, namely defective extra-cellular lipid processing, altered lamellar lipid architecture and cellular hyperproliferation. These abnormalities suggest that Grhl3 plays diverse mechanistic roles in maintaining homeostasis in the skin. PMID:19521564
The epidermis of grhl3-null mice displays altered lipid processing and cellular hyperproliferation.

PubMed

Ting, Stephen B; Caddy, Jacinta; Wilanowski, Tomasz; Auden, Alana; Cunningham, John M; Elias, Peter M; Holleran, Walter M; Jane, Stephen M

2005-04-01

The presence of an impermeable surface barrier is an essential homeostatic mechanism in almost all living organisms. We have recently described a novel gene that is critical for the developmental instruction and repair of the integument in mammals. This gene, Grainy head-like 3 (Grhl3) is a member of a large family of transcription factors that are homologs of the Drosophila developmental gene grainy head (grh). Mice lacking Grhl3 fail to form an adequate skin barrier, and die at birth due to dehydration. These animals are also unable to repair the epidermis, exhibiting failed wound healing in both fetal and adult stages of development. These defects are due, in part, to diminished expression of a Grhl3 target gene, Transglutaminase 1 (TGase 1), which encodes a key enzyme involved in cross-linking of epidermal structural proteins and lipids into the cornified envelope (CE). Remarkably, the Drosophila grh gene plays an analogous role, regulating enzymes involved in the generation of quinones, which are essential for cross-linking structural components of the fly epidermis. In an extension of our initial analyses, we focus this report on additional defects observed in the Grhl3-null epidermis, namely defective extra-cellular lipid processing, altered lamellar lipid architecture and cellular hyperproliferation. These abnormalities suggest that Grhl3 plays diverse mechanistic roles in maintaining homeostasis in the skin.
The protocadherin 17 gene affects cognition, personality, amygdala structure and function, synapse development and risk of major mood disorders.

PubMed

Chang, H; Hoshina, N; Zhang, C; Ma, Y; Cao, H; Wang, Y; Wu, D-D; Bergen, S E; Landén, M; Hultman, C M; Preisig, M; Kutalik, Z; Castelao, E; Grigoroiu-Serbanescu, M; Forstner, A J; Strohmaier, J; Hecker, J; Schulze, T G; Müller-Myhsok, B; Reif, A; Mitchell, P B; Martin, N G; Schofield, P R; Cichon, S; Nöthen, M M; Walter, H; Erk, S; Heinz, A; Amin, N; van Duijn, C M; Meyer-Lindenberg, A; Tost, H; Xiao, X; Yamamoto, T; Rietschel, M; Li, M

2018-02-01

Major mood disorders, which primarily include bipolar disorder and major depressive disorder, are the leading cause of disability worldwide and pose a major challenge in identifying robust risk genes. Here, we present data from independent large-scale clinical data sets (including 29 557 cases and 32 056 controls) revealing brain expressed protocadherin 17 (PCDH17) as a susceptibility gene for major mood disorders. Single-nucleotide polymorphisms (SNPs) spanning the PCDH17 region are significantly associated with major mood disorders; subjects carrying the risk allele showed impaired cognitive abilities, increased vulnerable personality features, decreased amygdala volume and altered amygdala function as compared with non-carriers. The risk allele predicted higher transcriptional levels of PCDH17 mRNA in postmortem brain samples, which is consistent with increased gene expression in patients with bipolar disorder compared with healthy subjects. Further, overexpression of PCDH17 in primary cortical neurons revealed significantly decreased spine density and abnormal dendritic morphology compared with control groups, which again is consistent with the clinical observations of reduced numbers of dendritic spines in the brains of patients with major mood disorders. Given that synaptic spines are dynamic structures which regulate neuronal plasticity and have crucial roles in myriad brain functions, this study reveals a potential underlying biological mechanism of a novel risk gene for major mood disorders involved in synaptic function and related intermediate phenotypes.
Developmental Transcriptome of Aplysia californica

PubMed Central

HEYLAND, ANDREAS; VUE, ZER; VOOLSTRA, CHRISTIAN R.; MEDINA, MÓNICA; MOROZ, LEONID L.

2014-01-01

Genome-wide transcriptional changes in development provide important insight into mechanisms underlying growth, differentiation, and patterning. However, such large-scale developmental studies have been limited to a few representatives of Ecdysozoans and Chordates. Here, we characterize transcriptomes of embryonic, larval, and metamorphic development in the marine mollusc Aplysia californica and reveal novel molecular components associated with life history transitions. Specifically, we identify more than 20 signal peptides, putative hormones, and transcription factors in association with early development and metamorphic stages—many of which seem to be evolutionarily conserved elements of signal transduction pathways. We also characterize genes related to biomineralization—a critical process of molluscan development. In summary, our experiment provides the first large-scale survey of gene expression in mollusc development, and complements previous studies on the regulatory mechanisms underlying body plan patterning and the formation of larval and juvenile structures. This study serves as a resource for further functional annotation of transcripts and genes in Aplysia, specifically and molluscs in general. A comparison of the Aplysia developmental transcriptome with similar studies in the zebra fish Danio rerio, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and other studies on molluscs suggests an overall highly divergent pattern of gene regulatory mechanisms that are likely a consequence of the different developmental modes of these organisms. PMID:21328528
The zebrafish reference genome sequence and its relationship to the human genome.

PubMed

Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

2013-04-25

Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
The zebrafish reference genome sequence and its relationship to the human genome

PubMed Central

Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

2013-01-01

Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743
Molecular evolution of the plastid genome during diversification of the cotton genus.

PubMed

Chen, Zhiwen; Grover, Corrinne E; Li, Pengbo; Wang, Yumei; Nie, Hushuai; Zhao, Yanpeng; Wang, Meiyan; Liu, Fang; Zhou, Zhongli; Wang, Xingxing; Cai, Xiaoyan; Wang, Kunbo; Wendel, Jonathan F; Hua, Jinping

2017-07-01

Cotton (Gossypium spp.) is commonly grouped into eight diploid genomic groups, designated A-G and K, and one tetraploid genomic group, namely AD. To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome duringdiversification, chloroplast genomes (cpDNA) from 6 D-genome and 2 G-genome species of Gossypium (G. armourianum D 2-1 , G. harknessii D 2-2 , G. davidsonii D 3-d , G. klotzschianum D 3-k , G. aridum D 4 , G. trilobum D 8 , and G. australe G 2 , G. nelsonii G 3 ) were newly reported here. In combination with the 26 previously released cpDNA sequences, we performed comparative phylogenetic analyses of 34 Gossypium chloroplast genomes that collectively represent most of the diversity in the genus. Gossypium chloroplasts span a small range in size that is mostly attributable to indels that occur in the large single copy (LSC) region of the genome. Phylogenetic analysis using a concatenation of all genes provides robust support for six major Gossypium clades, largely supporting earlier inferences but also revealing new information on intrageneric relationships. Using Theobroma cacao as an outgroup, diversification of the genus was dated, yielding results that are in accord with previous estimates of divergence times, but also offering new perspectives on the basal, early radiation of all major clades within the genus as well as gaps in the record indicative of extinctions. Like most higher-plant chloroplast genomes, all cotton species exhibit a conserved quadripartite structure, i.e., two large inverted repeats (IR) containing most of the ribosomal RNA genes, and two unique regions, LSC (large single sequence) and SSC (small single sequence). Within Gossypium, the IR-single copy region junctions are both variable and homoplasious among species. Two genes, accD and psaJ, exhibited greater rates of synonymous and non-synonymous substitutions than did other genes. Most genes exhibited Ka/Ks ratios suggestive of neutral evolution, with 8 exceptions distributed among one to several species. This research provides an overview of the molecular evolution of a single, large non-recombining molecular during the diversification of this important genus. Copyright © 2017 Elsevier Inc. All rights reserved.
[Association of polymorphisms in toll-like receptor genes with atopic dermatitis in the Republic of Bashkortostan].

PubMed

Gimalova, G F; Karunas, A S; Fedorova, Iu Iu; Gumennaia, É R; Levasheva, S V; Khismatullina, Z R; Prans, E; Koks, S; Étkina, É I; Khusnutdinova, É K

2014-01-01

Atopic dermatitis (AD) is a prevalent chronic inflammatory skin disease developing as a result of the interaction between genetic predisposition and environmental factors. Considerable role in allergic diseases development is played by polymorphisms of genes of pattern-recognition receptors (PRR) which are capable of recognizing conservative standard molecular structures (patterns) unique for large pathogen groups. In this study polymorphic variants of PRR genes--Toll-like receptors (TLR1, TLR2, TLR4, TLR5, TLR6, TLR9, TLR10), NOD-like receptors (NOD1, NOD2), lipopolysaccharide receptor CD14 gene, and C11orf30 and LRRC32 genes, located in 11q13.5 region, have been investigated in AD patients and control subjects from the Republic of Bashkortostan. An association of TLR1 (rs5743571 and rs5743604), TLR6 (rs5743794) and TLR10 (rs11466617) with AD was found. Our results confirm an important role of the innate immune system in the pathogenesis of AD and the significance of polymorphisms within the Toll-like receptor 2 subfamily genes in AD development.
A new approach to enhance the performance of decision tree for classifying gene expression data.

PubMed

Hassan, Md; Kotagiri, Ramamohanarao

2013-12-20

Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.
Gene selection heuristic algorithm for nutrigenomics studies.

PubMed

Valour, D; Hue, I; Grimard, B; Valour, B

2013-07-15

Large datasets from -omics studies need to be deeply investigated. The aim of this paper is to provide a new method (LEM method) for the search of transcriptome and metabolome connections. The heuristic algorithm here described extends the classical canonical correlation analysis (CCA) to a high number of variables (without regularization) and combines well-conditioning and fast-computing in "R." Reduced CCA models are summarized in PageRank matrices, the product of which gives a stochastic matrix that resumes the self-avoiding walk covered by the algorithm. Then, a homogeneous Markov process applied to this stochastic matrix converges the probabilities of interconnection between genes, providing a selection of disjointed subsets of genes. This is an alternative to regularized generalized CCA for the determination of blocks within the structure matrix. Each gene subset is thus linked to the whole metabolic or clinical dataset that represents the biological phenotype of interest. Moreover, this selection process reaches the aim of biologists who often need small sets of genes for further validation or extended phenotyping. The algorithm is shown to work efficiently on three published datasets, resulting in meaningfully broadened gene networks.
SoFoCles: feature filtering for microarray classification based on gene ontology.

PubMed

Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

2010-02-01

Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.

The impact of the metabotropic glutamate receptor and other gene family interaction networks on autism

PubMed Central

Hadley, Dexter; Wu, Zhi-liang; Kao, Charlly; Kini, Akshata; Mohamed-Hadley, Alisha; Thomas, Kelly; Vazquez, Lyam; Qiu, Haijun; Mentch, Frank; Pellegrino, Renata; Kim, Cecilia; Connolly, John; Pinto, Dalila; Merikangas, Alison; Klei, Lambertus; Vorstman, Jacob A.S.; Thompson, Ann; Regan, Regina; Pagnamenta, Alistair T.; Oliveira, Bárbara; Magalhaes, Tiago R.; Gilbert, John; Duketis, Eftichia; De Jonge, Maretha V.; Cuccaro, Michael; Correia, Catarina T.; Conroy, Judith; Conceição, Inês C.; Chiocchetti, Andreas G.; Casey, Jillian P.; Bolshakova, Nadia; Bacchelli, Elena; Anney, Richard; Zwaigenbaum, Lonnie; Wittemeyer, Kerstin; Wallace, Simon; Engeland, Herman van; Soorya, Latha; Rogé, Bernadette; Roberts, Wendy; Poustka, Fritz; Mouga, Susana; Minshew, Nancy; McGrew, Susan G.; Lord, Catherine; Leboyer, Marion; Le Couteur, Ann S.; Kolevzon, Alexander; Jacob, Suma; Guter, Stephen; Green, Jonathan; Green, Andrew; Gillberg, Christopher; Fernandez, Bridget A.; Duque, Frederico; Delorme, Richard; Dawson, Geraldine; Café, Cátia; Brennan, Sean; Bourgeron, Thomas; Bolton, Patrick F.; Bölte, Sven; Bernier, Raphael; Baird, Gillian; Bailey, Anthony J.; Anagnostou, Evdokia; Almeida, Joana; Wijsman, Ellen M.; Vieland, Veronica J.; Vicente, Astrid M.; Schellenberg, Gerard D.; Pericak-Vance, Margaret; Paterson, Andrew D.; Parr, Jeremy R.; Oliveira, Guiomar; Almeida, Joana; Café, Cátia; Mouga, Susana; Correia, Catarina; Nurnberger, John I.; Monaco, Anthony P.; Maestrini, Elena; Klauck, Sabine M.; Hakonarson, Hakon; Haines, Jonathan L.; Geschwind, Daniel H.; Freitag, Christine M.; Folstein, Susan E.; Ennis, Sean; Coon, Hilary; Battaglia, Agatino; Szatmari, Peter; Sutcliffe, James S.; Hallmayer, Joachim; Gill, Michael; Cook, Edwin H.; Buxbaum, Joseph D.; Devlin, Bernie; Gallagher, Louise; Betancur, Catalina; Scherer, Stephen W.; Glessner, Joseph; Hakonarson, Hakon

2014-01-01

Although multiple reports show that defective genetic networks underlie the aetiology of autism, few have translated into pharmacotherapeutic opportunities. Since drugs compete with endogenous small molecules for protein binding, many successful drugs target large gene families with multiple drug binding sites. Here we search for defective gene family interaction networks (GFINs) in 6,742 patients with the ASDs relative to 12,544 neurologically normal controls, to find potentially druggable genetic targets. We find significant enrichment of structural defects (P≤2.40E−09, 1.8-fold enrichment) in the metabotropic glutamate receptor (GRM) GFIN, previously observed to impact attention deficit hyperactivity disorder (ADHD) and schizophrenia. Also, the MXD-MYC-MAX network of genes, previously implicated in cancer, is significantly enriched (P≤3.83E−23, 2.5-fold enrichment), as is the calmodulin 1 (CALM1) gene interaction network (P≤4.16E−04, 14.4-fold enrichment), which regulates voltage-independent calcium-activated action potentials at the neuronal synapse. We find that multiple defective gene family interactions underlie autism, presenting new translational opportunities to explore for therapeutic interventions. PMID:24927284
Circumpolar Genetic Structure and Recent Gene Flow of Polar Bears: A Reanalysis.

PubMed

Malenfant, René M; Davis, Corey S; Cullingham, Catherine I; Coltman, David W

2016-01-01

Recently, an extensive study of 2,748 polar bears (Ursus maritimus) from across their circumpolar range was published in PLOS ONE, which used microsatellites and mitochondrial haplotypes to apparently show altered population structure and a dramatic change in directional gene flow towards the Canadian Archipelago-an area believed to be a future refugium for polar bears as their southernmost habitats decline under climate change. Although this study represents a major international collaborative effort and promised to be a baseline for future genetics work, methodological shortcomings and errors of interpretation undermine some of the study's main conclusions. Here, we present a reanalysis of this data in which we address some of these issues, including: (1) highly unbalanced sample sizes and large amounts of systematically missing data; (2) incorrect calculation of FST and of significance levels; (3) misleading estimates of recent gene flow resulting from non-convergence of the program BayesAss. In contrast to the original findings, in our reanalysis we find six genetic clusters of polar bears worldwide: the Hudson Bay Complex, the Western and Eastern Canadian Arctic Archipelago, the Western and Eastern Polar Basin, and-importantly-we reconfirm the presence of a unique and possibly endangered cluster of bears in Norwegian Bay near Canada's expected last sea-ice refugium. Although polar bears' abundance, distribution, and population structure will certainly be negatively affected by ongoing-and increasingly rapid-loss of Arctic sea ice, these genetic data provide no evidence of strong directional gene flow in response to recent climate change.
Genetic structure of the date palm (Phoenix dactylifera) in the Old World reveals a strong differentiation between eastern and western populations

PubMed Central

Zehdi-Azouzi, Salwa; Cherif, Emira; Moussouni, Souhila; Gros-Balthazard, Muriel; Abbas Naqvi, Summar; Ludeña, Bertha; Castillo, Karina; Chabrillange, Nathalie; Bouguedoura, Nadia; Bennaceur, Malika; Si-Dehbi, Farida; Abdoulkader, Sabira; Daher, Abdourahman; Terral, Jean-Frederic; Santoni, Sylvain; Ballardini, Marco; Mercuri, Antonio; Ben Salah, Mohamed; Kadri, Karim; Othmani, Ahmed; Littardi, Claudio; Salhi-Hannachi, Amel; Pintaud, Jean-Christophe; Aberlenc-Bertossi, Frédérique

2015-01-01

Background and Aims Date palms (Phoenix dactylifera, Arecaceae) are of great economic and ecological value to the oasis agriculture of arid and semi-arid areas. However, despite the availability of a large date palm germplasm spreading from the Atlantic shores to Southern Asia, improvement of the species is being hampered by a lack of information on global genetic diversity and population structure. In order to contribute to the varietal improvement of date palms and to provide new insights on the influence of geographic origins and human activity on the genetic structure of the date palm, this study analysed the diversity of the species. Methods Genetic diversity levels and population genetic structure were investigated through the genotyping of a collection of 295 date palm accessions ranging from Mauritania to Pakistan using a set of 18 simple sequence repeat (SSR) markers and a plastid minisatellite. Key Results Using a Bayesian clustering approach, the date palm genotypes can be structured into two different gene pools: the first, termed the Eastern pool, consists of accessions from Asia and Djibouti, whilst the second, termed the Western pool, consists of accessions from Africa. These results confirm the existence of two ancient gene pools that have contributed to the current date palm diversity. The presence of admixed genotypes is also noted, which points at gene flows between eastern and western origins, mostly from east to west, following a human-mediated diffusion of the species. Conclusions This study assesses the distribution and level of genetic diversity of accessible date palm resources, provides new insights on the geographic origins and genetic history of the cultivated component of this species, and confirms the existence of at least two domestication origins. Furthermore, the strong genetic structure clearly established here is a prerequisite for any breeding programme exploiting the effective polymorphism related to each gene pool. PMID:26113618
The polyphenol oxidase gene family in land plants: Lineage-specific duplication and expansion

PubMed Central

2012-01-01

Background Plant polyphenol oxidases (PPOs) are enzymes that typically use molecular oxygen to oxidize ortho-diphenols to ortho-quinones. These commonly cause browning reactions following tissue damage, and may be important in plant defense. Some PPOs function as hydroxylases or in cross-linking reactions, but in most plants their physiological roles are not known. To better understand the importance of PPOs in the plant kingdom, we surveyed PPO gene families in 25 sequenced genomes from chlorophytes, bryophytes, lycophytes, and flowering plants. The PPO genes were then analyzed in silico for gene structure, phylogenetic relationships, and targeting signals. Results Many previously uncharacterized PPO genes were uncovered. The moss, Physcomitrella patens, contained 13 PPO genes and Selaginella moellendorffii (spike moss) and Glycine max (soybean) each had 11 genes. Populus trichocarpa (poplar) contained a highly diversified gene family with 11 PPO genes, but several flowering plants had only a single PPO gene. By contrast, no PPO-like sequences were identified in several chlorophyte (green algae) genomes or Arabidopsis (A. lyrata and A. thaliana). We found that many PPOs contained one or two introns often near the 3’ terminus. Furthermore, N-terminal amino acid sequence analysis using ChloroP and TargetP 1.1 predicted that several putative PPOs are synthesized via the secretory pathway, a unique finding as most PPOs are predicted to be chloroplast proteins. Phylogenetic reconstruction of these sequences revealed that large PPO gene repertoires in some species are mostly a consequence of independent bursts of gene duplication, while the lineage leading to Arabidopsis must have lost all PPO genes. Conclusion Our survey identified PPOs in gene families of varying sizes in all land plants except in the genus Arabidopsis. While we found variation in intron numbers and positions, overall PPO gene structure is congruent with the phylogenetic relationships based on primary sequence data. The dynamic nature of this gene family differentiates PPO from other oxidative enzymes, and is consistent with a protein important for a diversity of functions relating to environmental adaptation. PMID:22897796
Identification of mutations, genotype-phenotype correlation and prenatal diagnosis of maple syrup urine disease in Indian patients.

PubMed

Gupta, Deepti; Bijarnia-Mahay, Sunita; Saxena, Renu; Kohli, Sudha; Dua-Puri, Ratna; Verma, Jyotsna; Thomas, E; Shigematsu, Yosuke; Yamaguchi, Seiji; Deb, Roumi; Verma, Ishwar Chander

2015-09-01

Maple syrup urine disease (MSUD) is caused by mutations in genes BCKDHA, BCKDHB, DBT encoding E1α, E1β, and E2 subunits of enzyme complex, branched-chain alpha-ketoacid dehydrogenase (BCKDH). BCKDH participates in catabolism of branched-chain amino acids (BCAAs) - leucine, isoleucine and valine in the energy production pathway. Deficiency or defect in the enzyme complex causes accumulation of BCAAs and keto-acids leading to toxicity. Twenty-four patients with MSUD were enrolled in the study for molecular characterization and genotype-phenotype correlation. Molecular studies were carried out by sequencing of the 3 genes by Sanger method. Bioinformatics tools were employed to classify novel variations into pathogenic or benign. The predicted effects of novel changes on protein structure were elucidated by 3D modeling. Mutations were detected in 22 of 24 patients (11, 7 and 4 in BCKDHB, BCKDHA and DBT genes, respectively). Twenty mutations including 11 novel mutations were identified. Protein modeling in novel mutations showed alteration of structure and function of these subunits. Mutations, c.1065 delT (BCKDHB gene) and c.939G > C (DBT gene) were noted to be recurrent, identified in 6 of 22 alleles and 5 of 8 alleles, respectively. Two-third patients were of neonatal classical phenotype (16 of 24). BCKDHB gene mutations were present in 10 of these 16 patients. Prenatal diagnoses were performed in 4 families. Consanguinity was noted in 37.5% families. Although no obvious genotype-phenotype correlation could be found in our study, most cases with mutation in BCKDHB gene presented in neonatal period. Large number of novel mutations underlines the heterogeneity and distinctness of gene pool from India. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Rudimentary expression of RYamide in Drosophila melanogaster relative to other Drosophila species points to a functional decline of this neuropeptide gene.

PubMed

Veenstra, Jan A; Khammassi, Hela

2017-04-01

RYamides are arthropod neuropeptides with unknown function. In 2011 two RYamides were isolated from D. melanogaster as the ligands for the G-protein coupled receptor CG5811. The D. melanogaster gene encoding these neuropeptides is highly unusual, as there are four RYamide encoding exons in the current genome assembly, but an exon encoding a signal peptide is absent. Comparing the D. melanogaster gene structure with those from other species, including D. virilis, suggests that the gene is degenerating. RNAseq data from 1634 short sequence read archives at NCBI containing more than 34 billion spots yielded numerous individual spots that correspond to the RYamide encoding exons, of which a large number include the intron-exon boundary at the start of this exon. Although 72 different sequences have been spliced onto this RYamide encoding exon, none codes for the signal peptide of this gene. Thus, the RNAseq data for this gene reveal only noise and no signal. The very small quantities of peptide recovered during isolation and the absence of credible RNAseq data, indicates that the gene is very little expressed, while the RYamide gene structure in D. melanogaster suggests that it might be evolving into a pseudogene. Yet, the identification of the peptides it encodes clearly shows it is still functional. Using region specific antisera, we could localize numerous neurons and enteroendocrine cells in D. willistoni, D. virilis and D. pseudoobscura, but only two adult abdominal neurons in D. melanogaster. Those two neurons project to and innervate the rectal papillae, suggesting that RYamides may be involved in the regulation of water homeostasis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Horizontal Transfer of a Subtilisin Gene from Plants into an Ancestor of the Plant Pathogenic Fungal Genus Colletotrichum

PubMed Central

Armijos Jaramillo, Vinicio Danilo; Vargas, Walter Alberto; Sukno, Serenella Ana; Thon, Michael R.

2013-01-01

The genus Colletotrichum contains a large number of phytopathogenic fungi that produce enormous economic losses around the world. The effect of horizontal gene transfer (HGT) has not been studied yet in these organisms. Inter-Kingdom HGT into fungal genomes has been reported in the past but knowledge about the HGT between plants and fungi is particularly limited. We describe a gene in the genome of several species of the genus Colletotrichum with a strong resemblance to subtilisins typically found in plant genomes. Subtilisins are an important group of serine proteases, widely distributed in all of the kingdoms of life. Our hypothesis is that the gene was acquired by Colletotrichum spp. through (HGT) from plants to a Colletotrichum ancestor. We provide evidence to support this hypothesis in the form of phylogenetic analyses as well as a characterization of the similarity of the subtilisin at the primary, secondary and tertiary structural levels. The remarkable level of structural conservation of Colletotrichum plant-like subtilisin (CPLS) with plant subtilisins and the differences with the rest of Colletotrichum subtilisins suggests the possibility of molecular mimicry. Our phylogenetic analysis indicates that the HGT event would have occurred approximately 150–155 million years ago, after the divergence of the Colletotrichum lineage from other fungi. Gene expression analysis shows that the gene is modulated during the infection of maize by C. graminicola suggesting that it has a role in plant disease. Furthermore, the upregulation of the CPLS coincides with the downregulation of several plant genes encoding subtilisins. Based on the known roles of subtilisins in plant pathogenic fungi and the gene expression pattern that we observed, we postulate that the CPLSs have an important role in plant infection. PMID:23554975
Horizontal transfer of a subtilisin gene from plants into an ancestor of the plant pathogenic fungal genus Colletotrichum.

PubMed

Armijos Jaramillo, Vinicio Danilo; Vargas, Walter Alberto; Sukno, Serenella Ana; Thon, Michael R

2013-01-01

The genus Colletotrichum contains a large number of phytopathogenic fungi that produce enormous economic losses around the world. The effect of horizontal gene transfer (HGT) has not been studied yet in these organisms. Inter-Kingdom HGT into fungal genomes has been reported in the past but knowledge about the HGT between plants and fungi is particularly limited. We describe a gene in the genome of several species of the genus Colletotrichum with a strong resemblance to subtilisins typically found in plant genomes. Subtilisins are an important group of serine proteases, widely distributed in all of the kingdoms of life. Our hypothesis is that the gene was acquired by Colletotrichum spp. through (HGT) from plants to a Colletotrichum ancestor. We provide evidence to support this hypothesis in the form of phylogenetic analyses as well as a characterization of the similarity of the subtilisin at the primary, secondary and tertiary structural levels. The remarkable level of structural conservation of Colletotrichum plant-like subtilisin (CPLS) with plant subtilisins and the differences with the rest of Colletotrichum subtilisins suggests the possibility of molecular mimicry. Our phylogenetic analysis indicates that the HGT event would have occurred approximately 150-155 million years ago, after the divergence of the Colletotrichum lineage from other fungi. Gene expression analysis shows that the gene is modulated during the infection of maize by C. graminicola suggesting that it has a role in plant disease. Furthermore, the upregulation of the CPLS coincides with the downregulation of several plant genes encoding subtilisins. Based on the known roles of subtilisins in plant pathogenic fungi and the gene expression pattern that we observed, we postulate that the CPLSs have an important role in plant infection.
Genome-Wide Identification of the Alba Gene Family in Plants and Stress-Responsive Expression of the Rice Alba Genes

PubMed Central

Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan

2018-01-01

Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa, Zea mays, Sorghum bicolor, Cicer arietinum, and Vitis vinifera, and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii, Physcomitrella patens, and Amborella trichopoda, revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice (OsAlba), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure–function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants. PMID:29597290
Design and Evaluation of Illumina MiSeq-Compatible, 18S rRNA Gene-Specific Primers for Improved Characterization of Mixed Phototrophic Communities.

PubMed

Bradley, Ian M; Pinto, Ameet J; Guest, Jeremy S

2016-10-01

The use of high-throughput sequencing technologies with the 16S rRNA gene for characterization of bacterial and archaeal communities has become routine. However, the adoption of sequencing methods for eukaryotes has been slow, despite their significance to natural and engineered systems. There are large variations among the target genes used for amplicon sequencing, and for the 18S rRNA gene, there is no consensus on which hypervariable region provides the most suitable representation of diversity. Additionally, it is unclear how much PCR/sequencing bias affects the depiction of community structure using current primers. The present study amplified the V4 and V8-V9 regions from seven microalgal mock communities as well as eukaryotic communities from freshwater, coastal, and wastewater samples to examine the effect of PCR/sequencing bias on community structure and membership. We found that degeneracies on the 3' end of the current V4-specific primers impact read length and mean relative abundance. Furthermore, the PCR/sequencing error is markedly higher for GC-rich members than for communities with balanced GC content. Importantly, the V4 region failed to reliably capture 2 of the 12 mock community members, and the V8-V9 hypervariable region more accurately represents mean relative abundance and alpha and beta diversity. Overall, the V4 and V8-V9 regions show similar community representations over freshwater, coastal, and wastewater environments, but specific samples show markedly different communities. These results indicate that multiple primer sets may be advantageous for gaining a more complete understanding of community structure and highlight the importance of including mock communities composed of species of interest. The quantification of error associated with community representation by amplicon sequencing is a critical challenge that is often ignored. When target genes are amplified using currently available primers, differential amplification efficiencies result in inaccurate estimates of community structure. The extent to which amplification bias affects community representation and the accuracy with which different gene targets represent community structure are not known. As a result, there is no consensus on which region provides the most suitable representation of diversity for eukaryotes. This study determined the accuracy with which commonly used 18S rRNA gene primer sets represent community structure and identified particular biases related to PCR amplification and Illumina MiSeq sequencing in order to more accurately study eukaryotic microbial communities. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Uptake and transfection with polymeric nanoparticles are dependent on polymer end-group structure, but largely independent of nanoparticle physical and chemical properties

PubMed Central

Sunshine, Joel C.; Peng, Daniel Y.; Green, Jordan J.

2012-01-01

Development of non-viral particles for gene delivery requires a greater understanding of the properties that enable gene delivery particles to overcome the numerous barriers to intracellular DNA delivery. Linear poly(beta-amino) esters (PBAE) have shown substantial promise for gene delivery, but the mechanism behind their effectiveness is not well quantified with respect to these barriers. In this study, we synthesized, characterized, and evaluated for gene delivery an array of linear PBAEs that differed by small changes along the backbone, side chain, and end-group of the polymers. We examined particle size and surface charge, polymer molecular weight, polymer degradation rate, buffering capacity, cellular uptake, transfection, and cytotoxicity of nanoparticles formulated with these polymers. Significantly, this is the first study that has quantified how small differential structural changes to polymers of this class modulate buffering capacity and polymer degradation rate and relates these findings to gene delivery efficacy. All polymers formed positively charged (zeta potential 21–29 mV) nanosized articles (~ 150 nm). The polymers hydrolytically degraded quickly in physiological conditions, with half-lives ranging from 90 minutes to 6 hours depending on polymer structure. The PBAE buffering capacities in the relevant pH range (pH 5.1 – 7.4) varied from 34% to 95% protonable amines, and on a per mass basis, PBAEs buffered 1.4–4.6 mmol H+/g. When compared to 25 kDa branched polyethyleneimine (PEI), PBAEs buffer significantly fewer protons/mass, as PEI buffers 6.2 mmol H+/g over the same range. However, due to the relatively low cytotoxicity of PBAEs, higher polymer mass can be used to form particles than with PEI and total buffering capacity of PBAE-based particles significantly exceeds that of PEI. Uptake into COS-7 cells ranged from 0% to 95% of cells and transfection ranged from 0% to 93% of cells, depending on the base polymer structure and the end-modifications examined. Five polymers achieved higher uptake and transfection efficacy with less toxicity than branched-PEI control. Surprisingly, acrylate-terminated base polymers were dramatically less efficacious than their end-capped versions, both in terms of uptake (1–3% for acrylate, 75–94% for end-capped) and transfection efficacy (0–1% vs. 20–89%), even though there are minimal differences between acrylate and end-capped polymers in terms of DNA retardation in gel electrophoresis, particle size, zeta potential, and cytotoxicity. These studies further elucidate the role of polymer structure for gene delivery and highlight that small molecule end-group modification of a linear polymer can be critical for cellular uptake in a manner that is largely independent of polymer/DNA binding, particle size, and particle surface charge. PMID:22970908
Very large scale wavefunction orthogonalization in Density Functional Theory electronic structure calculations

NASA Astrophysics Data System (ADS)

Bekas, C.; Curioni, A.

2010-06-01

Enforcing the orthogonality of approximate wavefunctions becomes one of the dominant computational kernels in planewave based Density Functional Theory electronic structure calculations that involve thousands of atoms. In this context, algorithms that enjoy both excellent scalability and single processor performance properties are much needed. In this paper we present block versions of the Gram-Schmidt method and we show that they are excellent candidates for our purposes. We compare the new approach with the state of the art practice in planewave based calculations and find that it has much to offer, especially when applied on massively parallel supercomputers such as the IBM Blue Gene/P Supercomputer. The new method achieves excellent sustained performance that surpasses 73 TFLOPS (67% of peak) on 8 Blue Gene/P racks (32 768 compute cores), while it enables more than a two fold decrease in run time when compared with the best competing methodology.
What Do We Know About NOD-Like Receptors in Plant Immunity?

PubMed

Zhang, Xiaoxiao; Dodds, Peter N; Bernoux, Maud

2017-08-04

The first plant disease resistance (R) genes were identified and cloned more than two decades ago. Since then, many more R genes have been identified and characterized in numerous plant pathosystems. Most of these encode members of the large family of intracellular NLRs (NOD-like receptors), which also includes animal immune receptors. New discoveries in this expanding field of research provide new elements for our understanding of plant NLR function. But what do we know about plant NLR function today? Genetic, structural, and functional analyses have uncovered a number of commonalities and differences in pathogen recognition strategies as well as how NLRs are regulated and activate defense signaling, but many unknowns remain. This review gives an update on the latest discoveries and breakthroughs in this field, with an emphasis on structural findings and some comparison to animal NLRs, which can provide additional insights and paradigms in plant NLR function.
The Popeye Domain Containing Genes and Their Function as cAMP Effector Proteins in Striated Muscle.

PubMed

Brand, Thomas

2018-03-13

The Popeye domain containing (POPDC) genes encode transmembrane proteins, which are abundantly expressed in striated muscle cells. Hallmarks of the POPDC proteins are the presence of three transmembrane domains and the Popeye domain, which makes up a large part of the cytoplasmic portion of the protein and functions as a cAMP-binding domain. Interestingly, despite the prediction of structural similarity between the Popeye domain and other cAMP binding domains, at the protein sequence level they strongly differ from each other suggesting an independent evolutionary origin of POPDC proteins. Loss-of-function experiments in zebrafish and mouse established an important role of POPDC proteins for cardiac conduction and heart rate adaptation after stress. Loss-of function mutations in patients have been associated with limb-girdle muscular dystrophy and AV-block. These data suggest an important role of these proteins in the maintenance of structure and function of striated muscle cells.
New Method for Producing Significant Amounts of RNA Labeled at Specific Sites | Center for Cancer Research

Cancer.gov

Among biomacromolecules, RNA is the most versatile, and it plays indispensable roles in almost all aspects of biology. For example, in addition to serving as mRNAs coding for proteins, RNAs regulate gene expression, such as controlling where, when, and how efficiently a gene gets expressed, participate in RNA processing, encode the genetic information of some viruses, serve as scaffolds, and even possess enzymatic activity. To study these RNAs and their biological functions and to make use of those RNA activities for biomedical applications, researchers first need to make various types of RNA. For structural biologists incorporating modified or labeled nucleotides at specific sites in RNA molecules of interest is critical to gain structural insight into RNA functions. However, placing labeled or modified residue(s) in desired positions in a large RNA has not been possible until now.
Structure of the two-domain hexameric APS kinase from Thiobacillus denitrificans: structural basis for the absence of ATP sulfurylase activity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gay, Sean C.; Segel, Irwin H.; Fisher, Andrew J., E-mail: fisher@chem.ucdavis.edu

2009-10-01

APS kinase from Thiobacillus denitrificans contains an inactive N-terminal ATP sulfurylase domain. The structure presented unveils the first hexameric assembly for an APS kinase, and reveals that structural changes in the N-terminal domain disrupt the ATP sulfurylase active site thus prohibiting activity. The Tbd-0210 gene of the chemolithotrophic bacterium Thiobacillus denitrificans is annotated to encode a 60.5 kDa bifunctional enzyme with ATP sulfurylase and APS kinase activity. This putative bifunctional enzyme was cloned, expressed and structurally characterized. The 2.95 Å resolution X-ray crystal structure reported here revealed a hexameric assembly with D{sub 3} symmetry. Each subunit contains a large N-terminalmore » sulfurylase-like domain and a C-terminal APS kinase domain reminiscent of the two-domain fungal ATP sulfurylases of Penicillium chrysogenum and Saccharomyces cerevisiae, which also exhibit a hexameric assembly. However, the T. denitrificans enzyme exhibits numerous structural and sequence differences in the N-terminal domain that render it inactive with respect to ATP sulfurylase activity. Surprisingly, the C-terminal domain does indeed display APS kinase activity, indicating that this gene product is a true APS kinase. Therefore, these results provide the first structural insights into a unique hexameric APS kinase that contains a nonfunctional ATP sulfurylase-like domain of unknown function.« less
Deep sequencing and genome-wide analysis reveals the expansion of MicroRNA genes in the gall midge Mayetiola destructor

PubMed Central

2013-01-01

Background MicroRNAs (miRNAs) are small non-coding RNAs that play critical roles in regulating post transcriptional gene expression. Gall midges encompass a large group of insects that are of economic importance and also possess fascinating biological traits. The gall midge Mayetiola destructor, commonly known as the Hessian fly, is a destructive pest of wheat and model organism for studying gall midge biology and insect – host plant interactions. Results In this study, we systematically analyzed miRNAs from the Hessian fly. Deep-sequencing a Hessian fly larval transcriptome led to the identification of 89 miRNA species that are either identical or very similar to known miRNAs from other insects, and 184 novel miRNAs that have not been reported from other species. A genome-wide search through a draft Hessian fly genome sequence identified a total of 611 putative miRNA-encoding genes based on sequence similarity and the existence of a stem-loop structure for miRNA precursors. Analysis of the 611 putative genes revealed a striking feature: the dramatic expansion of several miRNA gene families. The largest family contained 91 genes that encoded 20 different miRNAs. Microarray analyses revealed the expression of miRNA genes was strictly regulated during Hessian fly larval development and abundance of many miRNA genes were affected by host genotypes. Conclusion The identification of a large number of miRNAs for the first time from a gall midge provides a foundation for further studies of miRNA functions in gall midge biology and behavior. The dramatic expansion of identical or similar miRNAs provides a unique system to study functional relations among miRNA iso-genes as well as changes in sequence specificity due to small changes in miRNAs and in their mRNA targets. These results may also facilitate the identification of miRNA genes for potential pest control through transgenic approaches. PMID:23496979
Micron-scale coherence in interphase chromatin dynamics

PubMed Central

Zidovska, Alexandra; Weitz, David A.; Mitchison, Timothy J.

2013-01-01

Chromatin structure and dynamics control all aspects of DNA biology yet are poorly understood, especially at large length scales. We developed an approach, displacement correlation spectroscopy based on time-resolved image correlation analysis, to map chromatin dynamics simultaneously across the whole nucleus in cultured human cells. This method revealed that chromatin movement was coherent across large regions (4–5 µm) for several seconds. Regions of coherent motion extended beyond the boundaries of single-chromosome territories, suggesting elastic coupling of motion over length scales much larger than those of genes. These large-scale, coupled motions were ATP dependent and unidirectional for several seconds, perhaps accounting for ATP-dependent directed movement of single genes. Perturbation of major nuclear ATPases such as DNA polymerase, RNA polymerase II, and topoisomerase II eliminated micron-scale coherence, while causing rapid, local movement to increase; i.e., local motions accelerated but became uncoupled from their neighbors. We observe similar trends in chromatin dynamics upon inducing a direct DNA damage; thus we hypothesize that this may be due to DNA damage responses that physically relax chromatin and block long-distance communication of forces. PMID:24019504
Gene expression profiling during asexual development of the late blight pathogen Phytophthora infestans reveals a highly dynamic transcriptome.

PubMed

Judelson, Howard S; Ah-Fong, Audrey M V; Aux, George; Avrova, Anna O; Bruce, Catherine; Cakir, Cahid; da Cunha, Luis; Grenville-Briggs, Laura; Latijnhouwers, Maita; Ligterink, Wilco; Meijer, Harold J G; Roberts, Samuel; Thurber, Carrie S; Whisson, Stephen C; Birch, Paul R J; Govers, Francine; Kamoun, Sophien; van West, Pieter; Windass, John

2008-04-01

Much of the pathogenic success of Phytophthora infestans, the potato and tomato late blight agent, relies on its ability to generate from mycelia large amounts of sporangia, which release zoospores that encyst and form infection structures. To better understand these stages, Affymetrix GeneChips based on 15,650 unigenes were designed and used to profile the life cycle. Approximately half of P. infestans genes were found to exhibit significant differential expression between developmental transitions, with approximately (1)/(10) being stage-specific and most changes occurring during zoosporogenesis. Quantitative reverse-transcription polymerase chain reaction assays confirmed the robustness of the array results and showed that similar patterns of differential expression were obtained regardless of whether hyphae were from laboratory media or infected tomato. Differentially expressed genes encode potential cellular regulators, especially protein kinases; metabolic enzymes such as those involved in glycolysis, gluconeogenesis, or the biosynthesis of amino acids or lipids; regulators of DNA synthesis; structural proteins, including predicted flagellar proteins; and pathogenicity factors, including cell-wall-degrading enzymes, RXLR effector proteins, and enzymes protecting against plant defense responses. Curiously, some stage-specific transcripts do not appear to encode functional proteins. These findings reveal many new aspects of oomycete biology, as well as potential targets for crop protection chemicals.
Physical Factors Correlate to Microbial Community Structure and Nitrogen Cycling Gene Abundance in a Nitrate Fed Eutrophic Lagoon.

PubMed

Highton, Matthew P; Roosa, Stéphanie; Crawshaw, Josie; Schallenberg, Marc; Morales, Sergio E

2016-01-01

Nitrogenous run-off from farmed pastures contributes to the eutrophication of Lake Ellesmere, a large shallow lagoon/lake on the east coast of New Zealand. Tributaries periodically deliver high loads of nitrate to the lake which likely affect microbial communities therein. We hypothesized that a nutrient gradient would form from the potential sources (tributaries) creating a disturbance resulting in changes in microbial community structure. To test this we first determined the existence of such a gradient but found only a weak nitrogen (TN) and phosphorous gradient (DRP). Changes in microbial communities were determined by measuring functional potential (quantification of nitrogen cycling genes via nifH , nirS , nosZI , and nosZII using qPCR), potential activity (via denitrification enzyme activity), as well as using changes in total community (via 16S rRNA gene amplicon sequencing). Our results demonstrated that changes in microbial communities at a phylogenetic (relative abundance) and functional level (proportion of the microbial community carrying nifH and nosZI genes) were most strongly associated with physical gradients (e.g., lake depth, sediment grain size, sediment porosity) and not nutrient concentrations. Low nitrate influx at the time of sampling is proposed as a factor contributing to the observed patterns.

Intron self-complementarity enforces exon inclusion in a yeast pre-mRNA

PubMed Central

Howe, Kenneth James; Ares, Manuel

1997-01-01

Skipping of internal exons during removal of introns from pre-mRNA must be avoided for proper expression of most eukaryotic genes. Despite significant understanding of the mechanics of intron removal, mechanisms that ensure inclusion of internal exons in multi-intron pre-mRNAs remain mysterious. Using a natural two-intron yeast gene, we have identified distinct RNA–RNA complementarities within each intron that prevent exon skipping and ensure inclusion of internal exons. We show that these complementarities are positioned to act as intron identity elements, bringing together only the appropriate 5′ splice sites and branchpoints. Destroying either intron self-complementarity allows exon skipping to occur, and restoring the complementarity using compensatory mutations rescues exon inclusion, indicating that the elements act through formation of RNA secondary structure. Introducing new pairing potential between regions near the 5′ splice site of intron 1 and the branchpoint of intron 2 dramatically enhances exon skipping. Similar elements identified in single intron yeast genes contribute to splicing efficiency. Our results illustrate how intron secondary structure serves to coordinate splice site pairing and enforce exon inclusion. We suggest that similar elements in vertebrate genes could assist in the splicing of very large introns and in the evolution of alternative splicing. PMID:9356473
Spatial Genetic Structure and Mitochondrial DNA Phylogeography of Argentinean Populations of the Grasshopper Dichroplus elongatus

PubMed Central

Rosetti, Natalia; Remis, Maria Isabel

2012-01-01

Many grasshopper species are considered of agronomical importance because they cause damage to pastures and crops. Comprehension of pest population dynamics requires a clear understanding of the genetic diversity and spatial structure of populations. In this study we report on patterns of genetic variation in the South American grasshopper Dichroplus elongatus which is an agricultural pest of crops and forage grasses of great economic significance in Argentina. We use Direct Amplification of Minisatellite Regions (DAMD) and partial sequences of the cytochrome oxydase 1 (COI) mitochondrial gene to investigate intraspecific structure, demographic history and gene flow patterns in twenty Argentinean populations of this species belonging to different geographic and biogeographic regions. DAMD data suggest that, although genetic drift and migration occur within and between populations, measurable relatedness among neighbouring populations declines with distance and dispersal over distances greater than 200 km is not typical, whereas effective gene flow may occur for populations separated by less than 100 km. Landscape analysis was useful to detect genetic discontinuities associated with environmental heterogeneity reflecting the changing agroecosystem. The COI results indicate the existence of strong genetic differentiation between two groups of populations located at both margins of the Paraná River which became separated during climate oscillations of the Middle Pleistocene, suggesting a significant restriction in effective dispersion mediated by females and large scale geographic differentiation. The number of migrants between populations estimated through mitochondrial and DAMD markers suggest that gene flow is low prompting a non-homogeneous spatial structure and justifying the variation through space. Moreover, the genetic analysis of both markers allows us to conclude that males appear to disperse more than females, reducing the chance of the genetic loss associated with recent anthropogenic fragmentation of the D. elongatus studied range. PMID:22859953
Structure and vascular tissue expression of duplicated TERMINAL EAR1-like paralogues in poplar.

PubMed

Charon, Céline; Vivancos, Julien; Mazubert, Christelle; Paquet, Nicolas; Pilate, Gilles; Dron, Michel

2010-02-01

TERMINAL EAR1-like (TEL) genes encode putative RNA-binding proteins only found in land plants. Previous studies suggested that they may regulate tissue and organ initiation in Poaceae. Two TEL genes were identified in both Populus trichocarpa and the hybrid aspen Populus tremula x P. alba, named, respectively, PoptrTEL1-2 and PtaTEL1-2. The analysis of the organisation around the PoptrTEL genes in the P. trichocarpa genome and the estimation of the synonymous substitution rate for PtaTEL1-2 genes indicate that the paralogous link between these two Populus TEL genes probably results from the Salicoid large-scale gene-duplication event. Phylogenetic analyses confirmed their orthology link with the other TEL genes. The expression pattern of both PtaTEL genes appeared to be restricted to the mother cells of the plant body: leaf founder cells, leaf primordia, axillary buds and root differentiating tissues, as well as to mother cells of vascular tissues. Most interestingly, PtaTEL1-2 transcripts were found in differentiating cells of secondary xylem and phloem, but probably not in the cambium itself. Taken together, these results indicate specific expression of the TEL genes in differentiating cells controlling tissue and organ development in Populus (and other Angiosperm species).
Rapid diversification of five Oryza AA genomes associated with rice adaptation.

PubMed

Zhang, Qun-Jie; Zhu, Ting; Xia, En-Hua; Shi, Chao; Liu, Yun-Long; Zhang, Yun; Liu, Yuan; Jiang, Wen-Kai; Zhao, You-Jie; Mao, Shu-Yan; Zhang, Li-Ping; Huang, Hui; Jiao, Jun-Ying; Xu, Ping-Zhen; Yao, Qiu-Yang; Zeng, Fan-Chun; Yang, Li-Li; Gao, Ju; Tao, Da-Yun; Wang, Yue-Ju; Bennetzen, Jeffrey L; Gao, Li-Zhi

2014-11-18

Comparative genomic analyses among closely related species can greatly enhance our understanding of plant gene and genome evolution. We report de novo-assembled AA-genome sequences for Oryza nivara, Oryza glaberrima, Oryza barthii, Oryza glumaepatula, and Oryza meridionalis. Our analyses reveal massive levels of genomic structural variation, including segmental duplication and rapid gene family turnover, with particularly high instability in defense-related genes. We show, on a genomic scale, how lineage-specific expansion or contraction of gene families has led to their morphological and reproductive diversification, thus enlightening the evolutionary process of speciation and adaptation. Despite strong purifying selective pressures on most Oryza genes, we documented a large number of positively selected genes, especially those genes involved in flower development, reproduction, and resistance-related processes. These diversifying genes are expected to have played key roles in adaptations to their ecological niches in Asia, South America, Africa and Australia. Extensive variation in noncoding RNA gene numbers, function enrichment, and rates of sequence divergence might also help account for the different genetic adaptations of these rice species. Collectively, these resources provide new opportunities for evolutionary genomics, numerous insights into recent speciation, a valuable database of functional variation for crop improvement, and tools for efficient conservation of wild rice germplasm.
Rapid diversification of five Oryza AA genomes associated with rice adaptation

PubMed Central

Zhang, Qun-Jie; Zhu, Ting; Xia, En-Hua; Shi, Chao; Liu, Yun-Long; Zhang, Yun; Liu, Yuan; Jiang, Wen-Kai; Zhao, You-Jie; Mao, Shu-Yan; Zhang, Li-Ping; Huang, Hui; Jiao, Jun-Ying; Xu, Ping-Zhen; Yao, Qiu-Yang; Zeng, Fan-Chun; Yang, Li-Li; Gao, Ju; Tao, Da-Yun; Wang, Yue-Ju; Bennetzen, Jeffrey L.; Gao, Li-Zhi

2014-01-01

Comparative genomic analyses among closely related species can greatly enhance our understanding of plant gene and genome evolution. We report de novo-assembled AA-genome sequences for Oryza nivara, Oryza glaberrima, Oryza barthii, Oryza glumaepatula, and Oryza meridionalis. Our analyses reveal massive levels of genomic structural variation, including segmental duplication and rapid gene family turnover, with particularly high instability in defense-related genes. We show, on a genomic scale, how lineage-specific expansion or contraction of gene families has led to their morphological and reproductive diversification, thus enlightening the evolutionary process of speciation and adaptation. Despite strong purifying selective pressures on most Oryza genes, we documented a large number of positively selected genes, especially those genes involved in flower development, reproduction, and resistance-related processes. These diversifying genes are expected to have played key roles in adaptations to their ecological niches in Asia, South America, Africa and Australia. Extensive variation in noncoding RNA gene numbers, function enrichment, and rates of sequence divergence might also help account for the different genetic adaptations of these rice species. Collectively, these resources provide new opportunities for evolutionary genomics, numerous insights into recent speciation, a valuable database of functional variation for crop improvement, and tools for efficient conservation of wild rice germplasm. PMID:25368197
Large-scale identification of differentially expressed genes during pupa development reveals solute carrier gene is essential for pupal pigmentation in Chilo suppressalis.

PubMed

Sun, Yang; Huang, Shuijin; Wang, Shuping; Guo, Dianhao; Ge, Chang; Xiao, Huamei; Jie, Wencai; Yang, Qiupu; Teng, Xiaolu; Li, Fei

2017-04-01

Insects undergo metamorphosis, involving an abrupt change in body structure through cell growth and differentiation. Rice stem stripped borer (SSB), Chilo suppressalis, is one of the most destructive rice pests. However, little is known about the regulation mechanism of metamorphosis development in this notorious insect pest. Here, we studied the expression of 22,197 SSB genes at seven time points during pupa development with a customized microarray, identifying 622 differentially expressed genes (DEG) during pupa development. Gene ontology (GO) analysis of these DEGs indicated that the genes related to substance metabolism were highly expressed in the early pupa, which participate in the physiological processes of larval tissue disintegration at these stages. In comparison, highly expressed genes in the late pupal stages were mainly associated with substance biosynthesis, consistent with adult organ formation at these stages. There were 27 solute carrier (SLC) genes that were highly expressed during pupa development. We knocked down SLC22A3 at the prepupal stage, demonstrating that silencing SLC22A3 induced a deficiency in pupa stiffness and pigmentation. The RNAi-treated individuals had white and soft pupa, suggesting that this gene has an essential role in pupal development. Copyright © 2016 Elsevier Ltd. All rights reserved.
Genome-Wide Identification and Expression Analysis of the Mitogen-Activated Protein Kinase Gene Family in Cassava

PubMed Central

Yan, Yan; Wang, Lianzhe; Ding, Zehong; Tie, Weiwei; Ding, Xupo; Zeng, Changying; Wei, Yunxie; Zhao, Hongliang; Peng, Ming; Hu, Wei

2016-01-01

Mitogen-activated protein kinases (MAPKs) play central roles in plant developmental processes, hormone signaling transduction, and responses to abiotic stress. However, no data are currently available about the MAPK family in cassava, an important tropical crop. Herein, 21 MeMAPK genes were identified from cassava. Phylogenetic analysis indicated that MeMAPKs could be classified into four subfamilies. Gene structure analysis demonstrated that the number of introns in MeMAPK genes ranged from 1 to 10, suggesting large variation among cassava MAPK genes. Conserved motif analysis indicated that all MeMAPKs had typical protein kinase domains. Transcriptomic analysis suggested that MeMAPK genes showed differential expression patterns in distinct tissues and in response to drought stress between wild subspecies and cultivated varieties. Interaction networks and co-expression analyses revealed that crucial pathways controlled by MeMAPK networks may be involved in the differential response to drought stress in different accessions of cassava. Expression of nine selected MAPK genes showed that these genes could comprehensively respond to osmotic, salt, cold, oxidative stressors, and abscisic acid (ABA) signaling. These findings yield new insights into the transcriptional control of MAPK gene expression, provide an improved understanding of abiotic stress responses and signaling transduction in cassava, and lead to potential applications in the genetic improvement of cassava cultivars. PMID:27625666
OGEE v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines.

PubMed

Chen, Wei-Hua; Lu, Guanting; Chen, Xiao; Zhao, Xing-Ming; Bork, Peer

2017-01-04

OGEE is an Online GEne Essentiality database. To enhance our understanding of the essentiality of genes, in OGEE we collected experimentally tested essential and non-essential genes, as well as associated gene properties known to contribute to gene essentiality. We focus on large-scale experiments, and complement our data with text-mining results. We organized tested genes into data sets according to their sources, and tagged those with variable essentiality statuses across data sets as conditionally essential genes, intending to highlight the complex interplay between gene functions and environments/experimental perturbations. Developments since the last public release include increased numbers of species and gene essentiality data sets, inclusion of non-coding essential sequences and genes with intermediate essentiality statuses. In addition, we included 16 essentiality data sets from cancer cell lines, corresponding to 9 human cancers; with OGEE, users can easily explore the shared and differentially essential genes within and between cancer types. These genes, especially those derived from cell lines that are similar to tumor samples, could reveal the oncogenic drivers, paralogous gene expression pattern and chromosomal structure of the corresponding cancer types, and can be further screened to identify targets for cancer therapy and/or new drug development. OGEE is freely available at http://ogee.medgenius.info. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rivers influence the population genetic structure of bonobos (Pan paniscus).

PubMed

Eriksson, J; Hohmann, G; Boesch, C; Vigilant, L

2004-11-01

Bonobos are large, highly mobile primates living in the relatively undisturbed, contiguous forest south of the Congo River. Accordingly, gene flow among populations is assumed to be extensive, but may be impeded by large, impassable rivers. We examined mitochondrial DNA control region sequence variation in individuals from five distinct localities separated by rivers in order to estimate relative levels of genetic diversity and assess the extent and pattern of population genetic structure in the bonobo. Diversity estimates for the bonobo exceed those for humans, but are less than those found for the chimpanzee. All regions sampled are significantly differentiated from one another, according to genetic distances estimated as pairwise FSTs, with the greatest differentiation existing between region East and each of the two Northern populations (N and NE) and the least differentiation between regions Central and South. The distribution of nucleotide diversity shows a clear signal of population structure, with some 30% of the variance occurring among geographical regions. However, a geographical patterning of the population structure is not obvious. Namely, mitochondrial haplotypes were shared among all regions excepting the most eastern locality and the phylogenetic analysis revealed a tree in which haplotypes were intermixed with little regard to geographical origin, with the notable exception of the close relationships among the haplotypes found in the east. Nonetheless, genetic distances correlated with geographical distances when the intervening distances were measured around rivers presenting effective current-day barriers, but not when straight-line distances were used, suggesting that rivers are indeed a hindrance to gene flow in this species.
Pleistocene and ecological effects on continental-scale genetic differentiation in the bobcat (Lynx rufus).

PubMed

Reding, Dawn M; Bronikowski, Anne M; Johnson, Warren E; Clark, William R

2012-06-01

The potential for widespread, mobile species to exhibit genetic structure without clear geographic barriers is a topic of growing interest. Yet the patterns and mechanisms of structure--particularly over broad spatial scales--remain largely unexplored for these species. Bobcats occur across North America and possess many characteristics expected to promote gene flow. To test whether historical, topographic or ecological factors have influenced genetic differentiation in this species, we analysed 1 kb mtDNA sequence and 15 microsatellite loci from over 1700 samples collected across its range. The primary signature in both marker types involved a longitudinal cline with a sharp transition, or suture zone, occurring along the Great Plains. Thus, the data distinguished bobcats in the eastern USA from those in the western half, with no obvious physical barrier to gene flow. Demographic analyses supported a scenario of expansion from separate Pleistocene refugia, with the Great Plains representing a zone of secondary contact. Substructure within the two main lineages likely reflected founder effects, ecological factors, anthropogenic/topographic effects or a combination of these forces. Two prominent topographic features, the Mississippi River and Rocky Mountains, were not supported as significant genetic barriers. Ecological regions and environmental correlates explained a small but significant proportion of genetic variation. Overall, results implicate historical processes as the primary cause of broad-scale genetic differentiation, but contemporary forces seem to also play a role in promoting and maintaining structure. Despite the bobcat's mobility and broad niche, large-scale landscape changes have contributed to significant and complex patterns of genetic structure. © 2012 Blackwell Publishing Ltd.
The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

PubMed

Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin

2013-10-10

Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae. Molecular dating analyses suggest that Ranunculaceae and Berberidaceae diverged between 90 and 84 mya, which is congruent with the fossil records and with recent estimates of the divergence time of these two taxa. © 2013.
Gene editing tools: state-of-the-art and the road ahead for the model and non-model fishes.

PubMed

Barman, Hirak Kumar; Rasal, Kiran Dashrath; Chakrapani, Vemulawada; Ninawe, A S; Vengayil, Doyil T; Asrafuzzaman, Syed; Sundaray, Jitendra K; Jayasankar, Pallipuram

2017-10-01

Advancements in the DNA sequencing technologies and computational biology have revolutionized genome/transcriptome sequencing of non-model fishes at an affordable cost. This has led to a paradigm shift with regard to our heightened understandings of structure-functional relationships of genes at a global level, from model animals/fishes to non-model large animals/fishes. Whole genome/transcriptome sequencing technologies were supplemented with the series of discoveries in gene editing tools, which are being used to modify genes at pre-determined positions using programmable nucleases to explore their respective in vivo functions. For a long time, targeted gene disruption experiments were mostly restricted to embryonic stem cells, advances in gene editing technologies such as zinc finger nuclease, transcriptional activator-like effector nucleases and CRISPR (clustered regulatory interspaced short palindromic repeats)/CRISPR-associated nucleases have facilitated targeted genetic modifications beyond stem cells to a wide range of somatic cell lines across species from laboratory animals to farmed animals/fishes. In this review, we discuss use of different gene editing tools and the strategic implications in fish species for basic and applied biology research.
A flexible and economical barcoding approach for highly multiplexed amplicon sequencing of diverse target genes

PubMed Central

Herbold, Craig W.; Pelikan, Claus; Kuzyk, Orest; Hausmann, Bela; Angel, Roey; Berry, David; Loy, Alexander

2015-01-01

High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse, and high quality sets of amplicon sequence data for modern studies in microbial ecology. PMID:26236305
Pattern Genes Suggest Functional Connectivity of Organs

NASA Astrophysics Data System (ADS)

Qin, Yangmei; Pan, Jianbo; Cai, Meichun; Yao, Lixia; Ji, Zhiliang

2016-05-01

Human organ, as the basic structural and functional unit in human body, is made of a large community of different cell types that organically bound together. Each organ usually exerts highly specified physiological function; while several related organs work smartly together to perform complicated body functions. In this study, we present a computational effort to understand the roles of genes in building functional connection between organs. More specifically, we mined multiple transcriptome datasets sampled from 36 human organs and tissues, and quantitatively identified 3,149 genes whose expressions showed consensus modularly patterns: specific to one organ/tissue, selectively expressed in several functionally related tissues and ubiquitously expressed. These pattern genes imply intrinsic connections between organs. According to the expression abundance of the 766 selective genes, we consistently cluster the 36 human organs/tissues into seven functional groups: adipose & gland, brain, muscle, immune, metabolism, mucoid and nerve conduction. The organs and tissues in each group either work together to form organ systems or coordinate to perform particular body functions. The particular roles of specific genes and selective genes suggest that they could not only be used to mechanistically explore organ functions, but also be designed for selective biomarkers and therapeutic targets.
The octopus genome and the evolution of cephalopod neural and morphological novelties.

PubMed

Albertin, Caroline B; Simakov, Oleg; Mitros, Therese; Wang, Z Yan; Pungor, Judit R; Edsinger-Gonzales, Eric; Brenner, Sydney; Ragsdale, Clifton W; Rokhsar, Daniel S

2015-08-13

Coleoid cephalopods (octopus, squid and cuttlefish) are active, resourceful predators with a rich behavioural repertoire. They have the largest nervous systems among the invertebrates and present other striking morphological innovations including camera-like eyes, prehensile arms, a highly derived early embryogenesis and a remarkably sophisticated adaptive colouration system. To investigate the molecular bases of cephalopod brain and body innovations, we sequenced the genome and multiple transcriptomes of the California two-spot octopus, Octopus bimaculoides. We found no evidence for hypothesized whole-genome duplications in the octopus lineage. The core developmental and neuronal gene repertoire of the octopus is broadly similar to that found across invertebrate bilaterians, except for massive expansions in two gene families previously thought to be uniquely enlarged in vertebrates: the protocadherins, which regulate neuronal development, and the C2H2 superfamily of zinc-finger transcription factors. Extensive messenger RNA editing generates transcript and protein diversity in genes involved in neural excitability, as previously described, as well as in genes participating in a broad range of other cellular functions. We identified hundreds of cephalopod-specific genes, many of which showed elevated expression levels in such specialized structures as the skin, the suckers and the nervous system. Finally, we found evidence for large-scale genomic rearrangements that are closely associated with transposable element expansions. Our analysis suggests that substantial expansion of a handful of gene families, along with extensive remodelling of genome linkage and repetitive content, played a critical role in the evolution of cephalopod morphological innovations, including their large and complex nervous systems.
Ectopic expression of homeobox gene NKX2-1 in diffuse large B-cell lymphoma is mediated by aberrant chromatin modifications.

PubMed

Nagel, Stefan; Ehrentraut, Stefan; Tomasch, Jürgen; Quentmeier, Hilmar; Meyer, Corinna; Kaufmann, Maren; Drexler, Hans G; MacLeod, Roderick A F

2013-01-01

Homeobox genes encode transcription factors ubiquitously involved in basic developmental processes, deregulation of which promotes cell transformation in multiple cancers including hematopoietic malignancies. In particular, NKL-family homeobox genes TLX1, TLX3 and NKX2-5 are ectopically activated by chromosomal rearrangements in T-cell neoplasias. Here, using transcriptional microarray profiling and RQ-PCR we identified ectopic expression of NKL-family member NKX2-1, in a diffuse large B-cell lymphoma (DLBCL) cell line SU-DHL-5. Moreover, in silico analysis demonstrated NKX2-1 overexpression in 5% of examined DLBCL patient samples. NKX2-1 is physiologically expressed in lung and thyroid tissues where it regulates differentiation. Chromosomal and genomic analyses excluded rearrangements at the NKX2-1 locus in SU-DHL-5, implying alternative activation. Comparative expression profiling implicated several candidate genes in NKX2-1 regulation, variously encoding transcription factors, chromatin modifiers and signaling components. Accordingly, siRNA-mediated knockdown and overexpression studies confirmed involvement of transcription factor HEY1, histone methyltransferase MLL and ubiquitinated histone H2B in NKX2-1 deregulation. Chromosomal aberrations targeting MLL at 11q23 and the histone gene cluster HIST1 at 6p22 which we observed in SU-DHL-5 may, therefore, represent fundamental mutations mediating an aberrant chromatin structure at NKX2-1. Taken together, we identified ectopic expression of NKX2-1 in DLBCL cells, representing the central player in an oncogenic regulative network compromising B-cell differentiation. Thus, our data extend the paradigm of NKL homeobox gene deregulation in lymphoid malignancies.
Systematic analysis of copy number variation associated with congenital diaphragmatic hernia.

PubMed

Zhu, Qihui; High, Frances A; Zhang, Chengsheng; Cerveira, Eliza; Russell, Meaghan K; Longoni, Mauro; Joy, Maliackal P; Ryan, Mallory; Mil-Homens, Adam; Bellfy, Lauren; Coletti, Caroline M; Bhayani, Pooja; Hila, Regis; Wilson, Jay M; Donahoe, Patricia K; Lee, Charles

2018-05-15

Congenital diaphragmatic hernia (CDH), characterized by malformation of the diaphragm and hypoplasia of the lungs, is one of the most common and severe birth defects, and is associated with high morbidity and mortality rates. There is growing evidence demonstrating that genetic factors contribute to CDH, although the pathogenesis remains largely elusive. Single-nucleotide polymorphisms have been studied in recent whole-exome sequencing efforts, but larger copy number variants (CNVs) have not yet been studied on a large scale in a case control study. To capture CNVs within CDH candidate regions, we developed and tested a targeted array comparative genomic hybridization platform to identify CNVs within 140 regions in 196 patients and 987 healthy controls, and identified six significant CNVs that were either unique to patients or enriched in patients compared with controls. These CDH-associated CNVs reveal high-priority candidate genes including HLX , LHX1 , and HNF1B We also discuss CNVs that are present in only one patient in the cohort but have additional evidence of pathogenicity, including extremely rare large and/or de novo CNVs. The candidate genes within these predicted disease-causing CNVs form functional networks with other known CDH genes and play putative roles in DNA binding/transcription regulation and embryonic development. These data substantiate the importance of CNVs in the etiology of CDH, identify CDH candidate genes and pathways, and highlight the importance of ongoing analysis of CNVs in the study of CDH and other structural birth defects. Copyright © 2018 the Author(s). Published by PNAS.
Insights into animal and plant lectins with antimicrobial activities.

PubMed

Dias, Renata de Oliveira; Machado, Leandro Dos Santos; Migliolo, Ludovico; Franco, Octavio Luiz

2015-01-05

Lectins are multivalent proteins with the ability to recognize and bind diverse carbohydrate structures. The glyco -binding and diverse molecular structures observed in these protein classes make them a large and heterogeneous group with a wide range of biological activities in microorganisms, animals and plants. Lectins from plants and animals are commonly used in direct defense against pathogens and in immune regulation. This review focuses on sources of animal and plant lectins, describing their functional classification and tridimensional structures, relating these properties with biotechnological purposes, including antimicrobial activities. In summary, this work focuses on structural-functional elucidation of diverse lectin groups, shedding some light on host-pathogen interactions; it also examines their emergence as biotechnological tools through gene manipulation and development of new drugs.
Imaging genes, chromosomes, and nuclear structures using laser-scanning confocal microscopy

NASA Astrophysics Data System (ADS)

Ballard, Stephen G.

1990-08-01

For 350 years, the optical microscope has had a powerful symbiotic relationship with biology. Until this century, optical microscopy was the only means of examining cellular structure; in return, biologists have contributed greatly to the evolution of microscope design and technique. Recent advances in the detection and processing of optical images, together with methods for labelling specific biological molecules, have brought about a resurgence in the application of optical microscopy to the biological sciences. One of the areas in which optical microscopy is breaking new ground is in elucidating the large scale organization of chromatin in chromosomes and cell nuclei. Nevertheless, imaging the contents of the cell nucleus is a difficult challenge for light microscopy, for two principal reasons. First, the dimensions of all but the largest nuclear structures (nucleoli, vacuoles) are close to or below the resolving power of far field optics. Second, the native optical contrast properties of many important chromatin structures (eg. chromosome domains, centromere regions) are very weak, or essentially zero. As an extreme example, individual genes probably have nothing to distinguish them other than their sequence of DNA bases, which cannot be directly visualized with any current form of microscopy. Similarly, the interphase nucleus shows no direct visible evidence of focal chromatin domains. Thus, imaging of such entities depends heavily on contrast enhancement methods. The most promising of these is labelling DNA in situ using sequence-specific probes that may be visualized using fluorescent dyes. We have applied this method to detecting individual genes in metaphase chromosomes and interphase nuclei, and to imaging a number of DNA-containing structures including chromosome domains, metaphase chromosomes and centromere regions. We have also demonstrated the applicability of in situ fluorescent labelling to detecting numerical and structural abnormalities both in condensed metaphase chromosomes and in interphase nuclei. The ability to image the loci of fluorescent-labelled gene probes hybridized to chromosomes and to interphase nuclei will play a major role in the mapping of the human genome. This presentation is an overview of our laboratory's efforts to use confocal imaging to address fundamental questions about the structure and organization of genes, chromosomes and cell nuclei, and to develop applications useful in clinical diagnosis of inherited diseases.
Characterization of 17 chaperone-usher fimbriae encoded by Proteus mirabilis reveals strong conservation

PubMed Central

Kuan, Lisa; Schaffer, Jessica N.; Zouzias, Christos D.

2014-01-01

Proteus mirabilis is a Gram-negative enteric bacterium that causes complicated urinary tract infections, particularly in patients with indwelling catheters. Sequencing of clinical isolate P. mirabilis HI4320 revealed the presence of 17 predicted chaperone-usher fimbrial operons. We classified these fimbriae into three groups by their genetic relationship to other chaperone-usher fimbriae. Sixteen of these fimbriae are encoded by all seven currently sequenced P. mirabilis genomes. The predicted protein sequence of the major structural subunit for 14 of these fimbriae was highly conserved (≥95 % identity), whereas three other structural subunits (Fim3A, UcaA and Fim6A) were variable. Further examination of 58 clinical isolates showed that 14 of the 17 predicted major structural subunit genes of the fimbriae were present in most strains (>85 %). Transcription of the predicted major structural subunit genes for all 17 fimbriae was measured under different culture conditions designed to mimic conditions in the urinary tract. The majority of the fimbrial genes were induced during stationary phase, static culture or colony growth when compared to exponential-phase aerated culture. Major structural subunit proteins for six of these fimbriae were detected using MS of proteins sheared from the surface of broth-cultured P. mirabilis, demonstrating that this organism may produce multiple fimbriae within a single culture. The high degree of conservation of P. mirabilis fimbriae stands in contrast to uropathogenic Escherichia coli and Salmonella enterica, which exhibit greater variability in their fimbrial repertoires. These findings suggest there may be evolutionary pressure for P. mirabilis to maintain a large fimbrial arsenal. PMID:24809384

Some links on this page may take you to non-federal websites. Their policies may differ from this site.