Lan, DaoLiang; Xiong, XianRong; Wei, YanLi; Xu, Tong; Zhong, JinCheng; Zhi, XiangDong; Wang, Yong; Li, Jian
2014-09-01
RNA-Seq, a high-throughput (HT) sequencing technique, has been used effectively in large-scale transcriptomic studies, and is particularly useful for improving gene structure information and mining of new genes. In this study, RNA-Seq HT technology was employed to analyze the transcriptome of yak ovary. After Illumina-Solexa deep sequencing, 26826516 clean reads with a total of 4828772880 bp were obtained from the ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome and 3734 of these genes were involved in alternative splicing. Gene structure refinement analysis showed that 7340 genes that were annotated in the yak genome could be extended at the 5' or 3' ends based on the alignments been the transcripts and the genome sequence. Novel transcript prediction analysis identified 6321 new transcripts with lengths ranging from 180 to 14884 bp, and 2267 of them were predicted to code proteins. BLAST analysis of the new transcripts showed that 1200?4933 mapped to the non-redundant (nr), nucleotide (nt) and/or SwissProt sequence databases. Comparative statistical analysis of the new mapped transcripts showed that the majority of them were similar to genes in Bos taurus (41.4%), Bos grunniens mutus (33.0%), Ovis aries (6.3%), Homo sapiens (2.8%), Mus musculus (1.6%) and other species. Functional analysis showed that these expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes pathways. GO analysis of the new transcripts found that the largest proportion of them was associated with reproduction. The results of this study will provide a basis for describing the normal transcriptome map of yak ovary and for future studies on yak breeding performance. Moreover, the results confirmed that RNA-Seq HT technology is highly advantageous in improving gene structure information and mining of new genes, as well as in providing valuable data to expand the yak genome information.
Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P
2013-03-21
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
Genome-wide characterization of the Pectate Lyase-like (PLL) genes in Brassica rapa.
Jiang, Jingjing; Yao, Lina; Miao, Ying; Cao, Jiashu
2013-11-01
Pectate lyases (PL) depolymerize demethylated pectin (pectate, EC 4.2.2.2) by catalyzing the eliminative cleavage of α-1,4-glycosidic linked galacturonan. Pectate Lyase-like (PLL) genes are one of the largest and most complex families in plants. However, studies on the phylogeny, gene structure, and expression of PLL genes are limited. To understand the potential functions of PLL genes in plants, we characterized their intron-exon structure, phylogenetic relationships, and protein structures, and measured their expression patterns in various tissues, specifically the reproductive tissues in Brassica rapa. Sequence alignments revealed two characteristic motifs in PLL genes. The chromosome location analysis indicated that 18 of the 46 PLL genes were located in the least fractionated sub-genome (LF) of B. rapa, while 16 were located in the medium fractionated sub-genome (MF1) and 12 in the more fractionated sub-genome (MF2). Quantitative RT-PCR analysis showed that BrPLL genes were expressed in various tissues, with most of them being expressed in flowers. Detailed qRT-PCR analysis identified 11 pollen specific PLL genes and several other genes with unique spatial expression patterns. In addition, some duplicated genes showed similar expression patterns. The phylogenetic analysis identified three PLL gene subfamilies in plants, among which subfamily II might have evolved from gene neofunctionalization or subfunctionalization. Therefore, this study opens the possibility for exploring the roles of PLL genes during plant development.
Papaleo, Maria Cristiana; Russo, Edda; Fondi, Marco; Emiliani, Giovanni; Frandi, Antonio; Brilli, Matteo; Pastorelli, Roberta; Fani, Renato
2009-12-01
In this work a detailed analysis of the structure, the expression and the organization of his genes belonging to the core of histidine biosynthesis (hisBHAF) in 40 newly determined and 13 available sequences of Burkholderia strains was carried out. Data obtained revealed a strong conservation of the structure and organization of these genes through the entire genus. The phylogenetic analysis showed the monophyletic origin of this gene cluster and indicated that it did not undergo horizontal gene transfer events. The analysis of the intergenic regions, based on the substitution rate, entropy plot and bendability suggested the existence of a putative transcription promoter upstream of hisB, that was supported by the genetic analysis that showed that this cluster was able to complement Escherichia colihisA, hisB, and hisF mutations. Moreover, a preliminary transcriptional analysis and the analysis of microarray data revealed that the expression of the his core was constitutive. These findings are in agreement with the fact that the entire Burkholderiahis operon is heterogeneous, in that it contains "alien" genes apparently not involved in histidine biosynthesis. Besides, they also support the idea that the proteobacterial his operon was piece-wisely assembled, i.e. through accretion of smaller units containing only some of the genes (eventually together with their own promoters) involved in this biosynthetic route. The correlation existing between the structure, organization and regulation of his "core" genes and the function(s) they perform in cellular metabolism is discussed.
In silico identification and analysis of phytoene synthase genes in plants.
Han, Y; Zheng, Q S; Wei, Y P; Chen, J; Liu, R; Wan, H J
2015-08-14
In this study, we examined phytoene synthetase (PSY), the first key limiting enzyme in the synthesis of carotenoids and catalyzing the formation of geranylgeranyl pyrophosphate in terpenoid biosynthesis. We used known amino acid sequences of the PSY gene in tomato plants to conduct a genome-wide search and identify putative candidates in 34 sequenced plants. A total of 101 homologous genes were identified. Phylogenetic analysis revealed that PSY evolved independently in algae as well as monocotyledonous and dicotyledonous plants. Our results showed that the amino acid structures exhibited 5 motifs (motifs 1 to 5) in algae and those in higher plants were highly conserved. The PSY gene structures showed that the number of intron in algae varied widely, while the number of introns in higher plants was 4 to 5. Identification of PSY genes in plants and the analysis of the gene structure may provide a theoretical basis for studying evolutionary relationships in future analyses.
Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L
2014-04-08
In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne
2015-02-10
Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-05-26
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-01-01
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414
Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus
Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.
2001-01-01
Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505
Safo, Sandra E; Li, Shuzhao; Long, Qi
2018-03-01
Integrative analysis of high dimensional omics data is becoming increasingly popular. At the same time, incorporating known functional relationships among variables in analysis of omics data has been shown to help elucidate underlying mechanisms for complex diseases. In this article, our goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study that includes healthy adults at a high risk of developing cardiovascular diseases. Adopting a strategy that is both data-driven and knowledge-based, we develop statistical methods for sparse canonical correlation analysis (CCA) with incorporation of known biological information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of cardiovascular disease. Our simulations demonstrate that the structured sparse CCA methods outperform several existing sparse CCA methods in selecting relevant genes and metabolites when structural information is informative and are robust to mis-specified structural information. Our analysis of the PHI study reveals that a number of gene and metabolic pathways including some known to be associated with cardiovascular diseases are enriched in the set of genes and metabolites selected by our proposed approach. © 2017, The International Biometric Society.
Heendeniya, Ravindra G; Yu, Peiqiang
2017-03-20
Alfalfa ( Medicago sativa L.) genotypes transformed with Lc-bHLH and Lc transcription genes were developed with the intention of stimulating proanthocyanidin synthesis in the aerial parts of the plant. To our knowledge, there are no studies on the effect of single-gene and two-gene transformation on chemical functional groups and molecular structure changes in these plants. The objective of this study was to use advanced molecular spectroscopy with multivariate chemometrics to determine chemical functional group intensity and molecular structure changes in alfalfa plants when co-expressing Lc-bHLH and C1-MYB transcriptive flavanoid regulatory genes in comparison with non-transgenic (NT) and AC Grazeland (ACGL) genotypes. The results showed that compared to NT genotype, the presence of double genes ( Lc and C1 ) increased ratios of both the area and peak height of protein structural Amide I/II and the height ratio of α-helix to β-sheet. In carbohydrate-related spectral analysis, the double gene-transformed alfalfa genotypes exhibited lower peak heights at 1370, 1240, 1153, and 1020 cm -1 compared to the NT genotype. Furthermore, the effect of double gene transformation on carbohydrate molecular structure was clearly revealed in the principal component analysis of the spectra. In conclusion, single or double transformation of Lc and C1 genes resulted in changing functional groups and molecular structure related to proteins and carbohydrates compared to the NT alfalfa genotype. The current study provided molecular structural information on the transgenic alfalfa plants and provided an insight into the impact of transgenes on protein and carbohydrate properties and their molecular structure's changes.
Genome-Wide Analysis of the NADK Gene Family in Plants
Li, Wen-Yan; Wang, Xiang; Li, Ri; Li, Wen-Qiang; Chen, Kun-Ming
2014-01-01
Background NAD(H) kinase (NADK) is the key enzyme that catalyzes de novo synthesis of NADP(H) from NAD(H) for NADP(H)-based metabolic pathways. In plants, NADKs form functional subfamilies. Studies of these families in Arabidopsis thaliana indicate that they have undergone considerable evolutionary selection; however, the detailed evolutionary history and functions of the various NADKs in plants are not clearly understood. Principal Findings We performed a comparative genomic analysis that identified 74 NADK gene homologs from 24 species representing the eight major plant lineages within the supergroup Plantae: glaucophytes, rhodophytes, chlorophytes, bryophytes, lycophytes, gymnosperms, monocots and eudicots. Phylogenetic and structural analysis classified these NADK genes into four well-conserved subfamilies with considerable variety in the domain organization and gene structure among subfamily members. In addition to the typical NAD_kinase domain, additional domains, such as adenylate kinase, dual-specificity phosphatase, and protein tyrosine phosphatase catalytic domains, were found in subfamily II. Interestingly, NADKs in subfamily III exhibited low sequence similarity (∼30%) in the kinase domain within the subfamily and with the other subfamilies. These observations suggest that gene fusion and exon shuffling may have occurred after gene duplication, leading to specific domain organization seen in subfamilies II and III, respectively. Further analysis of the exon/intron structures showed that single intron loss and gain had occurred, yielding the diversified gene structures, during the process of structural evolution of NADK family genes. Finally, both available global microarray data analysis and qRT-RCR experiments revealed that the NADK genes in Arabidopsis and Oryza sativa show different expression patterns in different developmental stages and under several different abiotic/biotic stresses and hormone treatments, underscoring the functional diversity and functional divergence of the NADK family in plants. Conclusions These findings will facilitate further studies of the NADK family and provide valuable information for functional validation of this family in plants. PMID:24968225
Anderson, Olin D; Coleman-Derr, Devin; Gu, Yong Q; Heath, Sekou
2010-06-16
Among the dietary essential amino acids, the most severely limiting in the cereals is lysine. Since cereals make up half of the human diet, lysine limitation has quality/nutritional consequences. The breakdown of lysine is controlled mainly by the catabolic bifunctional enzyme lysine ketoglutarate reductase - saccharopine dehydrogenase (LKR/SDH). The LKR/SDH gene has been reported to produce transcripts for the bifunctional enzyme and separate monofunctional transcripts. In addition to lysine metabolism, this gene has been implicated in a number of metabolic and developmental pathways, which along with its production of multiple transcript types and complex exon/intron structure suggest an important node in plant metabolism. Understanding more about the LKR/SDH gene is thus interesting both from applied standpoint and for basic plant metabolism. The current report describes a wheat genomic fragment containing an LKR/SDH gene and adjacent genes. The wheat LKR/SDH genomic segment was found to originate from the A-genome of wheat, and EST analysis indicates all three LKR/SDH genes in hexaploid wheat are transcriptionally active. A comparison of a set of plant LKR/SDH genes suggests regions of greater sequence conservation likely related to critical enzymatic functions and metabolic controls. Although most plants contain only a single LKR/SDH gene per genome, poplar contains at least two functional bifunctional genes in addition to a monofunctional LKR gene. Analysis of ESTs finds evidence for monofunctional LKR transcripts in switchgrass, and monofunctional SDH transcripts in wheat, Brachypodium, and poplar. The analysis of a wheat LKR/SDH gene and comparative structural and functional analyses among available plant genes provides new information on this important gene. Both the structure of the LKR/SDH gene and the immediately adjacent genes show lineage-specific differences between monocots and dicots, and findings suggest variation in activity of LKR/SDH genes among plants. Although most plant genomes seem to contain a single conserved LKR/SDH gene per genome, poplar possesses multiple contiguous genes. A preponderance of SDH transcripts suggests the LKR region may be more rate-limiting. Only switchgrass has EST evidence for LKR monofunctional transcripts. Evidence for monofunctional SDH transcripts shows a novel intron in wheat, Brachypodium, and poplar.
Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni
2005-01-01
Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747
Pittman, Jon K; Hirschi, Kendal D
2016-12-01
The Ca(2+)/Cation Antiporter (CaCA) superfamily is an ancient and widespread family of ion-coupled cation transporters found in nearly all kingdoms of life. In animals, K(+)-dependent and K(+)-indendent Na(+)/Ca(2+) exchangers (NCKX and NCX) are important CaCA members. Recently it was proposed that all rice and Arabidopsis CaCA proteins should be classified as NCX proteins. Here we performed phylogenetic analysis of CaCA genes and protein structure homology modelling to further characterise members of this transporter superfamily. Phylogenetic analysis of rice and Arabidopsis CaCAs in comparison with selected CaCA members from non-plant species demonstrated that these genes form clearly distinct families, with the H(+)/Cation exchanger (CAX) and cation/Ca(2+) exchanger (CCX) families dominant in higher plants but the NCKX and NCX families absent. NCX-related Mg(2+)/H(+) exchanger (MHX) and CAX-related Na(+)/Ca(2+) exchanger-like (NCL) proteins are instead present. Analysis of genomes of ten closely-related rice species and four Arabidopsis-related species found that CaCA gene family structures are highly conserved within related plants, apart from minor variation. Protein structures were modelled for OsCAX1a and OsMHX1. Despite exhibiting broad structural conservation, there are clear structural differences observed between the different CaCA types. Members of the CaCA superfamily form clearly distinct families with different phylogenetic, structural and functional characteristics, and therefore should not be simply classified as NCX proteins, which should remain as a separate gene family.
Evidence for cryptic northern refugia in the last glacial period in Cryptomeria japonica
Kimura, Megumi K.; Uchiyama, Kentaro; Nakao, Katsuhiro; Moriguchi, Yoshinari; San Jose-Maldia, Lerma; Tsumura, Yoshihiko
2014-01-01
Background and Aims Distribution shifts and natural selection during past climatic changes are important factors in determining the genetic structure of forest species. In particular, climatic fluctuations during the Quaternary appear to have caused changes in the distribution ranges of plants, and thus strongly affected their genetic structure. This study was undertaken to identify the responses of the conifer Cryptomeria japonica, endemic to the Japanese Archipelago, to past climatic changes using a combination of phylogeography and species distribution modelling (SDM) methods. Specifically, this study focused on the locations of refugia during the last glacial maximum (LGM). Methods Genetic diversity and structure were examined using 20 microsatellite markers in 37 populations of C. japonica. The locations of glacial refugia were assessed using STRUCTURE analysis, and potential habitats under current and past climate conditions were predicted using SDM. The process of genetic divergence was also examined using the approximate Bayesian computation procedure (ABC) in DIY ABC to test the divergence time between the gene pools detected by the STRUCTURE analysis. Key Results STRUCTURE analysis identified four gene pools: northern Tohoku district; from Chubu to Chugoku district; from Tohoku to Shikoku district on the Pacific Ocean side of the Archipelago; and Yakushima Island. DIY ABC analysis indicated that the four gene pools diverged at the same time before the LGM. SDM also indicated potential northern cryptic refugia. Conclusions The combined evidence from microsatellites and SDM clearly indicates that climatic changes have shaped the genetic structure of C. japonica. The gene pool detected in northern Tohoku district is likely to have been established by cryptic northern refugia on the coast of the Japan Sea to the west of the Archipelago. The gene pool in Yakushima Island can probably be explained simply by long-term isolation from the other gene pools since the LGM. These results are supported by those of SDM and the predicted divergence time determined using ABC analysis. PMID:25355521
NASA Astrophysics Data System (ADS)
Mittal, Shikha; Banduni, Pooja; Mallikarjuna, Mallana G.; Rao, Atmakuri R.; Jain, Prashant A.; Dash, Prasanta K.; Thirunavukkarasu, Nepolean
2018-05-01
Drought is one of the major threats to maize production. In order to improve the production and to breed tolerant hybrids, understanding the genes and regulatory mechanisms during drought stress is important. Transcription factors (TFs) play a major role in gene regulation and many TFs have been identified in response to drought stress. In our experiment, a set of 15 major TF families comprising 1436 genes was structurally and functionally characterized using in-silico tools and a gene expression assay. All 1436 genes were mapped on 10 chromosome of maize. The functional annotation indicated the involvement of these genes in ABA signaling, ROS scavenging, photosynthesis, stomatal regulation, and sucrose metabolism. Duplication was identified as the primary force in divergence and expansion of TF families. Phylogenetic relationship was developed individually for each TF family as well as combined TF families. Phylogenetic analysis grouped the TF family of genes into TF-specific and mixed groups. Phylogenetic analysis of genes belonging to various TF families suggested that the origin of TFs occurred in the lineage of maize evolution. Gene structure analysis revealed that more number of genes were intron-rich as compared to intronless genes. Drought-responsive CRE’s such as ABREA, ABREB, DRE1 and DRECRTCOREAT have been identified. Expression and interaction analyses identified leaf-specific bZIP TF, GRMZM2G140355, as a potential contributor toward drought tolerance in maize. We also analyzed protein-protein interaction network of 269 drought-responsive genes belonging to different drought-related TFs. The information generated on structural and functional characteristics, expression and interaction of the drought-related TF families will be useful to decipher the drought tolerance mechanisms and to derive drought-tolerant genotypes in maize.
Gene and domain duplication in the chordate Otx gene family: insights from amphioxus Otx.
Williams, N A; Holland, P W
1998-05-01
We report the genomic organization and deduced protein sequence of a cephalochordate member of the Otx homeobox gene family (AmphiOtx) and show its probable single-copy state in the genome. We also present molecular phylogenetic analysis indicating that there was single ancestral Otx gene in the first chordates which was duplicated in the vertebrate lineage after it had split from the lineage leading to the cephalochordates. Duplication of a C-terminal protein domain has occurred specifically in the vertebrate lineage, strengthening the case for a single Otx gene in an ancestral chordate whose gene structure has been retained in an extant cephalochordate. Comparative analysis of protein sequences and published gene expression patterns suggest that the ancestral chordate Otx gene had roles in patterning the anterior mesendoderm and central nervous system. These roles were elaborated following Otx gene duplication in vertebrates, accompanied by regulatory and structural divergence, particularly of Otx1 descendant genes.
Zhu, Xudong; Wang, Mengqi; Li, Xiaopeng; Jiu, Songtao; Wang, Chen; Fang, Jinggui
2017-01-01
Sucrose synthase (SS) is widely considered as the key enzyme involved in the plant sugar metabolism that is critical to plant growth and development, especially quality of the fruit. The members of SS gene family have been identified and characterized in multiple plant genomes. However, detailed information about this gene family is lacking in grapevine (Vitis vinifera L.). In this study, we performed a systematic analysis of the grape (V. vinifera) genome and reported that there are five SS genes (VvSS1–5) in the grape genome. Comparison of the structures of grape SS genes showed high structural conservation of grape SS genes, resulting from the selection pressures during the evolutionary process. The segmental duplication of grape SS genes contributed to this gene family expansion. The syntenic analyses between grape and soybean (Glycine max) demonstrated that these genes located in corresponding syntenic blocks arose before the divergence of grape and soybean. Phylogenetic analysis revealed distinct evolutionary paths for the grape SS genes. VvSS1/VvSS5, VvSS2/VvSS3 and VvSS4 originated from three ancient SS genes, which were generated by duplication events before the split of monocots and eudicots. Bioinformatics analysis of publicly available microarray data, which was validated by quantitative real-time reverse transcription PCR (qRT-PCR), revealed distinct temporal and spatial expression patterns of VvSS genes in various tissues, organs and developmental stages, as well as in response to biotic and abiotic stresses. Taken together, our results will be beneficial for further investigations into the functions of SS gene in the processes of grape resistance to environmental stresses. PMID:28350372
Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T
2003-08-14
The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.
Fan, Sheng; Zhang, Dong; Xing, Libo; Qi, Siyan; Du, Lisha; Wu, Haiqin; Shao, Hongxia; Li, Youmei; Ma, Juanjuan; Han, Mingyu
2017-08-01
Although INDETERMINATE DOMAIN (IDD) genes encoding specific plant transcription factors have important roles in plant growth and development, little is known about apple IDD (MdIDD) genes and their potential functions in the flower induction. In this study, we identified 20 putative IDD genes in apple and named them according to their chromosomal locations. All identified MdIDD genes shared a conserved IDD domain. A phylogenetic analysis separated MdIDDs and other plant IDD genes into four groups. Bioinformatic analysis of chemical characteristics, gene structure, and prediction of protein-protein interactions demonstrated the functional and structural diversity of MdIDD genes. To further uncover their potential functions, we performed analysis of tandem, synteny, and gene duplications, which indicated several paired homologs of IDD genes between apple and Arabidopsis. Additionally, genome duplications also promoted the expansion and evolution of the MdIDD genes. Quantitative real-time PCR revealed that all the MdIDD genes showed distinct expression levels in five different tissues (stems, leaves, buds, flowers, and fruits). Furthermore, the expression levels of candidate MdIDD genes were also investigated in response to various circumstances, including GA treatment (decreased the flowering rate), sugar treatment (increased the flowering rate), alternate-bearing conditions, and two varieties with different-flowering intensities. Parts of them were affected by exogenous treatments and showed different expression patterns. Additionally, changes in response to alternate-bearing and different-flowering varieties of apple trees indicated that they were also responsive to flower induction. Taken together, our comprehensive analysis provided valuable information for further analysis of IDD genes aiming at flower induction.
Wan, B; Yarbrough, J W; Schultz, T W
2008-01-01
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Zhang, Shi-tao; Zuo, Chao; Li, Wan-nan; Fu, Xue-qi; Xing, Shu; Zhang, Xiao-ping
2016-02-01
To identify key genes related to the effect of estrogen on ovarian cancer. Microarray data (GSE22600) were downloaded from Gene Expression Omnibus. Eight estrogen and seven placebo treatment samples were obtained using a 2 × 2 factorial designs, which contained 2 cell lines (PEO4 and 2008) and 2 treatments (estrogen and placebo). Differentially expressed genes were identified by Bayesian methods, and the genes with P < 0.05 and |log2FC (fold change)| ≥0.5 were chosen as cut-off criterion. Differentially co-expressed genes (DCGs) and differentially regulated genes (DRGs) were, respectively, identified by DCe function and DRsort function in DCGL package. Topological structure analysis was performed on the important transcriptional factors (TFs) and genes in transcriptional regulatory network using tYNA. Functional enrichment analysis was, respectively, performed for DEGs and the important genes using Gene Ontology and KEGG databases. In total, 465 DEGs were identified. Functional enrichment analysis of DEGs indicated that ACVR2B, LTBP1, BMP7 and MYC involved in TGF-beta signaling pathway. The 2285 DCG pairs and 357 DRGs were identified. Topological structure analysis showed that 52 important TFs and 65 important genes were identified. Functional enrichment analysis of the important genes showed that TP53 and MLH1 participated in DNA damage response and the genes (ACVR2B, LTBP1, BMP7 and MYC) involved in TGF-beta signaling pathway. TP53, MLH1, ACVR2B, LTBP1 and BMP7 might participate in the pathogenesis of ovarian cancer.
The top skin-associated genes: a comparative analysis of human and mouse skin transcriptomes.
Gerber, Peter Arne; Buhren, Bettina Alexandra; Schrumpf, Holger; Homey, Bernhard; Zlotnik, Albert; Hevezi, Peter
2014-06-01
The mouse represents a key model system for the study of the physiology and biochemistry of skin. Comparison of skin between mouse and human is critical for interpretation and application of data from mouse experiments to human disease. Here, we review the current knowledge on structure and immunology of mouse and human skin. Moreover, we present a systematic comparison of human and mouse skin transcriptomes. To this end, we have recently used a genome-wide database of human gene expression to identify genes highly expressed in skin, with no, or limited expression elsewhere - human skin-associated genes (hSAGs). Analysis of our set of hSAGs allowed us to generate a comprehensive molecular characterization of healthy human skin. Here, we used a similar database to generate a list of mouse skin-associated genes (mSAGs). A comparative analysis between the top human (n=666) and mouse (n=873) skin-associated genes (SAGs) revealed a total of only 30.2% identity between the two lists. The majority of shared genes encode proteins that participate in structural and barrier functions. Analysis of the top functional annotation terms revealed an overlap for morphogenesis, cell adhesion, structure, and signal transduction. The results of this analysis, discussed in the context of published data, illustrate the diversity between the molecular make up of skin of both species and grants a probable explanation, why results generated in murine in vivo models often fail to translate into the human.
Wang, Jiang; Yu, Yi; Tang, Kexuan; Liu, Wen; He, Xinyi; Huang, Xi; Deng, Zixin
2010-01-01
Thiopeptide antibiotics are an important class of natural products resulting from posttranslational modifications of ribosomally synthesized peptides. Cyclothiazomycin is a typical thiopeptide antibiotic that has a unique bridged macrocyclic structure derived from an 18-amino-acid structural peptide. Here we reported cloning, sequencing, and heterologous expression of the cyclothiazomycin biosynthetic gene cluster from Streptomyces hygroscopicus 10-22. Remarkably, successful heterologous expression of a 22.7-kb gene cluster in Streptomyces lividans 1326 suggested that there is a minimum set of 15 open reading frames that includes all of the functional genes required for cyclothiazomycin production. Six genes of these genes, cltBCDEFG flanking the structural gene cltA, were predicted to encode the enzymes required for the main framework of cyclothiazomycin, and two enzymes encoded by a putative operon, cltMN, were hypothesized to participate in the tailoring step to generate the tertiary thioether, leading to the final cyclization of the bridged macrocyclic structure. This rigorous bioinformatics analysis based on heterologous expression of cyclothiazomycin resulted in an ideal biosynthetic model for us to understand the biosynthesis of thiopeptides. PMID:20154110
Chang, Yan-Li; Li, Wen-Yan; Miao, Hai; Yang, Shuai-Qi; Li, Ri; Wang, Xiang; Li, Wen-Qiang; Chen, Kun-Ming
2016-02-23
Plasma membrane NADPH oxidases (NOXs) are key producers of reactive oxygen species under both normal and stress conditions in plants and they form functional subfamilies. Studies of these subfamilies indicated that they show considerable evolutionary selection. We performed a comparative genomic analysis that identified 50 ferric reduction oxidases (FRO) and 77 NOX gene homologs from 20 species representing the eight major plant lineages within the supergroup Plantae: glaucophytes, rhodophytes, chlorophytes, bryophytes, lycophytes, gymnosperms, monocots, and eudicots. Phylogenetic and structural analysis classified these FRO and NOX genes into four well-conserved groups represented as NOX, FRO I, FRO II, and FRO III. Further analysis of NOXs of phylogenetic and exon/intron structures showed that single intron loss and gain had occurred, yielding the diversified gene structures during the evolution of NOXs family genes and which were classified into four conserved subfamilies which are represented as Sub.I, Sub.II, Sub.III, and Sub.IV. Additionally, both available global microarray data analysis and quantitative real-time PCR experiments revealed that the NOX genes in Arabidopsis and rice (Oryza sativa) have different expression patterns in different developmental stages, various abiotic stresses and hormone treatments. Finally, coexpression network analysis of NOX genes in Arabidopsis and rice revealed that NOXs have significantly correlated expression profiles with genes which are involved in plants metabolic and resistance progresses. All these results suggest that NOX family underscores the functional diversity and divergence in plants. This finding will facilitate further studies of the NOX family and provide valuable information for functional validation of this family in plants. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
MAGMA: Generalized Gene-Set Analysis of GWAS Data
de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle
2015-01-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
2015-04-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Ganie, Showkat Ahmad; Pani, Dipti Ranjan; Mondal, Tapan Kumar
2017-01-01
DUF221 domain-containing genes (DDP genes) play important roles in developmental biology, hormone signalling transduction, and responses to abiotic stress. Therefore to understand their structural and evolutionary relationship, we did a genome-wide analysis of this important gene family in rice. Further, through comparative genomics, DDP genes from Oryza sativa subsp. (indica), nine different wild species of rice and Arabidopsis were also identified. We also found an expansion of the DDP gene families in rice and Arabidopsis which is due to the segmental duplication events in some of the gene family members. In general, a highly purifying selection was found acting on all the deduced paralogous and orthologous DDP gene pairs. The data from microarray and subsequent qRT-PCR analysis revealed that although several OsDDPs were differentially regulated under salinity stress, yet OsDDP6 was upregulated at all the developmental stages in salt tolerant rice genotype, FL478. Interestingly, OsDDP6 was found to be involved in proline metabolism pathway as indicated by protein network analysis. The diverse gene structures, varied transmembrane topologies and the differential expression patterns implied the functional diversity in DDP genes. Therefore, the comprehensive evolutionary analysis of DDP genes from different Oryza species and Arabidopsis performed in this study will provide the basis for further functional validation studies vis-à-vis DDP genes of rice and other plant species.
Ganie, Showkat Ahmad; Pani, Dipti Ranjan
2017-01-01
DUF221 domain-containing genes (DDP genes) play important roles in developmental biology, hormone signalling transduction, and responses to abiotic stress. Therefore to understand their structural and evolutionary relationship, we did a genome-wide analysis of this important gene family in rice. Further, through comparative genomics, DDP genes from Oryza sativa subsp. (indica), nine different wild species of rice and Arabidopsis were also identified. We also found an expansion of the DDP gene families in rice and Arabidopsis which is due to the segmental duplication events in some of the gene family members. In general, a highly purifying selection was found acting on all the deduced paralogous and orthologous DDP gene pairs. The data from microarray and subsequent qRT-PCR analysis revealed that although several OsDDPs were differentially regulated under salinity stress, yet OsDDP6 was upregulated at all the developmental stages in salt tolerant rice genotype, FL478. Interestingly, OsDDP6 was found to be involved in proline metabolism pathway as indicated by protein network analysis. The diverse gene structures, varied transmembrane topologies and the differential expression patterns implied the functional diversity in DDP genes. Therefore, the comprehensive evolutionary analysis of DDP genes from different Oryza species and Arabidopsis performed in this study will provide the basis for further functional validation studies vis-à-vis DDP genes of rice and other plant species. PMID:28846681
2011-01-01
Background Coleoid cephalopods (squids and octopuses) have evolved a camera eye, the structure of which is very similar to that found in vertebrates and which is considered a classic example of convergent evolution. Other molluscs, however, possess mirror, pin-hole, or compound eyes, all of which differ from the camera eye in the degree of complexity of the eye structures and neurons participating in the visual circuit. Therefore, genes expressed in the cephalopod eye after divergence from the common molluscan ancestor could be involved in eye evolution through association with the acquisition of new structural components. To clarify the genetic mechanisms that contributed to the evolution of the cephalopod camera eye, we applied comprehensive transcriptomic analysis and conducted developmental validation of candidate genes involved in coleoid cephalopod eye evolution. Results We compared gene expression in the eyes of 6 molluscan (3 cephalopod and 3 non-cephalopod) species and selected 5,707 genes as cephalopod camera eye-specific candidate genes on the basis of homology searches against 3 molluscan species without camera eyes. First, we confirmed the expression of these 5,707 genes in the cephalopod camera eye formation processes by developmental array analysis. Second, using molecular evolutionary (dN/dS) analysis to detect positive selection in the cephalopod lineage, we identified 156 of these genes in which functions appeared to have changed after the divergence of cephalopods from the molluscan ancestor and which contributed to structural and functional diversification. Third, we selected 1,571 genes, expressed in the camera eyes of both cephalopods and vertebrates, which could have independently acquired a function related to eye development at the expression level. Finally, as experimental validation, we identified three functionally novel cephalopod camera eye genes related to optic lobe formation in cephalopods by in situ hybridization analysis of embryonic pygmy squid. Conclusion We identified 156 genes positively selected in the cephalopod lineage and 1,571 genes commonly found in the cephalopod and vertebrate camera eyes from the analysis of cephalopod camera eye specificity at the expression level. Experimental validation showed that the cephalopod camera eye-specific candidate genes include those expressed in the outer part of the optic lobes, which unique to coleoid cephalopods. The results of this study suggest that changes in gene expression and in the primary structure of proteins (through positive selection) from those in the common molluscan ancestor could have contributed, at least in part, to cephalopod camera eye acquisition. PMID:21702923
Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R.; Gorodkin, Jan
2014-01-01
Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways. PMID:24416147
Sabarinathan, Radhakrishnan; Wenzel, Anne; Novotny, Peter; Tang, Xiaojia; Kalari, Krishna R; Gorodkin, Jan
2014-01-01
Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways.
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.
Selvin, Joseph; Sathiyanarayanan, Ganesan; Lipton, Anuj N.; Al-Dhabi, Naif Abdullah; Valan Arasu, Mariadhas; Kiran, George S.
2016-01-01
The important biological macromolecules, such as lipopeptide and glycolipid biosurfactant producing marine actinobacteria were analyzed and their potential linkage between type II polyketide synthase (PKS) genes was explored. A unique feature of type II PKS genes is their high amino acid (AA) sequence homology and conserved gene organization. These enzymes mediate the biosynthesis of polyketide natural products with enormous structural complexity and chemical nature by combinatorial use of various domains. Therefore, deciphering the order of AA sequence encoded by PKS domains tailored the chemical structure of polyketide analogs still remains a great challenge. The present work deals with an in vitro and in silico analysis of PKS type II genes from five actinobacterial species to correlate KS domain architecture and structural features. Our present analysis reveals the unique protein domain organization of iterative type II PKS and KS domain of marine actinobacteria. The findings of this study would have implications in metabolic pathway reconstruction and design of semi-synthetic genomes to achieve rational design of novel natural products. PMID:26903957
Abdullah, Muhammad; Cao, Yungpeng; Cheng, Xi; Meng, Dandan; Chen, Yu; Shakoor, Awais; Gao, Junshan; Cai, Yongping
2018-05-11
Sucrose synthase (SS) is a key enzyme involved in sucrose metabolism that is critical in plant growth and development, and particularly quality of the fruit. Sucrose synthase gene families have been identified and characterized in plants various plants such as tobacco, grape, rice, and Arabidopsis . However, there is still lack of detailed information about sucrose synthase gene in pear. In the present study, we performed a systematic analysis of the pear ( Pyrus bretschneideri Rehd.) genome and reported 30 sucrose synthase genes. Subsequently, gene structure, phylogenetic relationship, chromosomal localization, gene duplications, promoter regions, collinearity, RNA-Seq data and qRT-PCR were conducted on these sucrose synthase genes. The transcript analysis revealed that 10 PbSSs genes (30%) were especially expressed in pear fruit development. Additionally, qRT-PCR analysis verified the RNA-seq data and shown that PbSS30 , PbSS24 , and PbSS15 have a potential role in the pear fruit development stages. This study provides important insights into the evolution of sucrose synthase gene family in pear and will provide assistance for further investigation of sucrose synthase genes functions in the process of fruit development, fruit quality and resistance to environmental stresses.
Liu, Chaoyang; Wang, Xia; Xu, Yuantao; Deng, Xiuxin; Xu, Qiang
2014-10-01
MYB transcription factor represents one of the largest gene families in plant genomes. Sweet orange (Citrus sinensis) is one of the most important fruit crops worldwide, and recently the genome has been sequenced. This provides an opportunity to investigate the organization and evolutionary characteristics of sweet orange MYB genes from whole genome view. In the present study, we identified 100 R2R3-MYB genes in the sweet orange genome. A comprehensive analysis of this gene family was performed, including the phylogeny, gene structure, chromosomal localization and expression pattern analyses. The 100 genes were divided into 29 subfamilies based on the sequence similarity and phylogeny, and the classification was also well supported by the highly conserved exon/intron structures and motif composition. The phylogenomic comparison of MYB gene family among sweet orange and related plant species, Arabidopsis, cacao and papaya suggested the existence of functional divergence during evolution. Expression profiling indicated that sweet orange R2R3-MYB genes exhibited distinct temporal and spatial expression patterns. Our analysis suggested that the sweet orange MYB genes may play important roles in different plant biological processes, some of which may be potentially involved in citrus fruit quality. These results will be useful for future functional analysis of the MYB gene family in sweet orange.
Cloning and characterization of a Candida albicans maltase gene involved in sucrose utilization.
Geber, A; Williamson, P R; Rex, J H; Sweeney, E C; Bennett, J E
1992-01-01
In order to isolate the structural gene involved in sucrose utilization, we screened a sucrose-induced Candida albicans cDNA library for clones expressing alpha-glucosidase activity. The C. albicans maltase structural gene (CAMAL2) was isolated. No other clones expressing alpha-glucosidase activity. were detected. A genomic CAMAL2 clone was obtained by screening a size-selected genomic library with the cDNA clone. DNA sequence analysis reveals that CAMAL2 encodes a 570-amino-acid protein which shares 50% identity with the maltase structural gene (MAL62) of Saccharomyces carlsbergensis. The substrate specificity of the recombinant protein purified from Escherichia coli identifies the enzyme as a maltase. Northern (RNA) analysis reveals that transcription of CAMAL2 is induced by maltose and sucrose and repressed by glucose. These results suggest that assimilation of sucrose in C. albicans relies on an inducible maltase enzyme. The family of genes controlling sucrose utilization in C. albicans shares similarities with the MAL gene family of Saccharomyces cerevisiae and provides a model system for studying gene regulation in this pathogenic yeast. Images PMID:1400249
2014-01-01
Background Non-small cell lung cancer (NSCLC) remains lethal despite the development of numerous drug therapy technologies. About 85% to 90% of lung cancers are NSCLC and the 5-year survival rate is at best still below 50%. Thus, it is important to find drugable target genes for NSCLC to develop an effective therapy for NSCLC. Results Integrated analysis of publically available gene expression and promoter methylation patterns of two highly aggressive NSCLC cell lines generated by in vivo selection was performed. We selected eleven critical genes that may mediate metastasis using recently proposed principal component analysis based unsupervised feature extraction. The eleven selected genes were significantly related to cancer diagnosis. The tertiary protein structure of the selected genes was inferred by Full Automatic Modeling System, a profile-based protein structure inference software, to determine protein functions and to specify genes that could be potential drug targets. Conclusions We identified eleven potentially critical genes that may mediate NSCLC metastasis using bioinformatic analysis of publically available data sets. These genes are potential target genes for the therapy of NSCLC. Among the eleven genes, TINAGL1 and B3GALNT1 are possible candidates for drug compounds that inhibit their gene expression. PMID:25521548
Rajesh, P S; Rai, V Ravishankar
2014-01-03
The aiiA homologous gene known to encode AHL- lactonase enzyme which hydrolyze the N-acylhomoserine lactone (AHL) quorum sensing signaling molecules produced by Gram negative bacteria. In this study, the degradation of AHL molecules was determined by cell-free lysate of endophytic Enterobacter species. The percentage of quorum quenching was confirmed and quantified by HPLC method (p<0.0001). Amplification and sequence BLAST analysis showed the presence of aiiA homologous gene in endophytic Enterobacter asburiae VT65, Enterobacter aerogenes VT66 and Enterobacter ludwigii VT70 strains. Sequence alignment analysis revealed the presence of two zinc binding sites, "HXHXDH" motif as well as tyrosine residue at the position 194. Based on known template available at Swiss-Model, putative tertiary structure of AHL-lactonase was constructed. The result showed that novel endophytic strains of Enterobacter genera encode the novel aiiA homologous gene and its structural importance for future study. Copyright © 2013 Elsevier Inc. All rights reserved.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2015-01-01
Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
NASA Astrophysics Data System (ADS)
Arce, DP; Krsticevic, FJ; Ezpeleta, J.; Ponce, SD; Pratta, GR; Tapia, E.
2016-04-01
The small heat shock proteins (sHSPs) have been found to play a critical role in physiological stress conditions in protecting proteins from irreversible aggregation. To characterize the gene expression profile of four sHsps with a tandem gene structure arrangement in the domesticated Solanum lycopersicum (Heinz 1706) genome and its wild close relative Solanum pimpinellifolium (LA1589), differential gene expression analysis using RNA-Seq was conducted in three ripening stages in both cultivars fruits. Gene promoter analysis was performed to explain the heterogeneous pattern of gene expression found for these tandem duplicated sHsps. In silico analysis results contribute to refocus wet experiment analysis in tomato sHsp family proteins.
Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu
2017-10-01
The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
Identification, Classification, and Expression Analysis of GRAS Gene Family in Malus domestica
Fan, Sheng; Zhang, Dong; Gao, Cai; Zhao, Ming; Wu, Haiqin; Li, Youmei; Shen, Yawen; Han, Mingyu
2017-01-01
GRAS genes encode plant-specific transcription factors that play important roles in plant growth and development. However, little is known about the GRAS gene family in apple. In this study, 127 GRAS genes were identified in the apple (Malus domestica Borkh.) genome and named MdGRAS1 to MdGRAS127 according to their chromosomal locations. The chemical characteristics, gene structures and evolutionary relationships of the MdGRAS genes were investigated. The 127 MdGRAS genes could be grouped into eight subfamilies based on their structural features and phylogenetic relationships. Further analysis of gene structures, segmental and tandem duplication, gene phylogeny and tissue-specific expression with ArrayExpress database indicated their diversification in quantity, structure and function. We further examined the expression pattern of MdGRAS genes during apple flower induction with transcriptome sequencing. Eight higher MdGRAS (MdGRAS6, 26, 28, 44, 53, 64, 107, and 122) genes were surfaced. Further quantitative reverse transcription PCR indicated that the candidate eight genes showed distinct expression patterns among different tissues (leaves, stems, flowers, buds, and fruits). The transcription levels of eight genes were also investigated with various flowering related treatments (GA3, 6-BA, and sucrose) and different flowering varieties (Yanfu No. 6 and Nagafu No. 2). They all were affected by flowering-related circumstance and showed different expression level. Changes in response to these hormone or sugar related treatments indicated their potential involvement during apple flower induction. Taken together, our results provide rich resources for studying GRAS genes and their potential clues in genetic improvement of apple flowering, which enriches biological theories of GRAS genes in apple and their involvement in flower induction of fruit trees. PMID:28503152
Identification, Classification, and Expression Analysis of GRAS Gene Family in Malus domestica.
Fan, Sheng; Zhang, Dong; Gao, Cai; Zhao, Ming; Wu, Haiqin; Li, Youmei; Shen, Yawen; Han, Mingyu
2017-01-01
GRAS genes encode plant-specific transcription factors that play important roles in plant growth and development. However, little is known about the GRAS gene family in apple. In this study, 127 GRAS genes were identified in the apple ( Malus domestica Borkh.) genome and named MdGRAS1 to MdGRAS127 according to their chromosomal locations. The chemical characteristics, gene structures and evolutionary relationships of the MdGRAS genes were investigated. The 127 MdGRAS genes could be grouped into eight subfamilies based on their structural features and phylogenetic relationships. Further analysis of gene structures, segmental and tandem duplication, gene phylogeny and tissue-specific expression with ArrayExpress database indicated their diversification in quantity, structure and function. We further examined the expression pattern of MdGRAS genes during apple flower induction with transcriptome sequencing. Eight higher MdGRAS ( MdGRAS6, 26, 28, 44, 53, 64, 107 , and 122 ) genes were surfaced. Further quantitative reverse transcription PCR indicated that the candidate eight genes showed distinct expression patterns among different tissues (leaves, stems, flowers, buds, and fruits). The transcription levels of eight genes were also investigated with various flowering related treatments (GA 3 , 6-BA, and sucrose) and different flowering varieties (Yanfu No. 6 and Nagafu No. 2). They all were affected by flowering-related circumstance and showed different expression level. Changes in response to these hormone or sugar related treatments indicated their potential involvement during apple flower induction. Taken together, our results provide rich resources for studying GRAS genes and their potential clues in genetic improvement of apple flowering, which enriches biological theories of GRAS genes in apple and their involvement in flower induction of fruit trees.
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar
2014-12-01
Lysophosphatidyl acyltransferase (LPAT) is one of the major triacylglycerol synthesis enzymes, controlling the metabolic flow of lysophosphatidic acid to phosphatidic acid. Experimental studies in Arabidopsis have shown that LPAT activity is exhibited primarily by three distinct isoforms, namely the plastid-located LPAT1, the endoplasmic reticulum-located LPAT2, and the soluble isoform of LPAT (solLPAT). In this study, 24 putative genes representing all LPAT isoforms were identified from the analysis of 11 complete genomes including green algae, red algae, diatoms and higher plants. We observed LPAT1 and solLPAT genes to be ubiquitously present in nearly all genomes examined, whereas LPAT2 genes to have evolved more recently in the plant lineage. Phylogenetic analysis indicated that LPAT1, LPAT2 and solLPAT have convergently evolved through separate evolutionary paths and belong to three different gene families, which was further evidenced by their wide divergence at gene structure and sequence level. The genome distribution supports the hypothesis that each gene encoding a LPAT is not duplicated. Mapping of exon-intron structure of LPAT genes to the domain structure of proteins across different algal and plant species indicates that exon shuffling plays no role in the evolution of LPAT genes. Besides the previously defined motifs, several conserved consensus sequences were discovered which could be useful to distinguish different LPAT isoforms. Taken together, this study will enable the generation of experimental approximations to better understand the functional role of algal LPAT in lipid accumulation.
Mu, Min; Lu, Xu-Ke; Wang, Jun-Juan; Wang, De-Long; Yin, Zu-Jun; Wang, Shuai; Fan, Wei-Li; Ye, Wu-Wei
2016-03-18
Trehalose (a-D-glucopyranosyl a-D-glucopyranoside) is a nonreducing disaccharide and is widely distributed in bacteria, fungi, algae, plants and invertebrates. In the study, the identification of trehalose-6-phosphate synthase (TPS) genes stress-related in cotton, and the genetic structure analysis and molecular evolution analysis of TPSs were conducted with bioinformatics methods, which could lay a foundation for further research of TPS functions in cotton. The genome information of Gossypium raimondii (group D), G. arboreum L. (group A), and G. hirsutum L. (group AD) was used in the study. Fifty-three TPSs were identified comprising 15 genes in group D, 14 in group A, and 24 in group AD. Bioinformatics methods were used to analyze the genetic structure and molecular evolution of TPSs. Real-time PCR analysis was performed to investigate the expression patterns of gene family members. All TPS family members in cotton can be divided into two subfamilies: Class I and Class II. The similarity of the TPS sequence is high within the same species and close within their family relatives. The genetic structures of two TPS subfamily members are different, with more introns and a more complicated gene structure in Class I. There is a TPS domain(Glyco transf_20) at the N-terminal in all TPS family members and a TPP domain(Trehalose_PPase) at the C-terminal in all except GrTPS6, GhTPS4, and GhTPS9. All Class II members contain a UDP-forming domain. The responses to environmental stresses showed that stresses could induce the expression of TPSs but the expression patterns vary with different stresses. The distribution of TPSs varies with different species but is relatively uniform on chromosomes. Genetic structure varies with different gene members, and expression levels vary with different stresses and exhibit tissue specificity. The upregulated genes in upland cotton TM-1 is significantly more than that in G. raimondii and G. arboreum L. Shixiya 1.
GeneBuilder: interactive in silico prediction of gene structure.
Milanesi, L; D'Angelo, D; Rogozin, I B
1999-01-01
Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.
Cheng, Xi; Wang, Yanan; Abdullah, Muhammad; Li, Manli; Li, Dahui; Gao, Junshan
2017-01-01
Plant type III polyketide synthase (PKS) can catalyse the formation of a series of secondary metabolites with different structures and different biological functions; the enzyme plays an important role in plant growth, development and resistance to stress. At present, the PKS gene has been identified and studied in a variety of plants. Here, we identified 11 PKS genes from upland cotton (Gossypium hirsutum) and compared them with 41 PKS genes in Populus tremula, Vitis vinifera, Malus domestica and Arabidopsis thaliana. According to the phylogenetic tree, a total of 52 PKS genes can be divided into four subfamilies (I–IV). The analysis of gene structures and conserved motifs revealed that most of the PKS genes were composed of two exons and one intron and there are two characteristic conserved domains (Chal_sti_synt_N and Chal_sti_synt_C) of the PKS gene family. In our study of the five species, gene duplication was found in addition to Arabidopsis thaliana and we determined that purifying selection has been of great significance in maintaining the function of PKS gene family. From qRT-PCR analysis and a combination of the role of the accumulation of proanthocyanidins (PAs) in brown cotton fibers, we concluded that five PKS genes are candidate genes involved in brown cotton fiber pigment synthesis. These results are important for the further study of brown cotton PKS genes. It not only reveals the relationship between PKS gene family and pigment in brown cotton, but also creates conditions for improving the quality of brown cotton fiber. PMID:29104824
Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.
Huang, Xin; Li, Hao-ming
2009-08-05
Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.
NASA Astrophysics Data System (ADS)
Mittal, Shikha; Mallikarjuna, Mallana Gowdra; Rao, Atmakuri R.; Jain, Prashant A.; Dash, Prasanta K.; Thirunavukkarasu, Nepolean
2017-12-01
Calcium dependent protein kinases (CDPKs) play major role in regulation of plant growth and development in response to various stresses including drought. A set of 32 CDPK genes identified in maize were further used for searching of orthologs in the model plant Arabidopsis (72) and major food crops such as rice (78) and sorghum (91). We comprehensively investigated the phylogenetic relationship, annotations, gene duplications, gene structure, divergence time, 3-D protein structures and tissue-specific drought induced expression of CDPK genes in all four species. Variation in intron frequency among these species likely contributed to the functional diversity of CDPK genes to various stress responses. Protein kinase and protein kinase C phosphorylation site domains were the most conserved motifs identified in all species. Four groups were identified from the sequence-based phylogenetic analysis, in which maize CDPKs were clustered in group III. The time of divergence (Ka/Ks) analysis revealed that the CDPKs were evolved through stabilizing selection. Expression data showed that the CDPK genes were highly expressed in leaf of maize, rice, and sorghum whereas in Arabidopsis the maximum expression was observed in root. 3-D protein structure were predicted for the nine genes (Arabidopsis: 2, maize: 2, rice: 3 and sorghum: 2) showing differential expression in at least three species. The predicted 3-D structures were further evaluated and validated by Ramachandran plot, ANOLEA, ProSA and Verify-3D. The superimposed 3-D structure of drought-related orthologous proteins retained similar folding pattern owing to their conserved nature. Functional annotation revealed the involvement of CDPK genes in various pathways such as osmotic homeostasis, cell protection and root growth. The interactions of CDPK genes in various pathways play crucial role in imparting drought tolerance through different ABA and MAPK signalling cascades. Our studies suggest that these selected candidate genes could be targeted in development of drought tolerant cultivars in maize, rice and sorghum through appropriate breeding approaches. Our comparative experiments of CDPK genes could also be extended in the drought stress breeding programmes of the related species.
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
The WRKY Transcription Factor Genes in Lotus japonicus.
Song, Hui; Wang, Pengfei; Nan, Zhibiao; Wang, Xingjun
2014-01-01
WRKY transcription factor genes play critical roles in plant growth and development, as well as stress responses. WRKY genes have been examined in various higher plants, but they have not been characterized in Lotus japonicus. The recent release of the L. japonicus whole genome sequence provides an opportunity for a genome wide analysis of WRKY genes in this species. In this study, we identified 61 WRKY genes in the L. japonicus genome. Based on the WRKY protein structure, L. japonicus WRKY (LjWRKY) genes can be classified into three groups (I-III). Investigations of gene copy number and gene clusters indicate that only one gene duplication event occurred on chromosome 4 and no clustered genes were detected on chromosomes 3 or 6. Researchers previously believed that group II and III WRKY domains were derived from the C-terminal WRKY domain of group I. Our results suggest that some WRKY genes in group II originated from the N-terminal domain of group I WRKY genes. Additional evidence to support this hypothesis was obtained by Medicago truncatula WRKY (MtWRKY) protein motif analysis. We found that LjWRKY and MtWRKY group III genes are under purifying selection, suggesting that WRKY genes will become increasingly structured and functionally conserved.
Genome-wide analysis of TCP family in tobacco.
Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H
2016-05-23
The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.
Structure and variation of the mitochondrial genome of fishes.
Satoh, Takashi P; Miya, Masaki; Mabuchi, Kohji; Nishida, Mutsumi
2016-09-07
The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets. An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species. Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.
Zhang, Bin; Liu, Xia; Zhao, Guangyao; Mao, Xinguo; Li, Ang; Jing, Ruilian
2014-06-01
Wheat (Triticum aestivum L.) is one of the most important crops in the world. Squamosa-promoter binding protein (SBP)-box genes play a critical role in regulating flower and fruit development. In this study, 10 novel SBP-box genes (TaSPL genes) were isolated from wheat ((Triticum aestivum L.) cultivar Yanzhan 4110). Phylogenetic analysis classified the TaSPL genes into five groups (G1-G5). The motif combinations and expression patterns of the TaSPL genes varied among the five groups with each having own distinctive characteristics: TaSPL20/21 in G1 and TaSPL17 in G2 mainly expressed in the shoot apical meristem and the young ear, and their expression levels responded to development of the ear; TaSPL6/15 belonging to G3 were upregulated and TaSPL1/23 in G4 were downregulated during grain development; the gene in G5 (TaSPL3) expressed constitutively. Thus, the consistency of the phylogenetic analysis, motif compositions, and expression patterns of the TaSPL genes revealed specific gene structures and functions. On the other hand, the diverse gene structures and different expression patterns suggested that wheat SBP-box genes have a wide range of functions. The results also suggest a potential role for wheat SBP-box genes in ear development. This study provides a significant beginning of functional analysis of SBP-box genes in wheat. © 2014 The Authors. Journal of Integrative Plant Biology Published by Wiley Publishing Asia Pty Ltd on behalf of Institute of Botany, Chinese Academy of Sciences.
Singh, Vikash K.; Jain, Mukesh; Garg, Rohini
2014-01-01
Growth hormone auxin regulates various cellular processes by altering the expression of diverse genes in plants. Among various auxin-responsive genes, GH3 genes maintain endogenous auxin homeostasis by conjugating excess of auxin with amino acids. GH3 genes have been characterized in many plant species, but not in legumes. In the present work, we identified members of GH3 gene family and analyzed their chromosomal distribution, gene structure, gene duplication and phylogenetic analysis in different legumes, including chickpea, soybean, Medicago, and Lotus. A comprehensive expression analysis in different vegetative and reproductive tissues/stages revealed that many of GH3 genes were expressed in a tissue-specific manner. Notably, chickpea CaGH3-3, soybean GmGH3-8 and -25, and Lotus LjGH3-4, -5, -9 and -18 genes were up-regulated in root, indicating their putative role in root development. In addition, chickpea CaGH3-1 and -7, and Medicago MtGH3-7, -8, and -9 were found to be highly induced under drought and/or salt stresses, suggesting their role in abiotic stress responses. We also observed the examples of differential expression pattern of duplicated GH3 genes in soybean, indicating their functional diversification. Furthermore, analyses of three-dimensional structures, active site residues and ligand preferences provided molecular insights into function of GH3 genes in legumes. The analysis presented here would help in investigation of precise function of GH3 genes in legumes during development and stress conditions. PMID:25642236
Qi, Fengxia; Chen, Ping; Caufield, Page W.
2000-01-01
Previously, we reported isolation and characterization of mutacin III and genetic analysis of mutacin III biosynthesis genes from the group III strain of Streptococcus mutans, UA787 (F. Qi, P. Chen, and P. W. Caufield, Appl. Environ. Microbiol. 65:3880–3887, 1999). During the same process of isolating the mutacin III structural gene, we also cloned the structural gene for mutacin I. In this report, we present purification and biochemical characterization of mutacin I from the group I strain CH43 and compare mutacin I and mutacin III biosynthesis genes. The mutacin I biosynthesis gene locus consists of 14 genes in the order mutR, -A, -A′, -B, -C, -D, -P, -T, -F, -E, -G, orfX, orfY, orfZ. mutA is the structural gene for mutacin I, while mutA′ is not required for mutacin I activity. DNA and protein sequence analysis revealed that mutacins I and III are homologous to each other, possibly arising from a common ancestor. The mature mutacin I is 24 amino acids in size and has a molecular mass of 2,364 Da. Ethanethiol modification and peptide sequencing of mutacin I revealed that it contains six dehydrated serines, four of which are probably involved with thioether bridge formation. Comparison of the primary sequence of mutacin I with that of mutacin III and epidermin suggests that mutacin I likely has the same bridging pattern as epidermin. PMID:10919773
Qi, F; Chen, P; Caufield, P W
2000-08-01
Previously, we reported isolation and characterization of mutacin III and genetic analysis of mutacin III biosynthesis genes from the group III strain of Streptococcus mutans, UA787 (F. Qi, P. Chen, and P. W. Caufield, Appl. Environ. Microbiol. 65:3880-3887, 1999). During the same process of isolating the mutacin III structural gene, we also cloned the structural gene for mutacin I. In this report, we present purification and biochemical characterization of mutacin I from the group I strain CH43 and compare mutacin I and mutacin III biosynthesis genes. The mutacin I biosynthesis gene locus consists of 14 genes in the order mutR, -A, -A', -B, -C, -D, -P, -T, -F, -E, -G, orfX, orfY, orfZ. mutA is the structural gene for mutacin I, while mutA' is not required for mutacin I activity. DNA and protein sequence analysis revealed that mutacins I and III are homologous to each other, possibly arising from a common ancestor. The mature mutacin I is 24 amino acids in size and has a molecular mass of 2, 364 Da. Ethanethiol modification and peptide sequencing of mutacin I revealed that it contains six dehydrated serines, four of which are probably involved with thioether bridge formation. Comparison of the primary sequence of mutacin I with that of mutacin III and epidermin suggests that mutacin I likely has the same bridging pattern as epidermin.
Zeng, Lingfeng; Deng, Rong; Guo, Ziping; Yang, Shushen; Deng, Xiping
2016-03-16
Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a central enzyme in glycolysi, we performed genome-wide identification of GAPDH genes in wheat and analyzed their structural characteristics and expression patterns under abiotic stress in wheat. A total of 22 GAPDH genes were identified in wheat cv. Chinese spring; the phylogenetic and structure analysis showed that these GAPDH genes could be divided into four distinct subfamilies. The expression profiles of GAPDH genes showed tissue specificity all over plant development stages. The qRT-PCR results revealed that wheat GAPDHs were involved in several abiotic stress response. Wheat carried 22 GAPDH genes, representing four types of plant GAPDHs (gapA/B, gapC, gapCp and gapN). Whole genome duplication and segmental duplication might account for the expansion of wheat GAPDHs. Expression analysis implied that GAPDHs play roles in plants abiotic stress tolerance.
Xiang, Bo; Yu, Minglan; Liang, Xuemei; Lei, Wei; Huang, Chaohua; Chen, Jing; He, Wenying; Zhang, Tao; Li, Tao; Liu, Kezhi
2017-12-10
To explore common biological pathways for attention deficit hyperactivity disorder (ADHD) and low birth weight (LBW). Thei-Gsea4GwasV2 software was used to analyze the result of genome-wide association analysis (GWAS) for LBW (pathways were derived from Reactome), and nominally significant (P< 0.05, FDR< 0.25) pathways were tested for replication in ADHD.Significant pathways were analyzed with DAPPLE and Reatome FI software to identify genes involved in such pathways, with each cluster enriched with the gene ontology (GO). The Centiscape2.0 software was used to calculate the degree of genetic networks and the betweenness value to explore the core node (gene). Weighed gene co-expression network analysis (WGCNA) was then used to explore the co-expression of genes in these pathways.With gene expression data derived from BrainSpan, GO enrichment was carried out for each gene module. Eleven significant biological pathways was identified in association with LBW, among which two (Selenoamino acid metabolism and Diseases associated with glycosaminoglycan metabolism) were replicated during subsequent ADHD analysis. Network analysis of 130 genes in these pathways revealed that some of the sub-networksare related with morphology of cerebellum, development of hippocampus, and plasticity of synaptic structure. Upon co-expression network analysis, 120 genes passed the quality control and were found to express in 3 gene modules. These modules are mainly related to the regulation of synaptic structure and activity regulation. ADHD and LBW share some biological regulation processes. Anomalies of such proces sesmay predispose to ADHD.
Differential accumulation of nif structural gene mRNA in Azotobacter vinelandii.
Hamilton, Trinity L; Jacobson, Marty; Ludwig, Marcus; Boyd, Eric S; Bryant, Donald A; Dean, Dennis R; Peters, John W
2011-09-01
Northern analysis was employed to investigate mRNA produced by mutant strains of Azotobacter vinelandii with defined deletions in the nif structural genes and in the intergenic noncoding regions. The results indicate that intergenic RNA secondary structures effect the differential accumulation of transcripts, supporting the high Fe protein-to-MoFe protein ratio required for optimal diazotrophic growth.
The hOGG1 Ser326Cys Gene Polymorphism and Breast Cancer Risk in Saudi Population.
Alanazi, Mohammed; Pathan, Akbar Ali Khan; Shaik, Jilani P; Alhadheq, Abdullah; Khan, Zahid; Khan, Wajahatullah; Al Naeem, Abdulrahman; Parine, Narasimha Reddy
2017-07-01
The purpose of this study was to test the association between human 8-oxoguanine glycosylase 1 (hOGG1) gene polymorphisms and susceptibility to breast cancer in Saudi population. We have also aimed to screen the hOGG1 Ser326Cys polymorphism effect on structural and functional properties of the hOGG1 protein using in silico tools. We have analyzed four SNPs of hOGG1 gene among Saudi breast cancer patients along with healthy controls. Genotypes were screened using TaqMan SNP genotype analysis method. Experimental data was analyzed using Chi-square, t test and logistic regression analysis using SPSS software (v.16). In silco analysis was conducted using discovery studio and HOPE program. Genotypic analysis showed that hOGG1 rs1052133 (Ser326Cys) is significantly associated with breast cancer samples in Saudi population, however rs293795 (T >C), rs2072668 (C>G) and rs2075747 (G >A) did not show any association with breast cancer. The hOGG1 SNP rs1052133 (Ser326Cys) minor allele T showed a significant association with breast cancer samples (OR = 1.78, χ2 = 7.86, p = 0.02024). In silico structural analysis was carried out to compare the wild type (Ser326) and mutant (Cys326) protein structures. The structural prediction studies revealed that Ser326Cys variant may destabilize the protein structure and it may disturb the hOGG1 function. Taken together this is the first In silico study report to confirm Ser326Cys variant effect on structural and functional properties of hOGG1 gene and Ser326Cys role in breast cancer susceptibility in Saudi population.
High-resolution DNA melting analysis in plant research
USDA-ARS?s Scientific Manuscript database
Genetic and genomic studies provide valuable insight into the inheritance, structure, organization, and function of genes. The knowledge gained from the analysis of plant genes is beneficial to all aspects of plant research, including crop improvement. New methods and tools are continually developed...
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252
2012-01-01
Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Roncaglia, Paola; Howe, Douglas G.; Laulederkind, Stanley J.F.; Khodiyar, Varsha K.; Berardini, Tanya Z.; Tweedie, Susan; Foulger, Rebecca E.; Osumi-Sutherland, David; Campbell, Nancy H.; Huntley, Rachael P.; Talmud, Philippa J.; Blake, Judith A.; Breckenridge, Ross; Riley, Paul R.; Lambiase, Pier D.; Elliott, Perry M.; Clapp, Lucie; Tinker, Andrew; Hill, David P.
2018-01-01
Background: A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. Methods and Results: In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. Conclusions: We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. PMID:29440116
Lovering, Ruth C; Roncaglia, Paola; Howe, Douglas G; Laulederkind, Stanley J F; Khodiyar, Varsha K; Berardini, Tanya Z; Tweedie, Susan; Foulger, Rebecca E; Osumi-Sutherland, David; Campbell, Nancy H; Huntley, Rachael P; Talmud, Philippa J; Blake, Judith A; Breckenridge, Ross; Riley, Paul R; Lambiase, Pier D; Elliott, Perry M; Clapp, Lucie; Tinker, Andrew; Hill, David P
2018-02-01
A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. © 2018 The Authors.
Functional understanding of the diverse exon-intron structures of human GPCR genes.
Hammond, Dorothy A; Olman, Victor; Xu, Ying
2014-02-01
The GPCR genes have a variety of exon-intron structures even though their proteins are all structurally homologous. We have examined all human GPCR genes with at least two functional protein isoforms, totaling 199, aiming to gain an understanding of what may have contributed to the large diversity of the exon-intron structures of the GPCR genes. The 199 genes have a total of 808 known protein splicing isoforms with experimentally verified functions. Our analysis reveals that 1301 (80.6%) adjacent exon-exon pairs out of the total of 1,613 in the 199 genes have either exactly one exon skipped or the intron in-between retained in at least one of the 808 protein splicing isoforms. This observation has a statistical significance p-value of 2.051762 * e(-09), assuming that the observed splicing isoforms are independent of the exon-intron structures. Our interpretation of this observation is that the exon boundaries of the GPCR genes are not randomly determined; instead they may be selected to facilitate specific alternative splicing for functional purposes.
Analysis of co-evolving genes in campylobacter jejuni and C. coli
USDA-ARS?s Scientific Manuscript database
Background: The population structure of Campylobacter has been frequently studied by MLST, for which fragments of housekeeping genes are compared. We wished to determine if the used MLST genes are representative of the complete genome. Methods: A set of 1029 core gene families (CGF) was identifie...
Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A
2017-04-01
Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa
2015-01-01
Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
Comparative Reannotation of 21 Aspergillus Genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salamov, Asaf; Riley, Robert; Kuo, Alan
2013-03-08
We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one whichmore » most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.« less
Phylogenetics and evolution of Trx SET genes in fully sequenced land plants.
Zhu, Xinyu; Chen, Caoyi; Wang, Baohua
2012-04-01
Plant Trx SET proteins are involved in H3K4 methylation and play a key role in plant floral development. Genes encoding Trx SET proteins constitute a multigene family in which the copy number varies among plant species and functional divergence appears to have occurred repeatedly. To investigate the evolutionary history of the Trx SET gene family, we made a comprehensive evolutionary analysis on this gene family from 13 major representatives of green plants. A novel clustering (here named as cpTrx clade), which included the III-1, III-2, and III-4 orthologous groups, previously resolved was identified. Our analysis showed that plant Trx proteins possessed a variety of domain organizations and gene structures among paralogs. Additional domains such as PHD, PWWP, and FYR were early integrated into primordial SET-PostSET domain organization of cpTrx clade. We suggested that the PostSET domain was lost in some members of III-4 orthologous group during the evolution of land plants. At least four classes of gene structures had been formed at the early evolutionary stage of land plants. Three intronless orphan Trx SET genes from the Physcomitrella patens (moss) were identified, and supposedly, their parental genes have been eliminated from the genome. The structural differences among evolutionary groups of plant Trx SET genes with different functions were described, contributing to the design of further experimental studies.
Bioinformatics analysis of the predicted polyprenol reductase genes in higher plants
NASA Astrophysics Data System (ADS)
Basyuni, M.; Wati, R.
2018-03-01
The present study evaluates the bioinformatics methods to analyze twenty-four predicted polyprenol reductase genes from higher plants on GenBank as well as predicted the structure, composition, similarity, subcellular localization, and phylogenetic. The physicochemical properties of plant polyprenol showed diversity among the observed genes. The percentage of the secondary structure of plant polyprenol genes followed the ratio order of α helix > random coil > extended chain structure. The values of chloroplast but not signal peptide were too low, indicated that few chloroplast transit peptide in plant polyprenol reductase genes. The possibility of the potential transit peptide showed variation among the plant polyprenol reductase, suggested the importance of understanding the variety of peptide components of plant polyprenol genes. To clarify this finding, a phylogenetic tree was drawn. The phylogenetic tree shows several branches in the tree, suggested that plant polyprenol reductase genes grouped into divergent clusters in the tree.
JPRS Report, Science and Technology USSR: Life Sciences.
1990-07-16
4 1 VETERINARY MEDICINE Primary Structure of RNA Polymerase Gene of Foot-and-Mouth Disease Virus ( FMDV ...neering were used to obtain cDNA corresponding to the Primary Structure of RNA Polymerase Gene of RNA polymerase gene to FMDV A 2 2 , with a map of the...Foot-and-Mouth Disease Virus ( FMDV ) A22 primary nucleotide sequence of the cDNA provided. 18400538F Moscow BIOORGANICHESKA YA Analysis of the data
Li, Ping; Jiang, Zhou; Wang, Yanhong; Deng, Ye; Van Nostrand, Joy D; Yuan, Tong; Liu, Han; Wei, Dazhun; Zhou, Jizhong
2017-10-15
Microbial functional potential in high arsenic (As) groundwater ecosystems remains largely unknown. In this study, the microbial community functional composition of nineteen groundwater samples was investigated using a functional gene array (GeoChip 5.0). Samples were divided into low and high As groups based on the clustering analysis of geochemical parameters and microbial functional structures. The results showed that As related genes (arsC, arrA), sulfate related genes (dsrA and dsrB), nitrogen cycling related genes (ureC, amoA, and hzo) and methanogen genes (mcrA, hdrB) in groundwater samples were correlated with As, SO 4 2- , NH 4 + or CH 4 concentrations, respectively. Canonical correspondence analysis (CCA) results indicated that some geochemical parameters including As, total organic content, SO 4 2- , NH 4 + , oxidation-reduction potential (ORP) and pH were important factors shaping the functional microbial community structures. Alkaline and reducing conditions with relatively low SO 4 2- , ORP, and high NH 4 + , as well as SO 4 2- and Fe reduction and ammonification involved in microbially-mediated geochemical processes could be associated with As enrichment in groundwater. This study provides an overall picture of functional microbial communities in high As groundwater aquifers, and also provides insights into the critical role of microorganisms in As biogeochemical cycling. Copyright © 2017 Elsevier Ltd. All rights reserved.
The WRKY Transcription Factor Genes in Lotus japonicus
Wang, Pengfei; Wang, Xingjun
2014-01-01
WRKY transcription factor genes play critical roles in plant growth and development, as well as stress responses. WRKY genes have been examined in various higher plants, but they have not been characterized in Lotus japonicus. The recent release of the L. japonicus whole genome sequence provides an opportunity for a genome wide analysis of WRKY genes in this species. In this study, we identified 61 WRKY genes in the L. japonicus genome. Based on the WRKY protein structure, L. japonicus WRKY (LjWRKY) genes can be classified into three groups (I–III). Investigations of gene copy number and gene clusters indicate that only one gene duplication event occurred on chromosome 4 and no clustered genes were detected on chromosomes 3 or 6. Researchers previously believed that group II and III WRKY domains were derived from the C-terminal WRKY domain of group I. Our results suggest that some WRKY genes in group II originated from the N-terminal domain of group I WRKY genes. Additional evidence to support this hypothesis was obtained by Medicago truncatula WRKY (MtWRKY) protein motif analysis. We found that LjWRKY and MtWRKY group III genes are under purifying selection, suggesting that WRKY genes will become increasingly structured and functionally conserved. PMID:24745006
Intron-loss evolution of hatching enzyme genes in Teleostei
2010-01-01
Background Hatching enzyme, belonging to the astacin metallo-protease family, digests egg envelope at embryo hatching. Orthologous genes of the enzyme are found in all vertebrate genomes. Recently, we found that exon-intron structures of the genes were conserved among tetrapods, while the genes of teleosts frequently lost their introns. Occurrence of such intron losses in teleostean hatching enzyme genes is an uncommon evolutionary event, as most eukaryotic genes are generally known to be interrupted by introns and the intron insertion sites are conserved from species to species. Here, we report on extensive studies of the exon-intron structures of teleostean hatching enzyme genes for insight into how and why introns were lost during evolution. Results We investigated the evolutionary pathway of intron-losses in hatching enzyme genes of 27 species of Teleostei. Hatching enzyme genes of basal teleosts are of only one type, which conserves the 9-exon-8-intron structure of an assumed ancestor. On the other hand, otocephalans and euteleosts possess two types of hatching enzyme genes, suggesting a gene duplication event in the common ancestor of otocephalans and euteleosts. The duplicated genes were classified into two clades, clades I and II, based on phylogenetic analysis. In otocephalans and euteleosts, clade I genes developed a phylogeny-specific structure, such as an 8-exon-7-intron, 5-exon-4-intron, 4-exon-3-intron or intron-less structure. In contrast to the clade I genes, the structures of clade II genes were relatively stable in their configuration, and were similar to that of the ancestral genes. Expression analyses revealed that hatching enzyme genes were high-expression genes, when compared to that of housekeeping genes. When expression levels were compared between clade I and II genes, clade I genes tends to be expressed more highly than clade II genes. Conclusions Hatching enzyme genes evolved to lose their introns, and the intron-loss events occurred at the specific points of teleostean phylogeny. We propose that the high-expression hatching enzyme genes frequently lost their introns during the evolution of teleosts, while the low-expression genes maintained the exon-intron structure of the ancestral gene. PMID:20796321
Diao, Weiping; Snyder, John C.; Liu, Jinbing; Pan, Baogui; Guo, Guangjun; Ge, Wei; Dawood, Mohammad Hasan Salman Ali
2018-01-01
The NAM, ATAF1/2, and CUC2 (NAC) transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.). The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC) proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6). In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III). Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9%) of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA), and methyl jasmonate (MeJA)] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with different types of stress. Our results also showed that CaNAC36 plays an important role in the interaction network, interacting with 48 genes. Most of these genes are in the mitogen-activated protein kinase (MAPK) family. Taken together, our results provide a platform for further studies to identify the biological functions of CaNAC genes. PMID:29596349
The limitations of simple gene set enrichment analysis assuming gene independence.
Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P
2016-02-01
Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. © The Author(s) 2012.
Le Bail, Aude; Scholz, Sebastian; Kost, Benedikt
2013-01-01
The use of the moss Physcomitrella patens as a model system to study plant development and physiology is rapidly expanding. The strategic position of P. patens within the green lineage between algae and vascular plants, the high efficiency with which transgenes are incorporated by homologous recombination, advantages associated with the haploid gametophyte representing the dominant phase of the P. patens life cycle, the simple structure of protonemata, leafy shoots and rhizoids that constitute the haploid gametophyte, as well as a readily accessible high-quality genome sequence make this moss a very attractive experimental system. The investigation of the genetic and hormonal control of P. patens development heavily depends on the analysis of gene expression patterns by real time quantitative PCR (RT qPCR). This technique requires well characterized sets of reference genes, which display minimal expression level variations under all analyzed conditions, for data normalization. Sets of suitable reference genes have been described for most widely used model systems including e.g. Arabidopsis thaliana, but not for P. patens. Here, we present a RT qPCR based comparison of transcript levels of 12 selected candidate reference genes in a range of gametophytic P. patens structures at different developmental stages, and in P. patens protonemata treated with hormones or hormone transport inhibitors. Analysis of these RT qPCR data using GeNorm and NormFinder software resulted in the identification of sets of P. patens reference genes suitable for gene expression analysis under all tested conditions, and suggested that the two best reference genes are sufficient for effective data normalization under each of these conditions. PMID:23951063
Matus, José Tomás; Aquea, Felipe; Arce-Johnson, Patricio
2008-01-01
Background The MYB superfamily constitutes the most abundant group of transcription factors described in plants. Members control processes such as epidermal cell differentiation, stomatal aperture, flavonoid synthesis, cold and drought tolerance and pathogen resistance. No genome-wide characterization of this family has been conducted in a woody species such as grapevine. In addition, previous analysis of the recently released grape genome sequence suggested expansion events of several gene families involved in wine quality. Results We describe and classify 108 members of the grape R2R3 MYB gene subfamily in terms of their genomic gene structures and similarity to their putative Arabidopsis thaliana orthologues. Seven gene models were derived and analyzed in terms of gene expression and their DNA binding domain structures. Despite low overall sequence homology in the C-terminus of all proteins, even in those with similar functions across Arabidopsis and Vitis, highly conserved motif sequences and exon lengths were found. The grape epidermal cell fate clade is expanded when compared with the Arabidopsis and rice MYB subfamilies. Two anthocyanin MYBA related clusters were identified in chromosomes 2 and 14, one of which includes the previously described grape colour locus. Tannin related loci were also detected with eight candidate homologues in chromosomes 4, 9 and 11. Conclusion This genome wide transcription factor analysis in Vitis suggests that clade-specific grape R2R3 MYB genes are expanded while other MYB genes could be well conserved compared to Arabidopsis. MYB gene abundance, homology and orientation within particular loci also suggests that expanded MYB clades conferring quality attributes of grapes and wines, such as colour and astringency, could possess redundant, overlapping and cooperative functions. PMID:18647406
Setoh, Yin Xiang; Prow, Natalie A; Rawle, Daniel J; Tan, Cindy Si En; Edmonds, Judith H; Hall, Roy A; Khromykh, Alexander A
2015-06-01
A variant Australian West Nile virus (WNV) strain, WNVNSW2011, emerged in 2011 causing an unprecedented outbreak of encephalitis in horses in south-eastern Australia. However, no human cases associated with this strain have yet been reported. Studies using mouse models for WNV pathogenesis showed that WNVNSW2011 was less virulent than the human-pathogenic American strain of WNV, New York 99 (WNVNY99). To identify viral genes and mutations responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, we constructed chimeric viruses with substitution of large genomic regions coding for the structural genes, non-structural genes and untranslated regions, as well as seven individual non-structural gene chimeras, using a modified circular polymerase extension cloning method. Our results showed that the complete non-structural region of WNVNSW2011, when substituted with that of WNVNY99, significantly enhanced viral replication and the ability to suppress type I IFN response in cells, resulting in higher virulence in mice. Analysis of the individual non-structural gene chimeras showed a predominant contribution of WNVNY99 NS3 to increased virus replication and evasion of IFN response in cells, and to virulence in mice. Other WNVNY99 non-structural proteins (NS2A, NS4B and NS5) were shown to contribute to the modulation of IFN response. Thus a combination of non-structural proteins, likely NS2A, NS3, NS4B and NS5, is primarily responsible for the difference in virulence between WNVNSW2011 and WNVNY99 strains, and accumulative mutations within these proteins would likely be required for the Australian WNVNSW2011 strain to become significantly more virulent. © 2015 The Authors.
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava
Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian
2016-01-01
The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava. PMID:26904033
Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava.
Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian
2016-01-01
The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.
Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan
2012-01-01
Background HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. Methods and Findings In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Conclusions Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication. PMID:22808219
Zhou, Mi; Yan, Jun; Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan
2012-01-01
HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication.
Deng, Peng; Tan, Xiaoqing; Wu, Ying; Bai, Qunhua; Jia, Yan; Xiao, Hong
2015-03-01
The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica , which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function.
DENG, PENG; TAN, XIAOQING; WU, YING; BAI, QUNHUA; JIA, YAN; XIAO, HONG
2015-01-01
The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica, which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function. PMID:25667630
Supervised group Lasso with applications to microarray data analysis
Ma, Shuangge; Song, Xiao; Huang, Jian
2007-01-01
Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436
Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.
Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun
2017-12-21
Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
Filiz, Ertugrul; Ozyigit, Ibrahim Ilker; Vatansever, Recep
2015-10-01
GolS genes stand as potential candidate genes for molecular breeding and/or engineering programs in order for improving abiotic stress tolerance in plant species. In this study, a total of six galactinol synthase (GolS) genes/proteins were retrieved for Solanum lycopersicum and Brachypodium distachyon. GolS protein sequences were identified to include glyco_transf_8 (PF01501) domain structure, and to have a close molecular weight (36.40-39.59kDa) and amino acid length (318-347 aa) with a slightly acidic pI (5.35-6.40). The sub-cellular location was mainly predicted as cytoplasmic. S. lycopersicum genes located on chr 1 and 2, and included one segmental duplication while genes of B. distachyon were only on chr 1 with one tandem duplication. GolS sequences were found to have well conserved motif structures. Cis-acting analysis was performed for three abiotic stress responsive elements, including ABA responsive element (ABRE), dehydration and cold responsive elements (DRE/CRT) and low-temperature responsive element (LTRE). ABRE elements were found in all GolS genes, except for SlGolS4; DRE/CRT was not detected in any GolS genes and LTRE element found in SlGolS1 and BdGolS1 genes. AU analysis in UTR and ORF regions indicated that SlGolS and BdGolS mRNAs may have a short half-life. SlGolS3 and SlGolS4 genes may generate more stable transcripts since they included AATTAAA motif for polyadenylation signal POLASIG2. Seconder structures of SlGolS proteins were well conserved than that of BdGolS. Some structural divergences were detected in 3D structures and predicted binding sites exhibited various patterns in GolS proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.
D'Onofrio, Giuseppe; Ghosh, Tapash Chandra
2005-01-17
Fluctuations and increments of both C(3) and G(3) levels along the human coding sequences were investigated comparing two sets of Xenopus/human orthologous genes. The first set of genes shows minor differences of the GC(3) levels, the second shows considerable increments of the GC(3) levels in the human genes. In both data sets, the fluctuations of C(3) and G(3) levels along the coding sequences correlated with the secondary structures of the encoded proteins. The human genes that underwent the compositional transition showed a different increment of the C(3) and G(3) levels within and among the structural units of the proteins. The relative synonymous codon usage (RSCU) of several amino acids were also affected during the compositional transition, showing that there exists a correlation between RSCU and protein secondary structures in human genes. The importance of natural selection for the formation of isochore organization of the human genome has been discussed on the basis of these results.
Genome Structure of the Legume, Lotus japonicus
Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi
2008-01-01
The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435
Zega, Alessandra; D'Ovidio, Renato
2016-11-01
Pectin methyl esterase (PME) genes code for enzymes that are involved in structural modifications of the plant cell wall during plant growth and development. They are also involved in plant-pathogen interaction. PME genes belong to a multigene family and in this study we report the first comprehensive analysis of the PME gene family in bread wheat (Triticum aestivum L.). Like in other species, the members of the TaPME family are dispersed throughout the genome and their encoded products retain the typical structural features of PMEs. qRT-PCR analysis showed variation in the expression pattern of TaPME genes in different tissues and revealed that these genes are mainly expressed in flowering spikes. In our attempt to identify putative TaPME genes involved in wheat defense, we revealed a strong variation in the expression of the TaPME following Fusarium graminearum infection, the causal agent of Fusarium head blight (FHB). Particularly interesting was the finding that the expression profile of some PME genes was markedly different between the FHB-resistant wheat cultivar Sumai3 and the FHB-susceptible cultivar Bobwhite, suggesting a possible involvement of these PME genes in FHB resistance. Moreover, the expression analysis of the TaPME genes during F. graminearum progression within the spike revealed those genes that responded more promptly to pathogen invasion. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.
Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui
2016-06-01
In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.
[Genome-wide identification and analysis of WRKY transcription factors in Medicago truncatula].
Song, Hui; Nan, Zhibiao
2014-02-01
WRKY gene family plays important roles in plant by involving in transcriptional regulations during various physiologically processes such as development, metabolism and responses to biotic and abiotic stresses. WRKY genes have been identified in various plants. However, only few WRKY genes in Medicago truncatula have been identified with systematic analysis and comparison. In this study, we identified 93 WRKY genes through analyses of M. truncatula genome. These genes include 19 type-I genes, 49 type II genes and 13 type-III genes, and 12 non-regular type genes. All of these genes were characterized through analyses of gene duplication, chromosomal locations, structural diversity, conserved protein motifs and phylogenetic relations. The results showed that 11 times of gene duplication event occurred in WRKY gene family involving 24 genes. WRKY genes, containing 6 gene clusters, are unevenly distributed into chromosome 1 to 6, and there is the purifying selection pressure in WRKY group III genes.
Blake, Judith A; Harris, Midori A
2008-09-01
Scientists wishing to utilize genomic data have quickly come to realize the benefit of standardizing descriptions of experimental procedures and results for computer-driven information retrieval systems. The focus of the Gene Ontology project is three-fold. First, the project goal is to compile the Gene Ontologies: structured vocabularies describing domains of molecular biology. Second, the project supports the use of these structured vocabularies in the annotation of gene products. Third, the gene product-to-GO annotation sets are provided by participating groups to the public through open access to the GO database and Web resource. This unit describes the current ontologies and what is beyond the scope of the Gene Ontology project. It addresses the issue of how GO vocabularies are constructed and related to genes and gene products. It concludes with a discussion of how researchers can access, browse, and utilize the GO project in the course of their own research. Copyright 2008 by John Wiley & Sons, Inc.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2013-01-01
A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698
Fragmentation of the large subunit ribosomal RNA gene in oyster mitochondrial genomes.
Milbury, Coren A; Lee, Jung C; Cannone, Jamie J; Gaffney, Patrick M; Gutell, Robin R
2010-09-02
Discontinuous genes have been observed in bacteria, archaea, and eukaryotic nuclei, mitochondria and chloroplasts. Gene discontinuity occurs in multiple forms: the two most frequent forms result from introns that are spliced out of the RNA and the resulting exons are spliced together to form a single transcript, and fragmented gene transcripts that are not covalently attached post-transcriptionally. Within the past few years, fragmented ribosomal RNA (rRNA) genes have been discovered in bilateral metazoan mitochondria, all within a group of related oysters. In this study, we have characterized this fragmentation with comparative analysis and experimentation. We present secondary structures, modeled using comparative sequence analysis of the discontinuous mitochondrial large subunit rRNA genes of the cupped oysters C. virginica, C. gigas, and C. hongkongensis. Comparative structure models for the large subunit rRNA in each of the three oyster species are generally similar to those for other bilateral metazoans. We also used RT-PCR and analyzed ESTs to determine if the two fragmented LSU rRNAs are spliced together. The two segments are transcribed separately, and not spliced together although they still form functional rRNAs and ribosomes. Although many examples of discontinuous ribosomal genes have been documented in bacteria and archaea, as well as the nuclei, chloroplasts, and mitochondria of eukaryotes, oysters are some of the first characterized examples of fragmented bilateral animal mitochondrial rRNA genes. The secondary structures of the oyster LSU rRNA fragments have been predicted on the basis of previous comparative metazoan mitochondrial LSU rRNA structure models.
Henne, Karsten; Li, Jing; Stoneking, Mark; Kessler, Olga; Schilling, Hildegard; Sonanini, Anne; Conrads, Georg; Horz, Hans-Peter
2014-08-22
The genetic diversity of the human microbiome holds great potential for shedding light on the history of our ancestors. Helicobacter pylori is the most prominent example as its analysis allowed a fine-scale resolution of past migration patterns including some that could not be distinguished using human genetic markers. However studies of H. pylori require stomach biopsies, which severely limits the number of samples that can be analysed. By focussing on the house-keeping gene gdh (coding for the glucose-6-phosphate dehydrogenase), on the virulence gene gtf (coding for the glucosyltransferase) of mitis-streptococci and on the 16S-23S rRNA internal transcribed spacer (ITS) region of the Fusobacterium nucleatum/periodonticum-group we here tested the hypothesis that bacterial genes from human saliva have the potential for distinguishing human populations. Analysis of 10 individuals from each of seven geographic regions, encompassing Africa, Asia and Europe, revealed that the genes gdh and ITS exhibited the highest number of polymorphic sites (59% and 79%, respectively) and most OTUs (defined at 99% identity) were unique to a given country. In contrast, the gene gtf had the lowest number of polymorphic sites (21%), and most OTUs were shared among countries. Most of the variation in the gdh and ITS genes was explained by the high clonal diversity within individuals (around 80%) followed by inter-individual variation of around 20%, leaving the geographic region as providing virtually no source of sequence variation. Conversely, for gtf the variation within individuals accounted for 32%, between individuals for 57% and among geographic regions for 11%. This geographic signature persisted upon extension of the analysis to four additional locations from the American continent. Pearson correlation analysis, pairwise Fst-cluster analysis as well as UniFrac analyses consistently supported a tree structure in which the European countries clustered tightly together and branched with American countries and South Africa, to the exclusion of Asian countries and the Congo. This study shows that saliva harbours protein-coding bacterial genes that are geographically structured, and which could potentially be used for addressing previously unresolved human migration events.
Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi
2017-01-01
We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Du, Jiancan; Hu, Simin; Yu, Qin; Wang, Chongde; Yang, Yunqiang; Sun, Hang; Yang, Yongping; Sun, Xudong
2017-01-01
The teosinte branched1/cycloidea/proliferating cell factor (TCP) gene family is a plant-specific transcription factor that participates in the control of plant development by regulating cell proliferation. However, no report is currently available about this gene family in turnips ( Brassica rapa ssp. rapa ). In this study, a genome-wide analysis of TCP genes was performed in turnips. Thirty-nine TCP genes in turnip genome were identified and distributed on 10 chromosomes. Phylogenetic analysis clearly showed that the family was classified as two clades: class I and class II. Gene structure and conserved motif analysis showed that the same clade genes have similar gene structures and conserved motifs. The expression profiles of 39 TCP genes were determined through quantitative real-time PCR. Most CIN-type BrrTCP genes were highly expressed in leaf. The members of CYC/TB1 subclade are highly expressed in flower bud and weakly expressed in root. By contrast, class I clade showed more widespread but less tissue-specific expression patterns. Yeast two-hybrid data show that BrrTCP proteins preferentially formed heterodimers. The function of BrrTCP2 was confirmed through ectopic expression of BrrTCP2 in wild-type and loss-of-function ortholog mutant of Arabidopsis. Overexpression of BrrTCP2 in wild-type Arabidopsis resulted in the diminished leaf size. Overexpression of BrrTCP2 in triple mutants of tcp2/4/10 restored the leaf phenotype of tcp2/4/10 to the phenotype of wild type. The comprehensive analysis of turnip TCP gene family provided the foundation to further study the roles of TCP genes in turnips.
Jian, Hongju; Lu, Kun; Yang, Bo; Wang, Tengyue; Zhang, Li; Zhang, Aoxiang; Wang, Jia; Liu, Liezhao; Qu, Cunmin; Li, Jiana
2016-01-01
Sucrose is the principal transported product of photosynthesis from source leaves to sink organs. SUTs/SUCs (sucrose transporters or sucrose carriers) and SWEETs (Sugars Will Eventually be Exported Transporters) play significant central roles in phloem loading and unloading. SUTs/SUCs and SWEETs are key players in sucrose translocation and are associated with crop yields. The SUT/SUC and SWEET genes have been characterized in several plant species, but a comprehensive analysis of these two gene families in oilseed rape has not yet been reported. In our study, 22 and 68 members of the SUT/SUCs and SWEET gene families, respectively, were identified in the oilseed rape (Brassica napus) genome through homology searches. An analysis of the chromosomal distribution, phylogenetic relationships, gene structures, motifs and the cis-acting regulatory elements in the promoters of BnSUC and BnSWEET genes were analyzed. Furthermore, we examined the expression of the 18 BnSUC and 16 BnSWEET genes in different tissues of “ZS11” and the expression of 9 BnSUC and 7 BnSWEET genes in “ZS11” under various conditions, including biotic stress (Sclerotinia sclerotiorum), abiotic stresses (drought, salt and heat), and hormone treatments (abscisic acid, auxin, cytokinin, brassinolide, gibberellin, and salicylic acid). In conclusion, our study provides the first comprehensive analysis of the oilseed rape SUC and SWEET gene families. Information regarding the phylogenetic relationships, gene structure and expression profiles of the SUC and SWEET genes in the different tissues of oilseed rape helps to identify candidates with potential roles in specific developmental processes. Our study advances our understanding of the important roles of sucrose transport in oilseed rape. PMID:27733861
Curk, Franck; Ancillo, Gema; Garcia-Lor, Andres; Luro, François; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Navarro, Luis; Ollitrault, Patrick
2014-12-29
The most economically important Citrus species originated by natural interspecific hybridization between four ancestral taxa (Citrus reticulata, Citrus maxima, Citrus medica, and Citrus micrantha) and from limited subsequent interspecific recombination as a result of apomixis and vegetative propagation. Such reticulate evolution coupled with vegetative propagation results in mosaic genomes with large chromosome fragments from the basic taxa in frequent interspecific heterozygosity. Modern breeding of these species is hampered by their complex heterozygous genomic structures that determine species phenotype and are broken by sexual hybridisation. Nevertheless, a large amount of diversity is present in the citrus gene pool, and breeding to allow inclusion of desirable traits is of paramount importance. However, the efficient mobilization of citrus biodiversity in innovative breeding schemes requires previous understanding of Citrus origins and genomic structures. Haplotyping of multiple gene fragments along the whole genome is a powerful approach to reveal the admixture genomic structure of current species and to resolve the evolutionary history of the gene pools. In this study, the efficiency of parallel sequencing with 454 methodology to decipher the hybrid structure of modern citrus species was assessed by analysis of 16 gene fragments on chromosome 2. 454 amplicon libraries were established using the Fluidigm array system for 48 genotypes and 16 gene fragments from chromosome 2. Haplotypes were established from the reads of each accession and phylogenetic analyses were performed using the haplotypic data for each gene fragment. The length of 454 reads and the level of differentiation between the ancestral taxa of modern citrus allowed efficient haplotype phylogenetic assignations for 12 of the 16 gene fragments. The analysis of the mixed genomic structure of modern species and cultivars (i) revealed C. maxima introgressions in modern mandarins, (ii) was consistent with previous hypotheses regarding the origin of secondary species, and (iii) provided a new picture of the evolution of chromosome 2. 454 sequencing was an efficient strategy to establish haplotypes with significant phylogenetic assignations in Citrus, providing a new picture of the mixed structure on chromosome 2 in 48 citrus genotypes.
Victoria L. Sork; Peter E. Smouse; Victoria J. Apsit; Rodney J. Dyer; Robert D. Westfall
2005-01-01
Anthropogenic landscape change can disrupt gene flow. As part of the Missouri Ozark Forest Ecosystem Project, this study examined whether silvicultural practices influence pollen-mediated gene movement in the insect-pollinated species, Cornus florida L., by comparing pollen pool structure (ΦST) among clear-cutting,...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Waldron, P.J.; Wu, L.; Van Nostrand, J.D.
2009-06-15
To understand how contaminants affect microbial community diversity, heterogeneity, and functional structure, six groundwater monitoring wells from the Field Research Center of the U.S. Department of Energy Environmental Remediation Science Program (ERSP; Oak Ridge, TN), with a wide range of pH, nitrate, and heavy metal contamination were investigated. DNA from the groundwater community was analyzed with a functional gene array containing 2006 probes to detect genes involved in metal resistance, sulfate reduction, organic contaminant degradation, and carbon and nitrogen cycling. Microbial diversity decreased in relation to the contamination levels of the wells. Highly contaminated wells had lower gene diversity butmore » greater signal intensity than the pristine well. The microbial composition was heterogeneous, with 17-70% overlap between different wells. Metal-resistant and metal-reducing microorganisms were detected in both contaminated and pristine wells, suggesting the potential for successful bioremediation of metal-contaminated groundwaters. In addition, results of Mantel tests and canonical correspondence analysis indicate that nitrate, sulfate, pH, uranium, and technetium have a significant (p < 0.05) effect on microbial community structure. This study provides an overall picture of microbial community structure in contaminated environments with functional gene arrays by showing that diversity and heterogeneity can vary greatly in relation to contamination.« less
Wang, Lishi; Huang, Yue; Jiao, Yan; Chen, Hong; Cao, Yanhong; Bennett, Beth; Wang, Yongjun; Gu, Weikuan
2013-01-01
The purpose of this study is to investigate whether expression profiles of alcoholism-relevant genes in different parts of the brain are correlated differently with those in the liver. Four experiments were conducted. First, we used gene expression profiles from five parts of the brain (striatum, prefrontal cortex, nucleus accumbens, hippocampus, and cerebellum) and from liver in a population of recombinant inbred mouse strains to examine the expression association of 10 alcoholism-relevant genes. Second, we conducted the same association analysis between brain structures and the lung. Third, using five randomly selected, nonalcoholism-relevant genes, we conducted the association analysis between brain and liver. Finally, we compared the expression of 10 alcoholism-relevant genes in hippocampus and cerebellum between an alcohol preference strain and a wild-type control. We observed a difference in correlation patterns in expression levels of 10 alcoholism-relevant genes between different parts of the brain with those of liver. We then examined the association of gene expression between alcohol dehydrogenases (Adh1, Adh2, Adh5, and Adh7) and different parts of the brain. The results were similar to those of the 10 genes. Then, we found that the association of those genes between brain structures and lung was different from that of liver. Next, we found that the association patterns of five alcoholism-nonrelevant genes were different from those of 10 alcoholism-relevant genes. Finally, we found that the expression level of 10 alcohol-relevant genes is influenced more in hippocampus than in cerebellum in the alcohol preference strain. Our results show that the expression of alcoholism-relevant genes in liver is differently associated with the expression of genes in different parts of the brain. Because different structural changes in different parts of the brain in alcoholism have been reported, it is important to investigate whether those structural differences in the brains of those with alcoholism are due to the difference in the associations of gene expression between genes in liver and in different parts of the brain.
Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou
2016-01-01
The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts. PMID:26907269
Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou
2016-02-23
The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones
Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio
2004-01-01
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
Holland, M J; Holland, J P; Thill, G P; Jackson, K A
1981-02-10
Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
Visualizing conserved gene location across microbe genomes
NASA Astrophysics Data System (ADS)
Shaw, Chris D.
2009-01-01
This paper introduces an analysis-based zoomable visualization technique for displaying the location of genes across many related species of microbes. The purpose of this visualizatiuon is to enable a biologist to examine the layout of genes in the organism of interest with respect to the gene organization of related organisms. During the genomic annotation process, the ability to observe gene organization in common with previously annotated genomes can help a biologist better confirm the structure and function of newly analyzed microbe DNA sequences. We have developed a visualization and analysis tool that enables the biologist to observe and examine gene organization among genomes, in the context of the primary sequence of interest. This paper describes the visualization and analysis steps, and presents a case study using a number of Rickettsia genomes.
Rodríguez-García, María Juliana; García-Reina, Andrés; Machado, Vilmar; Galián, José
2016-09-01
In this study, a defensin gene (Clit-Def) has been characterised in the tiger beetle Calomera littoralis for the first time. Bioinformatic analysis showed that the gene has an open reading frame of 246bp that contains a 46 amino acid mature peptide. The phylogenetic analysis showed a high variability in the coleopteran defensins analysed. The Clit-Def mature peptide has the features to be involved in the antimicrobial function: a predicted cationic isoelectric point of 8.94, six cysteine residues that form three disulfide bonds, and the typical cysteine-stabilized α-helix β-sheet (CSαβ) structural fold. Real time quantitative PCR analysis showed that Clit-Def was upregulated in the different body parts analysed after infection with lipopolysaccharides of Escherichia coli, and also indicated that has an expression peak at 12h post infection. The expression patterns of Clit-Def suggest that this gene plays important roles in the humoral system in the adephagan beetle Calomera littoralis. Copyright © 2016 Elsevier B.V. All rights reserved.
Hou, Xiao-Jin; Li, Si-Bei; Liu, Sheng-Rui; Hu, Chun-Gen; Zhang, Jin-Zhi
2014-01-01
MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB) family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB). Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus. PMID:25375352
Nguyen, Doan H.; Toshida, Hiroshi; Schurr, Jill; Beuerman, Roger W.
2010-01-01
Previous studies showed that loss of muscarinic parasympathetic input to the lacrimal gland (LG) leads to a dramatic reduction in tear secretion and profound changes to LG structure. In this study, we used DNA microarrays to examine the regulation of the gene expression of the genes for secretory function and organization of the LG. Long-Evans rats anesthetized with a mixture of ketamine/xylazine (80:10 mg/kg) underwent unilateral sectioning of the greater superficial petrosal nerve, the input to the pterygopalatine ganglion. After 7 days, tear secretion was measured, the animals were killed, and structural changes in the LG were examined by light microscopy. Total RNA from control and experimental LGs (n = 5) was used for DNA microarray analysis employing the U34A GeneChip. Three statistical algorithms (detection, change call, and signal log ratio) were used to determine differential gene expression using the Microarray Suite (5.0) and Data Mining Tools (3.0). Tear secretion was significantly reduced and corneal ulcers developed in all experimental eyes. Light microscopy showed breakdown of the acinar structure of the LG. DNA microarray analysis showed downregulation of genes associated with the endoplasmic reticulum and Golgi, including genes involved in protein folding and processing. Conversely, transcripts for cytoskeleton and extracellular matrix components, inflammation, and apoptosis were upregulated. The number of significantly upregulated genes (116) was substantially greater than the number of downregulated genes (49). Removal of the main secretory input to the rat LG resulted in clinical symptoms associated with severe dry eye. Components of the secretory pathway were negatively affected, and the increase in cell proliferation and inflammation may lead to loss of organization in the parasympathectomized lacrimal gland. PMID:15084711
Sizova, Olga V; Shashkov, Alexander S; Kondakova, Anna N; Knirel, Yuriy A; Shaikhutdinova, Rima Z; Ivanov, Sergei A; Kislichkina, Angelina A; Kadnikova, Lidia A; Bogun, Aleksandr G; Dentovskaya, Svetlana V
2018-05-02
Lipopolysaccharide was isolated from bacteria Yersinia intermedia H9-36/83 (O:17) and degraded with mild acid to give an O-specific polysaccharide, which was isolated by GPC on Sephadex G-50 and studied by sugar analysis and 1D and 2D NMR spectroscopy. The polysaccharide was found to contain 3-deoxy-3-[(R)-3-hydroxybutanoylamino]-d-fucose (d-Fuc3NR3Hb) and the following structure of the heptasaccharide repeating unit was established: The structure established is consistent with the gene content of the O-antigen gene cluster. The O-polysaccharide structure and gene cluster of Y. intermedia are related to those of Hafnia alvei 1211 and Escherichia coli O:103. Copyright © 2018 Elsevier Ltd. All rights reserved.
Ibrahim, Kalibulla Syed; Muniyandi, Jeyaraj; Pandian, Shunmugiah Karutha
2011-10-01
Leather industries release a large amount of pollution-causing chemicals which creates one of the major industrial pollutions. The development of enzyme based processes as a potent alternative to pollution-causing chemicals is useful to overcome this issue. Proteases are enzymes which have extensive applications in leather processing and in several bioremediation processes due to their high alkaline protease activity and dehairing efficacy. In the present study, we report cloning, characterization of a Mn2+ dependent alkaline serine protease gene (MASPT) of Bacillus pumilus TMS55. The gene encoding the protease from B. pumilus TMS55 was cloned and its nucleotide sequence was determined. This gene has an open reading frame (ORF) of 1,149 bp that encodes a polypeptide of 383 amino acid residues. Our analysis showed that this polypeptide is composed of 29 residues N-terminal signal peptide, a propeptide of 79 residues and a mature protein of 275 amino acids. We performed bioinformatics analysis to compare MASPT enzyme with other proteases. Homology modeling was employed to model three dimensional structure for MASPT. Structural analysis showed that MASPT structure is composed of nine α-helices and nine β-strands. It has 3 catalytic residues and 14 metal binding residues. Docking analysis showed that residues S223, A260, N263, T328 and S329 interact with Mn2+. This study allows initial inferences about the structure of the protease and will allow the rational design of its derivatives for structure-function studies and also for further improvement of the enzyme.
NASA Astrophysics Data System (ADS)
Song, Xiaoming; Duan, Weike; Huang, Zhinan; Liu, Gaofeng; Wu, Peng; Liu, Tongkun; Li, Ying; Hou, Xilin
2015-09-01
In plants, flowering is the most important transition from vegetative to reproductive growth. The flowering patterns of monocots and eudicots are distinctly different, but few studies have described the evolutionary patterns of the flowering genes in them. In this study, we analysed the evolutionary pattern, duplication and expression level of these genes. The main results were as follows: (i) characterization of flowering genes in monocots and eudicots, including the identification of family-specific, orthologous and collinear genes; (ii) full characterization of CONSTANS-like genes in Brassica rapa (BraCOL genes), the key flowering genes; (iii) exploration of the evolution of COL genes in plant kingdom and construction of the evolutionary pattern of COL genes; (iv) comparative analysis of CO and FT genes between Brassicaceae and Grass, which identified several family-specific amino acids, and revealed that CO and FT protein structures were similar in B. rapa and Arabidopsis but different in rice; and (v) expression analysis of photoperiod pathway-related genes in B. rapa under different photoperiod treatments by RT-qPCR. This analysis will provide resources for understanding the flowering mechanisms and evolutionary pattern of COL genes. In addition, this genome-wide comparative study of COL genes may also provide clues for evolution of other flowering genes.
The structure of the human interferon alpha/beta receptor gene.
Lutfalla, G; Gardiner, K; Proudhon, D; Vielh, E; Uzé, G
1992-02-05
Using the cDNA coding for the human interferon alpha/beta receptor (IFNAR), the IFNAR gene has been physically mapped relative to the other loci of the chromosome 21q22.1 region. 32,906 base pairs covering the IFNAR gene have been cloned and sequenced. Primer extension and solution hybridization-ribonuclease protection have been used to determine that the transcription of the gene is initiated in a broad region of 20 base pairs. Some aspects of the polymorphism of the gene, including noncoding sequences, have been analyzed; some are allelic differences in the coding sequence that induce amino acid variations in the resulting protein. The exon structure of the IFNAR gene and of that of the available genes for the receptors of the cytokine/growth hormone/prolactin/interferon receptor family have been compared with the predictions for the secondary structure of those receptors. From this analysis, we postulate a common origin and propose an hypothesis for the divergence from the immunoglobulin superfamily.
Schink, Anne-Kathrin; Kadlec, Kristina; Schwarz, Stefan
2011-01-01
In this study, 417 Escherichia coli isolates from defined disease conditions of companion and farm animals collected in the BfT-GermVet study were investigated for the presence of extended-spectrum β-lactamase (ESBL) genes. Three ESBL-producing E. coli isolates were identified among the 100 ampicillin-resistant isolates. The E. coli isolates 168 and 246, of canine and porcine origins, respectively, harbored blaCTX-M-1, and the canine isolate 913 harbored blaCTX-M-15, as confirmed by PCR and sequence analysis. The isolates 168 and 246 belonged to the novel multilocus sequence typing (MLST) types ST1576 and ST1153, respectively, while isolate 913 had the MLST type ST410. The ESBL genes were located on structurally related IncN plasmids in isolates 168 and 246 and on an IncF plasmid in isolate 913. The blaCTX-M-1 upstream regions of plasmids pCTX168 and pCTX246 were similar, whereas the downstream regions showed structural differences. The genetic environment of the blaCTX-M-15 gene on plasmid pCTX913 differed distinctly from that of both blaCTX-M-1 genes. Detailed sequence analysis showed that the integration of insertion sequences, as well as interplasmid recombination events, accounted for the structural variability in the blaCTX-M gene regions. PMID:21685166
Weekes, Jennifer; Yüksel, Gülhan Ü.
2004-01-01
Two lactate dehydrogenase (ldh) genes from Lactobacillus sp. strain MONT4 were cloned by complementation in Escherichia coli DC1368 (ldh pfl) and were sequenced. The sequence analysis revealed a novel genomic organization of the ldh genes. Subcloning of the individual ldh genes and their Northern blot analyses indicated that the genes are monocistronic. PMID:15466577
Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K.; Sopory, Sudhir K.; Kapoor, Sanjay; Pandey, Girdhar K.
2013-01-01
Background Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. Methodology/Principal Findings An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. Conclusion/Significance The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future. PMID:23638098
Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K; Sopory, Sudhir K; Kapoor, Sanjay; Pandey, Girdhar K
2013-01-01
Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future.
Mondal, Sunil Kanti; Kundu, Sudip; Das, Rabindranath; Roy, Sujit
2016-08-01
Bacteria and archaea have evolved with the ability to fix atmospheric dinitrogen in the form of ammonia, catalyzed by the nitrogenase enzyme complex which comprises three structural genes nifK, nifD and nifH. The nifK and nifD encodes for the beta and alpha subunits, respectively, of component 1, while nifH encodes for component 2 of nitrogenase. Phylogeny based on nifDHK have indicated that Cyanobacteria is closer to Proteobacteria alpha and gamma but not supported by the tree based on 16SrRNA. The evolutionary ancestor for the different trees was also different. The GC1 and GC2% analysis showed more consistency than GC3% which appeared to below for Firmicutes, Cyanobacteria and Euarchaeota while highest in Proteobacteria beta and clearly showed the proportional effect on the codon usage with a few exceptions. Few genes from Firmicutes, Euryarchaeota, Proteobacteria alpha and delta were found under mutational pressure. These nif genes with low and high GC3% from different classes of organisms showed similar expected number of codons. Distribution of the genes and codons, based on codon usage demonstrated opposite pattern for different orientation of mirror plane when compared with each other. Overall our results provide a comprehensive analysis on the evolutionary relationship of the three structural nif genes, nifK, nifD and nifH, respectively, in the context of codon usage bias, GC content relationship and amino acid composition of the encoded proteins and exploration of crucial statistical method for the analysis of positive data with non-constant variance to identify the shape factors of codon adaptation index.
Genome-wide Identification and Expression Analysis of the CDPK Gene Family in Grape, Vitis spp.
Zhang, Kai; Han, Yong-Tao; Zhao, Feng-Li; Hu, Yang; Gao, Yu-Rong; Ma, Yan-Fei; Zheng, Yi; Wang, Yue-Jin; Wen, Ying-Qiang
2015-06-30
Calcium-dependent protein kinases (CDPKs) play vital roles in plant growth and development, biotic and abiotic stress responses, and hormone signaling. Little is known about the CDPK gene family in grapevine. In this study, we performed a genome-wide analysis of the 12X grape genome (Vitis vinifera) and identified nineteen CDPK genes. Comparison of the structures of grape CDPK genes allowed us to examine their functional conservation and differentiation. Segmentally duplicated grape CDPK genes showed high structural conservation and contributed to gene family expansion. Additional comparisons between grape and Arabidopsis thaliana demonstrated that several grape CDPK genes occured in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grapevine and Arabidopsis. Phylogenetic analysis divided the grape CDPK genes into four groups. Furthermore, we examined the expression of the corresponding nineteen homologous CDPK genes in the Chinese wild grape (Vitis pseudoreticulata) under various conditions, including biotic stress, abiotic stress, and hormone treatments. The expression profiles derived from reverse transcription and quantitative PCR suggested that a large number of VpCDPKs responded to various stimuli on the transcriptional level, indicating their versatile roles in the responses to biotic and abiotic stresses. Moreover, we examined the subcellular localization of VpCDPKs by transiently expressing six VpCDPK-GFP fusion proteins in Arabidopsis mesophyll protoplasts; this revealed high variability consistent with potential functional differences. Taken as a whole, our data provide significant insights into the evolution and function of grape CDPKs and a framework for future investigation of grape CDPK genes.
Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
2014-01-01
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
2014-01-01
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Trivedi, G.; Sreedasyam, A.
2010-07-06
Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-02-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Ashbrook, David G; Williams, Robert W; Lu, Lu; Stein, Jason L; Hibar, Derrek P; Nichols, Thomas E; Medland, Sarah E; Thompson, Paul M; Hager, Reinmar
2014-10-03
Variation in hippocampal volume has been linked to significant differences in memory, behavior, and cognition among individuals. To identify genetic variants underlying such differences and associated disease phenotypes, multinational consortia such as ENIGMA have used large magnetic resonance imaging (MRI) data sets in human GWAS studies. In addition, mapping studies in mouse model systems have identified genetic variants for brain structure variation with great power. A key challenge is to understand how genetically based differences in brain structure lead to the propensity to develop specific neurological disorders. We combine the largest human GWAS of brain structure with the largest mammalian model system, the BXD recombinant inbred mouse population, to identify novel genetic targets influencing brain structure variation that are linked to increased risk for neurological disorders. We first use a novel cross-species, comparative analysis using mouse and human genetic data to identify a candidate gene, MGST3, associated with adult hippocampus size in both systems. We then establish the coregulation and function of this gene in a comprehensive systems-analysis. We find that MGST3 is associated with hippocampus size and is linked to a group of neurodegenerative disorders, such as Alzheimer's.
NASA Astrophysics Data System (ADS)
Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid
2017-02-01
Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.
Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.
Andersen, Ethan J; Nepal, Madhav P
2017-08-01
We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.
[Genome-wide identification and expression analysis of the WRKY gene family in peach].
Gu, Yan-bing; Ji, Zhi-rui; Chi, Fu-mei; Qiao, Zhuang; Xu, Cheng-nan; Zhang, Jun-xiang; Zhou, Zong-shan; Dong, Qing-long
2016-03-01
The WRKY transcription factors are one of the largest families of transcriptional regulators and play diverse regulatory roles in biotic and abiotic stresses, plant growth and development processes. In this study, the WRKY DNA-binding domain (Pfam Database number: PF03106) downloaded from Pfam protein families database was exploited to identify WRKY genes from the peach (Prunus persica 'Lovell') genome using HMMER 3.0. The obtained amino acid sequences were analyzed with DNAMAN 5.0, WebLogo 3, MEGA 5.1, MapInspect and MEME bioinformatics softwares. Totally 61 peach WRKY genes were found in the peach genome. Our phylogenetic analysis revealed that peach WRKY genes were classified into three Groups: Ⅰ, Ⅱ and Ⅲ. The WRKY N-terminal and C-terminal domains of Group Ⅰ (group I-N and group I-C) were monophyletic. The Group Ⅱ was sub-divided into five distinct clades (groupⅡ-a, Ⅱ-b, Ⅱ-c, Ⅱ-d and Ⅱ-e). Our domain analysis indicated that the WRKY regions contained a highly conserved heptapeptide stretch WRKYGQK at its N-terminus followed by a zinc-finger motif. The chromosome mapping analysis showed that peach WRKY genes were distributed with different densities over 8 chromosomes. The intron-exon structure analysis revealed that structures of the WRKY gene were highly conserved in the peach. The conserved motif analysis showed that the conserved motifs 1, 2 and 3, which specify the WRKY domain, were observed in all peach WRKY proteins, motif 5 as the unknown domain was observed in group Ⅱ-d, two WRKY domains were assigned to GroupⅠ. SqRT-PCR and qRT-PCR results indicated that 16 PpWRKY genes were expressed in roots, stems, leaves, flowers and fruits at various expression levels. Our analysis thus identified the PpWRKY gene families, and future functional studies are needed to reveal its specific roles.
Kück, Ulrich; Choquet, Yves; Schneider, Michel; Dron, Michel; Bennoun, Pierre
1987-01-01
The two homologous genes for the P700 chlorophyll a-apoproteins (ps1A1 and ps1A2) are encoded by the plastom in the green alga Chlamydomonas reinhardii. The structure and organization of the two genes were determined by comparison with the homologous genes from maize using data from heterologous hybridizations as well as from DNA and RNA sequencing. While the ps1A2 (736 codons) gene shows a continuous gene organization, the ps1A1 (754 codons) gene possesses some unusual features. The discontinuous gene is split into three separate exons which are scattered around the circular chloroplast genome. Exon 1 (86 bp) is separated by ∼50 kb from exon 2 (198 bp), which is located ∼ 90 kb apart from exon 3 (1984 bp). All exons are flanked by intronic sequences of group II. Transcription analysis reveals that the ps1A2 gene hybridizes with a 2.8-kb transcript, while all exon regions of the ps1A1 gene are homologous to a mature mRNA of 2.7 kb. From our data we conclude that the three distantly separated exonic sequences of the ps1A1 gene constitute a functional gene which probably operates by a trans-splicing mechanism. ImagesFig. 3.Fig. 5.Fig. 6. PMID:16453785
Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A; Rathbun, Stephen L; Whitman, William B
2016-01-01
K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.
Basic Helix-Loop-Helix Transcription Factor Gene Family Phylogenetics and Nomenclature
Skinner, Michael K.; Rawls, Alan; Wilson-Rawls, Jeanne; Roalson, Eric H.
2010-01-01
A phylogenetic analysis of the basic helix-loop-helix (bHLH) gene superfamily was performed using seven different species (human, mouse, rat, worm, fly, yeast, and plant Arabidopsis) and involving over 600 bHLH genes [1]. All bHLH genes were identified in the genomes of the various species, including expressed sequence tags, and the entire coding sequence was used in the analysis. Nearly 15% of the gene family has been updated or added since the original publication. A super-tree involving six clades and all structural relationships was established and is now presented for four of the species. The wealth of functional data available for members of the bHLH gene superfamily provides us with the opportunity to use this exhaustive phylogenetic tree to predict potential functions of uncharacterized members of the family. This phylogenetic and genomic analysis of the bHLH gene family has revealed unique elements of the evolution and functional relationships of the different genes in the bHLH gene family. PMID:20219281
Ma, Jie; Deng, Ye; Yuan, Tong; Zhou, Jizhong; Alvarez, Pedro J J
2015-03-01
GeoChip, a comprehensive gene microarray, was used to examine changes in microbial functional gene structure throughout the 4-year life cycle of a pilot-scale ethanol blend plume, including 2-year continuous released followed by plume disappearance after source removal. Canonical correlation analysis (CCA) and Mantel tests showed that dissolved O2 (which was depleted within 5 days of initiating the release and rebounded 194 days after source removal) was the most influential environmental factor on community structure. Initially, the abundance of anaerobic BTEX degradation genes increased significantly while that of aerobic BTEX degradation genes decreased. Gene abundance for N fixation, nitrification, P utilization, sulfate reduction and S oxidation also increased, potentially changing associated biogeochemical cycle dynamics. After plume disappearance, most genes returned to pre-release abundance levels, but the final functional structure significantly differed from pre-release conditions. Overall, observed successions of functional structure reflected adaptive responses that were conducive to biodegradation of ethanol-blend releases. Copyright © 2015. Published by Elsevier Ltd.
Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach
Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo
2013-01-01
The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142
NASA Astrophysics Data System (ADS)
Illing, Gerd; Saenger, Wolfram; Heinemann, Udo
2000-06-01
The Protein Structure Factory will be established to characterize proteins encoded by human genes or cDNAs, which will be selected by criteria of potential structural novelty or medical or biotechnological usefulness. It represents an integrative approach to structure analysis combining bioinformatics techniques, automated gene expression and purification of gene products, generation of a biophysical fingerprint of the proteins and the determination of their three-dimensional structures either by NMR spectroscopy or by X-ray diffraction. The use of synchrotron radiation will be crucial to the Protein Structure Factory: high brilliance and tunable wavelengths are prerequisites for fast data collection, the use of small crystals and multiwavelength anomalous diffraction (MAD) phasing. With the opening of BESSY II, direct access to a third-generation XUV storage ring source with excellent conditions is available nearby. An insertion device with two MAD beamlines and one constant energy station will be set up until 2001.
Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan
2018-03-28
Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa , Zea mays , Sorghum bicolor , Cicer arietinum , and Vitis vinifera , and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii , Physcomitrella patens , and Amborella trichopoda , revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice ( OsAlba ), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure-function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants.
Lu, Zhenmei; He, Zhili; Parisi, Victoria A.; Kang, Sanghoon; Deng, Ye; Van Nostrand, Joy D.; Masoner, Jason R.; Cozzarelli, Isabelle M.; Suflita, Joseph M.; Zhou, Jizhong
2012-01-01
The functional gene diversity and structure of microbial communities in a shallow landfill leachate-contaminated aquifer were assessed using a comprehensive functional gene array (GeoChip 3.0). Water samples were obtained from eight wells at the same aquifer depth immediately below a municipal landfill or along the predominant downgradient groundwater flowpath. Functional gene richness and diversity immediately below the landfill and the closest well were considerably lower than those in downgradient wells. Mantel tests and canonical correspondence analysis (CCA) suggested that various geochemical parameters had a significant impact on the subsurface microbial community structure. That is, leachate from the unlined landfill impacted the diversity, composition, structure, and functional potential of groundwater microbial communities as a function of groundwater pH, and concentrations of sulfate, ammonia, and dissolved organic carbon (DOC). Historical geochemical records indicate that all sampled wells chronically received leachate, and the increase in microbial diversity as a function of distance from the landfill is consistent with mitigation of the impact of leachate on the groundwater system by natural attenuation mechanisms.
Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung
2017-08-08
We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lewis, P.M.; Crosier, K.E.; Crosier, P.S.
The receptor tyrosine kinase Dtk/Tyro 3/Sky/rse/brt/tif is a member of a new subfamily of receptors that also includes Axl/Ufo/Ark and Eyk/Mer. These receptors are characterized by the presence of two immunoglobulin-like loops and two fibronectin type III repeats in their extracellular domains. The structure of the murine Dtk gene has been determined. The gene consists of 21 exons that are distributed over 21 kb of genomic DNA. An isoform of Dtk is generated by differential splicing of exons from the 5{prime} region of the gene. The overall genomic structure of Dtk is virtually identical to that determined for the humanmore » UFO gene. This particular genomic organization is likely to have been duplicated and closely maintained throughout evolution. 38 refs., 3 figs., 1 tab.« less
Mutational Analysis of Cell Types in TSC
2008-01-01
disability, and autism . TSC1/TSC2 gene mutations lead to developmental alterations in brain structure known as tubers in over 80% of TSC patients. Loss of...that is associated with epilepsy, cognitive disability, and autism . TSC1/TSC2 gene mutations lead to developmental alterations in brain structure...2000). Comorbid neuropsychological disorders such as autism , mental retardation (MR), pervasive developmental disorder, attention deficit disorder (ADD
Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui
2016-01-01
WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.
Genome-wide characterization of the SiDof gene family in foxtail millet (Setaria italica).
Zhang, Li; Liu, Baoling; Zheng, Gewen; Zhang, Aiying; Li, Runzhi
2017-01-01
Dof (DNA binding with one finger) proteins, which constitute a class of transcription factors found exclusively in plants, are involved in numerous physiological and biochemical reactions affecting growth and development. A genome-wide analysis of SiDof genes was performed in this study. Thirty five SiDof genes were identified and those genes were unevenly distributed across nine chromosomes in the Seteria italica genome. Protein lengths, molecular weights, and theoretical isoelectric points of SiDofs all vary greatly. Gene structure analysis demonstrated that most SiDof genes lack introns. Phylogenetic analysis of SiDof proteins and Dof proteins from Arabidopsis thaliana, rice, sorghum, and Setaria viridis revealed six major groups. Analysis of RNA-Seq data indicated that SiDof gene expression levels varied across roots, stems, leaves, and spike. In addition, expression profiling of SiDof genes in response to stress suggested that SiDof 7 and SiDof 15 are involved in drought stress signalling. Overall, this study could provide novel information on SiDofs for further investigation in foxtail millet. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Reverse genetics: Its origins and prospects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berg, P.
1991-04-01
The nucleotide sequence of a gene and its flanking segments alone will not tell us how its expression is regulated during development and differentiation, or in response to environmental changes. To comprehend the physiological significance of the molecular details requires biological analysis. Recombinant DNA techniques provide a powerful experimental approach. A strategy termed reverse genetics' utilizes the analysis of the activities of mutant and normal genes and experimentally constructed mutants to explore the relationship between gene structure and function thereby helping elucidate the relationship between genotype and phenotype.
Genome-Wide Analysis of the NAC Gene Family in Physic Nut (Jatropha curcas L.)
Wu, Zhenying; Xu, Xueqin; Xiong, Wangdan; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Wu, Guojiang; Jiang, Huawu
2015-01-01
The NAC proteins (NAM, ATAF1/2 and CUC2) are plant-specific transcriptional regulators that have a conserved NAM domain in the N-terminus. They are involved in various biological processes, including both biotic and abiotic stress responses. In the present study, a total of 100 NAC genes (JcNAC) were identified in physic nut (Jatropha curcas L.). Based on phylogenetic analysis and gene structures, 83 JcNAC genes were classified as members of, or proposed to be diverged from, 39 previously predicted orthologous groups (OGs) of NAC sequences. Physic nut has a single intron-containing NAC gene subfamily that has been lost in many plants. The JcNAC genes are non-randomly distributed across the 11 linkage groups of the physic nut genome, and appear to be preferentially retained duplicates that arose from both ancient and recent duplication events. Digital gene expression analysis indicates that some of the JcNAC genes have tissue-specific expression profiles (e.g. in leaves, roots, stem cortex or seeds), and 29 genes differentially respond to abiotic stresses (drought, salinity, phosphorus deficiency and nitrogen deficiency). Our results will be helpful for further functional analysis of the NAC genes in physic nut. PMID:26125188
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-01-01
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance. PMID:28417911
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress.
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-04-12
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance.
Genome-Wide Analysis of the NAC Gene Family in Physic Nut (Jatropha curcas L.).
Wu, Zhenying; Xu, Xueqin; Xiong, Wangdan; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Wu, Guojiang; Jiang, Huawu
2015-01-01
The NAC proteins (NAM, ATAF1/2 and CUC2) are plant-specific transcriptional regulators that have a conserved NAM domain in the N-terminus. They are involved in various biological processes, including both biotic and abiotic stress responses. In the present study, a total of 100 NAC genes (JcNAC) were identified in physic nut (Jatropha curcas L.). Based on phylogenetic analysis and gene structures, 83 JcNAC genes were classified as members of, or proposed to be diverged from, 39 previously predicted orthologous groups (OGs) of NAC sequences. Physic nut has a single intron-containing NAC gene subfamily that has been lost in many plants. The JcNAC genes are non-randomly distributed across the 11 linkage groups of the physic nut genome, and appear to be preferentially retained duplicates that arose from both ancient and recent duplication events. Digital gene expression analysis indicates that some of the JcNAC genes have tissue-specific expression profiles (e.g. in leaves, roots, stem cortex or seeds), and 29 genes differentially respond to abiotic stresses (drought, salinity, phosphorus deficiency and nitrogen deficiency). Our results will be helpful for further functional analysis of the NAC genes in physic nut.
GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.
Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H
2010-04-01
A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.
Homez, a homeobox leucine zipper gene specific to the vertebrate lineage.
Bayarsaihan, Dashzeveg; Enkhmandakh, Badam; Makeyev, Aleksandr; Greally, John M; Leckman, James F; Ruddle, Frank H
2003-09-02
This work describes a vertebrate homeobox gene, designated Homez (homeodomain leucine zipper-encoding gene), that encodes a protein with an unusual structural organization. There are several regions within Homez, including three atypical homeodomains, two leucine zipper-like motifs, and an acidic domain. The gene is ubiquitously expressed in human and murine tissues, although the expression pattern is more restricted during mouse development. Genomic analysis revealed that human and mouse genes are located at 14q11.2 and 14C, respectively, and are composed of two exons. The zebrafish and pufferfish homologs share high similarity to mammalian sequences, particularly within the homeodomain sequences. Based on homology of homeodomains and on the similarity in overall protein structure, we delineate Homez and members of ZHX family of zinc finger homeodomain factors as a subset within the superfamily of homeobox-containing proteins. The type and composition of homeodomains in the Homez subfamily are vertebrate-specific. Phylogenetic analysis indicates that Homez lineage was separated from related genes >400 million years ago before separation of ray- and lobe-finned fishes. We apply a duplication-degeneration-complementation model to explain how this family of genes has evolved.
Scieglinska, D; Widłak, W; Konopka, W; Poutanen, M; Rahman, N; Huhtaniemi, I; Krawczyk, Z
2001-01-01
The rat Hst70 gene and its mouse counterpart Hsp70.2 belong to the family of Hsp70 heat shock genes and are specifically expressed in male germ cells. Previous studies regarding the structure of the 5' region of the transcription unit of these genes as well as localization of the 'cis' elements conferring their testis-specific expression gave contradictory results [Widlak, Markkula, Krawczyk, Kananen and Huhtaniemi (1995) Biochim. Biophys. Acta 1264, 191-200; Dix, Rosario-Herrle, Gotoh, Mori, Goulding, Barret and Eddy (1996) Dev. Biol. 174, 310-321]. In the present paper we solve these controversies and show that the 5' untranslated region (UTR) of the Hst70 gene contains an intron which is localized similar to that of the mouse Hsp70.2 gene. Reverse transcriptase-mediated PCR, Northern blotting and RNase protection analysis revealed that the transcription initiation of both genes starts at two main distant sites, and one of them is localized within the intron. As a result two populations of Hst70 gene transcripts with similar sizes but different 5' UTR structures can be detected in total testicular RNA. Functional analysis of the Hst70 gene promoter in transgenic mice and transient transfection assays proved that the DNA fragment of approx. 360 bp localized upstream of the ATG transcription start codon is the minimal promoter required for testis-specific expression of the HST70/chloramphenicol acetyltransferase transgene. These experiments also suggest that the expression of the gene may depend on 'cis' regulatory elements localized within exon 1 and the intron sequences. PMID:11563976
2012-01-01
Background During sexual development, filamentous ascomycetes form complex, three-dimensional fruiting bodies for the protection and dispersal of sexual spores. Fruiting bodies contain a number of cell types not found in vegetative mycelium, and these morphological differences are thought to be mediated by changes in gene expression. However, little is known about the spatial distribution of gene expression in fungal development. Here, we used laser microdissection (LM) and RNA-seq to determine gene expression patterns in young fruiting bodies (protoperithecia) and non-reproductive mycelia of the ascomycete Sordaria macrospora. Results Quantitative analysis showed major differences in the gene expression patterns between protoperithecia and total mycelium. Among the genes strongly up-regulated in protoperithecia were the pheromone precursor genes ppg1 and ppg2. The up-regulation was confirmed by fluorescence microscopy of egfp expression under the control of ppg1 regulatory sequences. RNA-seq analysis of protoperithecia from the sterile mutant pro1 showed that many genes that are differentially regulated in these structures are under the genetic control of transcription factor PRO1. Conclusions We have generated transcriptional profiles of young fungal sexual structures using a combination of LM and RNA-seq. This allowed a high spatial resolution and sensitivity, and yielded a detailed picture of gene expression during development. Our data revealed significant differences in gene expression between protoperithecia and non-reproductive mycelia, and showed that the transcription factor PRO1 is involved in the regulation of many genes expressed specifically in sexual structures. The LM/RNA-seq approach will also be relevant to other eukaryotic systems in which multicellular development is investigated. PMID:23016559
Schmitz, Judith; Lor, Stephanie; Klose, Rena; Güntürkün, Onur; Ocklenburg, Sebastian
2017-01-01
Handedness and language lateralization are partially determined by genetic influences. It has been estimated that at least 40 (and potentially more) possibly interacting genes may influence the ontogenesis of hemispheric asymmetries. Recently, it has been suggested that analyzing the genetics of hemispheric asymmetries on the level of gene ontology sets, rather than at the level of individual genes, might be more informative for understanding the underlying functional cascades. Here, we performed gene ontology, pathway and disease association analyses on genes that have previously been associated with handedness and language lateralization. Significant gene ontology sets for handedness were anatomical structure development, pattern specification (especially asymmetry formation) and biological regulation. Pathway analysis highlighted the importance of the TGF-beta signaling pathway for handedness ontogenesis. Significant gene ontology sets for language lateralization were responses to different stimuli, nervous system development, transport, signaling, and biological regulation. Despite the fact that some authors assume that handedness and language lateralization share a common ontogenetic basis, gene ontology sets barely overlap between phenotypes. Compared to genes involved in handedness, which mostly contribute to structural development, genes involved in language lateralization rather contribute to activity-dependent cognitive processes. Disease association analysis revealed associations of genes involved in handedness with diseases affecting the whole body, while genes involved in language lateralization were specifically engaged in mental and neurological diseases. These findings further support the idea that handedness and language lateralization are ontogenetically independent, complex phenotypes.
Schmitz, Judith; Lor, Stephanie; Klose, Rena; Güntürkün, Onur; Ocklenburg, Sebastian
2017-01-01
Handedness and language lateralization are partially determined by genetic influences. It has been estimated that at least 40 (and potentially more) possibly interacting genes may influence the ontogenesis of hemispheric asymmetries. Recently, it has been suggested that analyzing the genetics of hemispheric asymmetries on the level of gene ontology sets, rather than at the level of individual genes, might be more informative for understanding the underlying functional cascades. Here, we performed gene ontology, pathway and disease association analyses on genes that have previously been associated with handedness and language lateralization. Significant gene ontology sets for handedness were anatomical structure development, pattern specification (especially asymmetry formation) and biological regulation. Pathway analysis highlighted the importance of the TGF-beta signaling pathway for handedness ontogenesis. Significant gene ontology sets for language lateralization were responses to different stimuli, nervous system development, transport, signaling, and biological regulation. Despite the fact that some authors assume that handedness and language lateralization share a common ontogenetic basis, gene ontology sets barely overlap between phenotypes. Compared to genes involved in handedness, which mostly contribute to structural development, genes involved in language lateralization rather contribute to activity-dependent cognitive processes. Disease association analysis revealed associations of genes involved in handedness with diseases affecting the whole body, while genes involved in language lateralization were specifically engaged in mental and neurological diseases. These findings further support the idea that handedness and language lateralization are ontogenetically independent, complex phenotypes. PMID:28729848
Evolution of the Structure and Chromosomal Distribution of Histidine Biosynthetic Genes
NASA Astrophysics Data System (ADS)
Fani, Renato; Mori, Elena; Tamburini, Elena; Lazcano, Antonio
1998-10-01
A database of more than 100 histidine biosynthetic genes from different organisms belonging to the three primary domains has been analyzed, including those found in the now completely sequenced genomes of Haemophilus influenzae, Mycoplasma genitalium, Synechocystis sp., Methanococcus jannaschii, and Saccharomyces cerevisiae. The ubiquity of his genes suggests that it is a highly conserved pathway that was probably already present in the last common ancestor of all extant life. The chromosomal distribution of the his genes shows that the enterobacterial histidine operon structure is not the only possible organization, and that there is a diversity of gene arrays for the his pathway. Analysis of the available sequences shows that gene fusions (like those involved in the origin of the Escherichia coli and Salmonella typhimurium hisIE and hisB gene structures) are not universal. In contrast, the elongation event that led to the extant hisA gene from two homologous ancestral modules, as well as the subsequent paralogous duplication that originated hisF, appear to be irreversible and are conserved in all known organisms. The available evidence supports the hypothesis that histidine biosynthesis was assembled by a gene recruitment process.
Caffeine exposure alters cardiac gene expression in embryonic cardiomyocytes
Fang, Xiefan; Mei, Wenbin; Barbazuk, William B.; Rivkees, Scott A.
2014-01-01
Previous studies demonstrated that in utero caffeine treatment at embryonic day (E) 8.5 alters DNA methylation patterns, gene expression, and cardiac function in adult mice. To provide insight into the mechanisms, we examined cardiac gene and microRNA (miRNA) expression in cardiomyocytes shortly after exposure to physiologically relevant doses of caffeine. In HL-1 and primary embryonic cardiomyocytes, caffeine treatment for 48 h significantly altered the expression of cardiac structural genes (Myh6, Myh7, Myh7b, Tnni3), hormonal genes (Anp and BnP), cardiac transcription factors (Gata4, Mef2c, Mef2d, Nfatc1), and microRNAs (miRNAs; miR208a, miR208b, miR499). In addition, expressions of these genes were significantly altered in embryonic hearts exposed to in utero caffeine. For in utero experiments, pregnant CD-1 dams were treated with 20–60 mg/kg of caffeine, which resulted in maternal circulation levels of 37.3–65.3 μM 2 h after treatment. RNA sequencing was performed on embryonic ventricles treated with vehicle or 20 mg/kg of caffeine daily from E6.5-9.5. Differential expression (DE) analysis revealed that 124 genes and 849 transcripts were significantly altered, and differential exon usage (DEU) analysis identified 597 exons that were changed in response to prenatal caffeine exposure. Among the DE genes identified by RNA sequencing were several cardiac structural genes and genes that control DNA methylation and histone modification. Pathway analysis revealed that pathways related to cardiovascular development and diseases were significantly affected by caffeine. In addition, global cardiac DNA methylation was reduced in caffeine-treated cardiomyocytes. Collectively, these data demonstrate that caffeine exposure alters gene expression and DNA methylation in embryonic cardiomyocytes. PMID:25354728
Xia, Jun Hong; Li, Hong Lian; Li, Bi Jun; Gu, Xiao Hui; Lin, Hao Ran
2018-01-10
Hypoxia is one of the critical environmental stressors for fish in aquatic environments. Although accumulating evidences indicate that gene expression is regulated by hypoxia stress in fish, how genes undergoing differential gene expression and/or alternative splicing (AS) in response to hypoxia stress in heart are not well understood. Using RNA-seq, we surveyed and detected 289 differential expressed genes (DEG) and 103 genes that undergo differential usage of exons and splice junctions events (DUES) in heart of a hypoxia tolerant fish, Nile tilapia, Oreochromis niloticus following 12h hypoxic treatment. The spatio-temporal expression analysis validated the significant association of differential exon usages in two randomly selected DUES genes (fam162a and ndrg2) in 5 tissues (heart, liver, brain, gill and spleen) sampled at three time points (6h, 12h, and 24h) under acute hypoxia treatment. Functional analysis significantly associated the differential expressed genes with the categories related to energy conservation, protein synthesis and immune response. Different enrichment categories were found between the DEG and DUES dataset. The Isomerase activity, Oxidoreductase activity, Glycolysis and Oxidative stress process were significantly enriched for the DEG gene dataset, but the Structural constituent of ribosome and Structural molecule activity, Ribosomal protein and RNA binding protein were significantly enriched only for the DUES genes. Our comparative transcriptomic analysis reveals abundant stress responsive genes and their differential regulation function in the heart tissues of Nile tilapia under acute hypoxia stress. Our findings will facilitate future investigation on transcriptome complexity and AS regulation during hypoxia stress in fish. Copyright © 2017 Elsevier B.V. All rights reserved.
The Drosophila transcriptional network is structured by microbiota.
Dobson, Adam J; Chaston, John M; Douglas, Angela E
2016-11-25
Resident microorganisms (microbiota) have far-reaching effects on the biology of their animal hosts, with major consequences for the host's health and fitness. A full understanding of microbiota-dependent gene regulation requires analysis of the overall architecture of the host transcriptome, by identifying suites of genes that are expressed synchronously. In this study, we investigated the impact of the microbiota on gene coexpression in Drosophila. Our transcriptomic analysis, of 17 lines representative of the global genetic diversity of Drosophila, yielded a total of 11 transcriptional modules of co-expressed genes. For seven of these modules, the strength of the transcriptional network (defined as gene-gene coexpression) differed significantly between flies bearing a defined gut microbiota (gnotobiotic flies) and flies reared under microbiologically sterile conditions (axenic flies). Furthermore, gene coexpression was uniformly stronger in these microbiota-dependent modules than in both the microbiota-independent modules in gnotobiotic flies and all modules in axenic flies, indicating that the presence of the microbiota directs gene regulation in a subset of the transcriptome. The genes constituting the microbiota-dependent transcriptional modules include regulators of growth, metabolism and neurophysiology, previously implicated in mediating phenotypic effects of microbiota on Drosophila phenotype. Together these results provide the first evidence that the microbiota enhances the coexpression of specific and functionally-related genes relative to the animal's intrinsic baseline level of coexpression. Our system-wide analysis demonstrates that the presence of microbiota enhances gene coexpression, thereby structuring the transcriptional network in the animal host. This finding has potentially major implications for understanding of the mechanisms by which microbiota affect host health and fitness, and the ways in which hosts and their resident microbiota coevolve.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2017-04-01
Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .
Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes
Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise
2009-01-01
Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885
Braberg, Hannes; Moehle, Erica A.; Shales, Michael; Guthrie, Christine; Krogan, Nevan J.
2014-01-01
We have achieved a residue-level resolution of genetic interaction mapping – a technique that measures how the function of one gene is affected by the alteration of a second gene – by analyzing point mutations. Here, we describe how to interpret point mutant genetic interactions, and outline key applications for the approach, including interrogation of protein interaction interfaces and active sites, and examination of post-translational modifications. Genetic interaction analysis has proven effective for characterizing cellular processes; however, to date, systematic high-throughput genetic interaction screens have relied on gene deletions or knockdowns, which limits the resolution of gene function analysis and poses problems for multifunctional genes. Our point mutant approach addresses these issues, and further provides a tool for in vivo structure-function analysis that complements traditional biophysical methods. We also discuss the potential for genetic interaction mapping of point mutations in human cells and its application to personalized medicine. PMID:24842270
Ma, Jun; Liu, Fang; Wang, Qinglian; Wang, Kunbo; Jones, Don C.; Zhang, Baohong
2016-01-01
TCP proteins are plant-specific transcription factors implicated to perform a variety of physiological functions during plant growth and development. In the current study, we performed for the first time the comprehensive analysis of TCP gene family in a diploid cotton species, Gossypium arboreum, including phylogenetic analysis, chromosome location, gene duplication status, gene structure and conserved motif analysis, as well as expression profiles in fiber at different developmental stages. Our results showed that G. arboreum contains 36 TCP genes, distributing across all of the thirteen chromosomes. GaTCPs within the same subclade of the phylogenetic tree shared similar exon/intron organization and motif composition. In addition, both segmental duplication and whole-genome duplication contributed significantly to the expansion of GaTCPs. Many these TCP transcription factor genes are specifically expressed in cotton fiber during different developmental stages, including cotton fiber initiation and early development. This suggests that TCP genes may play important roles in cotton fiber development. PMID:26857372
Ma, Jun; Liu, Fang; Wang, Qinglian; Wang, Kunbo; Jones, Don C; Zhang, Baohong
2016-02-09
TCP proteins are plant-specific transcription factors implicated to perform a variety of physiological functions during plant growth and development. In the current study, we performed for the first time the comprehensive analysis of TCP gene family in a diploid cotton species, Gossypium arboreum, including phylogenetic analysis, chromosome location, gene duplication status, gene structure and conserved motif analysis, as well as expression profiles in fiber at different developmental stages. Our results showed that G. arboreum contains 36 TCP genes, distributing across all of the thirteen chromosomes. GaTCPs within the same subclade of the phylogenetic tree shared similar exon/intron organization and motif composition. In addition, both segmental duplication and whole-genome duplication contributed significantly to the expansion of GaTCPs. Many these TCP transcription factor genes are specifically expressed in cotton fiber during different developmental stages, including cotton fiber initiation and early development. This suggests that TCP genes may play important roles in cotton fiber development.
Genomic analysis of Staphylococcus phage Stau2 isolated from medical specimen.
Hsieh, Sue-Er; Tseng, Yi-Hsiung; Lo, Hsueh-Hsia; Chen, Shui-Tu; Wu, Cheng-Nan
2016-02-01
Stau2 is a lytic myophage of Staphylococcus aureus isolated from medical specimen. Exhibiting a broad host range against S. aureus clinical isolates, Stau2 is potentially useful for topical phage therapy or as an additive in food preservation. In this study, Stau2 was firstly revealed to possess a circularly permuted linear genome of 133,798 bp, with low G + C content, containing 146 open reading frames, but encoding no tRNA. The genome is organized into several modules containing genes for packaging, structural proteins, replication/transcription and host-cell-lysis, with the structural proteins and DNA polymerase modules being organized similarly to that in Twort-like phages of Staphylococcus. With the encoded DNA replication genes, Stau2 can possibly use its own system for replication. In addition, analysis in silico found several introns in seven genes, including those involved in DNA metabolism, packaging, and structure, while one of them (helicase gene) is experimentally confirmed to undergo splicing. Furthermore, phylogenetic analysis suggested Stau2 to be most closely related to Staphylococcus phages SA11 and Remus, members of Twort-like phages. The results of sodium dodecyl sulfate polyacrylamide gel electrophoresis showed 14 structural proteins of Stau2 and N-terminal sequencing identified three of them. Importantly, this phage does not encode any proteins which are known or suspected to be involved in toxicity, pathogenicity, or antibiotic resistance. Therefore, further investigations of feasible therapeutic application of Stau2 are needed.
Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay
2004-01-01
Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175
Hu, Wei; Yang, Hubiao; Yan, Yan; Wei, Yunxie; Tie, Weiwei; Ding, Zehong; Zuo, Jiao; Peng, Ming; Li, Kaimian
2016-03-07
The basic leucine zipper (bZIP) transcription factor family plays crucial roles in various aspects of biological processes. Currently, no information is available regarding the bZIP family in the important tropical crop cassava. Herein, 77 bZIP genes were identified from cassava. Evolutionary analysis indicated that MebZIPs could be divided into 10 subfamilies, which was further supported by conserved motif and gene structure analyses. Global expression analysis suggested that MebZIPs showed similar or distinct expression patterns in different tissues between cultivated variety and wild subspecies. Transcriptome analysis of three cassava genotypes revealed that many MebZIP genes were activated by drought in the root of W14 subspecies, indicating the involvement of these genes in the strong resistance of cassava to drought. Expression analysis of selected MebZIP genes in response to osmotic, salt, cold, ABA, and H2O2 suggested that they might participate in distinct signaling pathways. Our systematic analysis of MebZIPs reveals constitutive, tissue-specific and abiotic stress-responsive candidate MebZIP genes for further functional characterization in planta, yields new insights into transcriptional regulation of MebZIP genes, and lays a foundation for understanding of bZIP-mediated abiotic stress response.
Bie, Luyao; Wu, Hao; Wang, Xin-Hua; Wang, Mingyu; Xu, Hai
2017-08-01
Integrative and conjugative elements (ICEs) are self-transmissible chromosomal mobile elements that play significant roles in the dissemination of antimicrobial resistance genes. Identification of the structures and functions of ICEs, particularly those in pathogens, improves understanding of the dissemination of antimicrobial resistance. This study identified new members of the sulfamethoxazole-trimethoprim (SXT)/R391 family of ICEs that could confer multi-drug resistance in the opportunistic pathogen Proteus mirabilis, characterized their genetic structures, and explored their evolutionary connection with other members of this family of ICEs. Three new members of the SXT/R391 family of ICEs were detected in six of 77 P. mirabilis strains isolated in China: ICEPmiChn2 (one strain), ICEPmiChn3 (one strain) and ICEPmiChn4 (three strains). All three new ICEs harbour antimicrobial resistance genes from diverse origins, suggesting their capability in acquiring foreign genes and serving as important carriers for antimicrobial resistance genes. Structural analysis showed that ICEPmiChn3 is a particularly interesting and unique ICE that has lost core genes involved in conjugation, and could not transfer to other cells via conjugation. This finding confirmed the key roles of these missing genes in conjugation. Further phylogenetic analysis suggested that ICEs in geographically close strains are also connected evolutionarily, and ICEPmiChn3 lost its conjugation cassette from a former mobile ICE. The identification and characterization of the three new members of the SXT/R391 family of ICEs in this work leads to suggestions of core ICE genes essential for conjugation, and extends understanding on the structures of ICEs, evolutionary relationships between ICEs, and the antimicrobial resistance mechanisms of P. mirabilis. Copyright © 2017 Elsevier B.V. and International Society of Chemotherapy. All rights reserved.
Fine-Scale Analysis Reveals Cryptic Landscape Genetic Structure in Desert Tortoises
Latch, Emily K.; Boarman, William I.; Walde, Andrew; Fleischer, Robert C.
2011-01-01
Characterizing the effects of landscape features on genetic variation is essential for understanding how landscapes shape patterns of gene flow and spatial genetic structure of populations. Most landscape genetics studies have focused on patterns of gene flow at a regional scale. However, the genetic structure of populations at a local scale may be influenced by a unique suite of landscape variables that have little bearing on connectivity patterns observed at broader spatial scales. We investigated fine-scale spatial patterns of genetic variation and gene flow in relation to features of the landscape in desert tortoise (Gopherus agassizii), using 859 tortoises genotyped at 16 microsatellite loci with associated data on geographic location, sex, elevation, slope, and soil type, and spatial relationship to putative barriers (power lines, roads). We used spatially explicit and non-explicit Bayesian clustering algorithms to partition the sample into discrete clusters, and characterize the relationships between genetic distance and ecological variables to identify factors with the greatest influence on gene flow at a local scale. Desert tortoises exhibit weak genetic structure at a local scale, and we identified two subpopulations across the study area. Although genetic differentiation between the subpopulations was low, our landscape genetic analysis identified both natural (slope) and anthropogenic (roads) landscape variables that have significantly influenced gene flow within this local population. We show that desert tortoise movements at a local scale are influenced by features of the landscape, and that these features are different than those that influence gene flow at larger scales. Our findings are important for desert tortoise conservation and management, particularly in light of recent translocation efforts in the region. More generally, our results indicate that recent landscape changes can affect gene flow at a local scale and that their effects can be detected almost immediately. PMID:22132143
Fine-scale analysis reveals cryptic landscape genetic structure in desert tortoises.
Latch, Emily K; Boarman, William I; Walde, Andrew; Fleischer, Robert C
2011-01-01
Characterizing the effects of landscape features on genetic variation is essential for understanding how landscapes shape patterns of gene flow and spatial genetic structure of populations. Most landscape genetics studies have focused on patterns of gene flow at a regional scale. However, the genetic structure of populations at a local scale may be influenced by a unique suite of landscape variables that have little bearing on connectivity patterns observed at broader spatial scales. We investigated fine-scale spatial patterns of genetic variation and gene flow in relation to features of the landscape in desert tortoise (Gopherus agassizii), using 859 tortoises genotyped at 16 microsatellite loci with associated data on geographic location, sex, elevation, slope, and soil type, and spatial relationship to putative barriers (power lines, roads). We used spatially explicit and non-explicit Bayesian clustering algorithms to partition the sample into discrete clusters, and characterize the relationships between genetic distance and ecological variables to identify factors with the greatest influence on gene flow at a local scale. Desert tortoises exhibit weak genetic structure at a local scale, and we identified two subpopulations across the study area. Although genetic differentiation between the subpopulations was low, our landscape genetic analysis identified both natural (slope) and anthropogenic (roads) landscape variables that have significantly influenced gene flow within this local population. We show that desert tortoise movements at a local scale are influenced by features of the landscape, and that these features are different than those that influence gene flow at larger scales. Our findings are important for desert tortoise conservation and management, particularly in light of recent translocation efforts in the region. More generally, our results indicate that recent landscape changes can affect gene flow at a local scale and that their effects can be detected almost immediately.
Mapping the polysaccharide degradation potential of Aspergillus niger
2012-01-01
Background The degradation of plant materials by enzymes is an industry of increasing importance. For sustainable production of second generation biofuels and other products of industrial biotechnology, efficient degradation of non-edible plant polysaccharides such as hemicellulose is required. For each type of hemicellulose, a complex mixture of enzymes is required for complete conversion to fermentable monosaccharides. In plant-biomass degrading fungi, these enzymes are regulated and released by complex regulatory structures. In this study, we present a methodology for evaluating the potential of a given fungus for polysaccharide degradation. Results Through the compilation of information from 203 articles, we have systematized knowledge on the structure and degradation of 16 major types of plant polysaccharides to form a graphical overview. As a case example, we have combined this with a list of 188 genes coding for carbohydrate-active enzymes from Aspergillus niger, thus forming an analysis framework, which can be queried. Combination of this information network with gene expression analysis on mono- and polysaccharide substrates has allowed elucidation of concerted gene expression from this organism. One such example is the identification of a full set of extracellular polysaccharide-acting genes for the degradation of oat spelt xylan. Conclusions The mapping of plant polysaccharide structures along with the corresponding enzymatic activities is a powerful framework for expression analysis of carbohydrate-active enzymes. Applying this network-based approach, we provide the first genome-scale characterization of all genes coding for carbohydrate-active enzymes identified in A. niger. PMID:22799883
Mapping the polysaccharide degradation potential of Aspergillus niger.
Andersen, Mikael R; Giese, Malene; de Vries, Ronald P; Nielsen, Jens
2012-07-16
The degradation of plant materials by enzymes is an industry of increasing importance. For sustainable production of second generation biofuels and other products of industrial biotechnology, efficient degradation of non-edible plant polysaccharides such as hemicellulose is required. For each type of hemicellulose, a complex mixture of enzymes is required for complete conversion to fermentable monosaccharides. In plant-biomass degrading fungi, these enzymes are regulated and released by complex regulatory structures. In this study, we present a methodology for evaluating the potential of a given fungus for polysaccharide degradation. Through the compilation of information from 203 articles, we have systematized knowledge on the structure and degradation of 16 major types of plant polysaccharides to form a graphical overview. As a case example, we have combined this with a list of 188 genes coding for carbohydrate-active enzymes from Aspergillus niger, thus forming an analysis framework, which can be queried. Combination of this information network with gene expression analysis on mono- and polysaccharide substrates has allowed elucidation of concerted gene expression from this organism. One such example is the identification of a full set of extracellular polysaccharide-acting genes for the degradation of oat spelt xylan. The mapping of plant polysaccharide structures along with the corresponding enzymatic activities is a powerful framework for expression analysis of carbohydrate-active enzymes. Applying this network-based approach, we provide the first genome-scale characterization of all genes coding for carbohydrate-active enzymes identified in A. niger.
Zhu, Xinyu; Ma, Hong; Chen, Zhiduan
2011-03-09
Plants contain numerous Su(var)3-9 homologues (SUVH) and related (SUVR) genes, some of which await functional characterization. Although there have been studies on the evolution of plant Su(var)3-9 SET genes, a systematic evolutionary study including major land plant groups has not been reported. Large-scale phylogenetic and evolutionary analyses can help to elucidate the underlying molecular mechanisms and contribute to improve genome annotation. Putative orthologs of plant Su(var)3-9 SET protein sequences were retrieved from major representatives of land plants. A novel clustering that included most members analyzed, henceforth referred to as core Su(var)3-9 homologues and related (cSUVHR) gene clade, was identified as well as all orthologous groups previously identified. Our analysis showed that plant Su(var)3-9 SET proteins possessed a variety of domain organizations, and can be classified into five types and ten subtypes. Plant Su(var)3-9 SET genes also exhibit a wide range of gene structures among different paralogs within a family, even in the regions encoding conserved PreSET and SET domains. We also found that the majority of SUVH members were intronless and formed three subclades within the SUVH clade. A detailed phylogenetic analysis of the plant Su(var)3-9 SET genes was performed. A novel deep phylogenetic relationship including most plant Su(var)3-9 SET genes was identified. Additional domains such as SAR, ZnF_C2H2 and WIYLD were early integrated into primordial PreSET/SET/PostSET domain organization. At least three classes of gene structures had been formed before the divergence of Physcomitrella patens (moss) from other land plants. One or multiple retroposition events might have occurred among SUVH genes with the donor genes leading to the V-2 orthologous group. The structural differences among evolutionary groups of plant Su(var)3-9 SET genes with different functions were described, contributing to the design of further experimental studies.
USDA-ARS?s Scientific Manuscript database
Cycles of whole genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied...
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-03-01
Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-01-01
Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199
Liu, S; Liu, L; Tang, Y; Xiong, S; Long, J; Liu, Z; Tian, N
2017-07-01
The regulatory mechanism of flavonoids, which synergise anti-malarial and anti-cancer compounds in Artemisia annua, is still unclear. In this study, an anthocyanidin-accumulating mutant callus was induced from A. annua and comparative transcriptomic analysis of wild-type and mutant calli performed, based on the next-generation Illumina/Solexa sequencing platform and de novo assembly. A total of 82,393 unigenes were obtained and 34,764 unigenes were annotated in the public database. Among these, 87 unigenes were assigned to 14 structural genes involved in the flavonoid biosynthetic pathway and 37 unigenes were assigned to 17 structural genes related to metabolism of flavonoids. More than 30 unigenes were assigned to regulatory genes, including R2R3-MYB, bHLH and WD40, which might regulate flavonoid biosynthesis. A further 29 unigenes encoding flavonoid biosynthetic enzymes or transcription factors were up-regulated in the mutant, while 19 unigenes were down-regulated, compared with the wild type. Expression levels of nine genes involved in the flavonoid pathway were compared using semi-quantitative RT-PCR, and results were consistent with comparative transcriptomic analysis. Finally, a putative flavonol synthase gene (AaFLS1) was identified from enzyme assay in vitro and in vivo through heterogeneous expression, and confirmed comparative transcriptomic analysis of wild-type and mutant callus. The present work has provided important target genes for the regulation of flavonoid biosynthesis in A. annua. © 2017 German Botanical Society and The Royal Botanical Society of the Netherlands.
2014-01-01
Background The oriental fruit fly, Bactrocera dorsalis s.s., is one of the most important quarantine pests in many countries, including China. Although the oriental fruit fly has been investigated extensively, its origins and genetic structure remain disputed. In this study, the NADH dehydrogenase subunit 1 (ND1) gene was used as a genetic marker to examine the genetic diversity, population structure, and gene flow of B. dorsalis s.s. throughout its range in China and southeast Asia. Results Haplotype networks and phylogenetic analysis indicated two distinguishable lineages of the fly population but provided no strong support for geographical subdivision in B. philippinensis. Demographic analysis revealed rapid expansion of B. dorsalis s.s. populations in China and Southeast Asia in the recent years. The greatest amount of genetic diversity was observed in Manila, Pattaya, and Bangkok, and asymmetric migration patterns were observed in different parts of China. The data collected here further show that B. dorsalis s.s. in Yunnan, Guangdong, and Fujian Provinces, and in Taiwan might have different origins within southeast Asia. Conclusions Using the mitochondrial ND1 gene, the results of the present study showed B. dorsalis s.s. from different parts of China to have different genetic structures and origins. B. dorsalis s.s. in China and southeast Asia was found to have experienced rapid expansion in recent years. Data further support the existence of two distinguishable lineages of B. dorsalis s.s. in China and indicate genetic diversity and gene flow from multiple origins. The sequences in this paper have been deposited in GenBank/NCBI under accession numbers KC413034–KC413367. PMID:24655832
Evidence-based gene models for structural and functional annotations of the oil palm genome.
Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie
2017-09-08
Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops. This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.
NASA Astrophysics Data System (ADS)
Xue, Zhuang; Li, Hui; Liu, Yang; Zhou, Wei; Sun, Jing; Wang, Xiuli
2017-12-01
As a `living fossil' of species origin and `rich treasure' of food and nutrition development, sea cucumber has received a lot of attentions from researchers. The cDNA library construction and EST sequencing of blood had been conducted previously in our lab. The bioinformatic analysis provided a gene fragment which is highly homologous with the genes of lectin family, named AjL ( Apostichopus japonicus lectin). To characterize and determine the phylogeny of AjL genes in early evolution, we isolated a full-length cDNA of lectin gene from the body wall of A. japonicus. The open reading frame of this gene contained 489 bp and encoded a 163 amino acids secretory protein being homologous to lectins of mammals and aquatic organisms. The deduced protein included a lectin-like domain. SDS-PAGE analysis showed that AjL migrated as a specific band (about 36.09 kDa under reducing), and agglutinated against rabbit red blood cells. AjL was similar to chain A of CEL-IV in space structure. We predicted that AjL may play the same role of CEL-IV. Our results suggested that more than one lectin gene functioned in sea cucumber and most of other species, which was fused by uncertain sequences during the evolution and encoded different proteins with diverse functions. Our findings provided the insights into the function and characteristics of lectin genes invertebrates. The results will also be helpful for the identification and structural, functional, and evolutionary analyses of lectin genes.
Structural and functional annotation of the porcine immunome
2013-01-01
Background The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems. Results The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome. Conclusions This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response. PMID:23676093
The genomic organization of the Fanconi anemia group A (FAA) gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ianzano, L.; Centra, M.; Savino, M.
1997-05-01
Fanconi anemia (FA) is a genetically heterogeneous disease involving at least five genes on the basis of complementation analysis (FAA to FAE). The FAA gene has been recently isolated by two independent approaches, positional and functional cloning. In the present study we describe the genomic structure of the FAA gene. The gene contains 43 exons spanning approximately 80 kb as determined by the alignment of four cosmids and the fine localization of the first and the last exons in restriction fragments of these clones. Exons range from 34 to 188 bp. All but three of the splice sites were consistentmore » with the ag-gt rule. We also describe three alternative splicing events in cDNA clones that result in the loss of exon 37, a 23-bp deletion at the 5{prime} end of exon 41. Sequence analysis of the 5{prime} region upstream of the putative transcription start site showed no obvious TATA and CAAT boxes, but did show a GC-rich region, typical of housekeeping genes. Knowledge of the structure of the FAA gene will provide an invaluable resource for the discovery of mutations in the gene that accounts for about 60-66% of FA patients. 24 refs., 3 figs., 1 tab.« less
Li, Si-Bei; OuYang, Wei-Zhi; Hou, Xiao-Jin; Xie, Liang-Liang; Hu, Chun-Gen; Zhang, Jin-Zhi
2015-01-01
Auxin response factors (ARFs) are an important family of proteins in auxin-mediated response, with key roles in various physiological and biochemical processes. To date, a genome-wide overview of the ARF gene family in citrus was not available. A systematic analysis of this gene family in citrus was begun by carrying out a genome-wide search for the homologs of ARFs. A total of 19 nonredundant ARF genes (CiARF) were found and validated from the sweet orange. A comprehensive overview of the CiARFs was undertaken, including the gene structures, phylogenetic analysis, chromosome locations, conserved motifs of proteins, and cis-elements in promoters of CiARF. Furthermore, expression profiling using real-time PCR revealed many CiARF genes, albeit with different patterns depending on types of tissues and/or developmental stages. Comprehensive expression analysis of these genes was also performed under two hormone treatments using real-time PCR. Indole-3-acetic acid (IAA) and N-1-napthylphthalamic acid (NPA) treatment experiments revealed differential up-regulation and down-regulation, respectively, of the 19 citrus ARF genes in the callus of sweet orange. Our comprehensive analysis of ARF genes further elucidates the roles of CiARF family members during citrus growth and development process. PMID:25870601
Yan, Yan; Wang, Lianzhe; Ding, Zehong; Tie, Weiwei; Ding, Xupo; Zeng, Changying; Wei, Yunxie; Zhao, Hongliang; Peng, Ming; Hu, Wei
2016-01-01
Mitogen-activated protein kinases (MAPKs) play central roles in plant developmental processes, hormone signaling transduction, and responses to abiotic stress. However, no data are currently available about the MAPK family in cassava, an important tropical crop. Herein, 21 MeMAPK genes were identified from cassava. Phylogenetic analysis indicated that MeMAPKs could be classified into four subfamilies. Gene structure analysis demonstrated that the number of introns in MeMAPK genes ranged from 1 to 10, suggesting large variation among cassava MAPK genes. Conserved motif analysis indicated that all MeMAPKs had typical protein kinase domains. Transcriptomic analysis suggested that MeMAPK genes showed differential expression patterns in distinct tissues and in response to drought stress between wild subspecies and cultivated varieties. Interaction networks and co-expression analyses revealed that crucial pathways controlled by MeMAPK networks may be involved in the differential response to drought stress in different accessions of cassava. Expression of nine selected MAPK genes showed that these genes could comprehensively respond to osmotic, salt, cold, oxidative stressors, and abscisic acid (ABA) signaling. These findings yield new insights into the transcriptional control of MAPK gene expression, provide an improved understanding of abiotic stress responses and signaling transduction in cassava, and lead to potential applications in the genetic improvement of cassava cultivars. PMID:27625666
Cong, Jing; Liu, Xueduan; Lu, Hui; Xu, Han; Li, Yide; Deng, Ye; Li, Diqiang; Zhang, Yuguang
2015-08-20
Tropical rainforests cover over 50% of all known plant and animal species and provide a variety of key resources and ecosystem services to humans, largely mediated by metabolic activities of soil microbial communities. A deep analysis of soil microbial communities and their roles in ecological processes would improve our understanding on biogeochemical elemental cycles. However, soil microbial functional gene diversity in tropical rainforests and causative factors remain unclear. GeoChip, contained almost all of the key functional genes related to biogeochemical cycles, could be used as a specific and sensitive tool for studying microbial gene diversity and metabolic potential. In this study, soil microbial functional gene diversity in tropical rainforest was analyzed by using GeoChip technology. Gene categories detected in the tropical rainforest soils were related to different biogeochemical processes, such as carbon (C), nitrogen (N) and phosphorus (P) cycling. The relative abundance of genes related to C and P cycling detected mostly derived from the cultured bacteria. C degradation gene categories for substrates ranging from labile C to recalcitrant C were all detected, and gene abundances involved in many recalcitrant C degradation gene categories were significantly (P < 0.05) different among three sampling sites. The relative abundance of genes related to N cycling detected was significantly (P < 0.05) different, mostly derived from the uncultured bacteria. The gene categories related to ammonification had a high relative abundance. Both canonical correspondence analysis and multivariate regression tree analysis showed that soil available N was the most correlated with soil microbial functional gene structure. Overall high microbial functional gene diversity and different soil microbial metabolic potential for different biogeochemical processes were considered to exist in tropical rainforest. Soil available N could be the key factor in shaping the soil microbial functional gene structure and metabolic potential.
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.
Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao
2016-04-01
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.
The functional landscape bound to the transcription factors of Escherichia coli K-12.
Pérez-Rueda, Ernesto; Tenorio-Salgado, Silvia; Huerta-Saquero, Alejandro; Balderas-Martínez, Yalbi I; Moreno-Hagelsieb, Gabriel
2015-10-01
Motivated by the experimental evidences accumulated in the last ten years and based on information deposited in RegulonDB, literature look up, and sequence analysis, we analyze the repertoire of 304 DNA-binding Transcription factors (TFs) in Escherichia coli K-12. These regulators were grouped in 78 evolutionary families and are regulating almost half of the total genes in this bacterium. In structural terms, 60% of TFs are composed by two-domains, 30% are monodomain, and 10% three- and four-structural domains. As previously noticed, the most abundant DNA-binding domain corresponds to the winged helix-turn-helix, with few alternative DNA-binding structures, resembling the hypothesis of successful protein structures with the emergence of new ones at low scales. In summary, we identified and described the characteristics associated to the DNA-binding TF in E. coli K-12. We also identified twelve functional modules based on a co-regulated gene matrix. Finally, diverse regulons were predicted based on direct associations between the TFs and potential regulated genes. This analysis should increase our knowledge about the gene regulation in the bacterium E. coli K-12, and provide more additional clues for comprehensive modelling of transcriptional regulatory networks in other bacteria. Copyright © 2015 Elsevier Ltd. All rights reserved.
Structural and Biochemical Characterization of a Novel Aminopeptidase from Human Intestine
Tykvart, Jan; Bařinka, Cyril; Svoboda, Michal; ...
2015-03-09
N-acetylated α-linked acidic dipeptidase-like protein (NAALADase L), encoded by the NAALADL1 gene, is a close homolog of glutamate carboxypeptidase II, a metallopeptidase that has been intensively studied as a target for imaging and therapy of solid malignancies and neuropathologies. However, neither the physiological functions nor structural features of NAALADase L are known at present. In this paper, we report a thorough characterization of the protein product of the human NAALADL1 gene, including heterologous overexpression and purification, structural and biochemical characterization, and analysis of its expression profile. By solving the NAALADase L x-ray structure, we provide the first experimental evidence thatmore » it is a zinc-dependent metallopeptidase with a catalytic mechanism similar to that of glutamate carboxypeptidase II yet distinct substrate specificity. A proteome-based assay revealed that the NAALADL1 gene product possesses previously unrecognized aminopeptidase activity but no carboxy- or endopeptidase activity. These findings were corroborated by site-directed mutagenesis and identification of bestatin as a potent inhibitor of the enzyme. Analysis of NAALADL1 gene expression at both the mRNA and protein levels revealed the small intestine as the major site of protein expression and points toward extensive alternative splicing of the NAALADL1 gene transcript. Taken together, our data imply that the NAALADL1 gene product's primary physiological function is associated with the final stages of protein/peptide digestion and absorption in the human digestive system. Finally, based on these results, we suggest a new name for this enzyme: human ileal aminopeptidase (HILAP).« less
Nagle, D L; Martin-DeLeon, P; Hough, R B; Bućan, M
1994-01-01
We are studying the chromosomal structure of three developmental mutations, dominant spotting (W), patch (Ph), and rump white (Rw) on mouse chromosome 5. These mutations are clustered in a region containing three genes encoding tyrosine kinase receptors (Kit, Pdgfra, and Flk1). Using probes for these genes and for a closely linked locus, D5Mn125, we established a high-resolution physical map covering approximately 2.8 Mb. The entire chromosomal segment mapped in this study is deleted in the W19H mutation. The map indicates the position of the Ph deletion, which encompasses not more than 400 kb around and including the Pdgfra gene. The map also places the distal breakpoint of the Rw inversion to a limited chromosomal segment between Kit and Pdgfra. In light of the structure of the Ph-W-Rw region, we interpret the previously published complementation analyses as indicating that the pigmentation defect in Rw/+ heterozygotes could be due to the disruption of Kit and/or Pdgfra regulatory sequences, whereas the gene(s) responsible for the recessive lethality of Rw/Rw embryos is not closely linked to the Ph and W loci and maps proximally to the W19H deletion. The structural analysis of chromosomal rearrangements associated with W19H, Ph, and Rw combined with the high-resolution physical mapping points the way toward the definition of these mutations in molecular terms and isolation of homologous genes on human chromosome 4. Images PMID:8041773
Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan
2009-01-01
We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
Nowrousian, Minou
2009-04-01
During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.
Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A.; Rathbun, Stephen L.; Whitman, William B.
2016-01-01
K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley’s K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing. PMID:27911946
MAISTAS: a tool for automatic structural evaluation of alternative splicing products.
Floris, Matteo; Raimondo, Domenico; Leoni, Guido; Orsini, Massimiliano; Marcatili, Paolo; Tramontano, Anna
2011-06-15
Analysis of the human genome revealed that the amount of transcribed sequence is an order of magnitude greater than the number of predicted and well-characterized genes. A sizeable fraction of these transcripts is related to alternatively spliced forms of known protein coding genes. Inspection of the alternatively spliced transcripts identified in the pilot phase of the ENCODE project has clearly shown that often their structure might substantially differ from that of other isoforms of the same gene, and therefore that they might perform unrelated functions, or that they might even not correspond to a functional protein. Identifying these cases is obviously relevant for the functional assignment of gene products and for the interpretation of the effect of variations in the corresponding proteins. Here we describe a publicly available tool that, given a gene or a protein, retrieves and analyses all its annotated isoforms, provides users with three-dimensional models of the isoform(s) of his/her interest whenever possible and automatically assesses whether homology derived structural models correspond to plausible structures. This information is clearly relevant. When the homology model of some isoforms of a gene does not seem structurally plausible, the implications are that either they assume a structure unrelated to that of the other isoforms of the same gene with presumably significant functional differences, or do not correspond to functional products. We provide indications that the second hypothesis is likely to be true for a substantial fraction of the cases. http://maistas.bioinformatica.crs4.it/.
Huang, Jianyan; Zhao, Xiaobo; Weng, Xiaoyu; Wang, Lei; Xie, Weibo
2012-01-01
Background The B-box (BBX) -containing proteins are a class of zinc finger proteins that contain one or two B-box domains and play important roles in plant growth and development. The Arabidopsis BBX gene family has recently been re-identified and renamed. However, there has not been a genome-wide survey of the rice BBX (OsBBX) gene family until now. Methodology/Principal Findings In this study, we identified 30 rice BBX genes through a comprehensive bioinformatics analysis. Each gene was assigned a uniform nomenclature. We described the chromosome localizations, gene structures, protein domains, phylogenetic relationship, whole life-cycle expression profile and diurnal expression patterns of the OsBBX family members. Based on the phylogeny and domain constitution, the OsBBX gene family was classified into five subfamilies. The gene duplication analysis revealed that only chromosomal segmental duplication contributed to the expansion of the OsBBX gene family. The expression profile of the OsBBX genes was analyzed by Affymetrix GeneChip microarrays throughout the entire life-cycle of rice cultivar Zhenshan 97 (ZS97). In addition, microarray analysis was performed to obtain the expression patterns of these genes under light/dark conditions and after three phytohormone treatments. This analysis revealed that the expression patterns of the OsBBX genes could be classified into eight groups. Eight genes were regulated under the light/dark treatments, and eleven genes showed differential expression under at least one phytohormone treatment. Moreover, we verified the diurnal expression of the OsBBX genes using the data obtained from the Diurnal Project and qPCR analysis, and the results indicated that many of these genes had a diurnal expression pattern. Conclusions/Significance The combination of the genome-wide identification and the expression and diurnal analysis of the OsBBX gene family should facilitate additional functional studies of the OsBBX genes. PMID:23118960
Ji, Shuiwang
2013-07-11
The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship.
Nucleosome Positioning and NDR Structure at RNA Polymerase III Promoters
NASA Astrophysics Data System (ADS)
Helbo, Alexandra Søgaard; Lay, Fides D.; Jones, Peter A.; Liang, Gangning; Grønbæk, Kirsten
2017-02-01
Chromatin is structurally involved in the transcriptional regulation of all genes. While the nucleosome positioning at RNA polymerase II (pol II) promoters has been extensively studied, less is known about the chromatin structure at pol III promoters in human cells. We use a high-resolution analysis to show substantial differences in chromatin structure of pol II and pol III promoters, and between subtypes of pol III genes. Notably, the nucleosome depleted region at the transcription start site of pol III genes extends past the termination sequences, resulting in nucleosome free gene bodies. The +1 nucleosome is located further downstream than at pol II genes and furthermore displays weak positioning. The variable position of the +1 location is seen not only within individual cell populations and between cell types, but also between different pol III promoter subtypes, suggesting that the +1 nucleosome may be involved in the transcriptional regulation of pol III genes. We find that expression and DNA methylation patterns correlate with distinct accessibility patterns, where DNA methylation associates with the silencing and inaccessibility at promoters. Taken together, this study provides the first high-resolution map of nucleosome positioning and occupancy at human pol III promoters at specific loci and genome wide.
Potts, Anastasia H; Leng, Yuanyuan; Babitzke, Paul; Romeo, Tony
2018-03-29
The Csr global regulatory system coordinates gene expression in response to metabolic status. This system utilizes the RNA binding protein CsrA to regulate gene expression by binding to transcripts of structural and regulatory genes, thus affecting their structure, stability, translation, and/or transcription elongation. CsrA activity is controlled by sRNAs, CsrB and CsrC, which sequester CsrA away from other transcripts. CsrB/C levels are partly determined by their rates of turnover, which requires CsrD to render them susceptible to RNase E cleavage. Previous epistasis analysis suggested that CsrD affects gene expression through the other Csr components, CsrB/C and CsrA. However, those conclusions were based on a limited analysis of reporters. Here, we reassessed the global behavior of the Csr circuitry using epistasis analysis with RNA seq (Epi-seq). Because CsrD effects on mRNA levels were entirely lost in the csrA mutant and largely eliminated in a csrB/C mutant under our experimental conditions, while the majority of CsrA effects persisted in the absence of csrD, the original model accounts for the global behavior of the Csr system. Our present results also reflect a more nuanced role of CsrA as terminal regulator of the Csr system than has been recognized.
Zhou, Yong; Hu, Lifang; Jiang, Lunwei; Liu, Shiqiang
2018-06-01
YTH domain-containing RNA-binding proteins are involved in post-transcriptional regulation and play important roles in the growth and development as well as abiotic stress responses of plants. However, YTH genes have not been previously studied in cucumber (Cucumis sativus). In this study, a total of five YTH genes (CsYTH1-CsYTH5) were identified in cucumber, which could be mapped on three out of the seven cucumber chromosomes. All CsYTH proteins had highly conserved C-terminal YTH domains, and two of them (CsYTH1 and CsYTH4) harbored extra CCCH and P/Q/N-rich domains. The phylogenesis, conserved motifs and exon-intron structure of YTH genes from cucumber, Arabidopsis and rice were also analyzed. The phylogenetically closely clustered YTHs shared similar gene structures and conserved motifs. An analysis of the cis-acting regulatory elements in the upstream region of these genes resulted in the identification of many cis-elements related to stress, hormone and development. Expression analysis based on the transcriptome data showed that some CsYTHs had development- or tissue-specific expression. In addition, their expression levels were altered under various stresses such as salt, drought, cold, and abscisic acid (ABA) treatments. These findings lay the foundation for the functional analysis of CsYTHs in the future.
Li, Donghua; Liu, Pan; Yu, Jingyin; Wang, Linhai; Dossa, Komivi; Zhang, Yanxin; Zhou, Rong; Wei, Xin; Zhang, Xiurong
2017-09-11
Sesame (Sesamum indicum L.) is one of the world's most important oil crops. However, it is susceptible to abiotic stresses in general, and to waterlogging and drought stresses in particular. The molecular mechanisms of abiotic stress tolerance in sesame have not yet been elucidated. The WRKY domain transcription factors play significant roles in plant growth, development, and responses to stresses. However, little is known about the number, location, structure, molecular phylogenetics, and expression of the WRKY genes in sesame. We performed a comprehensive study of the WRKY gene family in sesame and identified 71 SiWRKYs. In total, 65 of these genes were mapped to 15 linkage groups within the sesame genome. A phylogenetic analysis was performed using a related species (Arabidopsis thaliana) to investigate the evolution of the sesame WRKY genes. Tissue expression profiles of the WRKY genes demonstrated that six SiWRKY genes were highly expressed in all organs, suggesting that these genes may be important for plant growth and organ development in sesame. Analysis of the SiWRKY gene expression patterns revealed that 33 and 26 SiWRKYs respond strongly to waterlogging and drought stresses, respectively. Changes in the expression of 12 SiWRKY genes were observed at different times after the waterlogging and drought treatments had begun, demonstrating that sesame gene expression patterns vary in response to abiotic stresses. In this study, we analyzed the WRKY family of transcription factors encoded by the sesame genome. Insight was gained into the classification, evolution, and function of the SiWRKY genes, revealing their putative roles in a variety of tissues. Responses to abiotic stresses in different sesame cultivars were also investigated. The results of our study provide a better understanding of the structures and functions of sesame WRKY genes and suggest that manipulating these WRKYs could enhance resistance to waterlogging and drought.
Transport genes and chemotaxis in Laribacter hongkongensis: a genome-wide analysis
2011-01-01
Background Laribacter hongkongensis is a Gram-negative, sea gull-shaped rod associated with community-acquired gastroenteritis. The bacterium has been found in diverse freshwater environments including fish, frogs and drinking water reservoirs. Using the complete genome sequence data of L. hongkongensis, we performed a comprehensive analysis of putative transport-related genes and genes related to chemotaxis, motility and quorum sensing, which may help the bacterium adapt to the changing environments and combat harmful substances. Results A genome-wide analysis using Transport Classification Database TCDB, similarity and keyword searches revealed the presence of a large diversity of transporters (n = 457) and genes related to chemotaxis (n = 52) and flagellar biosynthesis (n = 40) in the L. hongkongensis genome. The transporters included those from all seven major transporter categories, which may allow the uptake of essential nutrients or ions, and extrusion of metabolic end products and hazardous substances. L. hongkongensis is unique among closely related members of Neisseriaceae family in possessing higher number of proteins related to transport of ammonium, urea and dicarboxylate, which may reflect the importance of nitrogen and dicarboxylate metabolism in this assacharolytic bacterium. Structural modeling of two C4-dicarboxylate transporters showed that they possessed similar structures to the determined structures of other DctP-TRAP transporters, with one having an unusual disulfide bond. Diverse mechanisms for iron transport, including hemin transporters for iron acquisition from host proteins, were also identified. In addition to the chemotaxis and flagella-related genes, the L. hongkongensis genome also contained two copies of qseB/qseC homologues of the AI-3 quorum sensing system. Conclusions The large number of diverse transporters and genes involved in chemotaxis, motility and quorum sensing suggested that the bacterium may utilize a complex system to adapt to different environments. Structural modeling will provide useful insights on the transporters in L. hongkongensis. PMID:21849034
Dong, Chen; Hu, Huigang; Xie, Jianghui
2016-12-01
DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.
Raethong, Nachon; Wong-ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H+-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction. PMID:27274991
Raethong, Nachon; Wong-Ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H(+)-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction.
Welt, Rachel S; Litt, Amy; Franks, Steven J
2015-03-27
The impact of environmental change on population structure is not well understood. This study aimed to examine the effect of a climate change event on gene flow over space and time in two populations of Brassica rapa that evolved more synchronous flowering times over 5 years of drought in southern California. Using plants grown from seeds collected before and after the drought, we estimated genetic parameters within and between populations and across generations. We expected that with greater temporal opportunity to cross-pollinate, due to reduced phenological isolation, these populations would exhibit an increase in gene flow following the drought. We found low but significant FST, but no change in FST or Nm across the drought, in contrast to predictions. Bayesian analysis of these data indicates minor differentiation between the two populations but no noticeable change in structure before and after the shift in flowering times. However, we found high and significant levels of FIS, indicating that inbreeding likely occurred in these populations despite self-incompatibility in B. rapa. In this system, we did not find an impact of climate change on gene flow or population structuring. The contribution of gene flow to adaptive evolution may vary by system, however, and is thus an important parameter to consider in further studies of natural responses to environmental change. Published by Oxford University Press on behalf of the Annals of Botany Company.
Chai, Wenbo; Si, Weina; Ji, Wei; Qin, Qianqian; Zhao, Manli; Jiang, Haiyang
2018-01-01
HD-Zip proteins represent the major transcription factors in higher plants, playing essential roles in plant development and stress responses. Foxtail millet is a crop to investigate the systems biology of millet and biofuel grasses and the HD-Zip gene family has not been studied in foxtail millet. For further investigation of the expression profile of the HD-Zip gene family in foxtail millet, a comprehensive genome-wide expression analysis was conducted in this study. We found 47 protein-encoding genes in foxtail millet using BLAST search tools; the putative proteins were classified into four subfamilies, namely, subfamilies I, II, III, and IV. Gene structure and motif analysis indicate that the genes in one subfamily were conserved. Promotor analysis showed that HD-Zip gene was involved in abiotic stress. Duplication analysis revealed that 8 (~17%) hdz genes were tandemly duplicated and 28 (58%) were segmentally duplicated; purifying duplication plays important roles in gene expansion. Microsynteny analysis revealed the maximum relationship in foxtail millet-sorghum and foxtail millet-rice. Expression profiling upon the abiotic stresses of drought and high salinity and the biotic stress of ABA revealed that some genes regulated responses to drought and salinity stresses via an ABA-dependent process, especially sihdz29 and sihdz45. Our study provides new insight into evolutionary and functional analyses of HD-Zip genes involved in environmental stress responses in foxtail millet.
Characterization of defensin gene from abalone Haliotis discus hannai and its deduced protein
NASA Astrophysics Data System (ADS)
Hong, Xuguang; Sun, Xiuqin; Zheng, Minggang; Qu, Lingyun; Zan, Jindong; Zhang, Jinxing
2008-11-01
Defensin is one of preserved ancient host defensive materials formed in biological evolution. As a regulator and effector molecule, it is very important in animals’ acquired immune system. This paper reports the defensin gene from the mixed liver and kidney cDNA library of abalone Haliotis discus hannai Ino. Sequence analysis shows that the gene sequence of full-length cDNA encodes 42 mature peptides (including six Cys), molecular weight of 4 323 Da, and pI of 8.02. Amino acid sequence homology analysis shows that the peptides are highly similar (70% in common) to other insects defensin. Because of a typical insect-defensin structural character of mature peptide in the secondary structure, the polypeptide named Haliotis discus defensin (hd-def), a novel of antimicrobial peptides, belongs to insects defensin subfamily. The RT-PCR result of Haliotis discus defensin shows that the gene can be expressed only in the hepatopancreas by Gram-negative and positive bacteria stimulation, which is ascribed to inducible expression. Therefore, it is revealed that the Haliotis discus defensin gene expression was related to the antibacterial infection of Haliotis discus hannai Ino.
Structural genes for thiamine biosynthetic enzymes (thiCEFGH) in Escherichia coli K-12.
Vander Horn, P B; Backstrom, A D; Stewart, V; Begley, T P
1993-01-01
Escherichia coli K-12 synthesizes thiamine pyrophosphate (vitamin B1) de novo. Two precursors [4-methyl-5-(beta-hydroxyethyl)thiazole monophosphate and 4-amino-5-hydroxymethyl-2-methylpyrimidine pyrophosphate] are coupled to form thiamine monophosphate, which is then phosphorylated to make thiamine pyrophosphate. Previous studies have identified two classes of thi mutations, clustered at 90 min on the genetic map, which result in requirements for the thiazole or the hydroxymethylpryimidine. We report here our initial molecular genetic analysis of the thi cluster. We cloned the thi cluster genes and examined their organization, structure, and function by a combination of phenotypic testing, complementation analysis, polypeptide expression, and DNA sequencing. We found five tightly linked genes, designated thiCEFGH. The thiC gene product is required for the synthesis of the hydroxymethylpyrimidine. The thiE, thiF, thiG, and thiH gene products are required for synthesis of the thiazole. These mutants did not respond to 1-deoxy-D-threo-2-pentulose, indicating that they are blocked in the conversion of this precursor compound to the thiazole itself. Images PMID:8432721
Zhou, Yong; Hu, Lifang; Wu, Hao; Jiang, Lunwei
2017-01-01
Superoxide dismutase (SOD) proteins are widely present in the plant kingdom and play important roles in different biological processes. However, little is known about the SOD genes in cucumber. In this study, night SOD genes were identified from cucumber (Cucumis sativus) using bioinformatics-based methods, including 5 Cu/ZnSODs, 3 FeSODs, and 1 MnSOD. Gene structure and motif analysis indicated that most of the SOD genes have relatively conserved exon/intron arrangement and motif composition. Phylogenetic analyses with SODs from cucumber and several other species revealed that these SOD proteins can be traced back to two ancestral SODs before the divergence of monocot and dicot plants. Many cis-elements related to stress responses and plant hormones were found in the promoter sequence of each CsSOD gene. Gene expression analysis revealed that most of the CsSOD genes are expressed in almost all the tested tissues. qRT-PCR analysis of 8 selected CsSOD genes showed that these genes could respond to heat, cold, osmotic, and salt stresses. Our results provide a basis for further functional research on SOD gene family in cucumber and facilitate their potential applications in the genetic improvement of cucumber. PMID:28808654
Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada
2015-01-01
Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Conservation of the structure and organization of lupin mitochondrial nad3 and rps12 genes.
Rurek, M; Oczkowski, M; Augustyniak, H
1998-01-01
A high level of the nucleotide sequence conservation of mitochondrial nad3 and rps12 genes was found in four lupin species. The only differences concern three nucleotides in the Lupinus albus rps12 gene and three nucleotides insertion in the L. mutabilis spacer. Northern blot analysis as well as RT-PCR confirmed cotranscription of the L. luteus genes because the transcripts detected were long enough.
NASA Technical Reports Server (NTRS)
Kopczynski, E. D.; Bateson, M. M.; Ward, D. M.
1994-01-01
When PCR was used to recover small-subunit (SSU) rRNA genes from a hot spring cyanobacterial mat community, chimeric SSU rRNA sequences which exhibited little or no secondary structural abnormality were recovered. They were revealed as chimeras of SSU rRNA genes of uncultivated species through separate phylogenetic analysis of short sequence domains.
Genome structure of Rosa multiflora, a wild ancestor of cultivated roses
Nakamura, Noriko; Hirakawa, Hideki; Sato, Shusei; Otagaki, Shungo; Matsumoto, Shogo; Tabata, Satoshi; Tanaka, Yoshikazu
2018-01-01
Abstract The draft genome sequence of a wild rose (Rosa multiflora Thunb.) was determined using Illumina MiSeq and HiSeq platforms. The total length of the scaffolds was 739,637,845 bp, consisting of 83,189 scaffolds, which was close to the 711 Mbp length estimated by k-mer analysis. N50 length of the scaffolds was 90,830 bp, and extent of the longest was 1,133,259 bp. The average GC content of the scaffolds was 38.9%. After gene prediction, 67,380 candidates exhibiting sequence homology to known genes and domains were extracted, which included complete and partial gene structures. This large number of genes for a diploid plant may reflect heterogeneity of the genome originating from self-incompatibility in R. multiflora. According to CEGMA analysis, 91.9% and 98.0% of the core eukaryotic genes were completely and partially conserved in the scaffolds, respectively. Genes presumably involved in flower color, scent and flowering are assigned. The results of this study will serve as a valuable resource for fundamental and applied research in the rose, including breeding and phylogenetic study of cultivated roses. PMID:29045613
Izquierdo, Esther; Cai, Yimin; Marchioni, Eric; Ennahar, Saïd
2009-05-01
Enterococcus faecium IT62, a strain isolated from ryegrass in Japan, produces three bacteriocins (enterocins L50A, L50B, and IT) that have been previously purified and the primary structures of which have been determined by amino acid sequencing (E. Izquierdo, A. Bednarczyk, C. Schaeffer, Y. Cai, E. Marchioni, A. Van Dorsselaer, and S. Ennahar, Antimicrob. Agents Chemother., 52:1917-1923, 2008). Genetic analysis showed that the bacteriocins of E. faecium IT62 are plasmid encoded, but with the structural genes specifying enterocin L50A and enterocin L50B being carried by a plasmid (pTAB1) that is separate from the one (pTIT1) carrying the structural gene of enterocin IT. Sequencing analysis of a 1,475-bp region from pTAB1 identified two consecutive open reading frames corresponding, with the exception of 2 bp, to the genes entL50A and entL50B, encoding EntL50A and EntL50B, respectively. Both bacteriocins are synthesized without N-terminal leader sequences. Genetic analysis of a sequenced 1,380-bp pTIT1 fragment showed that the genes entIT and entIM, encoding enterocin IT and its immunity protein, respectively, were both found in E. faecium VRE200 for bacteriocin 32. Enterocin IT, a 6,390-Da peptide made up of 54 amino acids, has been previously shown to be identical to the C-terminal part of bacteriocin 32, a 7,998-Da bacteriocin produced by E. faecium VRE200 whose structure was deduced from its structural gene (T. Inoue, H. Tomita, and Y. Ike, Antimicrob. Agents Chemother., 50:1202-1212, 2006). By combining the biochemical and genetic data on enterocin IT, it was concluded that bacteriocin 32 is in fact identical to enterocin IT, both being encoded by the same plasmid-borne gene, and that the N-terminal leader peptide for this bacteriocin is 35 amino acids long and not 19 amino acids long as previously reported.
Wei, Tao; Sun, Yuena; Shi, Ge; Wang, Rixin; Xu, Tianjun
2012-09-01
Heat shock proteins (HSPs) play crucial roles in the immune response of vertebrates. In order to study immune defense mechanism of heat shock protein gene in miiuy croaker (Miichthys miiuy), a cDNA encoding heat shock protein 70 (designated Mimi-HSP70) gene was cloned from miiuy croaker. The cDNA was 2195 bp in length, consisting of an open reading frame (ORF) of 1917 bp encoding a polypeptide of 638 amino acids with estimated molecular mass of 70.3 kDa and theoretical isoelectric point of 5.55. Genomic DNA structure analysis revealed that the Mimi-HSP70 gene contain no introns in coding region and four SNPs with 373 C/T, 789 G/A, 1005 C/T, and 1185 G/A were detected by direct sequencing of 20 samples from six different populations. BLAST analysis, structure comparison and phylogenetic analysis indicated that Mimi-HSP70 should be an inducible cytosolic member of the HSP70 family. The deduced amino acid sequence of Mimi-HSP70 had 82.4%-92.2% identity with those of vertebrate. A real-time quantitative RT-PCR demonstrated that the HSP70 gene was ubiquitously expressed in ten normal tissues. Under different temperature shock stress, the expression of Mimi-HSP70 gene in miiuy croaker increased at first and then decreased with the rise of temperature, finally, reached a maximum level in liver, spleen and kidney tissues. Infection of miiuy croaker with Vibrio anguillarum resulted in significant changes expression of Mimi-HSP70 gene in the immune-related tissues. These results indicated that expression analysis of Mimi-HSP70 gene provide theoretical basis to further study the mechanism of anti-adverseness in the miiuy croaker. Copyright © 2012 Elsevier Ltd. All rights reserved.
Chen, Xue; Chen, Zhu; Zhao, Hualin; Zhao, Yang; Cheng, Beijiu; Xiang, Yan
2014-01-01
Homeodomain-leucine zipper (HD-Zip) proteins, a group of homeobox transcription factors, participate in various aspects of normal plant growth and developmental processes as well as environmental responses. To date, no overall analysis or expression profiling of the HD-Zip gene family in soybean (Glycine max) has been reported. An investigation of the soybean genome revealed 88 putative HD-Zip genes. These genes were classified into four subfamilies, I to IV, based on phylogenetic analysis. In each subfamily, the constituent parts of gene structure and motif were relatively conserved. A total of 87 out of 88 genes were distributed unequally on 20 chromosomes with 36 segmental duplication events, indicating that segmental duplication is important for the expansion of the HD-Zip family. Analysis of the Ka/Ks ratios showed that the duplicated genes of the HD-Zip family basically underwent purifying selection with restrictive functional divergence after the duplication events. Analysis of expression profiles showed that 80 genes differentially expressed across 14 tissues, and 59 HD-Zip genes are differentially expressed under salinity and drought stress, with 20 paralogous pairs showing nearly identical expression patterns and three paralogous pairs diversifying significantly under drought stress. Quantitative real-time RT-PCR (qRT-PCR) analysis of six paralogous pairs of 12 selected soybean HD-Zip genes under both drought and salinity stress confirmed their stress-inducible expression patterns. This study presents a thorough overview of the soybean HD-Zip gene family and provides a new perspective on the evolution of this gene family. The results indicate that HD-Zip family genes may be involved in many plant responses to stress conditions. Additionally, this study provides a solid foundation for uncovering the biological roles of HD-Zip genes in soybean growth and development.
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
John, Anulekha Mary; C, George Priya Doss; Ebenazer, Andrew; Seshadri, Mandalam Subramaniam; Nair, Aravindan; Rajaratnam, Simon; Pai, Rekha
2013-01-01
Various missense mutations in the VHL gene have been reported among patients with familial bilateral pheochromocytoma. However, the p.Arg82Leu mutation in the VHL gene described here among patients with familial bilateral pheochromocytoma, has never been reported previously in a germline configuration. Interestingly, long-term follow-up of these patients indicated that the mutation might have had little impact on the normal function of the VHL gene, since all of them have remained asymptomatic. We further attempted to correlate this information with the results obtained by in silico analysis of this mutation using SIFT, PhD-SNP SVM profile, MutPred, PolyPhen2, and SNPs&GO prediction tools. To gain, new mechanistic insight into the structural effect, we mapped the mutation on to 3D structure (PDB ID 1LM8). Further, we analyzed the structural level changes in time scale level with respect to native and mutant protein complexes by using 12 ns molecular dynamics simulation method. Though these methods predict the mutation to have a pathogenic potential, it remains to be seen if these patients will eventually develop symptomatic disease. PMID:23626751
Soil-borne microbial functional structure across different land uses.
Kuramae, Eiko E; Zhou, Jizhong Z; Kowalchuk, George A; van Veen, Johannes A
2014-01-01
Land use change alters the structure and composition of microbial communities. However, the links between environmental factors and microbial functions are not well understood. Here we interrogated the functional structure of soil microbial communities across different land uses. In a multivariate regression tree analysis of soil physicochemical properties and genes detected by functional microarrays, the main factor that explained the different microbial community functional structures was C : N ratio. C : N ratio showed a significant positive correlation with clay and soil pH. Fields with low C : N ratio had an overrepresentation of genes for carbon degradation, carbon fixation, metal reductase, and organic remediation categories, while fields with high C : N ratio had an overrepresentation of genes encoding dissimilatory sulfate reductase, methane oxidation, nitrification, and nitrogen fixation. The most abundant genes related to carbon degradation comprised bacterial and fungal cellulases; bacterial and fungal chitinases; fungal laccases; and bacterial, fungal, and oomycete polygalacturonases. The high number of genes related to organic remediation was probably driven by high phosphate content, while the high number of genes for nitrification was probably explained by high total nitrogen content. The functional gene diversity found in different soils did not group the sites accordingly to land management. Rather, the soil factors, C : N ratio, phosphate, and total N, were the main factors driving the differences in functional genes across the fields examined.
Jia, Mingrui; Shi, Ranran; Zhao, Xuli; Fu, Zhijian; Bai, Zhijing; Sun, Tao; Zhao, Xuejun; Wang, Wenbo; Xu, Chao; Yan, Fang
2017-01-01
Abstract Mutation analysis as the gold standard is particularly important in diagnosis of osteogenesis imperfecta (OI) and it may be preventable upon early diagnosis. In this study, we aimed to analyze the clinical and genetic materials of an OI pedigree as well as to confirm the deleterious property of the mutation. A pedigree with OI was identified. All family members received careful clinical examinations and blood was drawn for genetic analyses. Genes implicated in OI were screened for mutation. The function and structure of the mutant protein were predicted using bioinformatics analysis. The proband, a 9-month fetus, showed abnormal sonographic images. Disproportionately short and triangular face with blue sclera was noticed at birth. She can barely walk and suffered multiple fractures till 2-year old. Her mother appeared small stature, frequent fractures, blue sclera, and deformity of extremities. A heterozygous missense mutation c.1009G>T (p.G337C) in the COL1A2 gene was identified in her mother and her. Bioinformatics analysis showed p.G337 was well-conserved among multiple species and the mutation probably changed the structure and damaged the function of collagen. We suggest that the mutation p.G337C in the COL1A2 gene is pathogenic for OI by affecting the protein structure and the function of collagen. PMID:28953610
Li, Jun; Hou, Hongmin; Li, Xiaoqin; Xiang, Jiang; Yin, Xiangjing; Gao, Hua; Zheng, Yi; Bassett, Carole L; Wang, Xiping
2013-09-01
SQUAMOSA promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors and play many crucial roles in plant development. In this study, 27 SBP-box gene family members were identified in the apple (Malus × domestica Borkh.) genome, 15 of which were suggested to be putative targets of MdmiR156. Plant SBPs were classified into eight groups according to the phylogenetic analysis of SBP-domain proteins. Gene structure, gene chromosomal location and synteny analyses of MdSBP genes within the apple genome demonstrated that tandem and segmental duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of the SBP-box gene family in apple. Additionally, synteny analysis between apple and Arabidopsis indicated that several paired homologs of MdSBP and AtSPL genes were located in syntenic genomic regions. Tissue-specific expression analysis of MdSBP genes in apple demonstrated their diversified spatiotemporal expression patterns. Most MdmiR156-targeted MdSBP genes, which had relatively high transcript levels in stems, leaves, apical buds and some floral organs, exhibited a more differential expression pattern than most MdmiR156-nontargeted MdSBP genes. Finally, expression analysis of MdSBP genes in leaves upon various plant hormone treatments showed that many MdSBP genes were responsive to different plant hormones, indicating that MdSBP genes may be involved in responses to hormone signaling during stress or in apple development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Verdugo, Ricardo A; Zeller, Tanja; Rotival, Maxime; Wild, Philipp S; Münzel, Thomas; Lackner, Karl J; Weidmann, Henri; Ninio, Ewa; Trégouët, David-Alexandre; Cambien, François; Blankenberg, Stefan; Tiret, Laurence
2013-01-01
Smoking is a risk factor for atherosclerosis with reported widespread effects on gene expression in circulating blood cells. We hypothesized that a molecular signature mediating the relation between smoking and atherosclerosis may be found in the transcriptome of circulating monocytes. Genome-wide expression profiles and counts of atherosclerotic plaques in carotid arteries were collected in 248 smokers and 688 non-smokers from the general population. Patterns of co-expressed genes were identified by Independent Component Analysis (ICA) and network structure of the pattern-specific gene modules was inferred by the PC-algorithm. A likelihood-based causality test was implemented to select patterns that fit models containing a path "smoking→gene expression→plaques". Robustness of the causal inference was assessed by bootstrapping. At a FDR ≤0.10, 3,368 genes were associated to smoking or plaques, of which 93% were associated to smoking only. SASH1 showed the strongest association to smoking and PPARG the strongest association to plaques. Twenty-nine gene patterns were identified by ICA. Modules containing SASH1 and PPARG did not show evidence for the "smoking→gene expression→plaques" causality model. Conversely, three modules had good support for causal effects and exhibited a network topology consistent with gene expression mediating the relation between smoking and plaques. The network with the strongest support for causal effects was connected to plaques through SLC39A8, a gene with known association to HDL-cholesterol and cellular uptake of cadmium from tobacco, while smoking was directly connected to GAS6, a gene reported to have anti-inflammatory effects in atherosclerosis and to be up-regulated in the placenta of women smoking during pregnancy. Our analysis of the transcriptome of monocytes recovered genes relevant for association to smoking and atherosclerosis, and connected genes that before, were only studied in separate contexts. Inspection of correlation structure revealed candidates that would be missed by expression-phenotype association analysis alone.
Verdugo, Ricardo A.; Zeller, Tanja; Rotival, Maxime; Wild, Philipp S.; Münzel, Thomas; Lackner, Karl J.; Weidmann, Henri; Ninio, Ewa; Trégouët, David-Alexandre; Cambien, François; Blankenberg, Stefan; Tiret, Laurence
2013-01-01
Smoking is a risk factor for atherosclerosis with reported widespread effects on gene expression in circulating blood cells. We hypothesized that a molecular signature mediating the relation between smoking and atherosclerosis may be found in the transcriptome of circulating monocytes. Genome-wide expression profiles and counts of atherosclerotic plaques in carotid arteries were collected in 248 smokers and 688 non-smokers from the general population. Patterns of co-expressed genes were identified by Independent Component Analysis (ICA) and network structure of the pattern-specific gene modules was inferred by the PC-algorithm. A likelihood-based causality test was implemented to select patterns that fit models containing a path “smoking→gene expression→plaques”. Robustness of the causal inference was assessed by bootstrapping. At a FDR ≤0.10, 3,368 genes were associated to smoking or plaques, of which 93% were associated to smoking only. SASH1 showed the strongest association to smoking and PPARG the strongest association to plaques. Twenty-nine gene patterns were identified by ICA. Modules containing SASH1 and PPARG did not show evidence for the “smoking→gene expression→plaques” causality model. Conversely, three modules had good support for causal effects and exhibited a network topology consistent with gene expression mediating the relation between smoking and plaques. The network with the strongest support for causal effects was connected to plaques through SLC39A8, a gene with known association to HDL-cholesterol and cellular uptake of cadmium from tobacco, while smoking was directly connected to GAS6, a gene reported to have anti-inflammatory effects in atherosclerosis and to be up-regulated in the placenta of women smoking during pregnancy. Our analysis of the transcriptome of monocytes recovered genes relevant for association to smoking and atherosclerosis, and connected genes that before, were only studied in separate contexts. Inspection of correlation structure revealed candidates that would be missed by expression-phenotype association analysis alone. PMID:23372645
Wang, Jingxue; Singh, Sanjay K; Du, Chunfang; Li, Chen; Fan, Jianchun; Pattanaik, Sitakanta; Yuan, Ling
2016-01-01
Rapeseed ( Brassica napus ) is an important oil seed crop, providing more than 13% of the world's supply of edible oils. An in-depth knowledge of the gene network involved in biosynthesis and accumulation of seed oil is critical for the improvement of B. napus . Using available genomic and transcriptomic resources, we identified 1,750 acyl-lipid metabolism (ALM) genes that are distributed over 19 chromosomes in the B . napus genome. B. rapa and B. oleracea , two diploid progenitors of B. napus , contributed almost equally to the ALM genes. Genome collinearity analysis demonstrated that the majority of the ALM genes have arisen due to genome duplication or segmental duplication events. In addition, we profiled the expression patterns of the ALM genes in four different developmental stages. Furthermore, we developed two B. napus near isogenic lines (NILs). The high oil NIL, YC13-559, accumulates significantly higher (∼10%) seed oil compared to the other, YC13-554. Comparative gene expression analysis revealed upregulation of lipid biosynthesis-related regulatory genes in YC13-559, including SHOOTMERISTEMLESS, LEAFY COTYLEDON 1 (LEC1), LEC2, FUSCA3, ABSCISIC ACID INSENSITIVE 3 (ABI3), ABI4, ABI5 , and WRINKLED1 , as well as structural genes, such as ACETYL-CoA CARBOXYLASE, ACYL-CoA DIACYLGLYCEROL ACYLTRANSFERASE , and LONG - CHAIN ACYL-CoA SYNTHETASES . We observed that several genes related to the phytohormones, gibberellins, jasmonate, and indole acetic acid, were differentially expressed in the NILs. Our findings provide a broad account of the numbers, distribution, and expression profiles of acyl-lipid metabolism genes, as well as gene networks that potentially control oil accumulation in B . napus seeds. The upregulation of key regulatory and structural genes related to lipid biosynthesis likely plays a major role for the increased seed oil in YC13-559.
Eklöf, Jens M.; Shojania, Shaheen; Okon, Mark; McIntosh, Lawrence P.; Brumer, Harry
2013-01-01
The large xyloglucan endotransglycosylase/hydrolase (XTH) gene family continues to be the focus of much attention in studies of plant cell wall morphogenesis due to the unique catalytic functions of the enzymes it encodes. The XTH gene products compose a subfamily of glycoside hydrolase family 16 (GH16), which also comprises a broad range of microbial endoglucanases and endogalactanases, as well as yeast cell wall chitin/β-glucan transglycosylases. Previous whole-family phylogenetic analyses have suggested that the closest relatives to the XTH gene products are the bacterial licheninases (EC 3.2.1.73), which specifically hydrolyze linear mixed linkage β(1→3)/β(1→4)-glucans. In addition to their specificity for the highly branched xyloglucan polysaccharide, XTH gene products are distinguished from the licheninases and other GH16 enzyme subfamilies by significant active site loop alterations and a large C-terminal extension. Given these differences, the molecular evolution of the XTH gene products in GH16 has remained enigmatic. Here, we present the biochemical and structural analysis of a unique, mixed function endoglucanase from black cottonwood (Populus trichocarpa), which reveals a small, newly recognized subfamily of GH16 members intermediate between the bacterial licheninases and plant XTH gene products. We postulate that this clade comprises an important link in the evolution of the large plant XTH gene families from a putative microbial ancestor. As such, this analysis provides new insights into the diversification of GH16 and further unites the apparently disparate members of this important family of proteins. PMID:23572521
Bajpai, Prabodh K; Warghat, Ashish R; Sharma, Ram Kumar; Yadav, Ashish; Thakur, Anil K; Srivastava, Ravi B; Stobdan, Tsering
2014-04-01
Sequence-related amplified polymorphism markers were used to assess the genetic structure in three natural populations of Morus alba from trans-Himalaya. Multilocation sampling was conducted across 14 collection sites. The overall genetic diversity estimates were high: percentage polymorphic loci 89.66%, Nei's gene diversity 0.2286, and Shannon's information index 0.2175. At a regional level, partitioning of variability assessed using analysis of molecular variance (AMOVA), revealed 80% variation within and 20% among collection sites. Pattern appeared in STRUCTURE, BARRIER, and AMOVA, clearly demonstrating gene flow between the Indus and Suru populations and a geographic barrier between the Indus-Suru and Nubra populations, which effectively hinders gene flow. The results showed significant genetic differentiation, population structure, high to restricted gene flow, and high genetic diversity. The assumption that samples collected from the three valleys represent three different populations does not hold true. The fragmentation present in trans-Himalaya was more natural and less anthropogenic.
Xu, Ruirui; Liu, Caiyun; Li, Ning; Zhang, Shizhong
2016-12-01
Argonaute (AGO) proteins, which are found in yeast, animals, and plants, are the core molecules of the RNA-induced silencing complex. These proteins play important roles in plant growth, development, and responses to biotic stresses. The complete analysis and classification of the AGO gene family have been recently reported in different plants. Nevertheless, systematic analysis and expression profiling of these genes have not been performed in apple (Malus domestica). Approximately 15 AGO genes were identified in the apple genome. The phylogenetic tree, chromosome location, conserved protein motifs, gene structure, and expression of the AGO gene family in apple were analyzed for gene prediction. All AGO genes were phylogenetically clustered into four groups (i.e., AGO1, AGO4, MEL1/AGO5, and ZIPPY/AGO7) with the AGO genes of Arabidopsis. These groups of the AGO gene family were statistically analyzed and compared among 31 plant species. The predicted apple AGO genes are distributed across nine chromosomes at different densities and include three segment duplications. Expression studies indicated that 15 AGO genes exhibit different expression patterns in at least one of the tissues tested. Additionally, analysis of gene expression levels indicated that the genes are mostly involved in responses to NaCl, PEG, heat, and low-temperature stresses. Hence, several candidate AGO genes are involved in different aspects of physiological and developmental processes and may play an important role in abiotic stress responses in apple. To the best of our knowledge, this study is the first to report a comprehensive analysis of the apple AGO gene family. Our results provide useful information to understand the classification and putative functions of these proteins, especially for gene members that may play important roles in abiotic stress responses in M. hupehensis.
Vouille, V; Amiche, M; Nicolas, P
1997-09-01
We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.
Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong
2018-01-01
Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.
Analysis of the vp2 gene sequence of a new mutated mink enteritis parvovirus strain in PR China
2010-01-01
Background Mink enteritis virus (MEV) causes a highly contagious viral disease of mink with a worldwide distribution. MEV has a linear, single-stranded, negative-sense DNA with a genome length of approximately 5,000 bp. The VP2 protein is the major structural protein of the parvovirus encoded by the vp2 gene. VP2 is highly antigenic and plays important roles in determining viral host ranges and tissue tropisms. This study describes the bionomics and vp2 gene analysis of a mutated strain, MEV-DL, which was isolated recently in China and outlines its homologous relationships with other selected strains registered in Genbank. Results The MEV-DL strain can infect F81 cells with cytopathic effects. Pig erythrocytes were agglutinated by the MEV-DL strain. The generation of MEV-DL in F81 cells could infect mink within three months and cause a disease that was similar to that caused by wild-type MEV. A comparative analysis of the vp2 gene nucleotide (nt) sequence of MEV-DL showed that this was more than 99% homologous with other mink enteritis parvoviruses in Genbank. However, the nucleotide residues at positions 1,065 and 1,238 in the MEV-DL strain of the vp2 gene differed from those of all the other MEV strains described previously. It is noteworthy that the mutation at the nucleotide residues position 1,238 led to Asp/Gly replacement. This may lead to structural changes. A phylogenetic tree and sequence distance table were obtained, which showed that the MEV-DL and ZYL-1 strains had the closest inheritance distance. Conclusions A new variation of the vp2 gene exists in the MEV-DL strain, which may lead to structural changes of the VP2 protein. Phylogenetic analysis showed that MEV-DL may originate from the ZYL-1 strain in DaLian. PMID:20540765
Phage phenomics: Physiological approaches to characterize novel viral proteins
Sanchez, Savannah E. [San Diego State Univ., San Diego, CA (United States); Cuevas, Daniel A. [San Diego State Univ., San Diego, CA (United States); Rostron, Jason E. [San Diego State Univ., San Diego, CA (United States); Liang, Tiffany Y. [San Diego State Univ., San Diego, CA (United States); Pivaroff, Cullen G. [San Diego State Univ., San Diego, CA (United States); Haynes, Matthew R. [San Diego State Univ., San Diego, CA (United States); Nulton, Jim [San Diego State Univ., San Diego, CA (United States); Felts, Ben [San Diego State Univ., San Diego, CA (United States); Bailey, Barbara A. [San Diego State Univ., San Diego, CA (United States); Salamon, Peter [San Diego State Univ., San Diego, CA (United States); Edwards, Robert A. [San Diego State Univ., San Diego, CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Burgin, Alex B. [Broad Institute, Cambridge, MA (United States); Segall, Anca M. [San Diego State Univ., San Diego, CA (United States); Rohwer, Forest [San Diego State Univ., San Diego, CA (United States)
2018-06-21
Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysis by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments
Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic
2001-01-01
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Transcriptome analysis by strand-specific sequencing of complementary DNA
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-01-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-10-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang
2015-11-23
With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.
Houtman, Corine J; Sterk, Saskia S; van de Heijning, Monique P M; Brouwer, Abraham; Stephany, Rainer W; van der Burg, Bart; Sonneveld, Edwin
2009-04-01
Anabolic androgenic steroids (AAS) are a class of steroid hormones related to the male hormone testosterone. They are frequently detected as drugs in sport doping control. Being similar to or derived from natural male hormones, AAS share the activation of the androgen receptor (AR) as common mechanism of action. The mammalian androgen responsive reporter gene assay (AR CALUX bioassay), measuring compounds interacting with the AR can be used for the analysis of AAS without the necessity of knowing their chemical structure beforehand, whereas current chemical-analytical approaches may have difficulty in detecting compounds with unknown structures, such as designer steroids. This study demonstrated that AAS prohibited in sports and potential designer AAS can be detected with this AR reporter gene assay, but that also additional steroid activities of AAS could be found using additional mammalian bioassays for other types of steroid hormones. Mixtures of AAS were found to behave additively in the AR reporter gene assay showing that it is possible to use this method for complex mixtures as are found in doping control samples, including mixtures that are a result of multi drug use. To test if mammalian reporter gene assays could be used for the detection of AAS in urine samples, background steroidal activities were measured. AAS-spiked urine samples, mimicking doping positive samples, showed significantly higher androgenic activities than unspiked samples. GC-MS analysis of endogenous androgens and AR reporter gene assay analysis of urine samples showed how a combined chemical-analytical and bioassay approach can be used to identify samples containing AAS. The results indicate that the AR reporter gene assay, in addition to chemical-analytical methods, can be a valuable tool for the analysis of AAS for doping control purposes.
Porcine Tissue-Specific Regulatory Networks Derived from Meta-Analysis of the Transcriptome
Pérez-Montarelo, Dafne; Hudson, Nicholas J.; Fernández, Ana I.; Ramayo-Caldas, Yuliaxis; Dalrymple, Brian P.; Reverter, Antonio
2012-01-01
The processes that drive tissue identity and differentiation remain unclear for most tissue types. So are the gene networks and transcription factors (TF) responsible for the differential structure and function of each particular tissue, and this is particularly true for non model species with incomplete genomic resources. To better understand the regulation of genes responsible for tissue identity in pigs, we have inferred regulatory networks from a meta-analysis of 20 gene expression studies spanning 480 Porcine Affymetrix chips for 134 experimental conditions on 27 distinct tissues. We developed a mixed-model normalization approach with a covariance structure that accommodated the disparity in the origin of the individual studies, and obtained the normalized expression of 12,320 genes across the 27 tissues. Using this resource, we constructed a network, based on the co-expression patterns of 1,072 TF and 1,232 tissue specific genes. The resulting network is consistent with the known biology of tissue development. Within the network, genes clustered by tissue and tissues clustered by site of embryonic origin. These clusters were significantly enriched for genes annotated in key relevant biological processes and confirm gene functions and interactions from the literature. We implemented a Regulatory Impact Factor (RIF) metric to identify the key regulators in skeletal muscle and tissues from the central nervous systems. The normalization of the meta-analysis, the inference of the gene co-expression network and the RIF metric, operated synergistically towards a successful search for tissue-specific regulators. Novel among these findings are evidence suggesting a novel key role of ERCC3 as a muscle regulator. Together, our results recapitulate the known biology behind tissue specificity and provide new valuable insights in a less studied but valuable model species. PMID:23049964
Li, Xiong; Wu, Yuansheng; Li, Boqun; He, Wenqi; Yang, Yonghong; Yang, Yongping
2018-01-01
The cation diffusion facilitator (CDF) family is one of the gene families involved in metal ion uptake and transport in plants, but the understanding of the definite roles and mechanisms of most CDF genes remain limited. In the present study, we identified 18 candidate CDF genes from the turnip genome and named them BrrMTP1.1 - BrrMTP12 . Then, we performed a comparative genomic analysis on the phylogenetic relationships, gene structures and chromosome distributions, conserved domains, and motifs of turnip CDFs. The constructed phylogenetic tree indicated that the BrrMTPs were divided into seven groups (groups 1, 5, 6, 7, 8, 9, and 12) and formed three major clusters (Zn-CDFs, Fe/Zn-CDFs, and Mn-CDFs). Moreover, the structural characteristics of the BrrMTP members in the same group were similar but varied among groups. To investigate the potential roles of BrrMTPs in turnip, we conducted an expression analysis on all BrrMTP genes under Mg, Zn, Cu, Mn, Fe, Co, Na, and Cd stresses. Results showed that the expression levels of all BrrMTP members were induced by at least one metal ion, indicating that these genes may be related to the tolerance or transport of those metal ions. Based on the roles of different metal ions for plants, we hypothesized that BrrMTP genes are possibly involved in heavy metal accumulation and tolerance to salt stress apart from their roles in the maintenance of mineral nutrient homeostasis in turnip. These findings are helpful to understand the roles of MTPs in plants and provide preliminary information for the study of the functions of BrrMTP genes.
Iverson, Eric A.; Goodman, David A.; Gorchels, Madeline E.
2017-01-01
ABSTRACT Viruses infecting the Archaea harbor a tremendous amount of genetic diversity. This is especially true for the spindle-shaped viruses of the family Fuselloviridae, where >90% of the viral genes do not have detectable homologs in public databases. This significantly limits our ability to elucidate the role of viral proteins in the infection cycle. To address this, we have developed genetic techniques to study the well-characterized fusellovirus Sulfolobus spindle-shaped virus 1 (SSV1), which infects Sulfolobus solfataricus in volcanic hot springs at 80°C and pH 3. Here, we present a new comparative genome analysis and a thorough genetic analysis of SSV1 using both specific and random mutagenesis and thereby generate mutations in all open reading frames. We demonstrate that almost half of the SSV1 genes are not essential for infectivity, and the requirement for a particular gene correlates well with its degree of conservation within the Fuselloviridae. The major capsid gene vp1 is essential for SSV1 infectivity. However, the universally conserved minor capsid gene vp3 could be deleted without a loss in infectivity and results in virions with abnormal morphology. IMPORTANCE Most of the putative genes in the spindle-shaped archaeal hyperthermophile fuselloviruses have no sequences that are clearly similar to characterized genes. In order to determine which of these SSV genes are important for function, we disrupted all of the putative genes in the prototypical fusellovirus, SSV1. Surprisingly, about half of the genes could be disrupted without destroying virus function. Even deletions of one of the known structural protein genes that is present in all known fuselloviruses, vp3, allows the production of infectious viruses. However, viruses lacking vp3 have abnormal shapes, indicating that the vp3 gene is important for virus structure. Identification of essential genes will allow focused research on minimal SSV genomes and further understanding of the structure of these unique, ubiquitous, and extremely stable archaeal viruses. PMID:28148789
Iverson, Eric A; Goodman, David A; Gorchels, Madeline E; Stedman, Kenneth M
2017-05-15
Viruses infecting the Archaea harbor a tremendous amount of genetic diversity. This is especially true for the spindle-shaped viruses of the family Fuselloviridae , where >90% of the viral genes do not have detectable homologs in public databases. This significantly limits our ability to elucidate the role of viral proteins in the infection cycle. To address this, we have developed genetic techniques to study the well-characterized fusellovirus Sulfolobus spindle-shaped virus 1 (SSV1), which infects Sulfolobus solfataricus in volcanic hot springs at 80°C and pH 3. Here, we present a new comparative genome analysis and a thorough genetic analysis of SSV1 using both specific and random mutagenesis and thereby generate mutations in all open reading frames. We demonstrate that almost half of the SSV1 genes are not essential for infectivity, and the requirement for a particular gene correlates well with its degree of conservation within the Fuselloviridae The major capsid gene vp1 is essential for SSV1 infectivity. However, the universally conserved minor capsid gene vp3 could be deleted without a loss in infectivity and results in virions with abnormal morphology. IMPORTANCE Most of the putative genes in the spindle-shaped archaeal hyperthermophile fuselloviruses have no sequences that are clearly similar to characterized genes. In order to determine which of these SSV genes are important for function, we disrupted all of the putative genes in the prototypical fusellovirus, SSV1. Surprisingly, about half of the genes could be disrupted without destroying virus function. Even deletions of one of the known structural protein genes that is present in all known fuselloviruses, vp3 , allows the production of infectious viruses. However, viruses lacking vp3 have abnormal shapes, indicating that the vp3 gene is important for virus structure. Identification of essential genes will allow focused research on minimal SSV genomes and further understanding of the structure of these unique, ubiquitous, and extremely stable archaeal viruses. Copyright © 2017 American Society for Microbiology.
Deletion Analysis of the Tumorous-Head (tuh–3) Gene in DROSOPHILA MELANOGASTER
Kuhn, David T.; Woods, Daniel F.; Andrew, Deborah J.
1981-01-01
In the presence of the naturally occurring maternal-effect alleles tuh-1h or tuh-1g, the tuh-3 mutant gene can cause the tumorous-head trait or the sac-testis trait. The tuh-3 gene functions as a semidominant in the presence of the tuh-1h maternal effect. Eye-antennal structures are replaced by posterior abdominal tergites and genital structures. If tuh-1h is replaced by its naturally occurring allele tuh-1g, tuh-3 functions as a recessive hypomorph and the defect switches from anterior to posterior structures, with a male genital-disc defect appearing with variable penetrance. Function and regulation of tuh-3+ may better be understood in light of the cytological localization of tuh-3 either adjacent to or as part of the bithorax complex. The tuh-3+ gene product appears to be essential for normal development, at least in the posterior end of the embryo. PMID:6804305
The Choice between MapMan and Gene Ontology for Automated Gene Function Prediction in Plant Science
Klie, Sebastian; Nikoloski, Zoran
2012-01-01
Since the introduction of the Gene Ontology (GO), the analysis of high-throughput data has become tightly coupled with the use of ontologies to establish associations between knowledge and data in an automated fashion. Ontologies provide a systematic description of knowledge by a controlled vocabulary of defined structure in which ontological concepts are connected by pre-defined relationships. In plant science, MapMan and GO offer two alternatives for ontology-driven analyses. Unlike GO, initially developed to characterize microbial systems, MapMan was specifically designed to cover plant-specific pathways and processes. While the dependencies between concepts in MapMan are modeled as a tree, in GO these are captured in a directed acyclic graph. Therefore, the difference in ontologies may cause discrepancies in data reduction, visualization, and hypothesis generation. Here provide the first systematic comparative analysis of GO and MapMan for the case of the model plant species Arabidopsis thaliana (Arabidopsis) with respect to their structural properties and difference in distributions of information content. In addition, we investigate the effect of the two ontologies on the specificity and sensitivity of automated gene function prediction via the coupling of co-expression networks and the guilt-by-association principle. Automated gene function prediction is particularly needed for the model plant Arabidopsis in which only half of genes have been functionally annotated based on sequence similarity to known genes. The results highlight the need for structured representation of species-specific biological knowledge, and warrants caution in the design principles employed in future ontologies. PMID:22754563
Cloning and analysis of the positively acting regulatory gene amdR from Aspergillus nidulans.
Andrianopoulos, A; Hynes, M J
1988-01-01
The positively acting regulatory gene amdR of Aspergillus nidulans coordinately regulates the expression of four unlinked structural genes involved in acetamide (amdS), omega amino acid (gatA and gabA), and lactam (lamA) catabolism. By the use of DNA-mediated transformation of A. nidulans, the amdR regulatory gene was cloned from a genomic cosmid library. Southern blot analysis of DNA from various loss-of-function amdR mutants revealed the presence of four detectable DNA rearrangements, including a deletion, an insertion, and a translocation. No detectable DNA rearrangements were found in several constitutive amdRc mutants. Analysis of the fate of amdR-bearing plasmids in transformants showed that 10 to 20% of the transformation events were homologous integrations or gene conversions, and this phenomenon was exploited in developing a strategy by which amdRc and amdR- alleles can be readily cloned and analyzed. Examination of the transcription of amdR by Northern blot (RNA blot) analysis revealed the presence of two mRNAs (2.7 and 1.8 kilobases) which were constitutively synthesized at a very low level. In addition, amdR transcription did not appear to depend on the presence of a functional amdR product nor was it altered in amdRc mutants. The dosage effects of multiple copies of amdR in transformants were examined, and it was shown that such transformants exhibited stronger growth than did the wild type on acetamide and pyrrolidinone media, indicating increased expression of the amdS and lamA genes, respectively. These results were used to formulate a model for amdR-mediated regulation of gene expression in which the low constitutive level of amdR product sets the upper limits of basal and induced transcription of the structural genes. Multiple copies of 5' sequences from the amdS gene can result in reduced growth on substrates whose utilization is dependent on amdR-controlled genes. This has been attributed to titration of limiting amdR gene product. Strong support for this proposal was obtained by showing that multiple copies of the amdR gene can reverse this phenomenon (antititration). Images PMID:3062382
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Lu, Kun; Xu, Xinfu; Wang, Rui; Li, Jiana; Qu, Cunmin
2017-10-24
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed ( Brassica napus ). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B . napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B . napus and its parental lines and for molecular breeding studies of bZIP genes in B . napus .
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Xu, Xinfu; Wang, Rui; Li, Jiana
2017-01-01
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed (Brassica napus). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B. napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B. napus and its parental lines and for molecular breeding studies of bZIP genes in B. napus. PMID:29064393
Garbuz, D G; Evgen’ev, M B
2017-01-01
Heat shock genes are the most evolutionarily ancient among the systems responsible for adaptation of organisms to a harsh environment. The encoded proteins (heat shock proteins, Hsps) represent the most important factors of adaptation to adverse environmental conditions. They serve as molecular chaperones, providing protein folding and preventing aggregation of damaged cellular proteins. Structural analysis of the heat shock genes in individuals from both phylogenetically close and very distant taxa made it possible to reveal the basic trends of the heat shock gene organization in the context of adaptation to extreme conditions. Using different model objects and nonmodel species from natural populations, it was demonstrated that modulation of the Hsps expression during adaptation to different environmental conditions could be achieved by changing the number and structural organization of heat shock genes in the genome, as well as the structure of their promoters. It was demonstrated that thermotolerant species were usually characterized by elevated levels of Hsps under normal temperature or by the increase in the synthesis of these proteins in response to heat shock. Analysis of the heat shock genes in phylogenetically distant organisms is of great interest because, on one hand, it contributes to the understanding of the molecular mechanisms of evolution of adaptogenes and, on the other hand, sheds the light on the role of different Hsps families in the development of thermotolerance and the resistance to other stress factors.
Zhang, Yu; Xie, Jianping; Liu, Miaomiao; Tian, Zhe; He, Zhili; van Nostrand, Joy D; Ren, Liren; Zhou, Jizhong; Yang, Min
2013-10-15
It is widely demonstrated that antibiotics in the environment affect microbial community structure. However, direct evidence regarding the impacts of antibiotics on microbial functional structures in wastewater treatment systems is limited. Herein, a high-throughput functional gene array (GeoChip 3.0) in combination with quantitative PCR and clone libraries were used to evaluate the microbial functional structures in two biological wastewater treatment systems, which treat antibiotic production wastewater mainly containing oxytetracycline. Despite the bacteriostatic effects of antibiotics, the GeoChip detected almost all key functional gene categories, including carbon cycling, nitrogen cycling, etc., suggesting that these microbial communities were functionally diverse. Totally 749 carbon-degrading genes belonging to 40 groups (24 from bacteria and 16 from fungi) were detected. The abundance of several fungal carbon-degrading genes (e.g., glyoxal oxidase (glx), lignin peroxidase or ligninase (lip), manganese peroxidase (mnp), endochitinase, exoglucanase_genes) was significantly correlated with antibiotic concentrations (Mantel test; P < 0.05), showing that the fungal functional genes have been enhanced by the presence of antibiotics. However, from the fact that the majority of carbon-degrading genes were derived from bacteria and diverse antibiotic resistance genes were detected in bacteria, it was assumed that many bacteria could survive in the environment by acquiring antibiotic resistance and may have maintained the position as a main player in nutrient removal. Variance partitioning analysis showed that antibiotics could explain 24.4% of variations in microbial functional structure of the treatment systems. This study provides insights into the impacts of antibiotics on microbial functional structure of a unique system receiving antibiotic production wastewater, and reveals the potential importance of the cooperation between fungi and bacteria with antibiotic resistance in maintaining the stability and performance of the systems. Copyright © 2013 Elsevier Ltd. All rights reserved.
Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya).
Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo
2012-01-01
Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Y(h)) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Y(h) chromosome, implying a loss of many genes on the Y(h) chromosome. Nevertheless, candidate Y(h) chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya.
Hu, Wei; Hou, Xiaowan; Huang, Chao; Yan, Yan; Tie, Weiwei; Ding, Zehong; Wei, Yunxie; Liu, Juhua; Miao, Hongxia; Lu, Zhiwei; Li, Meiying; Xu, Biyu; Jin, Zhiqiang
2015-01-01
Aquaporins (AQPs) function to selectively control the flow of water and other small molecules through biological membranes, playing crucial roles in various biological processes. However, little information is available on the AQP gene family in bananas. In this study, we identified 47 banana AQP genes based on the banana genome sequence. Evolutionary analysis of AQPs from banana, Arabidopsis, poplar, and rice indicated that banana AQPs (MaAQPs) were clustered into four subfamilies. Conserved motif analysis showed that all banana AQPs contained the typical AQP-like or major intrinsic protein (MIP) domain. Gene structure analysis suggested the majority of MaAQPs had two to four introns with a highly specific number and length for each subfamily. Expression analysis of MaAQP genes during fruit development and postharvest ripening showed that some MaAQP genes exhibited high expression levels during these stages, indicating the involvement of MaAQP genes in banana fruit development and ripening. Additionally, some MaAQP genes showed strong induction after stress treatment and therefore, may represent potential candidates for improving banana resistance to abiotic stress. Taken together, this study identified some excellent tissue-specific, fruit development- and ripening-dependent, and abiotic stress-responsive candidate MaAQP genes, which could lay a solid foundation for genetic improvement of banana cultivars. PMID:26307965
Digital Transcriptome Analysis of Putative Sex-Determination Genes in Papaya (Carica papaya)
Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo
2012-01-01
Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Yh) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Yh chromosome, implying a loss of many genes on the Yh chromosome. Nevertheless, candidate Yh chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya. PMID:22815863
In Silico Analysis of Single Nucleotide Polymorphism (SNPs) in Human β-Globin Gene
Alanazi, Mohammed; Abduljaleel, Zainularifeen; Khan, Wajahatullah; Warsy, Arjumand S.; Elrobh, Mohamed; Khan, Zahid; Amri, Abdullah Al; Bazzi, Mohammad D.
2011-01-01
Single amino acid substitutions in the globin chain are the most common forms of genetic variations that produce hemoglobinopathies- the most widespread inherited disorders worldwide. Several hemoglobinopathies result from homozygosity or compound heterozygosity to beta-globin (HBB) gene mutations, such as that producing sickle cell hemoglobin (HbS), HbC, HbD and HbE. Several of these mutations are deleterious and result in moderate to severe hemolytic anemia, with associated complications, requiring lifelong care and management. Even though many hemoglobinopathies result from single amino acid changes producing similar structural abnormalities, there are functional differences in the generated variants. Using in silico methods, we examined the genetic variations that can alter the expression and function of the HBB gene. Using a sequence homology-based Sorting Intolerant from Tolerant (SIFT) server we have searched for the SNPs, which showed that 200 (80%) non-synonymous polymorphism were found to be deleterious. The structure-based method via PolyPhen server indicated that 135 (40%) non-synonymous polymorphism may modify protein function and structure. The Pupa Suite software showed that the SNPs will have a phenotypic consequence on the structure and function of the altered protein. Structure analysis was performed on the key mutations that occur in the native protein coded by the HBB gene that causes hemoglobinopathies such as: HbC (E→K), HbD (E→Q), HbE (E→K) and HbS (E→V). Atomic Non-Local Environment Assessment (ANOLEA), Yet Another Scientific Artificial Reality Application (YASARA), CHARMM-GUI webserver for macromolecular dynamics and mechanics, and Normal Mode Analysis, Deformation and Refinement (NOMAD-Ref) of Gromacs server were used to perform molecular dynamics simulations and energy minimization calculations on β-Chain residue of the HBB gene before and after mutation. Furthermore, in the native and altered protein models, amino acid residues were determined and secondary structures were observed for solvent accessibility to confirm the protein stability. The functional study in this investigation may be a good model for additional future studies. PMID:22028795
Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan
2016-04-12
The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.
Segmental duplications: evolution and impact among the current Lepidoptera genomes.
Zhao, Qian; Ma, Dongna; Vasseur, Liette; You, Minsheng
2017-07-06
Structural variation among genomes is now viewed to be as important as single nucleoid polymorphisms in influencing the phenotype and evolution of a species. Segmental duplication (SD) is defined as segments of DNA with homologous sequence. Here, we performed a systematic analysis of segmental duplications (SDs) among five lepidopteran reference genomes (Plutella xylostella, Danaus plexippus, Bombyx mori, Manduca sexta and Heliconius melpomene) to understand their potential impact on the evolution of these species. We find that the SDs content differed substantially among species, ranging from 1.2% of the genome in B. mori to 15.2% in H. melpomene. Most SDs formed very high identity (similarity higher than 90%) blocks but had very few large blocks. Comparative analysis showed that most of the SDs arose after the divergence of each linage and we found that P. xylostella and H. melpomene showed more duplications than other species, suggesting they might be able to tolerate extensive levels of variation in their genomes. Conserved ancestral and species specific SD events were assessed, revealing multiple examples of the gain, loss or maintenance of SDs over time. SDs content analysis showed that most of the genes embedded in SDs regions belonged to species-specific SDs ("Unique" SDs). Functional analysis of these genes suggested their potential roles in the lineage-specific evolution. SDs and flanking regions often contained transposable elements (TEs) and this association suggested some involvement in SDs formation. Further studies on comparison of gene expression level between SDs and non-SDs showed that the expression level of genes embedded in SDs was significantly lower, suggesting that structure changes in the genomes are involved in gene expression differences in species. The results showed that most of the SDs were "unique SDs", which originated after species formation. Functional analysis suggested that SDs might play different roles in different species. Our results provide a valuable resource beyond the genetic mutation to explore the genome structure for future Lepidoptera research.
Genome-wide identification of the SWEET gene family in wheat.
Gao, Yue; Wang, Zi Yuan; Kumar, Vikranth; Xu, Xiao Feng; Yuan, De Peng; Zhu, Xiao Feng; Li, Tian Ya; Jia, Baolei; Xuan, Yuan Hu
2018-02-05
The SWEET (sugars will eventually be exported transporter) family is a newly characterized group of sugar transporters. In plants, the key roles of SWEETs in phloem transport, nectar secretion, pollen nutrition, stress tolerance, and plant-pathogen interactions have been identified. SWEET family genes have been characterized in many plant species, but a comprehensive analysis of SWEET members has not yet been performed in wheat. Here, 59 wheat SWEETs (hereafter TaSWEETs) were identified through homology searches. Analyses of phylogenetic relationships, numbers of transmembrane helices (TMHs), gene structures, and motifs showed that TaSWEETs carrying 3-7 TMHs could be classified into four clades with 10 different types of motifs. Examination of the expression patterns of 18 SWEET genes revealed that a few are tissue-specific while most are ubiquitously expressed. In addition, the stem rust-mediated expression patterns of SWEET genes were monitored using a stem rust-susceptible cultivar, 'Little Club' (LC). The resulting data showed that the expression of five out of the 18 SWEETs tested was induced following inoculation. In conclusion, we provide the first comprehensive analysis of the wheat SWEET gene family. Information regarding the phylogenetic relationships, gene structures, and expression profiles of SWEET genes in different tissues and following stem rust disease inoculation will be useful in identifying the potential roles of SWEETs in specific developmental and pathogenic processes. Copyright © 2017 Elsevier B.V. All rights reserved.
A genome-wide analysis of the expansin genes in Malus × Domestica.
Zhang, Shizhong; Xu, Ruirui; Gao, Zheng; Chen, Changtian; Jiang, Zesheng; Shu, Huairui
2014-04-01
Expansins were first identified as cell wall-loosening proteins; they are involved in regulating cell expansion, fruits softening and many other physiological processes. However, our knowledge about the expansin family members and their evolutionary relationships in fruit trees, such as apple, is limited. In this study, we identified 41 members of the expansin gene family in the genome of apple (Malus × Domestica L. Borkh). Phylogenetic analysis revealed that expansin genes in apple could be divided into four subfamilies according to their gene structures and protein motifs. By phylogenetic analysis of the expansins in five plants (Arabidopsis, rice, poplar, grape and apple), the expansins were divided into 17 subgroups. Our gene duplication analysis revealed that whole-genome and chromosomal-segment duplications contributed to the expansion of Mdexpansins. The microarray and expressed sequence tag (EST) data showed that 34 Mdexpansin genes could be divided into five groups by the EST analysis; they may also play different roles during fruit development. An expression model for MdEXPA16 and MdEXPA20 showed their potential role in developing fruit. Overall, our study provides useful data and novel insights into the functions and regulatory mechanisms of the expansin genes in apple, as well as their evolution and divergence. As the first step towards genome-wide analysis of the expansin genes in apple, our results have established a solid foundation for future studies on the function of the expansin genes in fruit development.
2012-01-01
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods. Reviewers This article was reviewed by Arcady Mushegian, Byung-Soo Kim and Joel Bader. PMID:23227854
Recognition of Computer Viruses by Detecting Their Gene of Self Replication
2006-03-01
etection A pproach ................................................................................................. 6 1.4.1 The syntactic analysis m...Therefore a group of instructions acting together in the right order have to be identified for the gene of self-replication to be obvious in a...its first system call NtCreateFile, while the outputs of NtWriteFile become its output arguments. These four blocks form the final structure - The Gene
Immunology Research in Israel.
1985-11-14
sonCLSIIDFG6S 1 .01 56 *3 2 . 1.8. 11111.2 1111 . MICROCOPY RESOLU TION TEST CHART W NAIN L 8UE OSTNA1936 - Ilile I.. - t"t 4-. r; I...pursued by Israel.i scientists include investigation of irmuno- globulin genes, structure-f unction analysis of antibodies and regulation of antibody...investigation of immuno- that had been first formulated by M. globulin genes, structure-function anal- Sela and R. Arnon. D. Givol pioneered in ysis of
Liu, Li; Sabo, Aniko; Neale, Benjamin M.; Nagaswamy, Uma; Stevens, Christine; Lim, Elaine; Bodea, Corneliu A.; Muzny, Donna; Reid, Jeffrey G.; Banks, Eric; Coon, Hillary; DePristo, Mark; Dinh, Huyen; Fennel, Tim; Flannick, Jason; Gabriel, Stacey; Garimella, Kiran; Gross, Shannon; Hawes, Alicia; Lewis, Lora; Makarov, Vladimir; Maguire, Jared; Newsham, Irene; Poplin, Ryan; Ripke, Stephan; Shakir, Khalid; Samocha, Kaitlin E.; Wu, Yuanqing; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H.; Devlin, Bernie; Schellenberg, Gerard D.; Sutcliffe, James S.; Daly, Mark J.; Gibbs, Richard A.; Roeder, Kathryn
2013-01-01
We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD. PMID:23593035
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Naville, Magali; Gautheret, Daniel
2010-01-01
Bacterial transcription attenuation occurs through a variety of cis-regulatory elements that control gene expression in response to a wide range of signals. The signal-sensing structures in attenuators are so diverse and rapidly evolving that only a small fraction have been properly annotated and characterized to date. Here we apply a broad-spectrum detection tool in order to achieve a more complete view of the transcriptional attenuation complement of key bacterial species. Our protocol seeks gene families with an unusual frequency of 5' terminators found across multiple species. Many of the detected attenuators are part of annotated elements, such as riboswitches or T-boxes, which often operate through transcriptional attenuation. However, a significant fraction of candidates were not previously characterized in spite of their unmistakable footprint. We further characterized some of these new elements using sequence and secondary structure analysis. We also present elements that may control the expression of several non-homologous genes, suggesting co-transcription and response to common signals. An important class of such elements, which we called mobile attenuators, is provided by 3' terminators of insertion sequences or prophages that may be exapted as 5' regulators when inserted directly upstream of a cellular gene. We show here that attenuators involve a complex landscape of signal-detection structures spanning the entire bacterial domain. We discuss possible scenarios through which these diverse 5' regulatory structures may arise or evolve.
Chen, Min; Tan, Qiuping; Sun, Mingyue; Li, Dongmei; Fu, Xiling; Chen, Xiude; Xiao, Wei; Li, Ling; Gao, Dongsheng
2016-06-01
Bud dormancy in deciduous fruit trees is an important adaptive mechanism for their survival in cold climates. The WRKY genes participate in several developmental and physiological processes, including dormancy. However, the dormancy mechanisms of WRKY genes have not been studied in detail. We conducted a genome-wide analysis and identified 58 WRKY genes in peach. These putative genes were located on all eight chromosomes. In bioinformatics analyses, we compared the sequences of WRKY genes from peach, rice, and Arabidopsis. In a cluster analysis, the gene sequences formed three groups, of which group II was further divided into five subgroups. Gene structure was highly conserved within each group, especially in groups IId and III. Gene expression analyses by qRT-PCR showed that WRKY genes showed different expression patterns in peach buds during dormancy. The mean expression levels of six WRKY genes (Prupe.6G286000, Prupe.1G393000, Prupe.1G114800, Prupe.1G071400, Prupe.2G185100, and Prupe.2G307400) increased during endodormancy and decreased during ecodormancy, indicating that these six WRKY genes may play a role in dormancy in a perennial fruit tree. This information will be useful for selecting fruit trees with desirable dormancy characteristics or for manipulating dormancy in genetic engineering programs.
Genome-wide analysis of WRKY gene family in Cucumis sativus
2011-01-01
Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985
Genome-wide analysis of WRKY gene family in Cucumis sativus.
Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan
2011-09-28
WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong
2014-10-16
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong
2014-01-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
Systematic exploration of essential yeast gene function with temperature-sensitive mutants
Li, Zhijian; Vizeacoumar, Franco J; Bahr, Sondra; Li, Jingjing; Warringer, Jonas; Vizeacoumar, Frederick S; Min, Renqiang; VanderSluis, Benjamin; Bellay, Jeremy; DeVit, Michael; Fleming, James A; Stephens, Andrew; Haase, Julian; Lin, Zhen-Yuan; Baryshnikova, Anastasia; Lu, Hong; Yan, Zhun; Jin, Ke; Barker, Sarah; Datti, Alessandro; Giaever, Guri; Nislow, Corey; Bulawa, Chris; Myers, Chad L; Costanzo, Michael; Gingras, Anne-Claude; Zhang, Zhaolei; Blomberg, Anders; Bloom, Kerry; Andrews, Brenda; Boone, Charles
2012-01-01
Conditional temperature-sensitive (ts) mutations are valuable reagents for studying essential genes in the yeast Saccharomyces cerevisiae. We constructed 787 ts strains, covering 497 (~45%) of the 1,101 essential yeast genes, with ~30% of the genes represented by multiple alleles. All of the alleles are integrated into their native genomic locus in the S288C common reference strain and are linked to a kanMX selectable marker, allowing further genetic manipulation by synthetic genetic array (SGA)–based, high-throughput methods. We show two such manipulations: barcoding of 440 strains, which enables chemical-genetic suppression analysis, and the construction of arrays of strains carrying different fluorescent markers of subcellular structure, which enables quantitative analysis of phenotypes using high-content screening. Quantitative analysis of a GFP-tubulin marker identified roles for cohesin and condensin genes in spindle disassembly. This mutant collection should facilitate a wide range of systematic studies aimed at understanding the functions of essential genes. PMID:21441928
The sieve element occlusion gene family in dicotyledonous plants
Jekat, Stephan B; Nordzieke, Steffen; Reineke, Anna R; Müller, Boje; Bornberg-Bauer, Erich; Noll, Gundula A
2011-01-01
Sieve element occlusion (SEO) genes encoding forisome subunits have been identified in Medicago truncatula and other legumes. Forisomes are structural phloem proteins uniquely found in Fabaceae sieve elements. They undergo a reversible conformational change after wounding, from a condensed to a dispersed state, thereby blocking sieve tube translocation and preventing the loss of photoassimilates. Recently, we identified SEO genes in several non-Fabaceae plants (lacking forisomes) and concluded that they most probably encode conventional non-forisome P-proteins. Molecular and phylogenetic analysis of the SEO gene family has identified domains that are characteristic for SEO proteins. Here, we extended our phylogenetic analysis by including additional SEO genes from several diverse species based on recently published genomic data. Our results strengthen the original assumption that SEO genes seem to be widespread in dicotyledonous angiosperms, and further underline the divergent evolution of SEO genes within the Fabaceae. PMID:21422825
The sieve element occlusion gene family in dicotyledonous plants.
Ernst, Antonia M; Rüping, Boris; Jekat, Stephan B; Nordzieke, Steffen; Reineke, Anna R; Müller, Boje; Bornberg-Bauer, Erich; Prüfer, Dirk; Noll, Gundula A
2011-01-01
Sieve element occlusion (SEO) genes encoding forisome subunits have been identified in Medicago truncatula and other legumes. Forisomes are structural phloem proteins uniquely found in Fabaceae sieve elements. They undergo a reversible conformational change after wounding, from a condensed to a dispersed state, thereby blocking sieve tube translocation and preventing the loss of photoassimilates. Recently, we identified SEO genes in several non-Fabaceae plants (lacking forisomes) and concluded that they most probably encode conventional non-forisome P-proteins. Molecular and phylogenetic analysis of the SEO gene family has identified domains that are characteristic for SEO proteins. Here, we extended our phylogenetic analysis by including additional SEO genes from several diverse species based on recently published genomic data. Our results strengthen the original assumption that SEO genes seem to be widespread in dicotyledonous angiosperms, and further underline the divergent evolution of SEO genes within the Fabaceae.
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.
Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R
1999-12-16
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Mishra, Manoj K.; Chaturvedi, Pankaj; Singh, Ruchi; Singh, Gaurav; Sharma, Lokendra K.; Pandey, Vibha; Kumari, Nishi; Misra, Pratibha
2013-01-01
Background Sterol glycosyltrnasferases (SGT) are enzymes that glycosylate sterols which play important role in plant adaptation to stress and are medicinally important in plants like Withania somnifera. The present study aims to find the role of WsSGTL1 which is a sterol glycosyltransferase from W. somnifera, in plant’s adaptation to abiotic stress. Methodology The WsSGTL1 gene was transformed in Arabidopsis thaliana through Agrobacterium mediated transformation, using the binary vector pBI121, by floral dip method. The phenotypic and physiological parameters like germination, root length, shoot weight, relative electrolyte conductivity, MDA content, SOD levels, relative electrolyte leakage and chlorophyll measurements were compared between transgenic and wild type Arabidopsis plants under different abiotic stresses - salt, heat and cold. Biochemical analysis was done by HPLC-TLC and radiolabelled enzyme assay. The promoter of the WsSGTL1 gene was cloned by using Genome Walker kit (Clontech, USA) and the 3D structures were predicted by using Discovery Studio Ver. 2.5. Results The WsSGTL1 transgenic plants were confirmed to be single copy by Southern and homozygous by segregation analysis. As compared to WT, the transgenic plants showed better germination, salt tolerance, heat and cold tolerance. The level of the transgene WsSGTL1 was elevated in heat, cold and salt stress along with other marker genes such as HSP70, HSP90, RD29, SOS3 and LEA4-5. Biochemical analysis showed the formation of sterol glycosides and increase in enzyme activity. When the promoter of WsSGTL1 gene was cloned from W. somnifera and sequenced, it contained stress responsive elements. Bioinformatics analysis of the 3D structure of the WsSGTL1 protein showed functional similarity with sterol glycosyltransferase AtSGT of A. thaliana. Conclusions Transformation of WsSGTL1 gene in A. thaliana conferred abiotic stress tolerance. The promoter of the gene in W.somnifera was found to have stress responsive elements. The 3D structure showed functional similarity with sterol glycosyltransferases. PMID:23646175
Brauburger, Kristina; Boehmann, Yannik; Tsuda, Yoshimi; Hoenen, Thomas; Olejnik, Judith; Schümann, Michael; Ebihara, Hideki
2014-01-01
ABSTRACT Ebola virus (EBOV) belongs to the group of nonsegmented negative-sense RNA viruses. The seven EBOV genes are separated by variable gene borders, including short (4- or 5-nucleotide) intergenic regions (IRs), a single long (144-nucleotide) IR, and gene overlaps, where the neighboring gene end and start signals share five conserved nucleotides. The unique structure of the gene overlaps and the presence of a single long IR are conserved among all filoviruses. Here, we sought to determine the impact of the EBOV gene borders during viral transcription. We show that readthrough mRNA synthesis occurs in EBOV-infected cells irrespective of the structure of the gene border, indicating that the gene overlaps do not promote recognition of the gene end signal. However, two consecutive gene end signals at the VP24 gene might improve termination at the VP24-L gene border, ensuring efficient L gene expression. We further demonstrate that the long IR is not essential for but regulates transcription reinitiation in a length-dependent but sequence-independent manner. Mutational analysis of bicistronic minigenomes and recombinant EBOVs showed no direct correlation between IR length and reinitiation rates but demonstrated that specific IR lengths not found naturally in filoviruses profoundly inhibit downstream gene expression. Intriguingly, although truncation of the 144-nucleotide-long IR to 5 nucleotides did not substantially affect EBOV transcription, it led to a significant reduction of viral growth. IMPORTANCE Our current understanding of EBOV transcription regulation is limited due to the requirement for high-containment conditions to study this highly pathogenic virus. EBOV is thought to share many mechanistic features with well-analyzed prototype nonsegmented negative-sense RNA viruses. A single polymerase entry site at the 3′ end of the genome determines that transcription of the genes is mainly controlled by gene order and cis-acting signals found at the gene borders. Here, we examined the regulatory role of the structurally unique EBOV gene borders during viral transcription. Our data suggest that transcriptional regulation in EBOV is highly complex and differs from that in prototype viruses and further the understanding of this most fundamental process in the filovirus replication cycle. Moreover, our results with recombinant EBOVs suggest a novel role of the long IR found in all filovirus genomes during the viral replication cycle. PMID:25142600
Brauburger, Kristina; Boehmann, Yannik; Tsuda, Yoshimi; Hoenen, Thomas; Olejnik, Judith; Schümann, Michael; Ebihara, Hideki; Mühlberger, Elke
2014-11-01
Ebola virus (EBOV) belongs to the group of nonsegmented negative-sense RNA viruses. The seven EBOV genes are separated by variable gene borders, including short (4- or 5-nucleotide) intergenic regions (IRs), a single long (144-nucleotide) IR, and gene overlaps, where the neighboring gene end and start signals share five conserved nucleotides. The unique structure of the gene overlaps and the presence of a single long IR are conserved among all filoviruses. Here, we sought to determine the impact of the EBOV gene borders during viral transcription. We show that readthrough mRNA synthesis occurs in EBOV-infected cells irrespective of the structure of the gene border, indicating that the gene overlaps do not promote recognition of the gene end signal. However, two consecutive gene end signals at the VP24 gene might improve termination at the VP24-L gene border, ensuring efficient L gene expression. We further demonstrate that the long IR is not essential for but regulates transcription reinitiation in a length-dependent but sequence-independent manner. Mutational analysis of bicistronic minigenomes and recombinant EBOVs showed no direct correlation between IR length and reinitiation rates but demonstrated that specific IR lengths not found naturally in filoviruses profoundly inhibit downstream gene expression. Intriguingly, although truncation of the 144-nucleotide-long IR to 5 nucleotides did not substantially affect EBOV transcription, it led to a significant reduction of viral growth. Our current understanding of EBOV transcription regulation is limited due to the requirement for high-containment conditions to study this highly pathogenic virus. EBOV is thought to share many mechanistic features with well-analyzed prototype nonsegmented negative-sense RNA viruses. A single polymerase entry site at the 3' end of the genome determines that transcription of the genes is mainly controlled by gene order and cis-acting signals found at the gene borders. Here, we examined the regulatory role of the structurally unique EBOV gene borders during viral transcription. Our data suggest that transcriptional regulation in EBOV is highly complex and differs from that in prototype viruses and further the understanding of this most fundamental process in the filovirus replication cycle. Moreover, our results with recombinant EBOVs suggest a novel role of the long IR found in all filovirus genomes during the viral replication cycle. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Genome-wide identification and characterisation of F-box family in maize.
Jia, Fengjuan; Wu, Bingjiang; Li, Hui; Huang, Jinguang; Zheng, Chengchao
2013-11-01
F-box-containing proteins, as the key components of the protein degradation machinery, are widely distributed in higher plants and are considered as one of the largest known families of regulatory proteins. The F-box protein family plays a crucial role in plant growth and development and in response to biotic and abiotic stresses. However, systematic analysis of the F-box family in maize (Zea mays) has not been reported yet. In this paper, we identified and characterised the maize F-box genes in a genome-wide scale, including phylogenetic analysis, chromosome distribution, gene structure, promoter analysis and gene expression profiles. A total of 359 F-box genes were identified and divided into 15 subgroups by phylogenetic analysis. The F-box domain was relatively conserved, whereas additional motifs outside the F-box domain may indicate the functional diversification of maize F-box genes. These genes were unevenly distributed in ten maize chromosomes, suggesting that they expanded in the maize genome because of tandem and segmental duplication events. The expression profiles suggested that the maize F-box genes had temporal and spatial expression patterns. Putative cis-acting regulatory DNA elements involved in abiotic stresses were observed in maize F-box gene promoters. The gene expression profiles under abiotic stresses also suggested that some genes participated in stress responsive pathways. Furthermore, ten genes were chosen for quantitative real-time PCR analysis under drought stress and the results were consistent with the microarray data. This study has produced a comparative genomics analysis of the maize ZmFBX gene family that can be used in further studies to uncover their roles in maize growth and development.
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Molecular Profile of Peripheral Blood Mononuclear Cells from Patients with Rheumatoid Arthritis
Edwards, Christopher J; Feldman, Jeffrey L; Beech, Jonathan; Shields, Kathleen M; Stover, Jennifer A; Trepicchio, William L; Larsen, Glenn; Foxwell, Brian MJ; Brennan, Fionula M; Feldmann, Marc; Pittman, Debra D
2007-01-01
Rheumatoid arthritis (RA) is a chronic inflammatory arthritis. Currently, diagnosis of RA may take several weeks, and factors used to predict a poor prognosis are not always reliable. Gene expression in RA may consist of a unique signature. Gene expression analysis has been applied to synovial tissue to define molecularly distinct forms of RA; however, expression analysis of tissue taken from a synovial joint is invasive and clinically impractical. Recent studies have demonstrated that unique gene expression changes can be identified in peripheral blood mononuclear cells (PBMCs) from patients with cancer, multiple sclerosis, and lupus. To identify RA disease-related genes, we performed a global gene expression analysis. RNA from PBMCs of 9 RA patients and 13 normal volunteers was analyzed on an oligonucleotide array. Compared with normal PBMCs, 330 transcripts were differentially expressed in RA. The differentially regulated genes belong to diverse functional classes and include genes involved in calcium binding, chaperones, cytokines, transcription, translation, signal transduction, extracellular matrix, integral to plasma membrane, integral to intracellular membrane, mitochondrial, ribosomal, structural, enzymes, and proteases. A k-nearest neighbor analysis identified 29 transcripts that were preferentially expressed in RA. Ten genes with increased expression in RA PBMCs compared with controls mapped to a RA susceptibility locus, 6p21.3. These results suggest that analysis of RA PBMCs at the molecular level may provide a set of candidate genes that could yield an easily accessible gene signature to aid in early diagnosis and treatment. PMID:17515956
Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E
1998-06-01
Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.
Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui
2012-11-07
RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.
Wu, Yan-Hua; Guo, Bin; Lou, Hui-Ling; Cui, Yu-Liang; Gu, Hui-Juan; Qiao, Shou-Yi
2012-02-01
Experimental gene engineering is a laboratory course focusing on the molecular structure, expression pattern and biological function of genes. Providing our students with a solid knowledge base and correct ways to conduct research is very important for high-quality education of genetic engineering. Inspired by recent progresses in this field, we improved the experimental gene engineering course by adding more updated knowledge and technologies and emphasizing on the combination of teaching and research, with the aim of offering our students a good start in their scientific careers.
2013-01-01
Background The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. Results In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Conclusions Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship. PMID:23845024
Ehlers, Claudia; Veit, Katharina; Gottschalk, Gerhard; Schmitz, Ruth A.
2002-01-01
The mesophilic methanogenic archaeon Methanosarcina mazei strain Gö1 is able to utilize molecular nitrogen (N2) as its sole nitrogen source. We have identified and characterized a single nitrogen fixation (nif) gene cluster in M. mazei Gö1 with an approximate length of 9 kbp. Sequence analysis revealed seven genes with sequence similarities to nifH, nifI1, nifI2, nifD, nifK, nifE and nifN, similar to other diazotrophic methanogens and certain bacteria such as Clostridium acetobutylicum, with the two glnB-like genes (nifI1 and nifI2) located between nifH and nifD. Phylogenetic analysis of deduced amino acid sequences for the nitrogenase structural genes of M. mazei Gö1 showed that they are most closely related to Methanosarcina barkeri nif2 genes, and also closely resemble those for the corresponding nif products of the gram-positive bacterium C. acetobutylicum. Northern blot analysis and reverse transcription PCR analysis demonstrated that the M. mazei nif genes constitute an operon transcribed only under nitrogen starvation as a single 8 kb transcript. Sequence analysis revealed a palindromic sequence at the transcriptional start site in front of the M. mazei nifH gene, which may have a function in transcriptional regulation of the nif operon. PMID:15803652
Genomewide analysis of TCP transcription factor gene family in Malus domestica.
Xu, Ruirui; Sun, Peng; Jia, Fengjuan; Lu, Longtao; Li, Yuanyuan; Zhang, Shizhong; Huang, Jinguang
2014-12-01
Teosinte branched 1/cycloidea/proliferating cell factor 1 (TCP) proteins are a large family of transcriptional regulators in angiosperms. They are involved in various biological processes, including development and plant metabolism pathways. In this study, a total of 52 TCP genes were identified in apple (Malus domestica) genome. Bioinformatic methods were employed to predicate and analyse their relevant gene classification, gene structure, chromosome location, sequence alignment and conserved domains of MdTCP proteins. Expression analysis from microarray data showed that the expression levels of 28 and 51 MdTCP genes changed during the ripening and rootstock-scion interaction processes, respectively. The expression patterns of 12 selected MdTCP genes were analysed in different tissues and in response to abiotic stresses. All of the selected genes were detected in at least one of the tissues tested, and most of them were modulated by adverse treatments indicating that the MdTCPs were involved in various developmental and physiological processes. To the best of our knowledge, this is the first study of a genomewide analysis of apple TCP gene family. These results provide valuable information for studies on functions of the TCP transcription factor genes in apple.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.
Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song
2012-11-16
Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria
2012-01-01
Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370
Zhu, Chaoying; Chen, Peng; Han, Yuqing; Ruan, Luzhang
2018-05-12
The Ruddy-breasted Crake (Porzana fusca) is an extremely poorly known species. Although it is not listed as globally endangered, in recent years, with the interference of climate change and human activities, its habitat is rapidly disappearing and its populations have been shrinking. There are two different life history traits for Ruddy-breasted Crake in China, i.e., non-migratory population in the south and migratory population in the north of China. In this study, mitochondrial control sequences and microsatellite datasets of 88 individuals sampled from 8 sites were applied to analyze their genetic diversity, genetic differentiation, and genetic structure. Our results indicated that low genetic diversity and genetic differentiation exit in most populations. The neutrality test suggested significantly negative Fu's Fs value, which, in combination with detection of the mismatch distribution, indicated that population expansion occurred in the interglacier approximately 98,000 years ago, and the time of the most recent common ancestor (TMRCA) was estimated to about 202,705 years ago. Gene flow analysis implied that the gene flow was low, but gene exchange was frequent among adjacent populations. Both phylogenetic and STRUCTURE analyses implied weak genetic structure. In general, the genetic diversity, gene flow, and genetic structure of Ruddy-breasted Crake were low.
Ray, Sumanta; Maulik, Ujjwal
2016-12-20
Detecting perturbation in modular structure during HIV-1 disease progression is an important step to understand stage specific infection pattern of HIV-1 virus in human cell. In this article, we proposed a novel methodology on integration of multiple biological information to identify such disruption in human gene module during different stages of HIV-1 infection. We integrate three different biological information: gene expression information, protein-protein interaction information and gene ontology information in single gene meta-module, through non negative matrix factorization (NMF). As the identified metamodules inherit those information so, detecting perturbation of these, reflects the changes in expression pattern, in PPI structure and in functional similarity of genes during the infection progression. To integrate modules of different data sources into strong meta-modules, NMF based clustering is utilized here. Perturbation in meta-modular structure is identified by investigating the topological and intramodular properties and putting rank to those meta-modules using a rank aggregation algorithm. We have also analyzed the preservation structure of significant GO terms in which the human proteins of the meta-modules participate. Moreover, we have performed an analysis to show the change of coregulation pattern of identified transcription factors (TFs) over the HIV progression stages.
Jühling, Frank; Pütz, Joern; Bernt, Matthias; Donath, Alexander; Middendorf, Martin; Florentz, Catherine; Stadler, Peter F.
2012-01-01
Transfer RNAs (tRNAs) are present in all types of cells as well as in organelles. tRNAs of animal mitochondria show a low level of primary sequence conservation and exhibit ‘bizarre’ secondary structures, lacking complete domains of the common cloverleaf. Such sequences are hard to detect and hence frequently missed in computational analyses and mitochondrial genome annotation. Here, we introduce an automatic annotation procedure for mitochondrial tRNA genes in Metazoa based on sequence and structural information in manually curated covariance models. The method, applied to re-annotate 1876 available metazoan mitochondrial RefSeq genomes, allows to distinguish between remaining functional genes and degrading ‘pseudogenes’, even at early stages of divergence. The subsequent analysis of a comprehensive set of mitochondrial tRNA genes gives new insights into the evolution of structures of mitochondrial tRNA sequences as well as into the mechanisms of genome rearrangements. We find frequent losses of tRNA genes concentrated in basal Metazoa, frequent independent losses of individual parts of tRNA genes, particularly in Arthropoda, and wide-spread conserved overlaps of tRNAs in opposite reading direction. Direct evidence for several recent Tandem Duplication-Random Loss events is gained, demonstrating that this mechanism has an impact on the appearance of new mitochondrial gene orders. PMID:22139921
Analysis of evolutionary patterns of genes in campylobacter jejuni and C. coli
USDA-ARS?s Scientific Manuscript database
Background: In order to investigate the population genetics structure of thermophilic Campylobacter spp., we extracted a set of 1029 core gene families (CGF) from 25 sequenced genomes of C. jejuni, C. coli and C. lari. Based on these CGFs we employed different approaches to reveal the evolutionary ...
Temperature-responsive in vitro RNA structurome of Yersinia pseudotuberculosis.
Righetti, Francesco; Nuss, Aaron M; Twittenhoff, Christian; Beele, Sascha; Urban, Kristina; Will, Sebastian; Bernhart, Stephan H; Stadler, Peter F; Dersch, Petra; Narberhaus, Franz
2016-06-28
RNA structures are fundamentally important for RNA function. Dynamic, condition-dependent structural changes are able to modulate gene expression as shown for riboswitches and RNA thermometers. By parallel analysis of RNA structures, we mapped the RNA structurome of Yersinia pseudotuberculosis at three different temperatures. This human pathogen is exquisitely responsive to host body temperature (37 °C), which induces a major metabolic transition. Our analysis profiles the structure of more than 1,750 RNAs at 25 °C, 37 °C, and 42 °C. Average mRNAs tend to be unstructured around the ribosome binding site. We searched for 5'-UTRs that are folded at low temperature and identified novel thermoresponsive RNA structures from diverse gene categories. The regulatory potential of 16 candidates was validated. In summary, we present a dynamic bacterial RNA structurome and find that the expression of virulence-relevant functions in Y. pseudotuberculosis and reprogramming of its metabolism in response to temperature is associated with a restructuring of numerous mRNAs.
xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud[OPEN
Merchant, Nirav
2016-01-01
Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today’s pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant’s Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. PMID:27020957
xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud.
Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P
2016-04-01
Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.
Xu, Zongda; Sun, Lidan; Zhou, Yuzhen; Yang, Weiru; Cheng, Tangren; Wang, Jia; Zhang, Qixiang
2015-10-01
SQUAMOSA promoter-binding protein (SBP)-box family genes encode plant-specific transcription factors that play crucial roles in plant development, especially flower and fruit development. However, little information on this gene family is available for Prunus mume, an ornamental and fruit tree widely cultivated in East Asia. To explore the evolution of SBP-box genes in Prunus and explore their functions in flower and fruit development, we performed a genome-wide analysis of the SBP-box gene family in P. mume. Fifteen SBP-box genes were identified, and 11 of them contained an miR156 target site. Phylogenetic and comprehensive bioinformatics analyses revealed that different groups of SBP-box genes have undergone different evolutionary processes and varied in their length, structure, and motif composition. Purifying selection has been the main selective constraint on both paralogous and orthologous SBP-box genes. In addition, the sequences of orthologous SBP-box genes did not diverge widely after the split of P. mume and Prunus persica. Expression analysis of P. mume SBP-box genes revealed their diverse spatiotemporal expression patterns. Three duplicated SBP-box genes may have undergone subfunctionalization in Prunus. Most of the SBP-box genes showed high transcript levels in flower buds and young fruit. The four miR156-nontargeted genes were upregulated during fruit ripening. Together, these results provide information about the evolution of SBP-box genes in Prunus. The expression analysis lays the foundation for further research on the functions of SBP-box genes in P. mume and other Prunus species, especially during flower and fruit development.
Long-Range Chromosome Interactions Mediated by Cohesin Shape Circadian Gene Expression
Xu, Yichi; Guo, Weimin; Li, Ping; Zhang, Yan; Zhao, Meng; Fan, Zenghua; Zhao, Zhihu; Yan, Jun
2016-01-01
Mammalian circadian rhythm is established by the negative feedback loops consisting of a set of clock genes, which lead to the circadian expression of thousands of downstream genes in vivo. As genome-wide transcription is organized under the high-order chromosome structure, it is largely uncharted how circadian gene expression is influenced by chromosome architecture. We focus on the function of chromatin structure proteins cohesin as well as CTCF (CCCTC-binding factor) in circadian rhythm. Using circular chromosome conformation capture sequencing, we systematically examined the interacting loci of a Bmal1-bound super-enhancer upstream of a clock gene Nr1d1 in mouse liver. These interactions are largely stable in the circadian cycle and cohesin binding sites are enriched in the interactome. Global analysis showed that cohesin-CTCF co-binding sites tend to insulate the phases of circadian oscillating genes while cohesin-non-CTCF sites are associated with high circadian rhythmicity of transcription. A model integrating the effects of cohesin and CTCF markedly improved the mechanistic understanding of circadian gene expression. Further experiments in cohesin knockout cells demonstrated that cohesin is required at least in part for driving the circadian gene expression by facilitating the enhancer-promoter looping. This study provided a novel insight into the relationship between circadian transcriptome and the high-order chromosome structure. PMID:27135601
Menzerath-Altmann law in mammalian exons reflects the dynamics of gene structure evolution.
Nikolaou, Christoforos
2014-12-01
Genomic sequences exhibit self-organization properties at various hierarchical levels. One such is the gene structure of higher eukaryotes with its complex exon/intron arrangement. Exon sizes and exon numbers in genes have been shown to conform to a law derived from statistical linguistics and formulated by Menzerath and Altmann, according to which the mean size of the constituents of an entity is inversely related to the number of these constituents. We herein perform a detailed analysis of this property in the complete exon set of the mouse genome in correlation to the sequence conservation of each exon and the transcriptional complexity of each gene locus. We show that extensive linear fits, representative of accordance to Menzerath-Altmann law are restricted to a particular subset of genes that are formed by exons under low or intermediate sequence constraints and have a small number of alternative transcripts. Based on this observation we propose a hypothesis for the law of Menzerath-Altmann in mammalian genes being predominantly due to genes that are more versatile in function and thus, more prone to undergo changes in their structure. To this end we demonstrate one test case where gene categories of different functionality also show differences in the extent of conformity to Menzerath-Altmann law. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Basyuni, M.; Wati, R.
2017-01-01
This study described the bioinformatics methods to analyze seven oxidosqualene cyclase (OSC) genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, similarity, subcellular localization and phylogenetic. The physical and chemical properties of seven mangrove OSC showed variation among the genes. The percentage of the secondary structure of seven mangrove OSC genes followed the order of a helix > random coil > extended chain structure. The values of chloroplast or signal peptide were too low, indicated that no chloroplast transit peptide or signal peptide of secretion pathway in mangrove OSC genes. The target peptide value of mitochondria varied from 0.163 to 0.430, indicated it was possible to exist. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove OSC genes. To clarify the relationship among the mangrove OSC gene, a phylogenetic tree was constructed. The phylogenetic tree shows that there are three clusters, Kandelia KcMS join with Bruguiera BgLUS, Rhizophora RsM1 was close to Bruguiera BgbAS, and Rhizophora RcCAS join with Kandelia KcCAS. The present study, therefore, supported the previous results that plant OSC genes form distinct clusters in the tree.
Characterization of the Bm61 of the Bombyx mori nucleopolyhedrovirus.
Shen, Hongxing; Chen, Keping; Yao, Qin; Zhou, Yang
2009-07-01
orf61 (bm61) of Bombyx mori Nucleopolyhedrovirus (BmNPV) is a highly conserved baculovirus gene, suggesting that it performs an important role in the virus life cycle whose function is unknown. In this study, we describe the characterization of bm61. Quantitative polymerase chain reaction (qPCR) and western blot analysis demonstrated that bm61 was expressed as a late gene. Immunofluorescence analysis by confocal microscopy showed that BM61 protein was localized on nuclear membrane and in intranuclear ring zone of infected cells. Structure localization of the BM61 in BV and ODV by western analysis demonstrated that BM61 was the protein of both BV and ODV. In addition, our data indicated that BM61 was a late structure protein localized in nucleus.
Web application for automatic prediction of gene translation elongation efficiency.
Sokolov, Vladimir; Zuraev, Bulat; Lashin, Sergei; Matushkin, Yury
2015-09-03
Expression efficiency is one of the major characteristics describing genes in various modern investigations. Expression efficiency of genes is regulated at various stages: transcription, translation, posttranslational protein modification and others. In this study, a special EloE (Elongation Efficiency) web application is described. The EloE sorts the organism's genes in a descend order on their theoretical rate of the elongation stage of translation based on the analysis of their nucleotide sequences. Obtained theoretical data have a significant correlation with available experimental data of gene expression in various organisms. In addition, the program identifies preferential codons in organism's genes and defines distribution of potential secondary structures energy in 5´ and 3´ regions of mRNA. The EloE can be useful in preliminary estimation of translation elongation efficiency for genes for which experimental data are not available yet. Some results can be used, for instance, in other programs modeling artificial genetic structures in genetically engineered experiments.
Chen, Y M; Zhu, Y; Lin, E C
1987-12-01
In Escherichia coli the six known genes specifying the utilization of L-fucose as carbon and energy source cluster at 60.2 min and constitute a regulon. These genes include fucP (encoding L-fucose permease), fucI (encoding L-fucose isomerase), fucK (encoding L-fuculose kinase), fucA (encoding L-fuculose 1-phosphate aldolase), fucO (encoding L-1,2-propanediol oxidoreductase), and fucR (encoding the regulatory protein). In this study the fuc genes were cloned and their positions on the chromosome were established by restriction endonuclease and complementation analyses. Clockwise, the gene order is: fucO-fucA-fucP-fucI-fucK-fucR. The operons comprising the structural genes and the direction of transcription were determined by complementation analysis and Southern blot hybridization. The fucPIK and fucA operons are transcribed clockwise. The fucO operon is transcribed counterclockwise. The fucR gene product activates the three structural operons in trans.
Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada
2015-07-20
Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Niu, Xin; Guan, Yuxiang; Chen, Shoukun; Li, Haifeng
2017-08-15
As a superfamily of transcription factors (TFs), the basic helix-loop-helix (bHLH) proteins have been characterized functionally in many plants with a vital role in the regulation of diverse biological processes including growth, development, response to various stresses, and so on. However, no systemic analysis of the bHLH TFs has been reported in Brachypodium distachyon, an emerging model plant in Poaceae. A total of 146 bHLH TFs were identified in the Brachypodium distachyon genome and classified into 24 subfamilies. BdbHLHs in the same subfamily share similar protein motifs and gene structures. Gene duplication events showed a close relationship to rice, maize and sorghum, and segment duplications might play a key role in the expansion of this gene family. The amino acid sequence of the bHLH domains were quite conservative, especially Leu-27 and Leu-54. Based on the predicted binding activities, the BdbHLHs were divided into DNA binding and non-DNA binding types. According to the gene ontology (GO) analysis, BdbHLHs were speculated to function in homodimer or heterodimer manner. By integrating the available high throughput data in public database and results of quantitative RT-PCR, we found the expression profiles of BdbHLHs were different, implying their differentiated functions. One hundred fourty-six BdbHLHs were identified and their conserved domains, sequence features, phylogenetic relationship, chromosomal distribution, GO annotations, gene structures, gene duplication and expression profiles were investigated. Our findings lay a foundation for further evolutionary and functional elucidation of BdbHLH genes.
1985-01-01
We have determined the DNA sequence of a gene encoding a thymus leukemia (TL) antigen in the BALB/c mouse, and have more definitively mapped the cloned BALB/c Tla-region class I gene clusters. Analysis of the sequence shows that the Tla gene is less closely related to the H-2 genes than H-2 genes are to one another or to a Qa-2,3-region genes. The Tla gene, 17.3A, contains an apparent gene conversion. Comparison of the BALB/c Tla genes with those from C57BL shows that BALB/c has more Tla-region class I genes, and that one of the genes absent in C57BL is gene 17.3A. PMID:3894562
Lee, Soon Goo; Krishnan, Hari B; Jez, Joseph M
2014-04-29
The symbiosis between rhizobial microbes and host plants involves the coordinated expression of multiple genes, which leads to nodule formation and nitrogen fixation. As part of the transcriptional machinery for nodulation and symbiosis across a range of Rhizobium, NolR serves as a global regulatory protein. Here, we present the X-ray crystal structures of NolR in the unliganded form and complexed with two different 22-base pair (bp) double-stranded operator sequences (oligos AT and AA). Structural and biochemical analysis of NolR reveals protein-DNA interactions with an asymmetric operator site and defines a mechanism for conformational switching of a key residue (Gln56) to accommodate variation in target DNA sequences from diverse rhizobial genes for nodulation and symbiosis. This conformational switching alters the energetic contributions to DNA binding without changes in affinity for the target sequence. Two possible models for the role of NolR in the regulation of different nodulation and symbiosis genes are proposed. To our knowledge, these studies provide the first structural insight on the regulation of genes involved in the agriculturally and ecologically important symbiosis of microbes and plants that leads to nodule formation and nitrogen fixation.
[Hsp70 Genes of the Megaphragma amalphitanum (Hymenoptera: Trichogrammatidae) Parasitic Wasp].
Chuvakova, L N; Sharko, F S; Nedoluzhko, A V; Polilov, A A; Prokhorchuk, E B; Skryabin, K G; Evgen'ev, M B
2017-01-01
Miniaturization is an evolutionary process that is widely represented in both invertebrates and vertebrates. Miniaturization frequently affects not only the size of the organism and its constituent cells, but also changes the genome structure and functioning. The structure of the main heat shock genes (hsp70 and hsp83) was studied in one of the smallest insects, the Megaphragma amalphitanum (Hymenoptera: Trichogrammatidae) parasitic wasp, which is comparable in size with unicellular organisms. An analysis of the sequenced genome has detected six genes that relate to the hsp70 family, some of which are apparently induced upon heat shock. Both induced and constitutively expressed hsp70 genes contain a large number of introns, which is not typical for the genes of this family. Moreover, none of the found genes form clusters, and they are all very heterogeneous (individual copies are only 75-85% identical), which indicates the absence of gene conversion, which provides the identity of genes of this family in Drosophila and other organisms. Two hsp83 genes, one of which contains an intron, have also been found in the M. amalphitanum genome.
Genome sequence, comparative analysis and haplotype structure of the domestic dog.
Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S
2005-12-08
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Lima, Luanne Helena Augusto; Pinheiro, Cristiano Guimarães do Amaral; de Moraes, Lídia Maria Pepe; de Freitas, Sonia Maria; Torres, Fernando Araripe Gonçalves
2006-12-01
Yeasts can metabolize xylose by the action of two key enzymes: xylose reductase and xylitol dehydrogenase. In this work, we present data concerning the cloning of the XYL2 gene encoding xylitol dehydrogenase from the yeast Candida tropicalis. The gene is present as a single copy in the genome and is controlled at the transcriptional level by the presence of the inducer xylose. XYL2 was functionally tested by heterologous expression in Saccharomyces cerevisiae to develop a yeast strain capable of producing ethanol from xylose. Structural analysis of C. tropicalis xylitol dehydrogenase, Xyl2, suggests that it is a member of the medium-chain dehydrogenase (MDR) family. This is supported by the presence of the amino acid signature [GHE]xx[G]xxxxx[G]xx[V] in its primary sequence and a typical alcohol dehydrogenase Rossmann fold pattern composed by NAD(+) and zinc ion binding domains.
Roosendaal, E; Jacobs, A A; Rathman, P; Sondermeyer, C; Stegehuis, F; Oudega, B; de Graaf, F K
1987-09-01
Analysis of the nucleotide sequence of the distal part of the fan gene cluster encoding the proteins involved in the biosynthesis of the fibrillar adhesin, K99, revealed the presence of two structural genes, fanG and fanH. The amino acid sequence of the gene products (FanG and FanH) showed significant homology to the amino acid sequence of the fibrillar subunit protein (FanC). Introduction of a site-specific frameshift mutation in fanG or fanH resulted in a simultaneous decrease in fibrillae production and adhesive capacity. Analysis of subcellular fractions showed that, in contrast to the K99 fibrillar subunit (FanC), both the FanH and the FanG protein were loosely associated with the outer membrane, possibly on the periplasmic side, but were not components of the fimbriae themselves.
Ślipiko, Monika; Buczkowska-Chmielewska, Katarzyna; Bączkiewicz, Alina; Szczecińska, Monika; Sawicki, Jakub
2017-01-01
Liverwort mitogenomes are considered to be evolutionarily stable. A comparative analysis of four Calypogeia species revealed differences compared to previously sequenced liverwort mitogenomes. Such differences involve unexpected structural changes in the two genes, cox1 and atp1, which have lost three and two introns, respectively. The group I introns in the cox1 gene are proposed to have been lost by two-step localized retroprocessing, whereas one-step retroprocessing could be responsible for the disappearance of the group II introns in the atp1 gene. These cases represent the first identified losses of introns in mitogenomes of leafy liverworts (Jungermanniopsida) contrasting the stability of mitochondrial gene order with certain changes in the gene content and intron set in liverworts. PMID:29257096
Bioinformatic Analysis of Strawberry GSTF12 Gene
NASA Astrophysics Data System (ADS)
Wang, Xiran; Jiang, Leiyu; Tang, Haoru
2018-01-01
GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.
The complete mitochondrial genome sequence of the maned wolf (Chrysocyon brachyurus).
Zhao, Chao; Yang, Xiufeng; Zhang, Honghai; Zhang, Jin; Chen, Lei; Sha, Weilai; Liu, Guangshuai
2016-01-01
In this study, the complete mitochondrial genome of the maned wolf (Chrysocyon brachyurus), the unique species in Chrysocyon, was sequenced and reported for the first time using blood samples obtained from a female individual in Shanghai Zoo, China. Sequence analysis showed that the genome structure was in accordance with other Canidae species and it contained 12 S rRNA gene, 16 S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region.
Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping
2016-01-01
In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes. PMID:27775579
Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping
2016-10-19
In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes.
Liu, Bin; Liu, Xingwang; Liu, Ying; Xue, Shudan; Cai, Yanling; Yang, Sen; Dong, Mingming; Zhang, Yaqi; Liu, Huiling; Zhao, Binyu; Qi, Changhong; Zhu, Ning; Ren, Huazhong
2016-01-01
Cucumber (Cucumis sativus L.) is threatened by substantial yield losses due to the south root-knot nematode (Meloidogyne incognita). However, understanding of the molecular mechanisms underlying the process of nematode infection is still limited. In this study, we found that M. incognita infection affected the structure of cells in cucumber roots and treatment of the cytoskeleton inhibitor (cytochalasin D) reduced root-knot nematode (RKN) parasitism. It is known that Actin-Depolymerizing Factor (ADF) affects cell structure, as well as the organization of the cytoskeleton. To address the hypothesis that nematode-induced abnormal cell structures and cytoskeletal rearrangements might be mediated by the ADF genes, we identified and characterized eight cucumber ADF (CsADF) genes. Phylogenetic analysis showed that the cucumber ADF gene family is grouped into four ancient subclasses. Expression analysis revealed that CsADF1, CsADF2-1, CsADF2-2, CsADF2-3 (Subclass I), and CsADF6 (Subclass III) have higher transcript levels than CsADF7-1, CsADF7-2 (Subclass II genes), and CsADF5 (Subclass IV) in roots. Members of subclass I genes (CsADF1, CsADF2-1, CsADF2-2, and CsADF2-3), with the exception of CsADF2-1, exhibited a induction of expression in roots 14 days after their inoculation (DAI) with nematodes. However, the expression of subclass II genes (CsADF7-1 and CsADF7-2) showed no significant change after inoculation. The transcript levels of CsADF6 (Subclass III) showed a specific induction at 21 DAI, while CsADF5 (Subclass IV) was weakly expressed in roots, but was strongly up-regulated as early as 7 DAI. In addition, treatment of roots with cytochalasin D caused an approximately 2-fold down-regulation of the CsADF genes in the treated plants. These results suggest that CsADF gene mediated actin dynamics are associated with structural changes in roots as a consequence of M. incognita infection. PMID:27695469
Yang, Z Q; Chen, H; Tan, J H; Xu, H L; Jia, J; Feng, Y H
2016-12-23
Pinus massoniana Lamb. is an important timber and turpentine-producing tree species in China. Dendrolimus punctatus and Dasychira axutha are leaf-eating pests that have harmful effects on P. massoniana production. Few studies have focused on the molecular mechanisms underlying pest resistance in P. massoniana. Based on sequencing analysis of the transcriptomes of insect-resistant P. massoniana, three key genes involved in the flavonoid metabolic pathway were identified in the present study (PmF3H, PmF3'5'H, and PmC4H). Structural domain analysis showed that the PmF3H gene contains typical binding sites for the 2OG-Fe (II) oxygenase superfamily, while PmF3'5'H and PmC4H both contain the cytochrome P450 structural domain, which is specific for P450 enzymes. Phylogenetic analysis showed that each of the three P. massoniana genes, and the homologous genes in gymnosperms, clustered into a group. Expression of these three genes was highest in the stems, and was higher in the insect-resistant P. massoniana varieties than in the controls. The extent of the increased expression in the insect-resistant P. massoniana varieties indicated that these three genes are involved in defense mechanisms against pests in this species. In the insect-resistant varieties, rapid induction of PmF3H increased the levels of PmF3'5'H and PmC4H expression. The enhanced anti-pest capability of the insect-resistant varieties could be related to temperature and humidity. In addition, these results suggest that these three genes maycontribute to the change in flower color during female cone development.
Wang, Wenzhao; Zhou, Yihui; Wu, Yingling; Dai, Xinlong; Liu, Yajun; Qian, Yumei; Li, Mingzhuo; Jiang, Xiaolan; Wang, Yunsheng; Gao, Liping; Xia, Tao
2018-04-25
Tea is an important economic crop with a 3.02 Gb genome. It accumulates various bioactive compounds, especially catechins, which are closely associated with tea flavor and quality. Catechins are biosynthesized through the phenylpropanoid and flavonoid pathways, with 12 structural genes being involved in their synthesis. However, we found that in Camellia sinensis the understanding of the basic profile of catechins biosynthesis is still unclear. The gene structure, locus, transcript number, transcriptional variation, and function of multigene families have not yet been clarified. Our previous studies demonstrated that the accumulation of flavonoids in tea is species, tissue, and induction specific, which indicates that gene coexpression patterns may be involved in tea catechins and flavonoids biosynthesis. In this paper, we screened candidate genes of multigene families involved in the phenylpropanoid and flavonoid pathways based on an analysis of genome and transcriptome sequence data. The authenticity of candidate genes was verified by PCR cloning, and their function was validated by reverse genetic methods. In the present study, 36 genes from 12 gene families were identified and were accessed in the NCBI database. During this process, some intron retention events of the CsCHI and CsDFR genes were found. Furthermore, the transcriptome sequencing of various tea tissues and subcellular location assays revealed coexpression and colocalization patterns. The correlation analysis showed that CsCHIc, CsF3'H, and CsANRb expression levels are associated significantly with the concentration of soluble PA as well as the expression levels of CsPALc and CsPALf with the concentration of insoluble PA. This work provides insights into catechins metabolism in tea and provides a foundation for future studies.
Genome-Wide Identification of the Invertase Gene Family in Populus.
Chen, Zhong; Gao, Kai; Su, Xiaoxing; Rao, Pian; An, Xinmin
2015-01-01
Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials.
Genome-Wide Identification of the Invertase Gene Family in Populus
Su, Xiaoxing; Rao, Pian; An, Xinmin
2015-01-01
Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials. PMID:26393355
Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D
2015-02-01
The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.
Solov'ev, V V; Kel', A E; Kolchanov, N A
1989-01-01
The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Structural and expressional analysis of the B-hordein genes in Tibetan hull-less barley
USDA-ARS?s Scientific Manuscript database
The B-hordein gene family was analyzed from two Tibetan hull-less barley cultivars, Z09 and Z26 (Hordeum vulgare subsp. vulgare). Fourteen B-hordein genes, designated BZ09-2 to BZ09-5 (from Z09) and BZ26-1 to BZ26-10 (from Z26), were sequenced. Seven of them similar to a previously reported BZ09-1...
Mutational Analysis of Cell Types in Tuberous Sclerosis Complex (TSC)
2007-01-01
disorder resulting from mutations in the TSC1 or TSC2 genes that is associated with epilepsy, cognitive disability, and autism . TSC1/TSC2 gene mutations...cognitive disability, and autism . TSC1/TSC2 gene mutations lead to developmental alterations in brain structure known as tubers in over 80% of TSC...TSC (Sparagana and Roach, 2000). Comorbid neuropsychological disorders such as autism , mental retardation (MR), pervasive developmental disorder
Weier, Heinz -Ulrich G
2015-08-04
Herein are described multicolor FISH probe sets termed "genetic barcodes" targeting several cancer or disease-related loci to assess gene rearrangements and copy number changes in tumor cells. Two, three or more different fluorophores are used to detect the genetic barcode sections thus permitting unique labeling and multilocus analysis in individual cell nuclei. Gene specific barcodes can be generated and combined to provide both numerical and structural genetic information for these and other pertinent disease associated genes.
Ragusa, Maria Antonietta; Nicosia, Aldo; Costa, Salvatore; Cuttitta, Angela; Gianguzza, Fabrizio
2017-01-01
Metallothioneins (MT) are small and cysteine-rich proteins that bind metal ions such as zinc, copper, cadmium, and nickel. In order to shed some light on MT gene structure and evolution, we cloned seven Paracentrotus lividus MT genes, comparing them to Echinodermata and Chordata genes. Moreover, we performed a phylogenetic analysis of 32 MTs from different classes of echinoderms and 13 MTs from the most ancient chordates, highlighting the relationships between them. Since MTs have multiple roles in the cells, we performed RT-qPCR and in situ hybridization experiments to understand better MT functions in sea urchin embryos. Results showed that the expression of MTs is regulated throughout development in a cell type-specific manner and in response to various metals. The MT7 transcript is expressed in all tissues, especially in the stomach and in the intestine of the larva, but it is less metal-responsive. In contrast, MT8 is ectodermic and rises only at relatively high metal doses. MT5 and MT6 expression is highly stimulated by metals in the mesenchyme cells. Our results suggest that the P. lividus MT family originated after the speciation events by gene duplications, evolving developmental and environmental sub-functionalization. PMID:28417916
Parker, Craig T.; Gilbert, Michel; Yuki, Nobuhiro; Endtz, Hubert P.; Mandrell, Robert E.
2008-01-01
The lipooligosaccharide (LOS) biosynthesis region is one of the more variable genomic regions between strains of Campylobacter jejuni. Indeed, eight classes of LOS biosynthesis loci have been established previously based on gene content and organization. In this study, we characterize additional classes of LOS biosynthesis loci and analyze various mechanisms that result in changes to LOS structures. To gain further insights into the genomic diversity of C. jejuni LOS biosynthesis region, we sequenced the LOS biosynthesis loci of 15 strains that possessed gene content that was distinct from the eight classes. This analysis identified 11 new classes of LOS loci that exhibited examples of deletions and insertions of genes and cassettes of genes found in other LOS classes or capsular biosynthesis loci leading to mosaic LOS loci. The sequence analysis also revealed both missense mutations leading to “allelic” glycosyltransferases and phase-variable and non-phase-variable gene inactivation by the deletion or insertion of bases. Specifically, we demonstrated that gene inactivation is an important mechanism for altering the LOS structures of strains possessing the same class of LOS biosynthesis locus. Together, these observations suggest that LOS biosynthesis region is a hotspot for genetic exchange and variability, often leading to changes in the LOS produced. PMID:18556784
Pydiura, Nikolay; Pirko, Yaroslav; Galinousky, Dmitry; Postovoitova, Anastasiia; Yemets, Alla; Kilchevsky, Aleksandr; Blume, Yaroslav
2018-06-08
Flax (Linum usitatissimum L.) is a valuable food and fiber crop cultivated for its quality fiber and seed oil. α-, β-, γ-tubulins and actins are the main structural proteins of the cytoskeleton. α- and γ-tubulin and actin genes have not been characterized yet in the flax genome. In this study, we have identified 6 α-tubulin genes, 13 β-tubulin genes, 2 γ-tubulin genes, and 15 actin genes in the flax genome and analysed the phylogenetic relationships between flax and A. thaliana tubulin and actin genes. Six α-tubulin genes are represented by 3 paralogous pairs, among 13 β-tubulin genes 7 different isotypes can be distinguished, 6 of which are encoded by two paralogous genes each. γ-tubulin is represented by a paralogous pair of genes one of which may be not functional. Fifteen actin genes represent 7 paralogous pairs - 7 actin isotypes and a sequentially duplicated copy of one of the genes of one of the isotypes. Exon-intron structure analysis has shown intron length polymorphism within the β-tubulin genes and intron number variation among the α-tubulin gene: 3 or 4 introns are found in two or four genes, respectively. Intron positioning occurs at conservative sites, as observed in numerous other plant species. Flax actin genes show both intron length polymorphisms and variation in the number of intron that may be 2 or 3. These data will be useful to support further studies on the specificity, functioning, regulation and evolution of the flax cytoskeleton proteins. This article is protected by copyright. All rights reserved.
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
Rossmassler, Karen; Dietrich, Carsten; Thompson, Claire; Mikaelyan, Aram; Nonoh, James O; Scheffrahn, Rudolf H; Sillam-Dussès, David; Brune, Andreas
2015-11-26
Termites are important contributors to carbon and nitrogen cycling in tropical ecosystems. Higher termites digest lignocellulose in various stages of humification with the help of an entirely prokaryotic microbiota housed in their compartmented intestinal tract. Previous studies revealed fundamental differences in community structure between compartments, but the functional roles of individual lineages in symbiotic digestion are mostly unknown. Here, we conducted a highly resolved analysis of the gut microbiota in six species of higher termites that feed on plant material at different levels of humification. Combining amplicon sequencing and metagenomics, we assessed similarities in community structure and functional potential between the major hindgut compartments (P1, P3, and P4). Cluster analysis of the relative abundances of orthologous gene clusters (COGs) revealed high similarities among wood- and litter-feeding termites and strong differences to humivorous species. However, abundance estimates of bacterial phyla based on 16S rRNA genes greatly differed from those based on protein-coding genes. Community structure and functional potential of the microbiota in individual gut compartments are clearly driven by the digestive strategy of the host. The metagenomics libraries obtained in this study provide the basis for future studies that elucidate the fundamental differences in the symbiont-mediated breakdown of lignocellulose and humus by termites of different feeding groups. The high proportion of uncultured bacterial lineages in all samples calls for a reference-independent approach for the correct taxonomic assignment of protein-coding genes.
Milhausen, M; Gill, P R; Parker, G; Agabian, N
1982-01-01
Immunoprecipitation of Caulobacter crescentus polyribosomes with antiflagellin antibody provided RNA for the synthesis of cDNA probes that were used to identify three specific EcoRI restriction fragments (6.8, 10, and 22 kilobases) in genomic digests of Caulobacter DNA. The RNA was present only in polyribosomes isolated from a time interval in the Caulobacter cell cycle that was coincident with flagellin polypeptide synthesis. The structural gene for Mr 27,500 flagellin polypeptide was assigned to a region of the 10-kilobase EcoRI restriction fragment by DNA sequence analysis. Analysis of mutants defective in motility further established a correlation between the Mr 27,500 flagellin gene and the flaE gene locus [Johnson, R. C. & Ely, B. (1979) J. Bacteriol. 137, 627-634]. The other EcoRI fragments that hybridize with the immunoprecipitated polyribosome-derived cDNA probe are also temporally regulated and have features that suggest they encode other polypeptides associated with the flagellum. Modifications were required to adapt the procedure of immunoprecipitation of polyribosomes for use with Caulobacter and should be applicable to the production of specific structural gene probes from other prokaryotic systems. Images PMID:6294658
Mesnage, Robin; Arno, Matthew; Costanzo, Manuela; Malatesta, Manuela; Séralini, Gilles-Eric; Antoniou, Michael N
2015-08-25
Glyphosate-based herbicides (GBH) are the major pesticides used worldwide. Converging evidence suggests that GBH, such as Roundup, pose a particular health risk to liver and kidneys although low environmentally relevant doses have not been examined. To address this issue, a 2-year study in rats administering 0.1 ppb Roundup (50 ng/L glyphosate equivalent) via drinking water (giving a daily intake of 4 ng/kg bw/day of glyphosate) was conducted. A marked increased incidence of anatomorphological and blood/urine biochemical changes was indicative of liver and kidney structure and functional pathology. In order to confirm these findings we have conducted a transcriptome microarray analysis of the liver and kidneys from these same animals. The expression of 4224 and 4447 transcript clusters (a group of probes corresponding to a known or putative gene) were found to be altered respectively in liver and kidney (p < 0.01, q < 0.08). Changes in gene expression varied from -3.5 to 3.7 fold in liver and from -4.3 to 5.3 in kidneys. Among the 1319 transcript clusters whose expression was altered in both tissues, ontological enrichment in 3 functional categories among 868 genes were found. First, genes involved in mRNA splicing and small nucleolar RNA were mostly upregulated, suggesting disruption of normal spliceosome activity. Electron microscopic analysis of hepatocytes confirmed nucleolar structural disruption. Second, genes controlling chromatin structure (especially histone-lysine N-methyltransferases) were mostly upregulated. Third, genes related to respiratory chain complex I and the tricarboxylic acid cycle were mostly downregulated. Pathway analysis suggests a modulation of the mTOR and phosphatidylinositol signalling pathways. Gene disturbances associated with the chronic administration of ultra-low dose Roundup reflect a liver and kidney lipotoxic condition and increased cellular growth that may be linked with regeneration in response to toxic effects causing damage to tissues. Observed alterations in gene expression were consistent with fibrosis, necrosis, phospholipidosis, mitochondrial membrane dysfunction and ischemia, which correlate with and thus confirm observations of pathology made at an anatomical, histological and biochemical level. Our results suggest that chronic exposure to a GBH in an established laboratory animal toxicity model system at an ultra-low, environmental dose can result in liver and kidney damage with potential significant health implications for animal and human populations.
Zhou, Ying; Zhou, Yu; Yang, Jie
2016-01-01
The GRAS gene family is one of the most important plant-specific gene families, which encodes transcriptional regulators and plays an essential role in plant development and physiological processes. The GRAS gene family has been well characterized in many higher plants such as Arabidopsis, rice, Chinese cabbage, tomato and tobacco. In this study, we identified 38 GRAS genes in sacred lotus (Nelumbo nucifera), analyzed their physical and chemical characteristics and performed phylogenetic analysis using the GRAS genes from eight representative plant species to show the evolution of GRAS genes in Planta. In addition, the gene structures and motifs of the sacred lotus GRAS proteins were characterized in detail. Comparative analysis identified 42 orthologous and 9 co-orthologous gene pairs between sacred lotus and Arabidopsis, and 35 orthologous and 22 co-orthologous gene pairs between sacred lotus and rice. Based on publically available RNA-seq data generated from leaf, petiole, rhizome and root, we found that most of the sacred lotus GRAS genes exhibited a tissue-specific expression pattern. Eight of the ten PAT1-clade GRAS genes, particularly NnuGRAS-05, NnuGRAS-10 and NnuGRAS-25, were preferentially expressed in rhizome and root. In summary, this is the first in silico analysis of the GRAS gene family in sacred lotus, which will provide valuable information for further molecular and biological analyses of this important gene family. PMID:27635351
Akhunov, Eduard D.; Sehgal, Sunish; Liang, Hanquan; Wang, Shichen; Akhunova, Alina R.; Kaur, Gaganpreet; Li, Wanlong; Forrest, Kerrie L.; See, Deven; Šimková, Hana; Ma, Yaqin; Hayden, Matthew J.; Luo, Mingcheng; Faris, Justin D.; Doležel, Jaroslav; Gill, Bikram S.
2013-01-01
Cycles of whole-genome duplication (WGD) and diploidization are hallmarks of eukaryotic genome evolution and speciation. Polyploid wheat (Triticum aestivum) has had a massive increase in genome size largely due to recent WGDs. How these processes may impact the dynamics of gene evolution was studied by comparing the patterns of gene structure changes, alternative splicing (AS), and codon substitution rates among wheat and model grass genomes. In orthologous gene sets, significantly more acquired and lost exonic sequences were detected in wheat than in model grasses. In wheat, 35% of these gene structure rearrangements resulted in frame-shift mutations and premature termination codons. An increased codon mutation rate in the wheat lineage compared with Brachypodium distachyon was found for 17% of orthologs. The discovery of premature termination codons in 38% of expressed genes was consistent with ongoing pseudogenization of the wheat genome. The rates of AS within the individual wheat subgenomes (21%–25%) were similar to diploid plants. However, we uncovered a high level of AS pattern divergence between the duplicated homeologous copies of genes. Our results are consistent with the accelerated accumulation of AS isoforms, nonsynonymous mutations, and gene structure rearrangements in the wheat lineage, likely due to genetic redundancy created by WGDs. Whereas these processes mostly contribute to the degeneration of a duplicated genome and its diploidization, they have the potential to facilitate the origin of new functional variations, which, upon selection in the evolutionary lineage, may play an important role in the origin of novel traits. PMID:23124323
Kathiravan, P; Goyal, S; Kataria, R S; Mishra, B P; Jayakumar, S; Joshi, B K
2011-01-01
The present study was undertaken to characterize the structure of S100A8 gene and its promoter in water buffalo and yak. Sequence data of 2.067 kb, 2.071 kb, and 2.052 kb with respect to complete S100A8 gene including 5' flanking region was generated in river buffalo, swamp buffalo, and yak, respectively. BLAST analysis of coding DNA sequences (CDS) of S100A8 gene revealed 95% homology of buffalo sequence with cattle, 85% with pig and horse, 83% with dog, 72-73% with murines, and around 79% with primates and humans. Phylogenetic analysis of predicted CDS revealed distinct clustering of murines, primates, and domestic animals with bovines and bubalines forming a subcluster among farm animals. In silico translation of predicted CDS revealed a sequence of 89 amino acids with 7 amino acid changes between cattle and buffalo and 2 changes between cattle and yak. The search for Pfam family revealed the N-terminal calcium binding domain and the noncanonical EF hand domain in the carboxy terminus, with more variations being observed in the N-terminal domain among different species. Two amino acid changes observed in carboxy terminal EF hand domain resulted in altered secondary structure of yak S100A8 protein. Analysis of S100A8 gene promoter revealed 14 putative motifs for transcriptional factor binding sites. Two putative motifs viz. C/EBP and v-Myb were found to be absent in swamp buffalo as compared to river buffalo and cattle. Differences in the structure of S100A8 protein and the transcriptional factor binding sites identified in the present study need to be analyzed further for their functional significance in yak and swamp buffalo respectively. Copyright © Taylor & Francis Group, LLC
Khattak, Naureen Aslam; Mir, Asif
2014-01-01
Mental retardation (MR)/ intellectual disability (ID) is a neuro-developmental disorder characterized by a low intellectual quotient (IQ) and deficits in adaptive behavior related to everyday life tasks such as delayed language acquisition, social skills or self-help skills with onset before age 18. To date, a few genes (PRSS12, CRBN, CC2D1A, GRIK2, TUSC3, TRAPPC9, TECR, ST3GAL3, MED23, MAN1B1, NSUN1) for autosomal-recessive forms of non syndromic MR (NS-ARMR) have been identified and established in various families with ID. The recently reported candidate gene TRAPPC9 was selected for computational analysis to explore its potentially important role in pathology as it is the only gene for ID reported in more than five different familial cases worldwide. YASARA (12.4.1) was utilized to generate three dimensional structures of the candidate gene TRAPPC9. Hybrid structure prediction was employed. Crystal Structure of a Conserved Metalloprotein From Bacillus Cereus (3D19-C) was selected as best suitable template using position-specific iteration-BLAST. Template (3D19-C) parameters were based on E-value, Z-score and resolution and quality score of 0.32, -1.152, 2.30°A and 0.684 respectively. Model reliability showed 93.1% residues placed in the most favored region with 96.684 quality factor, and overall 0.20 G-factor (dihedrals 0.06 and covalent 0.39 respectively). Protein-Protein docking analysis demonstrated that TRAPPC9 showed strong interactions of the amino acid residues S(253), S(251), Y(256), G(243), D(131) with R(105), Q(425), W(226), N(255), S(233), its functional partner 1KBKB. Protein-protein interacting residues could facilitate the exploration of structural and functional outcomes of wild type and mutated TRAPCC9 protein. Actively involved residues can be used to elucidate the binding properties of the protein, and to develop drug therapy for NS-ARMR patients.
Wang, Xiaojuan; Pan, Hongjia; Gu, Jie; Qian, Xun; Gao, Hua; Qin, Qingjun
2016-12-01
In this study, the effects of different concentrations of oxytetracycline (OTC) on biogas production, archaeal community structure, and the levels of tetracycline resistance genes (TRGs) were investigated in the anaerobic co-digestion products of pig manure and wheat straw. PCR denaturing gradient gel electrophoresis analysis and real-time quantitative polymerase chain reaction (RT-qPCR) (PCR) were used to detect the archaeal community structure and the levels of four TRGs: tet(M), tet(Q), tet(W), and tet(C). The results showed that anaerobic co-digestion with OTC at concentrations of 60, 100, and 140 mg/kg (dry weight of pig manure) reduced the cumulative biogas production levels by 9.9%, 10.4%, and 14.1%, respectively, compared with that produced by the control, which lacked the antibiotic. The addition of OTC substantially modified the structure of the archaeal community. Two orders were identified by phylogenetic analysis, that is, Pseudomonadales and Methanomicrobiales, and the methanogen present during anaerobic co-digestion with OTC may have been resistant to OTC. The abundances of tet(Q) and tet(W) genes increased as the OTC concentration increased, whereas the abundances of tet(M) and tet(C) genes decreased as the OTC concentration increased.
Analysis of informational redundancy in the protein-assembling machinery
NASA Astrophysics Data System (ADS)
Berkovich, Simon
2004-03-01
Entropy analysis of the DNA structure does not reveal a significant departure from randomness indicating lack of informational redundancy. This signifies the absence of a hidden meaning in the genome text and supports the 'barcode' interpretation of DNA given in [1]. Lack of informational redundancy is a characteristic property of an identification label rather than of a message of instructions. Yet randomness of DNA has to induce non-random structures of the proteins. Protein synthesis is a two-step process: transcription into RNA with gene splicing and formation a structure of amino acids. Entropy estimations, performed by A. Djebbari, show typical values of redundancy of the biomolecules along these pathways: DNA gene 4proteins 15-40in gene expression, the RNA copy carries the same information as the original DNA template. Randomness is essentially eliminated only at the step of the protein creation by a degenerate code. According to [1], the significance of the substitution of U for T with a subsequent gene splicing is that these transformations result in a different pattern of RNA oscillations, so the vital DNA communications are protected against extraneous noise coming from the protein making activities. 1. S. Berkovich, "On the 'barcode' functionality of DNA, or the Phenomenon of Life in the Physical Universe", Dorrance Publishing Co., Pittsburgh, 2003
Chakrabarti, Kausik; Pearson, Michael; Grate, Leslie; Sterne-Weiler, Timothy; Deans, Jonathan; Donohue, John Paul; Ares, Manuel
2007-01-01
As the genomes of more eukaryotic pathogens are sequenced, understanding how molecular differences between parasite and host might be exploited to provide new therapies has become a major focus. Central to cell function are RNA-containing complexes involved in gene expression, such as the ribosome, the spliceosome, snoRNAs, RNase P, and telomerase, among others. In this article we identify by comparative genomics and validate by RNA analysis numerous previously unknown structural RNAs encoded by the Plasmodium falciparum genome, including the telomerase RNA, U3, 31 snoRNAs, as well as previously predicted spliceosomal snRNAs, SRP RNA, MRP RNA, and RNAse P RNA. Furthermore, we identify six new RNA coding genes of unknown function. To investigate the relationships of the RNA coding genes to other genomic features in related parasites, we developed a genome browser for P. falciparum (http://areslab.ucsc.edu/cgi-bin/hgGateway). Additional experiments provide evidence supporting the prediction that snoRNAs guide methylation of a specific position on U4 snRNA, as well as predicting an snRNA promoter element particular to Plasmodium sp. These findings should allow detailed structural comparisons between the RNA components of the gene expression machinery of the parasite and its vertebrate hosts. PMID:17901154
NASA Astrophysics Data System (ADS)
Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé
2006-08-01
The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
Cui, Zhouqi; Jin, Guoqiang; Li, Bin; Kakar, Kaleem Ullah; Ojaghian, Mohammad Reza; Wang, Yangli; Xie, Guanlin; Sun, Guochang
2015-01-01
Valine glycine repeat G (VgrG) proteins are regarded as one of two effectors of Type VI secretion system (T6SS) which is a complex multi-component secretion system. In this study, potential biological roles of T6SS structural and VgrG genes in a rice bacterial pathogen, Acidovorax avenae subsp. avenae (Aaa) RS-1, were evaluated under seven stress conditions using principle component analysis of gene expression. The results showed that growth of the pathogen was reduced by H2O2 and paraquat-induced oxidative stress, high salt, low temperature, and vgrG mutation, compared to the control. However, pathogen growth was unaffected by co-culture with a rice rhizobacterium Burkholderia seminalis R456. In addition, expression of 14 T6SS structural and eight vgrG genes was significantly changed under seven conditions. Among different stress conditions, high salt, and low temperature showed a higher effect on the expression of T6SS gene compared with host infection and other environmental conditions. As a first report, this study revealed an association of T6SS gene expression of the pathogen with the host infection, gene mutation, and some common environmental stresses. The results of this research can increase understanding of the biological function of T6SS in this economically-important pathogen of rice. PMID:26378528
Shang, Shuai; Zhong, Huaming; Wu, Xiaoyang; Wei, Qinguo; Zhang, Huanxin; Chen, Jun; Chen, Yao; Tang, Xuexi; Zhang, Honghai
2018-04-01
Toll-like receptors (TLRs) encoded by the TLR multigene family play an important role in initial pathogen recognition in vertebrates. Among the TLRs, TLR2 and TLR4 may be of particular importance to reptiles. In order to study the evolutionary patterns and structural characteristics of TLRs, we explored the available genomes of several representative members of reptiles. 25 TLR2 genes and 19 TLR4 genes from reptiles were obtained in this study. Phylogenetic results showed that the TLR2 gene duplication occurred in several species. Evolutionary analysis by at least two methods identified 30 and 13 common positively selected codons in TLR2 and TLR4, respectively. Most positively selected sites of TLR2 and TLR4 were located in the Leucine-rich repeat (LRRs). Branch model analysis showed that TLR2 genes were under different evolutionary forces in reptiles, while the TLR4 genes showed no significant selection pressure. The different evolutionary adaptation of TLR2 and TLR4 among the reptiles might be due to their different function in recognizing bacteria. Overall, we explored the structure and evolution of TLR2 and TLR4 genes in reptiles for the first time. Our study revealed valuable information regarding TLR2 and TLR4 in reptiles, and provided novel insights into the conservation concern of natural populations. Copyright © 2017 Elsevier B.V. All rights reserved.
Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang
2014-01-01
Background Salt stress interferes with plant growth and production. Plants have evolved a series of molecular and morphological adaptations to cope with this abiotic stress, and overexpression of salt response genes reportedly enhances the productivity of various crops. However, little is known about the salt responsive genes in the energy plant physic nut (Jatropha curcas L.). Thus, excavate salt responsive genes in this plant are informative in uncovering the molecular mechanisms for the salt response in physic nut. Methodology/Principal Findings We applied next-generation Illumina sequencing technology to analyze global gene expression profiles of physic nut plants (roots and leaves) 2 hours, 2 days and 7 days after the onset of salt stress. A total of 1,504 and 1,115 genes were significantly up and down-regulated in roots and leaves, respectively, under salt stress condition. Gene ontology (GO) analysis of physiological process revealed that, in the physic nut, many “biological processes” were affected by salt stress, particular those categories belong to “metabolic process”, such as “primary metabolism process”, “cellular metabolism process” and “macromolecule metabolism process”. The gene expression profiles indicated that the associated genes were responsible for ABA and ethylene signaling, osmotic regulation, the reactive oxygen species scavenging system and the cell structure in physic nut. Conclusions/Significance The major regulated genes detected in this transcriptomic data were related to trehalose synthesis and cell wall structure modification in roots, while related to raffinose synthesis and reactive oxygen scavenger in leaves. The current study shows a comprehensive gene expression profile of physic nut under salt stress. The differential expression genes detected in this study allows the underling the salt responsive mechanism in physic nut with the aim of improving its salt resistance in the future. PMID:24837971
Zhang, Lin; Zhang, Chao; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang
2014-01-01
Salt stress interferes with plant growth and production. Plants have evolved a series of molecular and morphological adaptations to cope with this abiotic stress, and overexpression of salt response genes reportedly enhances the productivity of various crops. However, little is known about the salt responsive genes in the energy plant physic nut (Jatropha curcas L.). Thus, excavate salt responsive genes in this plant are informative in uncovering the molecular mechanisms for the salt response in physic nut. We applied next-generation Illumina sequencing technology to analyze global gene expression profiles of physic nut plants (roots and leaves) 2 hours, 2 days and 7 days after the onset of salt stress. A total of 1,504 and 1,115 genes were significantly up and down-regulated in roots and leaves, respectively, under salt stress condition. Gene ontology (GO) analysis of physiological process revealed that, in the physic nut, many "biological processes" were affected by salt stress, particular those categories belong to "metabolic process", such as "primary metabolism process", "cellular metabolism process" and "macromolecule metabolism process". The gene expression profiles indicated that the associated genes were responsible for ABA and ethylene signaling, osmotic regulation, the reactive oxygen species scavenging system and the cell structure in physic nut. The major regulated genes detected in this transcriptomic data were related to trehalose synthesis and cell wall structure modification in roots, while related to raffinose synthesis and reactive oxygen scavenger in leaves. The current study shows a comprehensive gene expression profile of physic nut under salt stress. The differential expression genes detected in this study allows the underling the salt responsive mechanism in physic nut with the aim of improving its salt resistance in the future.
Bravo-Alonso, Irene; Navarrete, Rosa; Arribas-Carreira, Laura; Perona, Almudena; Abia, David; Couce, María Luz; García-Cazorla, Angels; Morais, Ana; Domingo, Rosario; Ramos, María Antonia; Swanson, Michael A; Van Hove, Johan L K; Ugarte, Magdalena; Pérez, Belén; Pérez-Cerdá, Celia; Rodríguez-Pombo, Pilar
2017-06-01
The rapid analysis of genomic data is providing effective mutational confirmation in patients with clinical and biochemical hallmarks of a specific disease. This is the case for nonketotic hyperglycinemia (NKH), a Mendelian disorder causing seizures in neonates and early-infants, primarily due to mutations in the GLDC gene. However, understanding the impact of missense variants identified in this gene is a major challenge for the application of genomics into clinical practice. Herein, a comprehensive functional and structural analysis of 19 GLDC missense variants identified in a cohort of 26 NKH patients was performed. Mutant cDNA constructs were expressed in COS7 cells followed by enzymatic assays and Western blot analysis of the GCS P-protein to assess the residual activity and mutant protein stability. Structural analysis, based on molecular modeling of the 3D structure of GCS P-protein, was also performed. We identify hypomorphic variants that produce attenuated phenotypes with improved prognosis of the disease. Structural analysis allows us to interpret the effects of mutations on protein stability and catalytic activity, providing molecular evidence for clinical outcome and disease severity. Moreover, we identify an important number of mutants whose loss-of-functionality is associated with instability and, thus, are potential targets for rescue using folding therapeutic approaches. © 2017 Wiley Periodicals, Inc.
Sinha, Siddharth; Verma, Sharad; Singh, Aditi; Somvanshi, Pallavi; Grover, Abhinav
2018-01-01
Spinocerebellar degeneration, termed as ataxia is a neurological disorder of central nervous system, characterized by limb in-coordination and a progressive gait. The patient also demonstrates specific symptoms of muscle weakness, slurring of speech, and decreased vibration senses. Expansion of polyglutamine trinucleotide (CAG) within ATXN2 gene with 35 or more repeats, results in spinocerebellar ataxia type-2. Protein ataxin-2 coded by ATXN2 gene has been reported to have a crucial role in translation of the genetic information through sequestering the histone acetyl transferases (HAT) resulting in a state of hypo-acetylation. In the present study, we have evaluated the outcome for 122 non synonymous single nucleotide polymorphisms (nsSNPs) reported within ATXN2 gene through computational tools such as SIFT, PolyPhen 2.0, PANTHER, I-mutant 2.0, Phd-SNP, Pmut, MutPred. The apo and mutant (L305V and Q339L) form of structures for the ataxin-2 protein were modeled for gaining insights toward 3D spatial arrangement. Further, molecular dynamics simulations and structural analysis were performed to observe the brunt of disease associated nsSNPs toward the strength and secondary properties of ataxin-2 protein structure. Our results showed that, L305V is a highly deleterious and disease causing point substitution. Analysis based on RMSD, RMSF, Rg, SASA, number of hydrogen bonds (NH bonds), covariance matrix trace, projection analysis for eigen vector demonstrated a significant instability and conformation along with rise in mutant flexibility values in comparison to the apo form of ataxin-2 protein. The study provides a blue print of computational methodologies to examine the ataxin-blend SNPs. J. Cell. Biochem. 119: 499-510, 2018. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Soybean kinome: functional classification and gene expression patterns
Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek
2015-01-01
The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
Reyes-Guzmán, Edwin Alfredo; Poutou-Piñales, Raúl A.; Reyes-Montaño, Edgar Antonio; Pedroza-Rodríguez, Aura Marina; Rodríguez-Vázquez, Refugio; Cardozo-Bernal, Ángela M.
2015-01-01
Lacasses are multicopper oxidases that can catalyze aromatic and non-aromatic compounds concomitantly with reduction of molecular oxygen to water. Fungal laccases have generated a growing interest due to their biotechnological potential applications, such as lignocellulosic material delignification, biopulping and biobleaching, wastewater treatment, and transformation of toxic organic pollutants. In this work we selected fungal genes encoding for laccase enzymes GlLCC1 in Ganoderma lucidum and POXA 1B in Pleurotus ostreatus. These genes were optimized for codon use, GC content, and regions generating secondary structures. Laccase proposed computational models, and their interaction with ABTS [2, 2′-azino-bis(3-ethylbenzothiazoline-6-sulphonic acid)] substrate was evaluated by molecular docking. Synthetic genes were cloned under the control of Pichia pastoris glyceraldehyde-3-phosphate dehydrogenase (GAP) constitutive promoter. P. pastoris X-33 was transformed with pGAPZαA-LaccGluc-Stop and pGAPZαA-LaccPost-Stop constructs. Optimization reduced GC content by 47 and 49% for LaccGluc-Stop and LaccPost-Stop genes, respectively. A codon adaptation index of 0.84 was obtained for both genes. 3D structure analysis using SuperPose revealed LaccGluc-Stop is similar to the laccase crystallographic structure 1GYC of Trametes versicolor. Interaction analysis of the 3D models validated through ABTS, demonstrated higher substrate affinity for LaccPost-Stop, in agreement with our experimental results with enzymatic activities of 451.08 ± 6.46 UL-1 compared to activities of 0.13 ± 0.028 UL-1 for LaccGluc-Stop. This study demonstrated that G. lucidum GlLCC1 and P. ostreatus POXA 1B gene optimization resulted in constitutive gene expression under GAP promoter and α-factor leader in P. pastoris. These are important findings in light of recombinant enzyme expression system utility for environmentally friendly designed expression systems, because of the wide range of substrates that laccases can transform. This contributes to a great gamut of products in diverse settings: industry, clinical and chemical use, and environmental applications. PMID:25611746
Rivera-Hoyos, Claudia M; Morales-Álvarez, Edwin David; Poveda-Cuevas, Sergio Alejandro; Reyes-Guzmán, Edwin Alfredo; Poutou-Piñales, Raúl A; Reyes-Montaño, Edgar Antonio; Pedroza-Rodríguez, Aura Marina; Rodríguez-Vázquez, Refugio; Cardozo-Bernal, Ángela M
2015-01-01
Lacasses are multicopper oxidases that can catalyze aromatic and non-aromatic compounds concomitantly with reduction of molecular oxygen to water. Fungal laccases have generated a growing interest due to their biotechnological potential applications, such as lignocellulosic material delignification, biopulping and biobleaching, wastewater treatment, and transformation of toxic organic pollutants. In this work we selected fungal genes encoding for laccase enzymes GlLCC1 in Ganoderma lucidum and POXA 1B in Pleurotus ostreatus. These genes were optimized for codon use, GC content, and regions generating secondary structures. Laccase proposed computational models, and their interaction with ABTS [2, 2'-azino-bis(3-ethylbenzothiazoline-6-sulphonic acid)] substrate was evaluated by molecular docking. Synthetic genes were cloned under the control of Pichia pastoris glyceraldehyde-3-phosphate dehydrogenase (GAP) constitutive promoter. P. pastoris X-33 was transformed with pGAPZαA-LaccGluc-Stop and pGAPZαA-LaccPost-Stop constructs. Optimization reduced GC content by 47 and 49% for LaccGluc-Stop and LaccPost-Stop genes, respectively. A codon adaptation index of 0.84 was obtained for both genes. 3D structure analysis using SuperPose revealed LaccGluc-Stop is similar to the laccase crystallographic structure 1GYC of Trametes versicolor. Interaction analysis of the 3D models validated through ABTS, demonstrated higher substrate affinity for LaccPost-Stop, in agreement with our experimental results with enzymatic activities of 451.08 ± 6.46 UL-1 compared to activities of 0.13 ± 0.028 UL-1 for LaccGluc-Stop. This study demonstrated that G. lucidum GlLCC1 and P. ostreatus POXA 1B gene optimization resulted in constitutive gene expression under GAP promoter and α-factor leader in P. pastoris. These are important findings in light of recombinant enzyme expression system utility for environmentally friendly designed expression systems, because of the wide range of substrates that laccases can transform. This contributes to a great gamut of products in diverse settings: industry, clinical and chemical use, and environmental applications.
Joint mapping of genes and conditions via multidimensional unfolding analysis
Van Deun, Katrijn; Marchal, Kathleen; Heiser, Willem J; Engelen, Kristof; Van Mechelen, Iven
2007-01-01
Background Microarray compendia profile the expression of genes in a number of experimental conditions. Such data compendia are useful not only to group genes and conditions based on their similarity in overall expression over profiles but also to gain information on more subtle relations between genes and conditions. Getting a clear visual overview of all these patterns in a single easy-to-grasp representation is a useful preliminary analysis step: We propose to use for this purpose an advanced exploratory method, called multidimensional unfolding. Results We present a novel algorithm for multidimensional unfolding that overcomes both general problems and problems that are specific for the analysis of gene expression data sets. Applying the algorithm to two publicly available microarray compendia illustrates its power as a tool for exploratory data analysis: The unfolding analysis of a first data set resulted in a two-dimensional representation which clearly reveals temporal regulation patterns for the genes and a meaningful structure for the time points, while the analysis of a second data set showed the algorithm's ability to go beyond a mere identification of those genes that discriminate between different patient or tissue types. Conclusion Multidimensional unfolding offers a useful tool for preliminary explorations of microarray data: By relying on an easy-to-grasp low-dimensional geometric framework, relations among genes, among conditions and between genes and conditions are simultaneously represented in an accessible way which may reveal interesting patterns in the data. An additional advantage of the method is that it can be applied to the raw data without necessitating the choice of suitable genewise transformations of the data. PMID:17550582
Han, Yahui; Ding, Ting; Su, Bo; Jiang, Haiyang
2016-01-01
Members of the chalcone synthase (CHS) family participate in the synthesis of a series of secondary metabolites in plants, fungi and bacteria. The metabolites play important roles in protecting land plants against various environmental stresses during the evolutionary process. Our research was conducted on comprehensive investigation of CHS genes in maize (Zea mays L.), including their phylogenetic relationships, gene structures, chromosomal locations and expression analysis. Fourteen CHS genes (ZmCHS01–14) were identified in the genome of maize, representing one of the largest numbers of CHS family members identified in one organism to date. The gene family was classified into four major classes (classes I–IV) based on their phylogenetic relationships. Most of them contained two exons and one intron. The 14 genes were unevenly located on six chromosomes. Two segmental duplication events were identified, which might contribute to the expansion of the maize CHS gene family to some extent. In addition, quantitative real-time PCR and microarray data analyses suggested that ZmCHS genes exhibited various expression patterns, indicating functional diversification of the ZmCHS genes. Our results will contribute to future studies of the complexity of the CHS gene family in maize and provide valuable information for the systematic analysis of the functions of the CHS gene family. PMID:26828478
Yue, Hong; Wang, Meng; Liu, Siyan; Du, Xianghong; Song, Weining; Nie, Xiaojun
2016-05-10
WRKY genes, as the most pivotal transcription factors in plants, play the indispensable roles in regulating various physiological processes, including plant growth and development as well as in response to stresses. Broomcorn millet is one of the most important crops in drought areas worldwide. However, the WRKY gene family in broomcorn millet remains unknown. A total of 32 PmWRKY genes were identified in this study using computational prediction method. Structural analysis found that PmWRKY proteins contained a highly conserved motif WRKYGQK and two common variant motifs, namely WRKYGKK and WRKYGEK. Phylogenetic analysis of PmWRKYs together with the homologous genes from the representative species could classify them into three groups, with the number of 1, 15, and 16, respectively. Finally, the transcriptional profiles of these 32 PmWRKY genes in various tissues or under different abiotic stresses were systematically investigated using qRT-PCR analysis. Results showed that the expression level of 22 PmWRKY genes varied significantly under one or more abiotic stress treatments, which could be defined as abiotic stress-responsive genes. This was the first study to identify the organization and transcriptional profiles of PmWRKY genes, which not only facilitates the functional analysis of the PmWRKY genes, and also lays the foundation to reveal the molecular mechanism of stress tolerance in this important crop.
Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun
2014-11-25
The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position among isolates but also functionally essential for a given species and to further evaluate the stability or flexibility of such genome structures across lineages are of importance. Based on a large number of multi-isolate pangenomic data, our analysis reveals that a subset of core genes is organized into a core-gene-defined genome organizational framework, or cGOF. Furthermore, the lineage-associated cGOFs among Gram-positive and Gram-negative bacteria behave differently: the former, composed of 2 to 4 segments, have their fragments symmetrically rearranged around the origin-terminus axis, whereas the latter show more complex segmentation and are partitioned asymmetrically into chromosomal structures. The definition of cGOFs provides new insights into prokaryotic genome organization and efficient guidance for genome assembly and analysis. Copyright © 2014 Kang et al.
MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity
Wang, Yupeng; Tang, Haibao; DeBarry, Jeremy D.; Tan, Xu; Li, Jingping; Wang, Xiyin; Lee, Tae-ho; Jin, Huizhe; Marler, Barry; Guo, Hui; Kissinger, Jessica C.; Paterson, Andrew H.
2012-01-01
MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/. PMID:22217600
Chen, Xue; Chen, Zhu; Zhao, Hualin; Zhao, Yang; Cheng, Beijiu; Xiang, Yan
2014-01-01
Background Homeodomain-leucine zipper (HD-Zip) proteins, a group of homeobox transcription factors, participate in various aspects of normal plant growth and developmental processes as well as environmental responses. To date, no overall analysis or expression profiling of the HD-Zip gene family in soybean (Glycine max) has been reported. Methods and Findings An investigation of the soybean genome revealed 88 putative HD-Zip genes. These genes were classified into four subfamilies, I to IV, based on phylogenetic analysis. In each subfamily, the constituent parts of gene structure and motif were relatively conserved. A total of 87 out of 88 genes were distributed unequally on 20 chromosomes with 36 segmental duplication events, indicating that segmental duplication is important for the expansion of the HD-Zip family. Analysis of the Ka/Ks ratios showed that the duplicated genes of the HD-Zip family basically underwent purifying selection with restrictive functional divergence after the duplication events. Analysis of expression profiles showed that 80 genes differentially expressed across 14 tissues, and 59 HD-Zip genes are differentially expressed under salinity and drought stress, with 20 paralogous pairs showing nearly identical expression patterns and three paralogous pairs diversifying significantly under drought stress. Quantitative real-time RT-PCR (qRT-PCR) analysis of six paralogous pairs of 12 selected soybean HD-Zip genes under both drought and salinity stress confirmed their stress-inducible expression patterns. Conclusions This study presents a thorough overview of the soybean HD-Zip gene family and provides a new perspective on the evolution of this gene family. The results indicate that HD-Zip family genes may be involved in many plant responses to stress conditions. Additionally, this study provides a solid foundation for uncovering the biological roles of HD-Zip genes in soybean growth and development. PMID:24498296
González-Calabozo, Jose M; Valverde-Albacete, Francisco J; Peláez-Moreno, Carmen
2016-09-15
Gene Expression Data (GED) analysis poses a great challenge to the scientific community that can be framed into the Knowledge Discovery in Databases (KDD) and Data Mining (DM) paradigm. Biclustering has emerged as the machine learning method of choice to solve this task, but its unsupervised nature makes result assessment problematic. This is often addressed by means of Gene Set Enrichment Analysis (GSEA). We put forward a framework in which GED analysis is understood as an Exploratory Data Analysis (EDA) process where we provide support for continuous human interaction with data aiming at improving the step of hypothesis abduction and assessment. We focus on the adaptation to human cognition of data interpretation and visualization of the output of EDA. First, we give a proper theoretical background to bi-clustering using Lattice Theory and provide a set of analysis tools revolving around [Formula: see text]-Formal Concept Analysis ([Formula: see text]-FCA), a lattice-theoretic unsupervised learning technique for real-valued matrices. By using different kinds of cost structures to quantify expression we obtain different sequences of hierarchical bi-clusterings for gene under- and over-expression using thresholds. Consequently, we provide a method with interleaved analysis steps and visualization devices so that the sequences of lattices for a particular experiment summarize the researcher's vision of the data. This also allows us to define measures of persistence and robustness of biclusters to assess them. Second, the resulting biclusters are used to index external omics databases-for instance, Gene Ontology (GO)-thus offering a new way of accessing publicly available resources. This provides different flavors of gene set enrichment against which to assess the biclusters, by obtaining their p-values according to the terminology of those resources. We illustrate the exploration procedure on a real data example confirming results previously published. The GED analysis problem gets transformed into the exploration of a sequence of lattices enabling the visualization of the hierarchical structure of the biclusters with a certain degree of granularity. The ability of FCA-based bi-clustering methods to index external databases such as GO allows us to obtain a quality measure of the biclusters, to observe the evolution of a gene throughout the different biclusters it appears in, to look for relevant biclusters-by observing their genes and what their persistence is-to infer, for instance, hypotheses on their function.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trindade, Inês B.; Fonseca, Bruno M.; Matias, Pedro M.
The gene encoding a putative siderophore-interacting protein from the marine bacterium S. frigidimarina was successfully cloned, followed by expression and purification of the gene product. Optimized crystals diffracted to 1.35 Å resolution and preliminary crystallographic analysis is promising with respect to structure determination and increased insight into the poorly understood molecular mechanisms underlying iron acquisition. Siderophore-binding proteins (SIPs) perform a key role in iron acquisition in multiple organisms. In the genome of the marine bacterium Shewanella frigidimarina NCIMB 400, the gene tagged as SFRI-RS12295 encodes a protein from this family. Here, the cloning, expression, purification and crystallization of this proteinmore » are reported, together with its preliminary X-ray crystallographic analysis to 1.35 Å resolution. The SIP crystals belonged to the monoclinic space group P2{sub 1}, with unit-cell parameters a = 48.04, b = 78.31, c = 67.71 Å, α = 90, β = 99.94, γ = 90°, and are predicted to contain two molecules per asymmetric unit. Structure determination by molecular replacement and the use of previously determined ∼2 Å resolution SIP structures with ∼30% sequence identity as templates are ongoing.« less
System Biology Approach: Gene Network Analysis for Muscular Dystrophy.
Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro
2018-01-01
Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.
2014-01-01
Background The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs). The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds. Despite the important roles of Meg1 in maize seed development, the evolutionary history of the Meg cluster and the activities of the duplicate genes are not understood. Results In maize, the Meg gene cluster resides in a 2.3 Mb-long genomic region that exhibits many features of non-centromeric heterochromatin. Using phylogenetic reconstruction and syntenic alignments, we identified the pedigree of the Meg family, in which 11 of its 13 members arose in maize after allotetraploidization ~4.8 mya. Phylogenetic and population-genetic analyses identified possible signatures suggesting recent positive selection in Meg homologs. Structural analyses of the Meg proteins indicated potentially adaptive changes in secondary structure from α-helix to β-strand during the expansion. Transcriptomic analysis of the maize endosperm indicated that 6 Meg genes are selectively activated in the BETL, and younger Meg genes are more active than older ones. In endosperms from B73 by Mo17 reciprocal crosses, most Meg genes did not display parent-specific expression patterns. Conclusions Recently-duplicated Meg genes have different protein secondary structures, and their expressions in the BETL dominate over those of older members. Together with the signs of positive selections in the young Meg genes, these results suggest that the expansion of the Meg family involves potentially adaptive transitions in which new members with novel functions prevailed over older members. PMID:25084677
Earlier population genetic spatial analysis of European corn borer, Ostrinia nubilalis (Hubner), indicated no genetic differentiation even between locations separated by 720 km. This result suggests either high dispersal resulting in high gene flow, or that populations are not in...
USDA-ARS?s Scientific Manuscript database
Nitrogen uptake and the efficient absorption and metabolism of nitrogen are essential elements in attempts to breed improved cereal cultivars for grain or silage production. One of the enzymes related to nitrogen metabolism is glutamine-2-oxoglutarate amidotransferase (GOGAT). Together with glutami...
Method of identifying hairpin DNA probes by partial fold analysis
Miller, Benjamin L [Penfield, NY; Strohsahl, Christopher M [Saugerties, NY
2009-10-06
Method of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.
Method of identifying hairpin DNA probes by partial fold analysis
Miller, Benjamin L.; Strohsahl, Christopher M.
2008-10-28
Methods of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.
Gao, Feng; Song, Weibo; Katz, Laura A.
2014-01-01
In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that: 1) alternative processing is extensive among gene families; and 2) such gene families are likely to be C. uncinata-specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family -- a protein kinase domain containing protein (PKc) -- from two C. uncinata strains. Analysis of the PKc sequences reveals: 1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and 2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. PMID:24749903
The banana E2 gene family: Genomic identification, characterization, expression profiling analysis.
Dong, Chen; Hu, Huigang; Jue, Dengwei; Zhao, Qiufang; Chen, Hongliang; Xie, Jianghui; Jia, Liqiang
2016-04-01
The E2 is at the center of a cascade of Ub1 transfers, and it links activation of the Ub1 by E1 to its eventual E3-catalyzed attachment to substrate. Although the genome-wide analysis of this family has been performed in some species, little is known about analysis of E2 genes in banana. In this study, 74 E2 genes of banana were identified and phylogenetically clustered into thirteen subgroups. The predicted banana E2 genes were distributed across all 11 chromosomes at different densities. Additionally, the E2 domain, gene structure and motif compositions were analyzed. The expression of all of the banana E2 genes was analyzed in the root, stem, leaf, flower organs, five stages of fruit development and under abiotic stresses. All of the banana E2 genes, with the exception of few genes in each group, were expressed in at least one of the organs and fruit developments, which indicated that the E2 genes might involve in various aspects of the physiological and developmental processes of the banana. Quantitative RT-PCR (qRT-PCR) analysis identified that 45 E2s under drought and 33 E2s under salt were induced. To the best of our knowledge, this report describes the first genome-wide analysis of the banana E2 gene family, and the results should provide valuable information for understanding the classification, cloning and putative functions of this family. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Li, Xiaoqin; Guo, Rongrong; Li, Jun; Singer, Stacy D; Zhang, Yucheng; Yin, Xiangjing; Zheng, Yi; Fan, Chonghui; Wang, Xiping
2013-10-01
Aldehyde dehydrogenases (ALDHs) represent a protein superfamily encoding NAD(P)(+)-dependent enzymes that oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes. In plants, they are involved in many biological processes and play a role in the response to environmental stress. In this study, a total of 39 ALDH genes from ten families were identified in the apple (Malus × domestica Borkh.) genome. Synteny analysis of the apple ALDH (MdALDH) genes indicated that segmental and tandem duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of these gene families in apple. Moreover, synteny analysis between apple and Arabidopsis demonstrated that several MdALDH genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes appeared before the divergence of lineages that led to apple and Arabidopsis. In addition, phylogenetic analysis, as well as comparisons of exon-intron and protein structures, provided further insight into both their evolutionary relationships and their putative functions. Tissue-specific expression analysis of the MdALDH genes demonstrated diverse spatiotemporal expression patterns, while their expression profiles under abiotic stress and various hormone treatments indicated that many MdALDH genes were responsive to high salinity and drought, as well as different plant hormones. This genome-wide identification, as well as characterization of evolutionary relationships and expression profiles, of the apple MdALDH genes will not only be useful for the further analysis of ALDH genes and their roles in stress response, but may also aid in the future improvement of apple stress tolerance. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Zhili; Deng, Ye; Nostrand, Joy Van
2010-05-17
Microarray-based genomic technology has been widely used for microbial community analysis, and it is expected that microarray-based genomic technologies will revolutionize the analysis of microbial community structure, function and dynamics. A new generation of functional gene arrays (GeoChip 3.0) has been developed, with 27,812 probes covering 56,990 gene variants from 292 functional gene families involved in carbon, nitrogen, phosphorus and sulfur cycles, energy metabolism, antibiotic resistance, metal resistance, and organic contaminant degradation. Those probes were derived from 2,744, 140, and 262 species for bacteria, archaea, and fungi, respectively. GeoChip 3.0 has several other distinct features, such as a common oligomore » reference standard (CORS) for data normalization and comparison, a software package for data management and future updating, and the gyrB gene for phylogenetic analysis. Our computational evaluation of probe specificity indicated that all designed probes had a high specificity to their corresponding targets. Also, experimental analysis with synthesized oligonucleotides and genomic DNAs showed that only 0.0036percent-0.025percent false positive rates were observed, suggesting that the designed probes are highly specific under the experimental conditions examined. In addition, GeoChip 3.0 was applied to analyze soil microbial communities in a multifactor grassland ecosystem in Minnesota, USA, which demonstrated that the structure, composition, and potential activity of soil microbial communities significantly changed with the plant species diversity. All results indicate that GeoChip 3.0 is a high throughput powerful tool for studying microbial community functional structure, and linking microbial communities to ecosystem processes and functioning. To our knowledge, GeoChip 3.0 is the most comprehensive microarrays currently available for studying microbial communities associated with geobiochemical cycling, global climate change, bioenergy, agricuture, land use, ecosystem management, environmental cleanup and restoration, bioreactor systems, and human health.« less
Analysis of flavonoids and the flavonoid structural genes in brown fiber of upland cotton.
Feng, Hongjie; Tian, Xinhui; Liu, Yongchang; Li, Yanjun; Zhang, Xinyu; Jones, Brian Joseph; Sun, Yuqiang; Sun, Jie
2013-01-01
As a result of changing consumer preferences, cotton (Gossypium Hirsutum L.) from varieties with naturally colored fibers is becoming increasingly sought after in the textile industry. The molecular mechanisms leading to colored fiber development are still largely unknown, although it is expected that the color is derived from flavanoids. Firstly, four key genes of the flavonoid biosynthetic pathway in cotton (GhC4H, GhCHS, GhF3'H, and GhF3'5'H) were cloned and studied their expression profiles during the development of brown- and white cotton fibers by QRT-PCR. And then, the concentrations of four components of the flavonoid biosynthetic pathway, naringenin, quercetin, kaempferol and myricetin in brown- and white fibers were analyzed at different developmental stages by HPLC. The predicted proteins of the four flavonoid structural genes corresponding to these genes exhibit strong sequence similarity to their counterparts in various plant species. Transcript levels for all four genes were considerably higher in developing brown fibers than in white fibers from a near isogenic line (NIL). The contents of four flavonoids (naringenin, quercetin, kaempferol and myricetin) were significantly higher in brown than in white fibers and corresponding to the biosynthetic gene expression levels. Flavonoid structural gene expression and flavonoid metabolism are important in the development of pigmentation in brown cotton fibers.
Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott
2010-04-01
An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.
Using parentage analysis to examine gene flow and spatial genetic structure.
Kane, Nolan C; King, Matthew G
2009-04-01
Numerous approaches have been developed to examine recent and historical gene flow between populations, but few studies have used empirical data sets to compare different approaches. Some methods are expected to perform better under particular scenarios, such as high or low gene flow, but this, too, has rarely been tested. In this issue of Molecular Ecology, Saenz-Agudelo et al. (2009) apply assignment tests and parentage analysis to microsatellite data from five geographically proximal (2-6 km) and one much more distant (1500 km) panda clownfish populations, showing that parentage analysis performed better in situations of high gene flow, while their assignment tests did better with low gene flow. This unusually complete data set is comprised of multiple exhaustively sampled populations, including nearly all adults and large numbers of juveniles, enabling the authors to ask questions that in many systems would be impossible to answer. Their results emphasize the importance of selecting the right analysis to use, based on the underlying model and how well its assumptions are met by the populations to be analysed.
Zhang, Yanzhao; Xu, Shuzhen; Cheng, Yanwei; Peng, Zhengfeng; Han, Jianming
2018-01-01
Red leaf lettuce ( Lactuca sativa L.) is popular due to its high anthocyanin content, but poor leaf coloring often occurs under low light intensity. In order to reveal the mechanisms of anthocyanins affected by light intensity, we compared the transcriptome of L. sativa L. var. capitata under light intensities of 40 and 100 μmol m -2 s -1 . A total of 62,111 unigenes were de novo assembled with an N50 of 1,681 bp, and 48,435 unigenes were functionally annotated in public databases. A total of 3,899 differentially expressed genes (DEGs) were detected, of which 1,377 unigenes were up-regulated and 2,552 unigenes were down-regulated in the high light samples. By Kyoto Encyclopedia of Genes and Genomes enrichment analysis, the DEGs were significantly enriched in 14 pathways. Using gene annotation and phylogenetic analysis, we identified seven anthocyanin structural genes, including CHS , CHI , F3H , F3'H , DFR , ANS , and 3GT , and two anthocyanin transport genes, GST and MATE . In terms of anthocyanin regulatory genes, five MYBs and one bHLH gene were identified. An HY5 gene was discovered, which may respond to light-signaling and regulate anthocyanin structural genes. These genes showed a log2FC of 2.7-9.0 under high irradiance, and were validated using quantitative real-time-PCR. In conclusion, our results indicated transcriptome variance in red leaf lettuce under low and high light intensity, and observed a anthocyanin biosynthesis and regulation pattern. The data should further help to unravel the molecular mechanisms of anthocyanins influenced by light intensity.
2013-01-01
Background Xanthophylls, oxygenated derivatives of carotenes, play critical roles in photosynthetic apparatus of cyanobacteria, algae, and higher plants. Although the xanthophylls biosynthetic pathway of algae is largely unknown, it is of particular interest because they have a very complicated evolutionary history. Carotenoid hydroxylase (CHY) is an important protein that plays essential roles in xanthophylls biosynthesis. With the availability of 18 sequenced algal genomes, we performed a comprehensive comparative analysis of chy genes and explored their distribution, structure, evolution, origins, and expression. Results Overall 60 putative chy genes were identified and classified into two major subfamilies (bch and cyp97) according to their domain structures. Genes in the bch subfamily were found in 10 green algae and 1 red alga, but absent in other algae. In the phylogenetic tree, bch genes of green algae and higher plants share a common ancestor and are of non-cyanobacterial origin, whereas that of red algae is of cyanobacteria. The homologs of cyp97a/c genes were widespread only in green algae, while cyp97b paralogs were seen in most of algae. Phylogenetic analysis on cyp97 genes supported the hypothesis that cyp97b is an ancient gene originated before the formation of extant algal groups. The cyp97a gene is more closely related to cyp97c in evolution than to cyp97b. The two cyp97 genes were isolated from the green alga Haematococcus pluvialis, and transcriptional expression profiles of chy genes were observed under high light stress of different wavelength. Conclusions Green algae received a β-xanthophylls biosynthetic pathway from host organisms. Although red algae inherited the pathway from cyanobacteria during primary endosymbiosis, it remains unclear in Chromalveolates. The α-xanthophylls biosynthetic pathway is a common feature in green algae and higher plants. The origination of cyp97a/c is most likely due to gene duplication before divergence of green algae and higher plants. Protein domain structures and expression analyses in green alga H. pluvialis indicate that various chy genes are in different manners response to light. The knowledge of evolution of chy genes in photosynthetic eukaryotes provided information of gene cloning and functional investigation of chy genes in algae in the future. PMID:23834441
The Reconstruction and Analysis of Gene Regulatory Networks.
Zheng, Guangyong; Huang, Tao
2018-01-01
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
A short treatise concerning a musical approach for the interpretation of gene expression data
Staege, Martin S.
2015-01-01
Recent technical developments allow the genome-wide and near-complete analysis of gene expression in a given sample, e.g. by usage of high-density DNA microarrays or next generation sequencing. The generated data structure is usually multi-dimensional and requires extensive processing not only for analysis but also for presentation of the results. Today, such data are usually presented graphically, e.g. in the form of heat maps. In the present paper, we propose an alternative form of analysis and presentation which is based on the transformation of gene expression data into sounds that are characterized by their frequency (pitch) and tone duration. Using DNA microarray data from a panel of neuroblastoma and Ewing sarcoma cell lines as well as from Hodgkin’s lymphoma cell lines and normal B cells, we demonstrate that this Gene Expression Music Algorithm (GEMusicA) can be used for discrimination between samples with different biology and for the characterization of differentially expressed genes. PMID:26472273
High-Content Analysis of CRISPR-Cas9 Gene-Edited Human Embryonic Stem Cells.
Carlson-Stevermer, Jared; Goedland, Madelyn; Steyer, Benjamin; Movaghar, Arezoo; Lou, Meng; Kohlenberg, Lucille; Prestil, Ryan; Saha, Krishanu
2016-01-12
CRISPR-Cas9 gene editing of human cells and tissues holds much promise to advance medicine and biology, but standard editing methods require weeks to months of reagent preparation and selection where much or all of the initial edited samples are destroyed during analysis. ArrayEdit, a simple approach utilizing surface-modified multiwell plates containing one-pot transcribed single-guide RNAs, separates thousands of edited cell populations for automated, live, high-content imaging and analysis. The approach lowers the time and cost of gene editing and produces edited human embryonic stem cells at high efficiencies. Edited genes can be expressed in both pluripotent stem cells and differentiated cells. This preclinical platform adds important capabilities to observe editing and selection in situ within complex structures generated by human cells, ultimately enabling optical and other molecular perturbations in the editing workflow that could refine the specificity and versatility of gene editing. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Phadtare, Sangita; Severinov, Konstantin
2009-11-01
In Escherichia coli, temperature downshift elicits cold shock response, which is characterized by induction of cold shock proteins. CspA, the major cold shock protein of E. coli, helps cells to acclimatize to low temperature by melting the secondary structures in nucleic acids and acting as a transcription antiterminator. CspA and its homologues contain the cold shock domain and belong to the oligomer binding protein family, which also includes S1 domain proteins such as IF1. Structural similarity between IF1 and CspA homologues suggested a functional overlap between these proteins. Indeed IF1 can melt secondary structures in RNA and acts as transcription antiterminator in vivo and in vitro. Here, we show that in spite of having these critical activities, IF1 does not complement cold-sensitivity of a csp quadruple deletion strain. DNA microarray analysis shows that overproduction of IF1 and Csp leads to changes in expression of different sets of genes. Importantly, several genes which were previously shown to require Csp proteins for their expression at low temperature did not respond to IF1. Moreover, in vitro, we show that a transcription terminator responsive to Csp does not respond to IF1. Our results suggest that Csp proteins and IF1 have different sets of target genes as they may be suppressing the function of different types of transcription termination elements in specific genes.
Wang, Shan-Ning; Peng, Yong; Lu, Zi-Yun; Dhiloo, Khalid Hussain; Zheng, Yao; Shan, Shuang; Li, Rui-Jun; Zhang, Yong-Jun; Guo, Yu-Yuan
2016-07-01
Ionotropic receptors (IRs) mainly detect the acids and amines having great importance in many insect species, representing an ancient olfactory receptor family in insects. In the present work, we performed RNAseq of Microplitis mediator antennae and identified seventeen IRs. Full-length MmedIRs were cloned and sequenced. Phylogenetic analysis of the Hymenoptera IRs revealed that ten MmedIR genes encoded "antennal IRs" and seven encoded "divergent IRs". Among the IR25a orthologous groups, two genes, MmedIR25a.1 and MmedIR25a.2, were found in M. mediator. Gene structure analysis of MmedIR25a revealed a tandem duplication of IR25a in M. mediator. The tissue distribution and development specific expression of the MmedIR genes suggested that these genes showed a broad expression profile. Quantitative gene expression analysis showed that most of the genes are highly enriched in adult antennae, indicating the candidate chemosensory function of this family in parasitic wasps. Using immunocytochemistry, we confirmed that one co-receptor, MmedIR8a, was expressed in the olfactory sensory neurons. Our data will supply fundamental information for functional analysis of the IRs in parasitoid wasp chemoreception. Copyright © 2016 Elsevier Ltd. All rights reserved.
Chen, Hongfei; Zuo, Xiya; Shao, Hongxia; Fan, Sheng; Ma, Juanjuan; Zhang, Dong; Zhao, Caiping; Yan, Xiangyan; Liu, Xiaojie; Han, Mingyu
2018-02-01
Carotenoid cleavage oxygenases (CCOs) are able to cleave carotenoids to produce apocarotenoids and their derivatives, which are important for plant growth and development. In this study, 21 apple CCO genes were identified and divided into six groups based on their phylogenetic relationships. We further characterized the apple CCO genes in terms of chromosomal distribution, structure and the presence of cis-elements in the promoter. We also predicted the cellular localization of the encoded proteins. An analysis of the synteny within the apple genome revealed that tandem, segmental, and whole-genome duplication events likely contributed to the expansion of the apple carotenoid oxygenase gene family. An additional integrated synteny analysis identified orthologous carotenoid oxygenase genes between apple and Arabidopsis thaliana, which served as references for the functional analysis of the apple CCO genes. The net photosynthetic rate, transpiration rate, and stomatal conductance of leaves decreased, while leaf stomatal density increased under drought and saline conditions. Tissue-specific gene expression analyses revealed diverse spatiotemporal expression patterns. Finally, hormone and abiotic stress treatments indicated that many apple CCO genes are responsive to various phytohormones as well as drought and salinity stresses. The genome-wide identification of apple CCO genes and the analyses of their expression patterns described herein may provide a solid foundation for future studies examining the regulation and functions of this gene family. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Class I KNOX genes are associated with organogenesis during bulbil formation in Agave tequilana.
Abraham-Juárez, María Jazmín; Martínez-Hernández, Aída; Leyva-González, Marco Antonio; Herrera-Estrella, Luis; Simpson, June
2010-09-01
Bulbil formation in Agave tequilana was analysed with the objective of understanding this phenomenon at the molecular and cellular levels. Bulbils formed 14-45 d after induction and were associated with rearrangements in tissue structure and accelerated cell multiplication. Changes at the cellular level during bulbil development were documented by histological analysis. In addition, several cDNA libraries produced from different stages of bulbil development were generated and partially sequenced. Sequence analysis led to the identification of candidate genes potentially involved in the initiation and development of bulbils in Agave, including two putative class I KNOX genes. Real-time reverse transcription-PCR and in situ hybridization revealed that expression of the putative Agave KNOXI genes occurs at bulbil initiation and specifically in tissue where meristems will develop. Functional analysis of Agave KNOXI genes in Arabidopsis thaliana showed the characteristic lobed phenotype of KNOXI ectopic expression in leaves, although a slightly different phenotype was observed for each of the two Agave genes. An Arabidopsis KNOXI (knat1) mutant line (CS30) was successfully complemented with one of the Agave KNOX genes and partially complemented by the other. Analysis of the expression of the endogenous Arabidopsis genes KNAT1, KNAT6, and AS1 in the transformed lines ectopically expressing or complemented by the Agave KNOX genes again showed different regulatory patterns for each Agave gene. These results show that Agave KNOX genes are functionally similar to class I KNOX genes and suggest that spatial and temporal control of their expression is essential during bulbil formation in A. tequilana.
A comparative molecular analysis of water-filled limestone sinkholes in north-eastern Mexico.
Sahl, Jason W; Gary, Marcus O; Harris, J Kirk; Spear, John R
2011-01-01
Sistema Zacatón in north-eastern Mexico is host to several deep, water-filled, anoxic, karstic sinkholes (cenotes). These cenotes were explored, mapped, and geochemically and microbiologically sampled by the autonomous underwater vehicle deep phreatic thermal explorer (DEPTHX). The community structure of the filterable fraction of the water column and extensive microbial mats that coat the cenote walls was investigated by comparative analysis of small-subunit (SSU) 16S rRNA gene sequences. Full-length Sanger gene sequence analysis revealed novel microbial diversity that included three putative bacterial candidate phyla and three additional groups that showed high intra-clade distance with poorly characterized bacterial candidate phyla. Limited functional gene sequence analysis in these anoxic environments identified genes associated with methanogenesis, sulfate reduction and anaerobic ammonium oxidation. A directed, barcoded amplicon, multiplex pyrosequencing approach was employed to compare ∼100,000 bacterial SSU gene sequences from water column and wall microbial mat samples from five cenotes in Sistema Zacatón. A new, high-resolution sequence distribution profile (SDP) method identified changes in specific phylogenetic types (phylotypes) in microbial mats at varied depths; Mantel tests showed a correlation of the genetic distances between mat communities in two cenotes and the geographic location of each cenote. Community structure profiles from the water column of three neighbouring cenotes showed distinct variation; statistically significant differences in the concentration of geochemical constituents suggest that the variation observed in microbial communities between neighbouring cenotes are due to geochemical variation. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Contemporary gene flow and mating system of Arabis alpina in a Central European alpine landscape
Buehler, D.; Graf, R.; Holderegger, R.; Gugerli, F.
2012-01-01
Background and Aims Gene flow is important in counteracting the divergence of populations but also in spreading genes among populations. However, contemporary gene flow is not well understood across alpine landscapes. The aim of this study was to estimate contemporary gene flow through pollen and to examine the realized mating system in the alpine perennial plant, Arabis alpina (Brassicaceae). Methods An entire sub-alpine to alpine landscape of 2 km2 was exhaustively sampled in the Swiss Alps. Eighteen nuclear microsatellite loci were used to genotype 595 individuals and 499 offspring from 49 maternal plants. Contemporary gene flow by pollen was estimated from paternity analysis, matching the genotypes of maternal plants and offspring to the pool of likely father plants. Realized mating patterns and genetic structure were also estimated. Key Results Paternity analysis revealed several long-distance gene flow events (≤1 km). However, most outcrossing pollen was dispersed close to the mother plants, and 84 % of all offspring were selfed. Individuals that were spatially close were more related than by chance and were also more likely to be connected by pollen dispersal. Conclusions In the alpine landscape studied, genetic structure occurred on small spatial scales as expected for alpine plants. However, gene flow also covered large distances. This makes it plausible for alpine plants to spread beneficial alleles at least via pollen across landscapes at a short time scale. Thus, gene flow potentially facilitates rapid adaptation in A. alpina likely to be required under ongoing climate change. PMID:22492332
Robust Gaussian Graphical Modeling via l1 Penalization
Sun, Hokeun; Li, Hongzhe
2012-01-01
Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775
Comparative Mitogenomic Analysis of Species Representing Six Subfamilies in the Family Tenebrionidae
Zhang, Hong-Li; Liu, Bing-Bing; Wang, Xiao-Yang; Han, Zhi-Ping; Zhang, Dong-Xu; Su, Cai-Na
2016-01-01
To better understand the architecture and evolution of the mitochondrial genome (mitogenome), mitogenomes of ten specimens representing six subfamilies in Tenebrionidae were selected, and comparative analysis of these mitogenomes was carried out in this study. Ten mitogenomes in this family share a similar gene composition, gene order, nucleotide composition, and codon usage. In addition, our results show that nucleotide bias was strongly influenced by the preference of codon usage for A/T rich codons which significantly correlated with the G + C content of protein coding genes (PCGs). Evolutionary rate analyses reveal that all PCGs have been subjected to a purifying selection, whereas 13 PCGs displayed different evolution rates, among which ATPase subunit 8 (ATP8) showed the highest evolutionary rate. We inferred the secondary structure for all RNA genes of Tenebrio molitor (Te2) and used this as the basis for comparison with the same genes from other Tenebrionidae mitogenomes. Some conserved helices (stems) and loops of RNA structures were found in different domains of ribosomal RNAs (rRNAs) and the cloverleaf structure of transfer RNAs (tRNAs). With regard to the AT-rich region, we analyzed tandem repeat sequences located in this region and identified some essential elements including T stretches, the consensus motif at the flanking regions of T stretch, and the secondary structure formed by the motif at the 3′ end of T stretch in major strand, which are highly conserved in these species. Furthermore, phylogenetic analyses using mitogenomic data strongly support the relationships among six subfamilies: ((Tenebrionidae incertae sedis + (Diaperinae + Tenebrioninae)) + (Pimeliinae + Lagriinae)), which is consistent with phylogenetic results based on morphological traits. PMID:27258256
Metagenomic analysis of microbial communities yields insight into impacts of nanoparticle design
NASA Astrophysics Data System (ADS)
Metch, Jacob W.; Burrows, Nathan D.; Murphy, Catherine J.; Pruden, Amy; Vikesland, Peter J.
2018-01-01
Next-generation DNA sequencing and metagenomic analysis provide powerful tools for the environmentally friendly design of nanoparticles. Herein we demonstrate this approach using a model community of environmental microbes (that is, wastewater-activated sludge) dosed with gold nanoparticles of varying surface coatings and morphologies. Metagenomic analysis was highly sensitive in detecting the microbial community response to gold nanospheres and nanorods with either cetyltrimethylammonium bromide or polyacrylic acid surface coatings. We observed that the gold-nanoparticle morphology imposes a stronger force in shaping the microbial community structure than does the surface coating. Trends were consistent in terms of the compositions of both taxonomic and functional genes, which include antibiotic resistance genes, metal resistance genes and gene-transfer elements associated with cell stress that are relevant to public health. Given that nanoparticle morphology remained constant, the potential influence of gold dissolution was minimal. Surface coating governed the nanoparticle partitioning between the bioparticulate and aqueous phases.
Plaga, W; Lottspeich, F; Oesterhelt, D
1992-04-01
An improved purification procedure, including nickel chelate affinity chromatography, is reported which resulted in a crystallizable pyruvate:ferredoxin oxidoreductase preparation from Halobacterium halobium. Crystals of the enzyme were obtained using potassium citrate as the precipitant. The genes coding for pyruvate:ferredoxin oxidoreductase were cloned and their nucleotide sequences determined. The genes of both subunits were adjacent to one another on the halobacterial genome. The derived amino acid sequences were confirmed by partial primary structure analysis of the purified protein. The structural motif of thiamin-diphosphate-binding enzymes was unequivocally located in the deduced amino acid sequence of the small subunit.
Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou
2011-11-01
Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.
Yu, Youjian; Liang, Ying; Lv, Meiling; Wu, Jian; Lu, Gang; Cao, Jiashu
2014-01-01
Polygalacturonase (PG, EC3.2.1.15), one of the hydrolytic enzymes associated with the modification of pectin network in plant cell wall, has an important role in various cell-separation processes that are essential for plant development. PGs are encoded by a large gene family in plants. However, information on this gene family in plant development remains limited. In the present study, 53 and 62 putative members of the PG gene family in cucumber and watermelon genomes, respectively, were identified by genome-wide search to explore the composition, structure, and evolution of the PG family in Cucurbitaceae crops. The results showed that tandem duplication could be an important factor that contributes to the expansion of the PG genes in the two crops. The phylogenetic and evolutionary analyses suggested that PGs could be classified into seven clades, and that the exon/intron structures and intron phases were conserved within but divergent between clades. At least 24 ancestral PGs were detected in the common ancestor of Arabidopsis and Cucumis sativus. Expression profile analysis by quantitative real-time polymerase chain reaction demonstrated that most CsPGs exhibit specific or high expression pattern in one of the organs/tissues. The 16 CsPGs associated with fruit development could be divided into three subsets based on their specific expression patterns and the cis-elements of fruit-specific, endosperm/seed-specific, and ethylene-responsive exhibited in their promoter regions. Our comparative analysis provided some basic information on the PG gene family, which would be valuable for further functional analysis of the PG genes during plant development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Genetic analysis of PAX3 for diagnosis of Waardenburg syndrome type I.
Matsunaga, Tatsuo; Mutai, Hideki; Namba, Kazunori; Morita, Noriko; Masuda, Sawako
2013-04-01
PAX3 genetic analysis increased the diagnostic accuracy for Waardenburg syndrome type I (WS1). Analysis of the three-dimensional (3D) structure of PAX3 helped verify the pathogenicity of a missense mutation, and multiple ligation-dependent probe amplification (MLPA) analysis of PAX3 increased the sensitivity of genetic diagnosis in patients with WS1. Clinical diagnosis of WS1 is often difficult in individual patients with isolated, mild, or non-specific symptoms. The objective of the present study was to facilitate the accurate diagnosis of WS1 through genetic analysis of PAX3 and to expand the spectrum of known PAX3 mutations. In two Japanese families with WS1, we conducted a clinical evaluation of symptoms and genetic analysis, which involved direct sequencing, MLPA analysis, quantitative PCR of PAX3, and analysis of the predicted 3D structure of PAX3. The normal-hearing control group comprised 92 subjects who had normal hearing according to pure tone audiometry. In one family, direct sequencing of PAX3 identified a heterozygous mutation, p.I59F. Analysis of PAX3 3D structures indicated that this mutation distorted the DNA-binding site of PAX3. In the other family, MLPA analysis and subsequent quantitative PCR detected a large, heterozygous deletion spanning 1759-2554 kb that eliminated 12-18 genes including a whole PAX3 gene.
Gross, C H; Russell, R L; Rohrmann, G F
1994-05-01
To investigate the regulation of p10 and polyhedron envelope protein (PEP) gene expression and their role in polyhedron development, Orgyia pseudotsugata multinucleocapsid nuclear polyhedrosis viruses lacking these genes were constructed. Recombinant viruses were produced, in which the p10 gene, the PEP gene or both genes were disrupted with the beta-glucuronidase (GUS) or beta-galactosidase (lacZ) genes. GUS activity under the control of the PEP protein promoter was observed later in infection and its maximal expression was less than 10% the level for p10 promoter-GUS constructs. Tissues from O. pseudotsugata larvae infected with these recombinants were examined by electron microscopy. Cells from insects infected with the p10- viruses lacked p10-associated fibrillar structures, but fragments of polyhedron envelope-like structures were observed on the surface of some polyhedra. Immunogold labelling of cells infected with the p10-GUS+ virus with an antibody directed against PEP showed that the PEP was concentrated at the surface of polyhedra. Although polyhedra produced by p10 and PEP gene deletion mutants demonstrated what appeared to be a polyhedron envelope by transmission electron microscopy, scanning electron microscopy showed that they had irregular, pitted surfaces that were different from wild-type polyhedra. These data suggested that both p10 and PEP are important for the proper formation of the periphery of polyhedra.
Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L
1980-01-01
The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Lei, Yaogeng; Hannoufa, Abdelali; Yu, Peiqiang
2017-01-29
Alfalfa is one of the most important legume forage crops in the world. In spite of its agronomic and nutritive advantages, alfalfa has some limitations in the usage of pasture forage and hay supplement. High rapid degradation of protein in alfalfa poses a risk of rumen bloat to ruminants which could cause huge economic losses for farmers. Coupled with the relatively high lignin content, which impedes the degradation of carbohydrate in rumen, alfalfa has unbalanced and asynchronous degradation ratio of nitrogen to carbohydrate (N/CHO) in rumen. Genetic engineering approaches have been used to manipulate the expression of genes involved in important metabolic pathways for the purpose of improving the nutritive value, forage yield, and the ability to resist abiotic stress. Such gene modification could bring molecular structural changes in alfalfa that are detectable by advanced structural analytical techniques. These structural analyses have been employed in assessing alfalfa forage characteristics, allowing for rapid, convenient and cost-effective analysis of alfalfa forage quality. In this article, we review two major obstacles facing alfalfa utilization, namely poor protein utilization and relatively high lignin content, and highlight genetic studies that were performed to overcome these drawbacks, as well as to introduce other improvements to alfalfa quality. We also review the use of advanced molecular structural analysis in the assessment of alfalfa forage for its potential usage in quality selection in alfalfa breeding.
Lei, Yaogeng; Hannoufa, Abdelali; Yu, Peiqiang
2017-01-01
Alfalfa is one of the most important legume forage crops in the world. In spite of its agronomic and nutritive advantages, alfalfa has some limitations in the usage of pasture forage and hay supplement. High rapid degradation of protein in alfalfa poses a risk of rumen bloat to ruminants which could cause huge economic losses for farmers. Coupled with the relatively high lignin content, which impedes the degradation of carbohydrate in rumen, alfalfa has unbalanced and asynchronous degradation ratio of nitrogen to carbohydrate (N/CHO) in rumen. Genetic engineering approaches have been used to manipulate the expression of genes involved in important metabolic pathways for the purpose of improving the nutritive value, forage yield, and the ability to resist abiotic stress. Such gene modification could bring molecular structural changes in alfalfa that are detectable by advanced structural analytical techniques. These structural analyses have been employed in assessing alfalfa forage characteristics, allowing for rapid, convenient and cost-effective analysis of alfalfa forage quality. In this article, we review two major obstacles facing alfalfa utilization, namely poor protein utilization and relatively high lignin content, and highlight genetic studies that were performed to overcome these drawbacks, as well as to introduce other improvements to alfalfa quality. We also review the use of advanced molecular structural analysis in the assessment of alfalfa forage for its potential usage in quality selection in alfalfa breeding. PMID:28146083
Bracco, Mariana; Cascales, Jimena; Hernández, Julián Cámara; Poggio, Lidia; Gottlieb, Alexandra M; Lia, Verónica V
2016-08-26
Maize landraces from South America have traditionally been assigned to two main categories: Andean and Tropical Lowland germplasm. However, the genetic structure and affiliations of the lowland gene pools have been difficult to assess due to limited sampling and the lack of comparative analysis. Here, we examined SSR and Adh2 sequence variation in a diverse sample of maize landraces from lowland middle South America, and performed a comprehensive integrative analysis of population structure and diversity including already published data of archaeological and extant specimens from the Americas. Geographic distribution models were used to explore the relationship between environmental factors and the observed genetic structure. Bayesian and multivariate analyses of population structure showed the existence of two previously overlooked lowland gene pools associated with Guaraní indigenous communities of middle South America. The singularity of this germplasm was also evidenced by the frequency distribution of microsatellite repeat motifs of the Adh2 locus and the distinct spatial pattern inferred from geographic distribution models. Our results challenge the prevailing view that lowland middle South America is just a contact zone between Andean and Tropical Lowland germplasm and highlight the occurrence of a unique, locally adapted gene pool. This information is relevant for the conservation and utilization of maize genetic resources, as well as for a better understanding of environment-genotype associations.
Chen, Zhong-Yuan; Gao, Xiao-Chan; Zhang, Qi-Ya
2015-08-03
Aquareoviruses are serious pathogens of aquatic animals. Here, genome characterization and functional gene analysis of a novel aquareovirus, largemouth bass Micropterus salmoides reovirus (MsReV), was described. It comprises 11 dsRNA segments (S1-S11) covering 24,024 bp, and encodes 12 putative proteins including the inclusion forming-related protein NS87 and the fusion-associated small transmembrane (FAST) protein NS22. The function of NS22 was confirmed by expression in fish cells. Subsequently, MsReV was compared with two representative aquareoviruses, saltwater fish turbot Scophthalmus maximus reovirus (SMReV) and freshwater fish grass carp reovirus strain 109 (GCReV-109). MsReV NS87 and NS22 genes have the same structure and function with those of SMReV, whereas GCReV-109 is either missing the coiled-coil region in NS79 or the gene-encoding NS22. Significant similarities are also revealed among equivalent genome segments between MsReV and SMReV, but a difference is found between MsReV and GCReV-109. Furthermore, phylogenetic analysis showed that 13 aquareoviruses could be divided into freshwater and saline environments subgroups, and MsReV was closely related to SMReV in saline environments. Consequently, these viruses from hosts in saline environments have more genomic structural similarities than the viruses from hosts in freshwater. This is the first study of the relationships between aquareovirus genomic structure and their host environments.
Alu Elements as Novel Regulators of Gene Expression in Type 1 Diabetes Susceptibility Genes?
Kaur, Simranjeet; Pociot, Flemming
2015-07-13
Despite numerous studies implicating Alu repeat elements in various diseases, there is sparse information available with respect to the potential functional and biological roles of the repeat elements in Type 1 diabetes (T1D). Therefore, we performed a genome-wide sequence analysis of T1D candidate genes to identify embedded Alu elements within these genes. We observed significant enrichment of Alu elements within the T1D genes (p-value < 10e-16), which highlights their importance in T1D. Functional annotation of T1D genes harboring Alus revealed significant enrichment for immune-mediated processes (p-value < 10e-6). We also identified eight T1D genes harboring inverted Alus (IRAlus) within their 3' untranslated regions (UTRs) that are known to regulate the expression of host mRNAs by generating double stranded RNA duplexes. Our in silico analysis predicted the formation of duplex structures by IRAlus within the 3'UTRs of T1D genes. We propose that IRAlus might be involved in regulating the expression levels of the host T1D genes.
Cheng, Hongtao; Hao, Mengyu; Wang, Wenxiang; Mei, Desheng; Tong, Chaobo; Wang, Hui; Liu, Jia; Fu, Li; Hu, Qiong
2016-09-08
SBP-box genes belong to one of the largest families of transcription factors. Though members of this family have been characterized to be important regulators of diverse biological processes, information of SBP-box genes in the third most important oilseed crop Brassica napus is largely undefined. In the present study, by whole genome bioinformatics analysis and transcriptional profiling, 58 putative members of SBP-box gene family in oilseed rape (Brassica napus L.) were identified and their expression pattern in different tissues as well as possible interaction with miRNAs were analyzed. In addition, B. napus lines with contrasting branch angle were used for investigating the involvement of SBP-box genes in plant architecture regulation. Detailed gene information, including genomic organization, structural feature, conserved domain and phylogenetic relationship of the genes were systematically characterized. By phylogenetic analysis, BnaSBP proteins were classified into eight distinct groups representing the clear orthologous relationships to their family members in Arabidopsis and rice. Expression analysis in twelve tissues including vegetative and reproductive organs showed different expression patterns among the SBP-box genes and a number of the genes exhibit tissue specific expression, indicating their diverse functions involved in the developmental process. Forty-four SBP-box genes were ascertained to contain the putative miR156 binding site, with 30 and 14 of the genes targeted by miR156 at the coding and 3'UTR region, respectively. Relative expression level of miR156 is varied across tissues. Different expression pattern of some BnaSBP genes and the negative correlation of transcription levels between miR156 and its target BnaSBP gene were observed in lines with different branch angle. Taken together, this study represents the first systematic analysis of the SBP-box gene family in Brassica napus. The data presented here provides base foundation for understanding the crucial roles of BnaSBP genes in plant development and other biological processes.
Active bacterial community structure along vertical redox gradients in Baltic Sea sediment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jansson, Janet; Edlund, Anna; Hardeman, Fredrik
Community structures of active bacterial populations were investigated along a vertical redox profile in coastal Baltic Sea sediments by terminal-restriction fragment length polymorphism (T-RFLP) and clone library analysis. According to correspondence analysis of T-RFLP results and sequencing of cloned 16S rRNA genes, the microbial community structures at three redox depths (179 mV, -64 mV and -337 mV) differed significantly. The bacterial communities in the community DNA differed from those in bromodeoxyuridine (BrdU)-labeled DNA, indicating that the growing members of the community that incorporated BrdU were not necessarily the most dominant members. The structures of the actively growing bacterial communities weremore » most strongly correlated to organic carbon followed by total nitrogen and redox potentials. Bacterial identification by sequencing of 16S rRNA genes from clones of BrdU-labeled DNA and DNA from reverse transcription PCR (rt-PCR) showed that bacterial taxa involved in nitrogen and sulfur cycling were metabolically active along the redox profiles. Several sequences had low similarities to previously detected sequences indicating that novel lineages of bacteria are present in Baltic Sea sediments. Also, a high number of different 16S rRNA gene sequences representing different phyla were detected at all sampling depths.« less
Igawa, T; Oumi, S; Katsuren, S; Sumida, M
2013-01-01
Isolation by distance and landscape connectivity are fundamental factors underlying speciation and evolution. To understand how landscapes affect gene flow and shape population structures, island species provide intrinsic study objects. We investigated the effects of landscapes on the population structure of the endangered frog species, Odorrana ishikawae and O. splendida, which each inhabit an island in southwest Japan. This was done by examining population structure, gene flow and demographic history of each species by analyzing 12 microsatellite loci and exploring causal environmental factors through ecological niche modeling (ENM) and the cost-distance approach. Our results revealed that the limited gene flow and multiple-population structure in O. splendida and the single-population structure in O. ishikawae were maintained after divergence of the species through ancient vicariance between islands. We found that genetic distance correlated with geographic distance between populations of both species. Our landscape genetic analysis revealed that the connectivity of suitable habitats influences gene flow and leads to the formation of specific population structures. In particular, different degrees of topographical complexity between islands are the major determining factor for shaping contrasting population structures of two species. In conclusion, our results illustrate the diversification mechanism of organisms through the interaction with space and environment. Our results also present an ENM approach for identifying the key factors affecting demographic history and population structures of target species, especially endangered species. PMID:22990312
Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut
2014-01-01
Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional targets and upstream regulators showed differential expression between the contrasting morphotypes. Interestingly, although selected network genes showed overlapping expression patterns in situ and no morph differences, Timp2 expression patterns differed between morphs. Our comparative study of transcriptional dynamics in divergent craniofacial morphologies of Arctic charr revealed a conserved network of coexpressed genes sharing functional roles in structural morphogenesis. We also implicate transcriptional regulators of the network as targets for future functional studies.
Analysis of the aac(3)-VIa gene encoding a novel 3-N-acetyltransferase.
Rather, P N; Mann, P A; Mierzwa, R; Hare, R S; Miller, G H; Shaw, K J
1993-01-01
Biochemical analysis (G. A. Papanicolaou, R. S. Hare, R. Mierzwa, and G. H. Miller, abstr. 152, Program Abstr. 29th Intersci. Conf. Antimicrob. Agents Chemother., 1989) demonstrated the presence of a novel 3-N-acetyltransferase in Enterobacter cloacae 88020217. This organism was resistant to gentamicin, and the MIC of 2'-N-ethylnetilmicin for it was fourfold lower than that of 6'-N-ethylnetilmicin, a resistance pattern which suggested 2'-acetylating activity. However, high-pressure liquid chromatography analysis demonstrated that the enzyme acetylated sisomicin in the 3 position. We have cloned the structural gene for this enzyme from a large (> 70-kb) conjugative plasmid present in E. cloacae. Subcloning experiments have localized the aac(3)-VIa gene to a 2.1-kb Sau3A fragment. The deduced AAC(3)-VIa protein showed 48% amino acid identity to the AAC(3)-IIa protein and 39% identity to the AAC(3)-VII protein. Examination of the 5'-flanking sequences demonstrated that the aac(3)-VIa gene was located 167 bp downstream of the aadA1 gene and was present in an integron. In addition, the aac(3)-VIa gene is also downstream of a 59-base element often seen in an integron environment. Primer extension analysis has identified a promoter for the aac(3)-VIa gene downstream of both the aadA1 gene and a 59-base element. Images PMID:8257126
Chao, Yuanqing; Ma, Liping; Yang, Ying; Ju, Feng; Zhang, Xu-Xiang; Wu, Wei-Min; Zhang, Tong
2013-12-19
The metagenomic approach was applied to characterize variations of microbial structure and functions in raw (RW) and treated water (TW) in a drinking water treatment plant (DWTP) at Pearl River Delta, China. Microbial structure was significantly influenced by the treatment processes, shifting from Gammaproteobacteria and Betaproteobacteria in RW to Alphaproteobacteria in TW. Further functional analysis indicated the basic metabolic functions of microorganisms in TW did not vary considerably. However, protective functions, i.e. glutathione synthesis genes in 'oxidative stress' and 'detoxification' subsystems, significantly increased, revealing the surviving bacteria may have higher chlorine resistance. Similar results were also found in glutathione metabolism pathway, which identified the major reaction for glutathione synthesis and supported more genes for glutathione metabolism existed in TW. This metagenomic study largely enhanced our knowledge about the influences of treatment processes, especially chlorination, on bacterial community structure and protective functions (e.g. glutathione metabolism) in ecosystems of DWTPs.
Genome-wide network analysis of Wnt signaling in three pediatric cancers
NASA Astrophysics Data System (ADS)
Bao, Ju; Lee, Ho-Jin; Zheng, Jie J.
2013-10-01
Genomic structural alteration is common in pediatric cancers, and analysis of data generated by the Pediatric Cancer Genome Project reveals such tumor-related alterations in many Wnt signaling-associated genes. Most pediatric cancers are thought to arise within developing tissues that undergo substantial expansion during early organ formation, growth and maturation, and Wnt signaling plays an important role in this development. We examined three pediatric tumors--medullobastoma, early T-cell precursor acute lymphoblastic leukemia, and retinoblastoma--that show multiple genomic structural variations within Wnt signaling pathways. We mathematically modeled this pathway to investigate the effects of cancer-related structural variations on Wnt signaling. Surprisingly, we found that an outcome measure of canonical Wnt signaling was consistently similar in matched cancer cells and normal cells, even in the context of different cancers, different mutations, and different Wnt-related genes. Our results suggest that the cancer cells maintain a normal level of Wnt signaling by developing multiple mutations.
Diversification of Root Hair Development Genes in Vascular Plants.
Huang, Ling; Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui; Schiefelbein, John
2017-07-01
The molecular genetic program for root hair development has been studied intensively in Arabidopsis ( Arabidopsis thaliana ). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. © 2017 American Society of Plant Biologists. All Rights Reserved.
Diversification of Root Hair Development Genes in Vascular Plants1[OPEN
Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui
2017-01-01
The molecular genetic program for root hair development has been studied intensively in Arabidopsis (Arabidopsis thaliana). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. PMID:28487476
Shang, Haihong; Li, Wei; Zou, Changsong; Yuan, Youlu
2013-07-01
NAC domain proteins are plant-specific transcription factors known to play diverse roles in various plant developmental processes. In the present study, we performed the first comprehensive study of the NAC gene family in Gossypium raimondii Ulbr., incorporating phylogenetic, chromosomal location, gene structure, conserved motif, and expression profiling analyses. We identified 145 NAC transcription factor (NAC-TF) genes that were phylogenetically clustered into 18 distinct subfamilies. Of these, 127 NAC-TF genes were distributed across the 13 chromosomes, 80 (55%) were preferentially retained duplicates located in both duplicated regions and six were located in triplicated chromosomal regions. The majority of NAC-TF genes showed temporal-, spatial-, and tissue-specific expression patterns based on transcriptomic and qRT-PCR analyses. However, the expression patterns of several duplicate genes were partially redundant, suggesting the occurrence of sub-functionalization during their evolution. Based on their genomic organization, we concluded that genomic duplications contributed significantly to the expansion of the NAC-TF gene family in G. raimondii. Comprehensive analysis of their expression profiles could provide novel insights into the functional divergence among members of the NAC gene family in G. raimondii. © 2013 Institute of Botany, Chinese Academy of Sciences.
Novel mechanism of conjoined gene formation in the human genome.
Kim, Ryong Nam; Kim, Aeri; Choi, Sang-Haeng; Kim, Dae-Soo; Nam, Seong-Hyeuk; Kim, Dae-Won; Kim, Dong-Wook; Kang, Aram; Kim, Min-Young; Park, Kun-Hyang; Yoon, Byoung-Ha; Lee, Kang Seon; Park, Hong-Seog
2012-03-01
Recently, conjoined genes (CGs) have emerged as important genetic factors necessary for understanding the human genome. However, their formation mechanism and precise structures have remained mysterious. Based on a detailed structural analysis of 57 human CG transcript variants (CGTVs, discovered in this study) and all (833) known CGs in the human genome, we discovered that the poly(A) signal site from the upstream parent gene region is completely removed via the skipping or truncation of the final exon; consequently, CG transcription is terminated at the poly(A) signal site of the downstream parent gene. This result led us to propose a novel mechanism of CG formation: the complete removal of the poly(A) signal site from the upstream parent gene is a prerequisite for the CG transcriptional machinery to continue transcribing uninterrupted into the intergenic region and downstream parent gene. The removal of the poly(A) signal sequence from the upstream gene region appears to be caused by a deletion or truncation mutation in the human genome rather than post-transcriptional trans-splicing events. With respect to the characteristics of CG sequence structures, we found that intergenic regions are hot spots for novel exon creation during CGTV formation and that exons farther from the intergenic regions are more highly conserved in the CGTVs. Interestingly, many novel exons newly created within the intergenic and intragenic regions originated from transposable element sequences. Additionally, the CGTVs showed tumor tissue-biased expression. In conclusion, our study provides novel insights into the CG formation mechanism and expands the present concepts of the genetic structural landscape, gene regulation, and gene formation mechanisms in the human genome.
Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun
2013-01-01
The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867
The structure of a gene co-expression network reveals biological functions underlying eQTLs.
Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali
2013-01-01
What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.
The drug target genes show higher evolutionary conservation than non-target genes.
Lv, Wenhua; Xu, Yongdeng; Guo, Yiying; Yu, Ziqi; Feng, Guanglong; Liu, Panpan; Luan, Meiwei; Zhu, Hongjie; Liu, Guiyou; Zhang, Mingming; Lv, Hongchao; Duan, Lian; Shang, Zhenwei; Li, Jin; Jiang, Yongshuai; Zhang, Ruijie
2016-01-26
Although evidence indicates that drug target genes share some common evolutionary features, there have been few studies analyzing evolutionary features of drug targets from an overall level. Therefore, we conducted an analysis which aimed to investigate the evolutionary characteristics of drug target genes. We compared the evolutionary conservation between human drug target genes and non-target genes by combining both the evolutionary features and network topological properties in human protein-protein interaction network. The evolution rate, conservation score and the percentage of orthologous genes of 21 species were included in our study. Meanwhile, four topological features including the average shortest path length, betweenness centrality, clustering coefficient and degree were considered for comparison analysis. Then we got four results as following: compared with non-drug target genes, 1) drug target genes had lower evolutionary rates; 2) drug target genes had higher conservation scores; 3) drug target genes had higher percentages of orthologous genes and 4) drug target genes had a tighter network structure including higher degrees, betweenness centrality, clustering coefficients and lower average shortest path lengths. These results demonstrate that drug target genes are more evolutionarily conserved than non-drug target genes. We hope that our study will provide valuable information for other researchers who are interested in evolutionary conservation of drug targets.
Yang, Zefeng; Gu, Shiliang; Wang, Xuefeng; Li, Wenjuan; Tang, Zaixiang; Xu, Chenwu
2008-09-01
CPP-like genes are members of a small family which features the existence of two similar Cys-rich domains termed CXC domains in their protein products and are distributed widely in plants and animals but do not exist in yeast. The members of this family in plants play an important role in development of reproductive tissue and control of cell division. To gain insights into how CPP-like genes evolved in plants, we conducted a comparative phylogenetic and molecular evolutionary analysis of the CPP-like gene family in Arabidopsis and rice. The results of phylogeny revealed that both gene loss and species-specific expansion contributed to the evolution of this family in Arabidopsis and rice. Both intron gain and intron loss were observed through intron/exon structure analysis for duplicated genes. Our results also suggested that positive selection was a major force during the evolution of CPP-like genes in plants, and most amino acid residues under positive selection were disproportionately located in the region outside the CXC domains. Further analysis revealed that two CXC domains and sequences connecting them might have coevolved during the long evolutionary period.
Ye, Jianqiu; Yang, Hai; Shi, Haitao; Wei, Yunxie; Tie, Weiwei; Ding, Zehong; Yan, Yan; Luo, Ying; Xia, Zhiqiang; Wang, Wenquan; Peng, Ming; Li, Kaimian; Zhang, He; Hu, Wei
2017-11-02
Mitogen-activated protein kinase kinase kinases (MAPKKKs), an important unit of MAPK cascade, play crucial roles in plant development and response to various stresses. However, little is known concerning the MAPKKK family in the important subtropical and tropical crop cassava. In this study, 62 MAPKKK genes were identified in the cassava genome, and were classified into 3 subfamilies based on phylogenetic analysis. Most of MAPKKKs in the same subfamily shared similar gene structures and conserved motifs. The comprehensive transcriptome analysis showed that MAPKKK genes participated in tissue development and response to drought stress. Comparative expression profiles revealed that many MAPKKK genes were activated in cultivated varieties SC124 and Arg7 and the function of MeMAPKKKs in drought resistance may be different between SC124/Arg7 and W14. Expression analyses of the 7 selected MeMAPKKK genes showed that most of them were significantly upregulated by osmotic, salt and ABA treatments, whereas slightly induced by H 2 O 2 and cold stresses. Taken together, this study identified candidate MeMAPKKK genes for genetic improvement of abiotic stress resistance and provided new insights into MAPKKK -mediated cassava resistance to drought stress.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopal, B.; Madan, Lalima L.; Betz, Stephen F.
2010-11-10
Common structural motifs, such as the cupin domains, are found in enzymes performing different biochemical functions while retaining a similar active site configuration and structural scaffold. The soil bacterium Bacillus subtilis has 20 cupin genes (0.5% of the total genome) with up to 14% of its genes in the form of doublets, thus making it an attractive system for studying the effects of gene duplication. There are four bicupins in B. subtilis encoded by the genes yvrK, yoaN, yxaG, and ywfC. The gene products of yvrK and yoaN function as oxalate decarboxylases with a manganese ion at the active site(s),more » whereas YwfC is a bacitracin synthetase. Here we present the crystal structure of YxaG, a novel iron-containing quercetin 2,3-dioxygenase with one active site in each cupin domain. Yxag is a dimer, both in solution and in the crystal. The crystal structure shows that the coordination geometry of the Fe ion is different in the two active sites of YxaG. Replacement of the iron at the active site with other metal ions suggests modulation of enzymatic activity in accordance with the Irving-Williams observation on the stability of metal ion complexes. This observation, along with a comparison with the crystal structure of YvrK determined recently, has allowed for a detailed structure-function analysis of the active site, providing clues to the diversification of function in the bicupin family of proteins.« less
Genome-wide identification and analysis of the MADS-box gene family in apple.
Tian, Yi; Dong, Qinglong; Ji, Zhirui; Chi, Fumei; Cong, Peihua; Zhou, Zongshan
2015-01-25
The MADS-box gene family is one of the most widely studied families in plants and has diverse developmental roles in flower pattern formation, gametophyte cell division and fruit differentiation. Although the genome-wide analysis of this family has been performed in some species, little is known regarding MADS-box genes in apple (Malus domestica). In this study, 146 MADS-box genes were identified in the apple genome and were phylogenetically clustered into six subgroups (MIKC(c), MIKC*, Mα, Mβ, Mγ and Mδ) with the MADS-box genes from Arabidopsis and rice. The predicted apple MADS-box genes were distributed across all 17 chromosomes at different densities. Additionally, the MADS-box domain, exon length, gene structure and motif compositions of the apple MADS-box genes were analysed. Moreover, the expression of all of the apple MADS-box genes was analysed in the root, stem, leaf, flower tissues and five stages of fruit development. All of the apple MADS-box genes, with the exception of some genes in each group, were expressed in at least one of the tissues tested, which indicates that the MADS-box genes are involved in various aspects of the physiological and developmental processes of the apple. To the best of our knowledge, this report describes the first genome-wide analysis of the apple MADS-box gene family, and the results should provide valuable information for understanding the classification, cloning and putative functions of this family. Copyright © 2014 Elsevier B.V. All rights reserved.
Sato, Mitsuharu; Miyazaki, Kentaro
2017-01-01
Horizontal gene transfer (HGT) is a ubiquitous genetic event in bacterial evolution, but it seldom occurs for genes involved in highly complex supramolecules (or biosystems), which consist of many gene products. The ribosome is one such supramolecule, but several bacteria harbor dissimilar and/or chimeric 16S rRNAs in their genomes, suggesting the occurrence of HGT of this gene. However, we know little about whether the genes actually experience HGT and, if so, the frequency of such a transfer. This is primarily because the methods currently employed for phylogenetic analysis (e.g., neighbor-joining, maximum likelihood, and maximum parsimony) of 16S rRNA genes assume point mutation-driven tree-shape evolution as an evolutionary model, which is intrinsically inappropriate to decipher the evolutionary history for genes driven by recombination. To address this issue, we applied a phylogenetic network analysis, which has been used previously for detection of genetic recombination in homologous alleles, to the 16S rRNA gene. We focused on the genus Enterobacter, whose phylogenetic relationships inferred by multi-locus sequence alignment analysis and 16S rRNA sequences are incompatible. All 10 complete genomic sequences were retrieved from the NCBI database, in which 71 16S rRNA genes were included. Neighbor-joining analysis demonstrated that the genes residing in the same genomes clustered, indicating the occurrence of intragenomic recombination. However, as suggested by the low bootstrap values, evolutionary relationships between the clusters were uncertain. We then applied phylogenetic network analysis to representative sequences from each cluster. We found three ancestral 16S rRNA groups; the others were likely created through recursive recombination between the ancestors and chimeric descendants. Despite the large sequence changes caused by the recombination events, the RNA secondary structures were conserved. Successive intergenomic and intragenomic recombination thus shaped the evolution of 16S rRNA genes in the genus Enterobacter. PMID:29180992
Zhou, Wei; Song, Xiang-gang; Chen, Chao; Wang, Shu-mei; Liang, Sheng-wang
2015-08-01
Action mechanism and material base of compound Danshen dripping pills in treatment of carotid atherosclerosis were discussed based on gene expression profile and molecular fingerprint in this paper. First, gene expression profiles of atherosclerotic carotid artery tissues and histologically normal tissues in human body were collected, and were screened using significance analysis of microarray (SAM) to screen out differential gene expressions; then differential genes were analyzed by Gene Ontology (GO) analysis and KEGG pathway analysis; to avoid some genes with non-outstanding differential expression but biologically importance, Gene Set Enrichment Analysis (GSEA) were performed, and 7 chemical ingredients with higher negative enrichment score were obtained by Cmap method, implying that they could reversely regulate the gene expression profiles of pathological tissues; and last, based on the hypotheses that similar structures have similar activities, 336 ingredients of compound Danshen dripping pills were compared with 7 drug molecules in 2D molecular fingerprints method. The results showed that 147 differential genes including 60 up-regulated genes and 87 down regulated genes were screened out by SAM. And in GO analysis, Biological Process ( BP) is mainly concerned with biological adhesion, response to wounding and inflammatory response; Cellular Component (CC) is mainly concerned with extracellular region, extracellular space and plasma membrane; while Molecular Function (MF) is mainly concerned with antigen binding, metalloendopeptidase activity and peptide binding. KEGG pathway analysis is mainly concerned with JAK-STAT, RIG-I like receptor and PPAR signaling pathway. There were 10 compounds, such as hexadecane, with Tanimoto coefficients greater than 0.85, which implied that they may be the active ingredients (AIs) of compound Danshen dripping pills in treatment of carotid atherosclerosis (CAs). The present method can be applied to the research on material base and molecular action mechanism of TCM.
Two-component signal transduction systems of Xanthomonas spp.: a lesson from genomics.
Qian, Wei; Han, Zhong-Ji; He, Chaozu
2008-02-01
The two-component signal transduction systems (TCSTSs), consisting of a histidine kinase sensor (HK) and a response regulator (RR), are the dominant molecular mechanisms by which prokaryotes sense and respond to environmental stimuli. Genomes of Xanthomonas generally contain a large repertoire of TCSTS genes (approximately 92 to 121 for each genome), which encode diverse structural groups of HKs and RRs. Among them, although a core set of 70 TCSTS genes (about two-thirds in total) which accumulates point mutations with a slow rate are shared by these genomes, the other genes, especially hybrid HKs, experienced extensive genetic recombination, including genomic rearrangement, gene duplication, addition or deletion, and fusion or fission. The recombinations potentially promote the efficiency and complexity of TCSTSs in regulating gene expression. In addition, our analysis suggests that a co-evolutionary model, rather than a selfish operon model, is the major mechanism for the maintenance and microevolution of TCSTS genes in the genomes of Xanthomonas. Genomic annotation, secondary protein structure prediction, and comparative genomic analyses of TCSTS genes reviewed here provide insights into our understanding of signal networks in these important phytopathogenic bacteria.
Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki
2010-01-01
A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057
A web application for automatic prediction of gene translation elongation efficiency.
Sokolov, Vladimir S; Zuraev, Bulat S; Lashin, Sergei A; Matushkin, Yury G
2015-03-01
Expression efficiency is one of the major characteristics describing genes in various modern investigations. Expression efficiency of genes is regulated at various stages: transcription, translation, posttranslational protein modification and others. In this study, a special EloE (Elongation Efficiency) web application is described. The EloE sorts the organism's genes in a descend order on their theoretical rate of the elongation stage of translation based on the analysis of their nucleotide sequences. Obtained theoretical data have a significant correlation with available experimental data of gene expression in various organisms. In addition, the program identifies preferential codons in organism's genes and defines distribution of potential secondary structures energy in 5´ and 3´ regions of mRNA. The EloE can be useful in preliminary estimation of translation elongation efficiency for genes for which experimental data are not available yet. Some results can be used, for instance, in other programs modeling artificial genetic structures in genetically engineered experiments. The EloE web application is available at http://www-bionet.sscc.ru:7780/EloE.
Chen, Wen; Si, Guo-Yang; Zhao, Gang; Abdullah, Muhammad; Guo, Ning; Li, Da-Hui; Sun, Xu; Cai, Yong-Ping; Lin, Yi; Gao, Jun-Shan
2018-05-05
Plant P-type H⁺-ATPase (P-ATPase) is a membrane protein existing in the plasma membrane that plays an important role in the transmembrane transport of plant cells. To understand the variety and quantity of P-ATPase proteins in different cotton species, we combined four databases from two diploid cotton species ( Gossypium raimondii and G. arboreum ) and two tetraploid cotton species ( G. hirsutum and G. barbadense ) to screen the P-ATPase gene family and resolved the evolutionary relationships between the former cotton species. We identified 53, 51, 99 and 98 P-ATPase genes from G. arboretum, G. raimondii , G. barbadense and G. hirsutum , respectively. The structural and phylogenetic analyses revealed that the gene structure was consistent between P-ATPase genes, with a close evolutionary relationship. The expression analysis of P-ATPase genes showed that many P-ATPase genes were highly expressed in various tissues and at different fiber developmental stages in G. hirsutum , suggesting that they have potential functions during growth and fiber development in cotton.
Ma, Jinxing; Wang, Zhiwei; Li, Huan; Park, Hee-Deung; Wu, Zhichao
2016-06-01
Metagenomic sequencing was used to investigate the microbial structures, functional potentials, and biofouling-related genes in a membrane bioreactor (MBR). The results showed that the microbial community in the MBR was highly diverse. Notably, function analysis of the dominant genera indicated that common genes from different phylotypes were identified for important functional potentials with the observation of variation of abundances of genes in a certain taxon (e.g., Dechloromonas). Despite maintaining similar metabolic functional potentials with a parallel full-scale conventional activated sludge (CAS) system due to treating the identical wastewater, the MBR had more abundant nitrification-related bacteria and coding genes of ammonia monooxygenase, which could well explain its excellent ammonia removal in the low-temperature period. Furthermore, according to quantification of the genes involved in exopolysaccharide and extracellular polymeric substance (EPS) protein metabolism, the MBR did not show a much different potential in producing EPS compared to the CAS system, and bacteria from the membrane biofilm had lower abundances of genes associated with EPS biosynthesis and transport compared to the activated sludge in the MBR.
Differential gene expression patterns in the autogamous plant Hordeum euclaston (Poaceae).
Georg-Kraemer, J E; Ferreira, C A S; Cavalli, S S
2011-02-22
Sib-seedlings of 95 strains of the strictly autogamous grass Hordeum euclaston were analyzed by horizontal polyacrylamide gel electrophoresis for four isoenzyme systems at a specific ontogenetic stage. We found differences in the activity of some genes among individuals of this species. Hence, an ontogenetic analysis was carried out to investigate 12 strains at five ontogenetic stages, to determine the patterns of expression of these genes during development. The differences in the presence versus absence of certain isoenzyme bands may be due to differential regulatory activation in response to environmental differences, as all plants showed the same structural genes, although these genes were active in different tissues and/or times of development. These results indicate the importance of differential gene activation in the metabolic phenotype variability of this strictly autogamous, highly homozygous species. The same structural alleles for isoenzymes showed the active form of the enzymes (phenotypic expression) to be present in different tissues and/or stages of development. Differential isoenzyme gene activation was shown to be directly responsible for the enzymatic variability (metabolic phenotype) presented by the plants, which seem to possess almost no heterozygosis.
Identification and characterization of NF-YB family genes in tung tree.
Yang, Susu; Wang, Yangdong; Yin, Hengfu; Guo, Haobo; Gao, Ming; Zhu, Huiping; Chen, Yicun
2015-12-01
The NF-YB transcription factor gene family encodes a subunit of the CCAAT box-binding factor (CBF), a highly conserved trimeric activator that strongly binds to the CCAAT box promoter element. Studies on model plants have shown that NF-YB proteins participate in important developmental and physiological processes, but little is known about NF-YB proteins in trees. Here, we identified seven NF-YB transcription factor-encoding genes in Vernicia fordii, an important oilseed tree in China. A phylogenetic analysis separated the genes into two groups; non-LEC1 type (VfNF-YB1, 5, 7, 9, 11, 13) and LEC1-type (VfNF-YB 14). A gene structure analysis showed that VfNF-YB 5 has three introns and the other genes have no introns. The seven VfNF-YB sequences contain highly conserved domains, a disordered region at the N terminus, and two long helix structures at the C terminus. Phylogenetic analyses showed that VfNF-YB family genes are highly homologous to GmNF-YB genes, and many of them are closely related to functionally characterized NF-YBs. In expression analyses of various tissues (root, stem, leaf, and kernel) and the root during pathogen infection, VfNF-YB1, 5, and 11 were dominantly expressed in kernels, and VfNF-YB7 and 9 were expressed only in the root. Different VfNF-YB family genes showed different responses to pathogen infection, suggesting that they play different roles in the pathogen response. Together, these findings represent the first extensive evaluation of the NF-YB family in tung tree and provide a foundation for dissecting the functions of VfNF-YB genes in seed development, stress adaption, fatty acid synthesis, and pathogen response.
Xu, Zongda; Zhang, Qixiang; Sun, Lidan; Du, Dongliang; Cheng, Tangren; Pan, Huitang; Yang, Weiru; Wang, Jia
2014-10-01
MADS-box genes encode transcription factors that play crucial roles in plant development, especially in flower and fruit development. To gain insight into this gene family in Prunus mume, an important ornamental and fruit plant in East Asia, and to elucidate their roles in flower organ determination and fruit development, we performed a genome-wide identification, characterisation and expression analysis of MADS-box genes in this Rosaceae tree. In this study, 80 MADS-box genes were identified in P. mume and categorised into MIKC, Mα, Mβ, Mγ and Mδ groups based on gene structures and phylogenetic relationships. The MIKC group could be further classified into 12 subfamilies. The FLC subfamily was absent in P. mume and the six tandemly arranged DAM genes might experience a species-specific evolution process in P. mume. The MADS-box gene family might experience an evolution process from MIKC genes to Mδ genes to Mα, Mβ and Mγ genes. The expression analysis suggests that P. mume MADS-box genes have diverse functions in P. mume development and the functions of duplicated genes diverged after the duplication events. In addition to its involvement in the development of female gametophytes, type I genes also play roles in male gametophytes development. In conclusion, this study adds to our understanding of the roles that the MADS-box genes played in flower and fruit development and lays a foundation for selecting candidate genes for functional studies in P. mume and other species. Furthermore, this study also provides a basis to study the evolution of the MADS-box family.
Pandey, Ashutosh; Misra, Prashant; Alok, Anshu; Kaur, Navneet; Sharma, Shivani; Lakhwani, Deepika; Asif, Mehar H.; Tiwari, Siddharth; Trivedi, Prabodh K.
2016-01-01
The homeodomain zipper family (HD-ZIP) of transcription factors is present only in plants and plays important role in the regulation of plant-specific processes. The subfamily IV of HDZ transcription factors (HD-ZIP IV) has primarily been implicated in the regulation of epidermal structure development. Though this gene family is present in all lineages of land plants, members of this gene family have not been identified in banana, which is one of the major staple fruit crops. In the present work, we identified 21 HDZIV encoding genes in banana by the computational analysis of banana genome resource. Our analysis suggested that these genes putatively encode proteins having all the characteristic domains of HDZIV transcription factors. The phylogenetic analysis of the banana HDZIV family genes further confirmed that after separation from a common ancestor, the banana, and poales lineages might have followed distinct evolutionary paths. Further, we conclude that segmental duplication played a major role in the evolution of banana HDZIV encoding genes. All the identified banana HDZIV genes expresses in different banana tissue, however at varying levels. The transcript levels of some of the banana HDZIV genes were also detected in banana fruit pulp, suggesting their putative role in fruit attributes. A large number of genes of this family showed modulated expression under drought and salinity stress. Taken together, the present work lays a foundation for elucidation of functional aspects of the banana HDZIV encoding genes and for their possible use in the banana improvement programs. PMID:26870050
Genomic survey, expression profile and co-expression network analysis of OsWD40 family in rice
2012-01-01
Background WD40 proteins represent a large family in eukaryotes, which have been involved in a broad spectrum of crucial functions. Systematic characterization and co-expression analysis of OsWD40 genes enable us to understand the networks of the WD40 proteins and their biological processes and gene functions in rice. Results In this study, we identify and analyze 200 potential OsWD40 genes in rice, describing their gene structures, genome localizations, and evolutionary relationship of each member. Expression profiles covering the whole life cycle in rice has revealed that transcripts of OsWD40 were accumulated differentially during vegetative and reproductive development and preferentially up or down-regulated in different tissues. Under phytohormone treatments, 25 OsWD40 genes were differentially expressed with treatments of one or more of the phytohormone NAA, KT, or GA3 in rice seedlings. We also used a combined analysis of expression correlation and Gene Ontology annotation to infer the biological role of the OsWD40 genes in rice. The results suggested that OsWD40 genes may perform their diverse functions by complex network, thus were predictive for understanding their biological pathways. The analysis also revealed that OsWD40 genes might interact with each other to take part in metabolic pathways, suggesting a more complex feedback network. Conclusions All of these analyses suggest that the functions of OsWD40 genes are diversified, which provide useful references for selecting candidate genes for further functional studies. PMID:22429805
Hu, Anyi; Jiao, Nianzhi; Zhang, Chuanlun L
2011-10-01
Marine Crenarchaeota represent a widespread and abundant microbial group in marine ecosystems. Here, we investigated the abundance, diversity, and distribution of planktonic Crenarchaeota in the epi-, meso-, and bathypelagic zones at three stations in the South China Sea (SCS) by analysis of crenarchaeal 16S rRNA gene, ammonia monooxygenase gene amoA involved in ammonia oxidation, and biotin carboxylase gene accA putatively involved in archaeal CO(2) fixation. Quantitative PCR analyses indicated that crenarchaeal amoA and accA gene abundances varied similarly with archaeal and crenarchaeal 16S rRNA gene abundances at all stations, except that crenarchaeal accA genes were almost absent in the epipelagic zone. Ratios of the crenarchaeal amoA gene to 16S rRNA gene abundances decreased ~2.6 times from the epi- to bathypelagic zones, whereas the ratios of crenarchaeal accA gene to marine group I crenarchaeal 16S rRNA gene or to crenarchaeal amoA gene abundances increased with depth, suggesting that the metabolism of Crenarchaeota may change from the epi- to meso- or bathypelagic zones. Denaturing gradient gel electrophoresis profiling of the 16S rRNA genes revealed depth partitioning in archaeal community structures. Clone libraries of crenarchaeal amoA and accA genes showed two clusters: the "shallow" cluster was exclusively derived from epipelagic water and the "deep" cluster was from meso- and/or bathypelagic waters, suggesting that niche partitioning may take place between the shallow and deep marine Crenarchaeota. Overall, our results show strong depth partitioning of crenarchaeal populations in the SCS and suggest a shift in their community structure and ecological function with increasing depth.
PRGdb: a bioinformatics platform for plant resistance gene analysis
Sanseverino, Walter; Roma, Guglielmo; De Simone, Marco; Faino, Luigi; Melito, Sara; Stupka, Elia; Frusciante, Luigi; Ercolano, Maria Raffaella
2010-01-01
PRGdb is a web accessible open-source (http://www.prgdb.org) database that represents the first bioinformatic resource providing a comprehensive overview of resistance genes (R-genes) in plants. PRGdb holds more than 16 000 known and putative R-genes belonging to 192 plant species challenged by 115 different pathogens and linked with useful biological information. The complete database includes a set of 73 manually curated reference R-genes, 6308 putative R-genes collected from NCBI and 10463 computationally predicted putative R-genes. Thanks to a user-friendly interface, data can be examined using different query tools. A home-made prediction pipeline called Disease Resistance Analysis and Gene Orthology (DRAGO), based on reference R-gene sequence data, was developed to search for plant resistance genes in public datasets such as Unigene and Genbank. New putative R-gene classes containing unknown domain combinations were discovered and characterized. The development of the PRG platform represents an important starting point to conduct various experimental tasks. The inferred cross-link between genomic and phenotypic information allows access to a large body of information to find answers to several biological questions. The database structure also permits easy integration with other data types and opens up prospects for future implementations. PMID:19906694
Bakera, Beata; Makowska, Bogna; Groszyk, Jolanta; Niziołek, Michał; Orczyk, Wacław; Bolibok-Brągoszewska, Hanna; Hromada-Judycka, Aneta; Rakoczy-Trojanowska, Monika
2015-08-01
Benzoxazinoids (BX) are major secondary metabolites of gramineous plants that play an important role in disease resistance and allelopathy. They also have many other unique properties including anti-bacterial and anti-fungal activity, and the ability to reduce alfa-amylase activity. The biosynthesis and modification of BX are controlled by the genes Bx1 ÷ Bx10, GT and glu, and the majority of these Bx genes have been mapped in maize, wheat and rye. However, the genetic basis of BX biosynthesis remains largely uncharacterized apart from some data from maize and wheat. The aim of this study was to isolate, sequence and characterize five genes (ScBx1, ScBx2, ScBx3, ScBx4 and ScBx5) encoding enzymes involved in the synthesis of DIBOA, an important defense compound of rye. Using a modified 3D procedure of BAC library screening, seven BAC clones containing all of the ScBx genes were isolated and sequenced. Bioinformatic analyses of the resulting contigs were used to examine the structure and other features of these genes, including their promoters, introns and 3'UTRs. Comparative analysis showed that the ScBx genes are similar to those of other Poaceae species, especially to the TaBx genes. The polymorphisms present both in the coding sequences and non-coding regions of ScBx in relation to other Bx genes are predicted to have an impact on the expression, structure and properties of the encoded proteins.
Genetic structure in the Sherpa and neighboring Nepalese populations.
Cole, Amy M; Cox, Sean; Jeong, Choongwon; Petousi, Nayia; Aryal, Dhana R; Droma, Yunden; Hanaoka, Masayuki; Ota, Masao; Kobayashi, Nobumitsu; Gasparini, Paolo; Montgomery, Hugh; Robbins, Peter; Di Rienzo, Anna; Cavalleri, Gianpiero L
2017-01-19
We set out to describe the fine-scale population structure across the Eastern region of Nepal. To date there is relatively little known about the genetic structure of the Sherpa residing in Nepal and their genetic relationship with the Nepalese. We assembled dense genotype data from a total of 1245 individuals representing Nepal and a variety of different populations resident across the greater Himalayan region including Tibet, China, India, Pakistan, Kazakhstan, Uzbekistan, Tajikistan and Kirghizstan. We performed analysis of principal components, admixture and homozygosity. We identified clear substructure across populations resident in the Himalayan arc, with genetic structure broadly mirroring geographical features of the region. Ethnic subgroups within Nepal show distinct genetic structure, on both admixture and principal component analysis. We detected differential proportions of ancestry from northern Himalayan populations across Nepalese subgroups, with the Nepalese Rai, Magar and Tamang carrying the greatest proportions of Tibetan ancestry. We show that populations dwelling on the Himalayan plateau have had a clear impact on the Northern Indian gene pool. We illustrate how the Sherpa are a remarkably isolated population, with little gene flow from surrounding Nepalese populations.
Structure-Based Annotation of a Novel Sugar Isomerase from the Pathogenic E. coli O157:H7
DOE Office of Scientific and Technical Information (OSTI.GOV)
van Staalduinen, L.; Park, C; Yeom, S
2010-01-01
Prokaryotes can use a variety of sugars as carbon sources in order to provide a selective survival advantage. The gene z5688 found in the pathogenic Escherichia coli O157:H7 encodes a 'hypothetical' protein of unknown function. Sequence analysis identified the gene product as a putative member of the cupin superfamily of proteins, but no other functional information was known. We have determined the crystal structure of the Z5688 protein at 1.6 {angstrom} resolution and identified the protein as a novel E. coli sugar isomerase (EcSI) through overall fold analysis and secondary-structure matching. Extensive substrate screening revealed that EcSI is capable ofmore » acting on D-lyxose and D-mannose. The complex structure of EcSI with fructose allowed the identification of key active-site residues, and mutagenesis confirmed their importance. The structure of EcSI also suggested a novel mechanism for substrate binding and product release in a cupin sugar isomerase. Supplementation of a nonpathogenic E. coli strain with EcSI enabled cell growth on the rare pentose d-lyxose.« less
Batra, Ritu; Saripalli, Gautam; Mohan, Amita; Gupta, Saurabh; Gill, Kulvinder S.; Varadwaj, Pritish K.; Balyan, Harindra S.; Gupta, Pushpendra K.
2017-01-01
ADP-glucose pyrophosphorylase (AGPase) is a heterotetrameric enzyme with two large subunits (LS) and two small subunits (SS). It plays a critical role in starch biosynthesis. We are reporting here detailed structure, function and evolution of the genes encoding the LS and the SS among monocots and dicots. “True” orthologs of maize Sh2 (AGPase LS) and Bt2 (AGPase SS) were identified in seven other monocots and three dicots; structure of the enzyme at protein level was also studied. Novel findings of the current study include the following: (i) at the DNA level, the genes controlling the SS are more conserved than those controlling the LS; the variation in both is mainly due to intron number, intron length and intron phase distribution; (ii) at protein level, the SS genes are more conserved relative to those for LS; (iii) “QTCL” motif present in SS showed evolutionary differences in AGPase belonging to wheat 7BS, T. urartu, rice and sorghum, while “LGGG” motif in LS was present in all species except T. urartu and chickpea; SS provides thermostability to AGPase, while LS is involved in regulation of AGPase activity; (iv) heterotetrameric structure of AGPase was predicted and analyzed in real time environment through molecular dynamics simulation for all the species; (v) several cis-acting regulatory elements were identified in the AGPase promoters with their possible role in regulating spatial and temporal expression (endosperm and leaf tissue) and also the expression, in response to abiotic stresses; and (vi) expression analysis revealed downregulation of both subunits under conditions of heat and drought stress. The results of the present study have allowed better understanding of structure and evolution of the genes and the encoded proteins and provided clues for exploitation of variability in these genes for engineering thermostable AGPase. PMID:28174576
Rossmassler, Karen; Dietrich, Carsten; Thompson, Claire; ...
2015-11-26
Termites are important contributors to carbon and nitrogen cycling in tropical ecosystems. Higher termites digest lignocellulose in various stages of humification with the help of an entirely prokaryotic microbiota housed in their compartmented intestinal tract. Previous studies revealed fundamental differences in community structure between compartments, but the functional roles of individual lineages in symbiotic digestion are mostly unknown. Furthermore, we conducted a highly resolved analysis of the gut microbiota in six species of higher termites that feed on plant material at different levels of humification. Combining amplicon sequencing and metagenomics, we assessed similarities in community structure and functional potential betweenmore » the major hindgut compartments (P1, P3, and P4). Cluster analysis of the relative abundances of orthologous gene clusters (COGs) revealed high similarities among woodand litter-feeding termites and strong differences to humivorous species. However, abundance estimates of bacterial phyla based on 16S rRNA genes greatly differed from those based on protein-coding genes. In conclusion, the community structure and functional potential of the microbiota in individual gut compartments are clearly driven by the digestive strategy of the host. The metagenomics libraries obtained in this study provide the basis for future studies that elucidate the fundamental differences in the symbiont-mediated breakdown of lignocellulose and humus by termites of different feeding groups. The high proportion of uncultured bacterial lineages in all samples calls for a reference-independent approach for the correct taxonomic assignment of protein-coding genes.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rossmassler, Karen; Dietrich, Carsten; Thompson, Claire
Termites are important contributors to carbon and nitrogen cycling in tropical ecosystems. Higher termites digest lignocellulose in various stages of humification with the help of an entirely prokaryotic microbiota housed in their compartmented intestinal tract. Previous studies revealed fundamental differences in community structure between compartments, but the functional roles of individual lineages in symbiotic digestion are mostly unknown. Furthermore, we conducted a highly resolved analysis of the gut microbiota in six species of higher termites that feed on plant material at different levels of humification. Combining amplicon sequencing and metagenomics, we assessed similarities in community structure and functional potential betweenmore » the major hindgut compartments (P1, P3, and P4). Cluster analysis of the relative abundances of orthologous gene clusters (COGs) revealed high similarities among woodand litter-feeding termites and strong differences to humivorous species. However, abundance estimates of bacterial phyla based on 16S rRNA genes greatly differed from those based on protein-coding genes. In conclusion, the community structure and functional potential of the microbiota in individual gut compartments are clearly driven by the digestive strategy of the host. The metagenomics libraries obtained in this study provide the basis for future studies that elucidate the fundamental differences in the symbiont-mediated breakdown of lignocellulose and humus by termites of different feeding groups. The high proportion of uncultured bacterial lineages in all samples calls for a reference-independent approach for the correct taxonomic assignment of protein-coding genes.« less
Clostridium perfringens: insight into virulence evolution and population structure.
Sawires, Youhanna S; Songer, J Glenn
2006-02-01
Clostridium perfringens is an important pathogen in veterinary and medical fields. Diseases caused by this organism are in many cases life threatening or fatal. At the same time, it is part of the ecological community of the intestinal tract of man and animals. Virulence in this species is not fully understood and it does seem that there is erratic distribution of the toxin/enzyme genes within C. perfringens population. We used the recently developed multiple-locus variable-number tandem repeat analysis (MLVA) scheme to investigate the evolution of virulence and population structure of this species. Analysis of the phylogenetic signal indicates that acquisition of the major toxin genes as well as other plasmid-borne toxin genes is a recent evolutionary event and their maintenance is essentially a function of the selective advantage they confer in certain niches under different conditions. In addition, it indicates the ability of virulent strains to cause disease in different host species. More interestingly, there is evidence that certain normal flora strains are virulent when they gain access to a different host species. Analysis of the population structure indicates that recombination events are the major tool that shapes the population and this panmixia is interrupted by frequent clonal expansion that mostly corresponds to disease processes. The signature of positive selection was detected in alpha toxin gene, suggesting the possibility of adaptive alleles on the other chromosomally encoded determinants. Finally, C. perfringens proved to have a dynamic population and availability of more genome sequences and use of comparative proteomics and animal modeling would provide more insight into the virulence of this organism.
Zhang, Fengli; Zhao, Xiaoxue; Li, Qingbo; Liu, Jia; Ding, Jizhe; Wu, Huiying; Zhao, Zongsheng; Ba, Yue; Cheng, Xuemin; Cui, Liuxin; Li, Hongping; Zhu, Jingyuan
2018-04-01
Soil contamination with heavy metals is a worldwide problem especially in China. The interrelation of soil bacterial community structure, antibiotic resistance genes, and heavy metal contamination in soil is still unclear. Here, seven agricultural areas (G1-G7) with heavy metal contamination were sampled with different distances (741 to 2556 m) to the factory. Denaturing gradient gel electrophoresis (DGGE) and Shannon index were used to analyze bacterial community diversity. Real-time fluorescence quantitative PCR was used to detect the relative abundance of ARGs sul1, sul2, tetA, tetM, tetW, one mobile genetic elements (MGE) inti1. Results showed that all samples were polluted by Cadmium (Cd), and some of them were polluted by lead (Pb), mercury (Hg), arsenic (As), copper (Cu), and zinc (Zn). DGGE showed that the most abundant bacterial species were found in G7 with the lightest heavy metal contamination. The results of the principal component analysis and clustering analysis both showed that G7 could not be classified with other samples. The relative abundance of sul1 was correlated with Cu, Zn concentration. Gene sul2 are positively related with total phosphorus, and tetM was associated with organic matter. Total gene abundances and relative abundance of inti1 both correlated with organic matter. Redundancy analysis showed that Zn and sul2 were significantly related with bacterial community structure. Together, our results indicate a complex linkage between soil heavy metal concentration, bacterial community composition, and some global disseminated ARG abundance.
Genome-wide survey and expression analysis of F-box genes in chickpea.
Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata
2015-02-13
The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.
Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S
2010-10-07
PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out to dissect the PHB gene function. The conserved gene evolution indicated that the study in the model species can be translated to human and mammalian studies.
Reif, David M.; Israel, Mark A.; Moore, Jason H.
2007-01-01
The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
Joshi, Dev Raj; Zhang, Yu; Zhang, Hong; Gao, Yingxin; Yang, Min
2018-01-01
Nitrogenous heterocyclic compounds are key pollutants in coking wastewater; however, the functional potential of microbial communities for biodegradation of such contaminants during biological treatment is still elusive. Herein, a high throughput functional gene array (GeoChip 5.0) in combination with Illumina HiSeq2500 sequencing was used to compare and characterize the microbial community functional structure in a long run (500days) bench scale bioreactor treating coking wastewater, with a control system treating synthetic wastewater. Despite the inhibitory toxic pollutants, GeoChip 5.0 detected almost all key functional gene (average 61,940 genes) categories in the coking wastewater sludge. With higher abundance, aromatic ring cleavage dioxygenase genes including multi ring1,2diox; one ring2,3diox; catechol represented significant functional potential for degradation of aromatic pollutants which was further confirmed by Illumina HiSeq2500 analysis results. Response ratio analysis revealed that three nitrogenous compound degrading genes- nbzA (nitro-aromatics), tdnB (aniline), and scnABC (thiocyanate) were unique for coking wastewater treatment, which might be strong cause to increase ammonia level during the aerobic process. Additionally, HiSeq2500 elucidated carbozole and isoquinoline degradation genes in the system. These findings expanded our understanding on functional potential of microbial communities to remove organic nitrogenous pollutants; hence it will be useful in optimization strategies for biological treatment of coking wastewater. Copyright © 2017. Published by Elsevier B.V.
Iwanaga, Masashi; Kurihara, Masaaki; Kobayashi, Masahiko; Kang, WonKyung
2002-05-25
All lepidopteran baculovirus genomes sequenced to date encode a homolog of the Bombyx mori nucleopolyhedrovirus (BmNPV) orf68 gene, suggesting that it performs an important role in the virus life cycle. In this article we describe the characterization of BmNPV orf68 gene. Northern and Western analyses demonstrated that orf68 gene was expressed as a late gene and encoded a structural protein of budded virus (BV). Immunohistochemical analysis by confocal microscopy showed that ORF68 protein was localized mainly in the nucleus of infected cells. To examine the function of orf68 gene, we constructed orf68 deletion mutant (BmD68) and characterized it in BmN cells and larvae of B. mori. BV production was delayed in BmD68-infected cells. The larval bioassays also demonstrated that deletion of orf68 did not reduce the infectivity, but mutant virus took 70 h longer to kill the host than wild-type BmNPV. In addition, dot-blot analysis showed viral DNA accumulated more slowly in mutant infected cells. Further examination suggested that BmD68 was less efficient in entry and budding from cells, although it seemed to possess normal attachment ability. These results suggest that ORF68 is a BV-associated protein involved in secondary infection from cell-to-cell. (c) 2002 Elsevier Science (USA).
Assessment of Bacterial bph Gene in Amazonian Dark Earth and Their Adjacent Soils
Brossi, Maria Julia de Lima; Mendes, Lucas William; Germano, Mariana Gomes; Lima, Amanda Barbosa; Tsai, Siu Mui
2014-01-01
Amazonian Anthrosols are known to harbour distinct and highly diverse microbial communities. As most of the current assessments of these communities are based on taxonomic profiles, the functional gene structure of these communities, such as those responsible for key steps in the carbon cycle, mostly remain elusive. To gain insights into the diversity of catabolic genes involved in the degradation of hydrocarbons in anthropogenic horizons, we analysed the bacterial bph gene community structure, composition and abundance using T-RFLP, 454-pyrosequencing and quantitative PCR essays, respectively. Soil samples were collected in two Brazilian Amazon Dark Earth (ADE) sites and at their corresponding non-anthropogenic adjacent soils (ADJ), under two different land use systems, secondary forest (SF) and manioc cultivation (M). Redundancy analysis of T-RFLP data revealed differences in bph gene structure according to both soil type and land use. Chemical properties of ADE soils, such as high organic carbon and organic matter, as well as effective cation exchange capacity and pH, were significantly correlated with the structure of bph communities. Also, the taxonomic affiliation of bph gene sequences revealed the segregation of community composition according to the soil type. Sequences at ADE sites were mostly affiliated to aromatic hydrocarbon degraders belonging to the genera Streptomyces, Sphingomonas, Rhodococcus, Mycobacterium, Conexibacter and Burkholderia. In both land use sites, shannon's diversity indices based on the bph gene data were higher in ADE than ADJ soils. Collectively, our findings provide evidence that specific properties in ADE soils shape the structure and composition of bph communities. These results provide a basis for further investigations focusing on the bio-exploration of novel enzymes with potential use in the biotechnology/biodegradation industry. PMID:24927167
Assessment of bacterial bph gene in Amazonian dark earth and their adjacent soils.
Brossi, Maria Julia de Lima; Mendes, Lucas William; Germano, Mariana Gomes; Lima, Amanda Barbosa; Tsai, Siu Mui
2014-01-01
Amazonian Anthrosols are known to harbour distinct and highly diverse microbial communities. As most of the current assessments of these communities are based on taxonomic profiles, the functional gene structure of these communities, such as those responsible for key steps in the carbon cycle, mostly remain elusive. To gain insights into the diversity of catabolic genes involved in the degradation of hydrocarbons in anthropogenic horizons, we analysed the bacterial bph gene community structure, composition and abundance using T-RFLP, 454-pyrosequencing and quantitative PCR essays, respectively. Soil samples were collected in two Brazilian Amazon Dark Earth (ADE) sites and at their corresponding non-anthropogenic adjacent soils (ADJ), under two different land use systems, secondary forest (SF) and manioc cultivation (M). Redundancy analysis of T-RFLP data revealed differences in bph gene structure according to both soil type and land use. Chemical properties of ADE soils, such as high organic carbon and organic matter, as well as effective cation exchange capacity and pH, were significantly correlated with the structure of bph communities. Also, the taxonomic affiliation of bph gene sequences revealed the segregation of community composition according to the soil type. Sequences at ADE sites were mostly affiliated to aromatic hydrocarbon degraders belonging to the genera Streptomyces, Sphingomonas, Rhodococcus, Mycobacterium, Conexibacter and Burkholderia. In both land use sites, shannon's diversity indices based on the bph gene data were higher in ADE than ADJ soils. Collectively, our findings provide evidence that specific properties in ADE soils shape the structure and composition of bph communities. These results provide a basis for further investigations focusing on the bio-exploration of novel enzymes with potential use in the biotechnology/biodegradation industry.
Comparative study on gene set and pathway topology-based enrichment methods.
Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim
2015-10-22
Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both types of methods for enrichment analysis require further improvements in order to deal with the problem of pathway overlaps.
Single cell Hi-C reveals cell-to-cell variability in chromosome structure
Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter
2013-01-01
Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610
Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
2014-11-19
Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Seon, A A; Pierre, T N; Redeker, V; Lacombe, C; Delfour, A; Nicolas, P; Amiche, M
2000-02-25
Calcitonin gene-related peptide has been extracted from the skin exudate of a single living specimen of the frog Phyllomedusa bicolor and purified to homogeneity by a two-step protocol. A total volume of 250 microl of exudate yielded 380 microg of purified peptide. Mass spectrometric analysis and gas phase sequencing of the purified peptide as well as chemical synthesis and cDNA analysis were consistent with the structure SCDTSTCATQRLADFLSRSGGIGSPDFVPTDVSANSF amide and the presence of a disulfide bridge linking Cys(2) and Cys(7). The skin peptide, named skin calcitonin gene-related peptide, differs significantly from all other members of the calcitonin gene-related peptide family of peptides at nine positions but binds with high affinity to calcitonin gene-related peptide receptors in the rat brain and acts as an agonist in the rat vas deferens bioassay with potencies equal to those of human CGRP. Reverse transcriptase-polymerase chain reaction coupled with cDNA cloning and sequencing demonstrated that skin calcitonin gene-related peptide isolated in the skin is identical to that present in the frog's central and enteric nervous systems. These data, which indicate for the first time the existence of calcitonin gene-related peptide in the frog skin, add further support to the brain-skin-gut triangle hypothesis as a useful tool in the identification and/or isolation of mammalian peptides that are present in the brain and other tissues in only minute quantities.
Wei, Wei; Hu, Yang; Cui, Meng-Yuan; Han, Yong-Tao; Gao, Kuan; Feng, Jia-Yue
2016-01-01
Plant-specific TEOSINTE BRANCHED 1, CYCLOIDEA, and PROLIFERATING CELL FACTORS (TCP) transcription factors play versatile functions in multiple processes of plant growth and development. However, no systematic study has been performed in strawberry. In this study, 19 FvTCP genes were identified in the diploid woodland strawberry (Fragaria vesca) accession Heilongjiang-3. Phylogenetic analysis suggested that the FvTCP genes were classified into two main classes, with the second class further divided into two subclasses, which was supported by the exon-intron organizations and the conserved motif structures. Promoter analysis revealed various cis-acting elements related to growth and development, hormone and/or stress responses. We analyzed FvTCP gene transcript accumulation patterns in different tissues and fruit developmental stages. Among them, 12 FvTCP genes exhibited distinct tissue-specific transcript accumulation patterns. Eleven FvTCP genes were down-regulated in different fruit developmental stages, while five FvTCP genes were up-regulated. Transcripts of FvTCP genes also varied with different subcultural propagation periods and were induced by hormone treatments and biotic and abiotic stresses. Subcellular localization analysis showed that six FvTCP-GFP fusion proteins showed distinct localizations in Arabidopsis mesophyll protoplasts. Notably, transient over-expression of FvTCP9 in strawberry fruits dramatically affected the expression of a series of genes implicated in fruit development and ripening. Taken together, the present study may provide the basis for functional studies to reveal the role of this gene family in strawberry growth and development. PMID:28066489
Computerized image analysis for quantitative neuronal phenotyping in zebrafish.
Liu, Tianming; Lu, Jianfeng; Wang, Ye; Campbell, William A; Huang, Ling; Zhu, Jinmin; Xia, Weiming; Wong, Stephen T C
2006-06-15
An integrated microscope image analysis pipeline is developed for automatic analysis and quantification of phenotypes in zebrafish with altered expression of Alzheimer's disease (AD)-linked genes. We hypothesize that a slight impairment of neuronal integrity in a large number of zebrafish carrying the mutant genotype can be detected through the computerized image analysis method. Key functionalities of our zebrafish image processing pipeline include quantification of neuron loss in zebrafish embryos due to knockdown of AD-linked genes, automatic detection of defective somites, and quantitative measurement of gene expression levels in zebrafish with altered expression of AD-linked genes or treatment with a chemical compound. These quantitative measurements enable the archival of analyzed results and relevant meta-data. The structured database is organized for statistical analysis and data modeling to better understand neuronal integrity and phenotypic changes of zebrafish under different perturbations. Our results show that the computerized analysis is comparable to manual counting with equivalent accuracy and improved efficacy and consistency. Development of such an automated data analysis pipeline represents a significant step forward to achieve accurate and reproducible quantification of neuronal phenotypes in large scale or high-throughput zebrafish imaging studies.
Genome-wide identification and expression profiling of the SnRK2 gene family in Malus prunifolia.
Shao, Yun; Qin, Yuan; Zou, Yangjun; Ma, Fengwang
2014-11-15
Sucrose non-fermenting-1-related protein kinase 2 (SnRK2) constitutes a small plant-specific serine/threonine kinase family with essential roles in the abscisic acid (ABA) signal pathway and in responses to osmotic stress. Although a genome-wide analysis of this family has been conducted in some species, little is known about SnRK2 genes in apple (Malus domestica). We identified 14 putative sequences encoding 12 deduced SnRK2 proteins within the apple genome. Gene chromosomal location and synteny analysis of the apple SnRK2 genes indicated that tandem and segmental duplications have likely contributed to the expansion and evolution of these genes. All 12 full-length coding sequences were confirmed by cloning from Malus prunifolia. The gene structure and motif compositions of the apple SnRK2 genes were analyzed. Phylogenetic analysis showed that MpSnRK2s could be classified into four groups. Profiling of these genes presented differential patterns of expression in various tissues. Under stress conditions, transcript levels for some family members were up-regulated in the leaves in response to drought, salinity, or ABA treatments. This suggested their possible roles in plant response to abiotic stress. Our findings provide essential information about SnRK2 genes in apple and will contribute to further functional dissection of this gene family. Copyright © 2014 Elsevier B.V. All rights reserved.
Analysis of genes involved in biosynthesis of the lantibiotic subtilin.
Klein, C; Kaletta, C; Schnell, N; Entian, K D
1992-01-01
Lantibiotics are peptide-derived antibiotics with high antimicrobial activity against pathogenic gram-positive bacteria. They are ribosomally synthesized and posttranslationally modified (N. Schnell, K.-D. Entian, U. Schneider, F. Götz, H. Zähner, R. Kellner, and G. Jung, Nature [London] 333:276-278, 1988). The most important lantibiotics are subtilin and the food preservative nisin, which both have a very similar structure. By using a hybridization probe specific for the structural gene of subtilin, spaS, the DNA region adjacent to spaS was isolated from Bacillus subtilis. Sequence analysis of a 4.9-kb fragment revealed several open reading frames with the same orientation as spaS. Downstream of spaS, no reading frames were present on the isolated XbaI fragment. Upstream of spaS, three reading frames, spaB, spaC, and spaT, were identified which showed strong homology to genes identified near the structural gene of the lantibiotic epidermin. The SpaT protein derived from the spaT sequence was homologous to hemolysin B of Escherichia coli, which indicated its possible function in subtilin transport. Gene deletions within spaB and spaC revealed subtilin-negative mutants, whereas spaT gene disruption mutants still produced subtilin. Remarkably, the spaT mutant colonies revealed a clumpy surface morphology on solid media. After growth on liquid media, spaT mutant cells agglutinated in the mid-logarithmic growth phase, forming longitudinal 3- to 10-fold-enlarged cells which aggregated. Aggregate formation preceded subtilin production and cells lost their viability, possibly as a result of intracellular subtilin accumulation. Our results clearly proved that reading frames spaB and spaC are essential for subtilin biosynthesis whereas spaT mutants are probably deficient in subtilin transport. Images PMID:1539969
Recurrent Rearrangements of Human Amylase Genes Create Multiple Independent CNV Series.
Shwan, Nzar A A; Louzada, Sandra; Yang, Fengtang; Armour, John A L
2017-05-01
The human amylase gene cluster includes the human salivary (AMY1) and pancreatic amylase genes (AMY2A and AMY2B), and is a highly variable and dynamic region of the genome. Copy number variation (CNV) of AMY1 has been implicated in human dietary adaptation, and in population association with obesity, but neither of these findings has been independently replicated. Despite these functional implications, the structural genomic basis of CNV has only been defined in detail very recently. In this work, we use high-resolution analysis of copy number, and analysis of segregation in trios, to define new, independent allelic series of amylase CNVs in sub-Saharan Africans, including a series of higher-order expansions of a unit consisting of one copy each of AMY1, AMY2A, and AMY2B. We use fiber-FISH (fluorescence in situ hybridization) to define unexpected complexity in the accompanying rearrangements. These findings demonstrate recurrent involvement of the amylase gene region in genomic instability, involving at least five independent rearrangements of the pancreatic amylase genes (AMY2A and AMY2B). Structural features shared by fundamentally distinct lineages strongly suggest that the common ancestral state for the human amylase cluster contained more than one, and probably three, copies of AMY1. © 2017 WILEY PERIODICALS, INC.
Joshi, Dev Raj; Zhang, Yu; Gao, Yinxin; Liu, Yuan; Yang, Min
2017-09-15
Although coking wastewater is generally considered to contain high concentration of nitrogen- and sulfur-containing pollutants, the biotransformation processes of these compounds have not been well understood. Herein, a high throughput functional gene array (GeoChip 5.0) in combination with Illumina MiSeq sequencing of the 16S rRNA gene were used to identify microbial functional traits and their role in biotransformation of nitrogen- and sulfur-containing compounds in a bench-scale aerobic coking wastewater treatment system operated for 488 days. Biotransformation of nitrogen and sulfur-containing pollutants deteriorated when pH of the bioreactor was increased to >8.0, and the microbial community functional structure was significantly associated with pH (Mantels test, P < 0.05). The release of ammonia nitrogen and sulfate was correlated with both the taxonomic and functional microbial community structure (P < 0.05). Considering the abundance and correlation with the release of ammonia nitrogen and sulfate, aromatic dioxygenases (e.g. xylXY, nagG), nitrilases (e.g. nhh, nitrilase), dibenzothiophene oxidase (DbtAc), and thiocyanate hydrolase (scnABC) were important functional genes for biotransformation of nitrogen- and sulfur-containing pollutants. Functional characterization of taxa and network analysis suggested that Burkholderiales, Actinomycetales, Rhizobiales, Pseudomonadales, and Hydrogenophiliales (Thiobacillus) were key functional taxa. Variance partitioning analysis showed that pH and influent ammonia nitrogen jointly explained 25.9% and 35.5% of variation in organic pollutant degrading genes and microbial community structure, respectively. This study revealed a linkage between microbial community functional structure and the likely biotransformation of nitrogen- and sulfur-containing pollutants, along with a suitable range of pH (7.0-7.5) for stability of the biological system treating coking wastewater. Copyright © 2017 Elsevier Ltd. All rights reserved.
Tang, Kai; Dong, Chun-Juan; Liu, Jin-Yuan
2016-01-01
In this study, 40 phospholipase D (PLD) genes were identified from allotetraploid cotton Gossypium hirsutum, and 20 PLD genes were examined in diploid cotton Gossypium raimondii. Combining with 19 previously identified Gossypium arboreum PLD genes, a comparative analysis was performed among the PLD gene families among allotetraploid and two diploid cottons. Based on the orthologous relationships, we found that almost each G. hirsutum PLD had a corresponding homolog in the G. arboreum and G. raimondii genomes, except for GhPLDβ3A, whose homolog GaPLDβ3 may have been lost during the evolution of G. arboreum after the interspecific hybridization. Phylogenetic analysis showed that all of the cotton PLDs were unevenly classified into six numbered subgroups: α, β/γ, δ, ε, ζ and φ. An N-terminal C2 domain was found in the α, β/γ, δ and ε subgroups, while phox homology (PX) and pleckstrin homology (PH) domains were identified in the ζ subgroup. The subgroup φ possessed a single peptide instead of a functional domain. In each phylogenetic subgroup, the PLDs showed high conservation in gene structure and amino acid sequences in functional domains. The expansion of GhPLD and GrPLD gene families were mainly attributed to segmental duplication and partly attributed to tandem duplication. Furthermore, purifying selection played a critical role in the evolution of PLD genes in cotton. Quantitative RT-PCR documented that allotetraploid cotton PLD genes were broadly expressed and each had a unique spatial and developmental expression pattern, indicating their functional diversification in cotton growth and development. Further analysis of cis-regulatory elements elucidated transcriptional regulations and potential functions. Our comparative analysis provided valuable information for understanding the putative functions of the PLD genes in cotton fiber. PMID:27213891
Prevalence of pfhrp2 and pfhrp3 gene deletions in Puerto Lempira, Honduras.
Abdallah, Joseph F; Okoth, Sheila Akinyi; Fontecha, Gustavo A; Torres, Rosa Elena Mejia; Banegas, Engels I; Matute, María Luisa; Bucheli, Sandra Tamara Mancero; Goldman, Ira F; de Oliveira, Alexandre Macedo; Barnwell, John W; Udhayakumar, Venkatachalam
2015-01-21
Recent studies have demonstrated the deletion of the histidine-rich protein 2 (PfHRP2) gene (pfhrp2) in field isolates of Plasmodium falciparum, which could result in false negative test results when PfHRP2-based rapid diagnostic tests (RDTs) are used for malaria diagnosis. Although primary diagnosis of malaria in Honduras is determined based on microscopy, RDTs may be useful in remote areas. In this study, it was investigated whether there are deletions of the pfhrp2, pfhrp3 and their respective flanking genes in 68 P. falciparum parasite isolates collected from the city of Puerto Lempira, Honduras. In addition, further investigation considered the possible correlation between parasite population structure and the distribution of these gene deletions by genotyping seven neutral microsatellites. Sixty-eight samples used in this study, which were obtained from a previous chloroquine efficacy study, were utilized in the analysis. All samples were genotyped for pfhrp2, pfhrp3 and flanking genes by PCR. The samples were then genotyped for seven neutral microsatellites in order to determine the parasite population structure in Puerto Lempira at the time of sample collection. It was found that all samples were positive for pfhrp2 and its flanking genes on chromosome 8. However, only 50% of the samples were positive for pfhrp3 and its neighboring genes while the rest were either pfhrp3-negative only or had deleted a combination of pfhrp3 and its neighbouring genes on chromosome 13. Population structure analysis predicted that there are at least two distinct parasite population clusters in this sample population. It was also determined that a greater proportion of parasites with pfhrp3-(and flanking gene) deletions belonged to one cluster compared to the other. The findings indicate that the P. falciparum parasite population in the municipality of Puerto Lempira maintains the pfhrp2 gene and that PfHRP2-based RDTs could be considered for use in this region; however continued monitoring of parasite population will be useful to detect any parasites with deletions of pfhrp2.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Z.; Deng, Y.; Van Nostrand, J.D.
A new generation of functional gene arrays (FGAs; GeoChip 3.0) has been developed, with {approx}28,000 probes covering approximately 57,000 gene variants from 292 functional gene families involved in carbon, nitrogen, phosphorus and sulfur cycles, energy metabolism, antibiotic resistance, metal resistance and organic contaminant degradation. GeoChip 3.0 also has several other distinct features, such as a common oligo reference standard (CORS) for data normalization and comparison, a software package for data management and future updating and the gyrB gene for phylogenetic analysis. Computational evaluation of probe specificity indicated that all designed probes would have a high specificity to their corresponding targets.more » Experimental analysis with synthesized oligonucleotides and genomic DNAs showed that only 0.0036-0.025% false-positive rates were observed, suggesting that the designed probes are highly specific under the experimental conditions examined. In addition, GeoChip 3.0 was applied to analyze soil microbial communities in a multifactor grassland ecosystem in Minnesota, USA, which showed that the structure, composition and potential activity of soil microbial communities significantly changed with the plant species diversity. As expected, GeoChip 3.0 is a high-throughput powerful tool for studying microbial community functional structure, and linking microbial communities to ecosystem processes and functioning.« less
Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng
2015-01-01
In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466
Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng
2015-01-01
In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.
Bioinformatics analyses of Shigella CRISPR structure and spacer classification.
Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan
2016-03-01
Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.
Uronic polysaccharide degrading enzymes.
Garron, Marie-Line; Cygler, Miroslaw
2014-10-01
In the past several years progress has been made in the field of structure and function of polysaccharide lyases (PLs). The number of classified polysaccharide lyase families has increased to 23 and more detailed analysis has allowed the identification of more closely related subfamilies, leading to stronger correlation between each subfamily and a unique substrate. The number of as yet unclassified polysaccharide lyases has also increased and we expect that sequencing projects will allow many of these unclassified sequences to emerge as new families. The progress in structural analysis of PLs has led to having at least one representative structure for each of the families and for two unclassified enzymes. The newly determined structures have folds observed previously in other PL families and their catalytic mechanisms follow either metal-assisted or Tyr/His mechanisms characteristic for other PL enzymes. Comparison of PLs with glycoside hydrolases (GHs) shows several folds common to both classes but only for the β-helix fold is there strong indication of divergent evolution from a common ancestor. Analysis of bacterial genomes identified gene clusters containing multiple polysaccharide cleaving enzymes, the Polysaccharides Utilization Loci (PULs), and their gene complement suggests that they are organized to process completely a specific polysaccharide. Copyright © 2014 Elsevier Ltd. All rights reserved.
Rudolf, Jeffrey D.; Yan, Xiaohui; Shen, Ben
2015-01-01
The enediynes are one of the most fascinating families of bacterial natural products given their unprecedented molecular architecture and extraordinary cytotoxicity. Enediynes are rare with only 11 structurally characterized members and four additional members isolated in their cycloaromatized form. Recent advances in DNA sequencing have resulted in an explosion of microbial genomes. A virtual survey of the GenBank and JGI genome databases revealed 87 enediyne biosynthetic gene clusters from 78 bacteria strains, implying enediynes are more common than previously thought. Here we report the construction and analysis of an enediyne genome neighborhood network (GNN) as a high-throughput approach to analyze secondary metabolite gene clusters. Analysis of the enediyne GNN facilitated rapid gene cluster annotation, revealed genetic trends in enediyne biosynthetic gene clusters resulting in a simple prediction scheme to determine 9- vs 10-membered enediyne gene clusters, and supported a genomic-based strain prioritization method for enediyne discovery. PMID:26318027
Functional and evolution characterization of SWEET sugar transporters in Ananas comosus.
Guo, Chengying; Li, Huayang; Xia, Xinyao; Liu, Xiuyuan; Yang, Long
2018-02-05
Sugars will eventually be exported transporters (SWEETs) are a group of recently identified sugar transporters in plants that play important roles in diverse physiological processes. However, currently, limited information about this gene family is available in pineapple (Ananas comosus). The availability of the recently released pineapple genome sequence provides the opportunity to identify SWEET genes in a Bromeliaceae family member at the genome level. In this study, 39 pineapple SWEET genes were identified in two pineapple cultivars (18 AnfSWEET and 21 AnmSWEET) and further phylogenetically classified into five clades. A phylogenetic analysis revealed distinct evolutionary paths for the SWEET genes of the two pineapple cultivars. The MD2 cultivar might have experienced a different expansion than the F153 cultivar because two additional duplications exist, which separately gave rise to clades III and IV. A gene exon/intron structure analysis showed that the pineapple SWEET genes contained highly conserved exon/intron numbers. An analysis of public RNA-seq data and expression profiling showed that SWEET genes may be involved in fruit development and ripening processes. AnmSWEET5 and AnmSWEET11 were highly expressed in the early stages of pineapple fruit development and then decreased. The study increases the understanding of the roles of SWEET genes in pineapple. Copyright © 2018 Elsevier Inc. All rights reserved.
Merlino, Giuseppe; Marzorati, Massimo; Rizzi, Aurora; Lavazza, Davide; de Ferra, Francesca; Carpani, Giovanna
2015-01-01
The achievement of successful biostimulation of active microbiomes for the cleanup of a polluted site is strictly dependent on the knowledge of the key microorganisms equipped with the relevant catabolic genes responsible for the degradation process. In this work, we present the characterization of the bacterial community developed in anaerobic microcosms after biostimulation with the electron donor lactate of groundwater polluted with 1,2-dichloroethane (1,2-DCA). Through a multilevel analysis, we have assessed (i) the structural analysis of the bacterial community; (ii) the identification of putative dehalorespiring bacteria; (iii) the characterization of functional genes encoding for putative 1,2-DCA reductive dehalogenases (RDs). Following the biostimulation treatment, the structure of the bacterial community underwent a notable change of the main phylotypes, with the enrichment of representatives of the order Clostridiales. Through PCR targeting conserved regions within known RD genes, four novel variants of RDs previously associated with the reductive dechlorination of 1,2-DCA were identified in the metagenome of the Clostridiales-dominated bacterial community. PMID:26273600
Cotton, Allison M.; Chen, Chih-Yu; Lam, Lucia L.; Wasserman, Wyeth W.; Kobor, Michael S.; Brown, Carolyn J.
2014-01-01
X-chromosome inactivation results in dosage equivalence between the X chromosome in males and females; however, over 15% of human X-linked genes escape silencing and these genes are enriched on the evolutionarily younger short arm of the X chromosome. The spread of inactivation onto translocated autosomal material allows the study of inactivation without the confounding evolutionary history of the X chromosome. The heterogeneity and reduced extent of silencing on autosomes are evidence for the importance of DNA elements underlying the spread of silencing. We have assessed DNA methylation in six unbalanced X-autosome translocations using the Illumina Infinium HumanMethylation450 array. Two to 42% of translocated autosomal genes showed this mark of silencing, with the highest degree of inactivation observed for trisomic autosomal regions. Generally, the extent of silencing was greatest close to the translocation breakpoint; however, silencing was detected well over 100 kb into the autosomal DNA. Alu elements were found to be enriched at autosomal genes that escaped from inactivation while L1s were enriched at subject genes. In cells without the translocation, there was enrichment of heterochromatic features such as EZH2 and H3K27me3 for those genes that become silenced when translocated, suggesting that underlying chromatin structure predisposes genes towards silencing. Additionally, the analysis of topological domains indicated physical clustering of autosomal genes of common inactivation status. Overall, our analysis indicated a complex interaction between DNA sequence, chromatin features and the three-dimensional structure of the chromosome. PMID:24158853
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biery, B.J.; Stein, D.E.; Goodman, S.I.
The structure of the human glutaryl coenzyme A dehydrogenase (GCD) gene was determined to contain 11 exons and to span {approximately}7 kb. Fibroblast DNA from 64 unrelated glutaric academia type I (GA1) patients was screened for mutations by PCR amplification and analysis of SSCP. Fragments with altered electrophoretic mobility were subcloned and sequenced to detect mutations that caused GA1. This report describes the structure of the GCD gene, as well as point mutations and polymorphisms found in 7 of its 11 exons. Several mutations were found in more than one patient, but no one prevalent mutation was detected in themore » general population. As expected from pedigree analysis, a single mutant allele causes GA1 in the Old Order Amish of Lancaster County, Pennsylvania. Several mutations have been expressed in Escherichia coli, and all produce diminished enzyme activity. Reduced activity in GCD encoded by the A421V mutation in the Amish may be due to impaired association of enzyme subunits. 13 refs., 5 figs., 3 tabs.« less
Wang, Zhao; Yang, Yuyin; Sun, Weimin; Dai, Yu; Xie, Shuguang
2015-02-01
Nonylphenol (NP) can accumulate in river sediment. Bioaugmentation is an attractive option to dissipate heavy NP pollution in river sediment. In this study, two NP degraders were isolated from crude oil-polluted soil and river sediment. Microcosms were constructed to test their ability to degrade NP in river sediment. The shift in the proportion of NP-degrading genes and bacterial community structure in sediment microcosms were characterized using quantitative PCR assay and terminal restriction fragment length polymorphism analysis, respectively. Phylogenetic analysis indicated that the soil isolate belonged to genus Stenotrophomonas, while the sediment isolate was a Sphingobium species. Both of them could almost completely clean up a high level of NP in river sediment (150 mg/kg NP) in 10 or 14 days after inoculation. An increase in the proportion of alkB and sMO genes was observed in sediment microcosms inoculated with Stenotrophomonas strain Y1 and Sphingobium strain Y2, respectively. Moreover, bioaugmentation using Sphingobium strain Y2 could have a strong impact on sediment bacterial community structure, while inoculation of Stenotrophomonas strain Y1 illustrated a weak impact. This study can provide some new insights towards NP biodegradation and bioremediation.
Lymphocyte signaling: beyond knockouts.
Saveliev, Alexander; Tybulewicz, Victor L J
2009-04-01
The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Although this gene 'knockout' approach is often informative, in many cases, the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to 'knock in' subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully based on structural and biophysical data.
Liang, Yuting; Van Nostrand, Joy D.; N′Guessan, Lucie A.; Peacock, Aaron D.; Deng, Ye; Long, Philip E.; Resch, C. Tom; Wu, Liyou; He, Zhili; Li, Guanghe; Hazen, Terry C.; Lovley, Derek R.
2012-01-01
To better understand the microbial functional diversity changes with subsurface redox conditions during in situ uranium bioremediation, key functional genes were studied with GeoChip, a comprehensive functional gene microarray, in field experiments at a uranium mill tailings remedial action (UMTRA) site (Rifle, CO). The results indicated that functional microbial communities altered with a shift in the dominant metabolic process, as documented by hierarchical cluster and ordination analyses of all detected functional genes. The abundance of dsrAB genes (dissimilatory sulfite reductase genes) and methane generation-related mcr genes (methyl coenzyme M reductase coding genes) increased when redox conditions shifted from Fe-reducing to sulfate-reducing conditions. The cytochrome genes detected were primarily from Geobacter sp. and decreased with lower subsurface redox conditions. Statistical analysis of environmental parameters and functional genes indicated that acetate, U(VI), and redox potential (Eh) were the most significant geochemical variables linked to microbial functional gene structures, and changes in microbial functional diversity were strongly related to the dominant terminal electron-accepting process following acetate addition. The study indicates that the microbial functional genes clearly reflect the in situ redox conditions and the dominant microbial processes, which in turn influence uranium bioreduction. Microbial functional genes thus could be very useful for tracking microbial community structure and dynamics during bioremediation. PMID:22327592
Zdorovenko, E L; Wang, Y; Shashkov, A S; Chen, T; Ovchinnikova, O G; Liu, B; Golomidova, A K; Babenko, V V; Letarov, A V; Knirel, Y A
2018-05-01
Glycerophosphate-containing O-specific polysaccharides (OPSs) were obtained by mild acidic degradation of lipopolysaccharides isolated from Escherichia coli type strain O81 and E. coli strain HS3-104 from horse feces. The structures of both OPSs and of the oligosaccharide derived from the strain O81 OPS by treatment with 48% HF were studied by monosaccharide analysis and one- and two-dimensional 1H- and 13C-NMR spectroscopy. Both OPSs had similar structures and differed only in the presence of a side-chain glucose residue in the strain HS3-104 OPS. The genes and the organization of the O-antigen biosynthesis gene cluster in both strains are almost identical with the exception of the gtr gene cluster responsible for glucosylations in the strain HS3-104, which is located elsewhere in the genome.
Wang, X; Zhao, L; Zhang, L; Wu, Y; Chou, M; Wei, G
2018-07-01
Rhizobial symbiotic plasmids play vital roles in mutualistic symbiosis with legume plants by executing the functions of nodulation and nitrogen fixation. To explore the gene composition and genetic constitution of rhizobial symbiotic plasmids, comparison analyses of 24 rhizobial symbiotic plasmids derived from four rhizobial genera was carried out. Results illustrated that rhizobial symbiotic plasmids had higher proportion of functional genes participating in amino acid transport and metabolism, replication; recombination and repair; carbohydrate transport and metabolism; energy production and conversion and transcription. Mesorhizobium amorphae CCNWGS0123 symbiotic plasmid - pM0123d had similar gene composition with pR899b and pSNGR234a. All symbiotic plasmids shared 13 orthologous genes, including five nod and eight nif/fix genes which participate in the rhizobia-legume symbiosis process. These plasmids contained nod genes from four ancestors and fix genes from six ancestors. The ancestral type of pM0123d nod genes was similar with that of Rhizobium etli plasmids, while the ancestral type of pM0123d fix genes was same as that of pM7653Rb. The phylogenetic trees constructed based on nodCIJ and fixABC displayed different topological structures mainly due to nodCIJ and fixABC ancestral type discordance. The study presents valuable insights into mosaic structures and the evolution of rhizobial symbiotic plasmids. This study compared 24 rhizobial symbiotic plasmids that included four genera and 11 species, illuminating the functional gene composition and symbiosis gene ancestor types of symbiotic plasmids from higher taxonomy. It provides valuable insights into mosaic structures and the evolution of symbiotic plasmids. © 2018 The Society for Applied Microbiology.
Millot, Benjamin; Montoliu, Lluís; Fontaine, Marie-Louise; Mata, Teresa; Devinoy, Eve
2003-01-01
The upstream regulatory regions of the mouse and rabbit whey acidic protein (WAP) genes have been used extensively to target the efficient expression of foreign genes into the mammary gland of transgenic animals. Therefore both regions have been studied to elucidate fully the mechanisms controlling WAP gene expression. Three DNase I-hypersensitive sites (HSS0, HSS1 and HSS2) have been described upstream of the rabbit WAP gene in the lactating mammary gland and correspond to important regulatory regions. These sites are surrounded by variable chromatin structures during mammary-gland development. In the present study, we describe the upstream sequence of the mouse WAP gene. Analysis of genomic sequences shows that the mouse WAP gene is situated between two widely expressed genes (Cpr2 and Ramp3). We show that the hypersensitive sites found upstream of the rabbit WAP gene are also detected in the mouse WAP gene. Further, they encompass functional signal transducer and activator of transcription 5-binding sites, as has been observed in the rabbit. A new hypersensitive site (HSS3), not specific to the mammary gland, was mapped 8 kb upstream of the rabbit WAP gene. Unlike the three HSSs described above, HSS3 is also detected in the liver, but similar to HSS1, it does not depend on lactogenic hormone treatments during cell culture. The region surrounding HSS3 encompasses a potential matrix attachment region, which is also conserved upstream of the mouse WAP gene and contains a functional transcription factor Ets-1 (E26 transformation-specific-1)-binding site. Finally, we demonstrate for the first time that variations in the chromatin structure are dependent on prolactin alone. PMID:12580766
Xie, Jianbo; Tian, Jiaxing; Du, Qingzhang; Chen, Jinhui; Li, Ying; Yang, Xiaohui; Li, Bailian; Zhang, Deqiang
2016-05-01
Gibberellins (GAs) regulate a wide range of important processes in plant growth and development, including photosynthesis. However, the mechanism by which GAs regulate photosynthesis remains to be understood. Here, we used multi-gene association to investigate the effect of genes in the GA-responsive pathway, as constructed by RNA sequencing, on photosynthesis, growth, and wood property traits, in a population of 435 Populus tomentosa By analyzing changes in the transcriptome following GA treatment, we identified many key photosynthetic genes, in agreement with the observed increase in measurements of photosynthesis. Regulatory motif enrichment analysis revealed that 37 differentially expressed genes related to photosynthesis shared two essential GA-related cis-regulatory elements, the GA response element and the pyrimidine box. Thus, we constructed a GA-responsive pathway consisting of 47 genes involved in regulating photosynthesis, including GID1, RGA, GID2, MYBGa, and 37 photosynthetic differentially expressed genes. Single nucleotide polymorphism (SNP)-based association analysis showed that 142 SNPs, representing 40 candidate genes in this pathway, were significantly associated with photosynthesis, growth, and wood property traits. Epistasis analysis uncovered interactions between 310 SNP-SNP pairs from 37 genes in this pathway, revealing possible genetic interactions. Moreover, a structural gene-gene matrix based on a time-course of transcript abundances provided a better understanding of the multi-gene pathway affecting photosynthesis. The results imply a functional role for these genes in mediating photosynthesis, growth, and wood properties, demonstrating the potential of combining transcriptome-based regulatory pathway construction and genetic association approaches to detect the complex genetic networks underlying quantitative traits. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Bank, Arthur; Mears, J. Gregory; Ramirez, Francesco
1980-02-01
Studies of the human hemoglobin system have provided new insights into the regulation of expression of a group of linked human genes, the γ -δ -β globin gene complex in man. In particular, the thalassemia syndromes and related disorders of man are inherited anemias that provide mutations for the study of the regulation of globin gene expression. New methods, including restriction enzyme analysis and cloning of cellular DNA, have made it feasible to define more precisely the structure and organization of the globin genes in cellular DNA. Deletions of specific globin gene fragments have already been found in certain of these disorders and have been applied in prenatal diagnosis.
A mutation in the gamma actin 1 (ACTG1) gene causes autosomal dominant hearing loss (DFNA20/26)
van Wijk, E; Krieger, E; Kemperman, M; De Leenheer, E M R; Huygen, P; Cremers, C; Cremers, F; Kremer, H
2003-01-01
Linkage analysis in a multigenerational family with autosomal dominant hearing loss yielded a chromosomal localisation of the underlying genetic defect in the DFNA20/26 locus at 17q25-qter. The 6-cM critical region harboured the γ-1-actin (ACTG1) gene, which was considered an attractive candidate gene because actins are important structural elements of the inner ear hair cells. In this study, a Thr278Ile mutation was identified in helix 9 of the modelled protein structure. The alteration of residue Thr278 is predicted to have a small but significant effect on the γ 1 actin structure owing to its close proximity to a methionine residue at position 313 in helix 11. Met313 has no space in the structure to move away. Moreover, the Thr278 residue is highly conserved throughout eukaryotic evolution. Using a known actin structure the mutation could be predicted to impair actin polymerisation. These findings strongly suggest that the Thr278Ile mutation in ACTG1 represents the first disease causing germline mutation in a cytoplasmic actin isoform. PMID:14684684
Cao, Yunpeng; Han, Yahui; Meng, Dandan; Li, Dahui; Jiao, Chunyan; Jin, Qing; Lin, Yi; Cai, Yongping
2017-09-19
The B-BOX (BBX) proteins have important functions in regulating plant growth and development. In plants, the BBX gene family has been identified in several plants, such as rice, Arabidopsis and tomato. However, there still lack a genome-wide survey of BBX genes in pear. In the present study, a total of 25 BBX genes were identified in pear (Pyrus bretschneideri Rehd.). Subsequently, phylogenetic relationship, gene structure, gene duplication, transcriptome data and qRT-PCR were conducted on these BBX gene members. The transcript analysis revealed that twelve PbBBX genes (48%) were specifically expressed in pear pollen tubes. Furthermore, qRT-PCR analysis indicated that both PbBBX4 and PbBBX13 have potential role in pear fruit development, while PbBBX5 should be involved in the senescence of pear pollen tube. This study provided a genome-wide survey of BBX gene family in pear, and highlighted its roles in both pear fruits and pollen tubes. The results will be useful in improving our understanding of the complexity of BBX gene family and functional characteristics of its members in future study.
Mao, Yizhou; Jiang, Biao; Peng, Qingwu; Liu, Wenrui; Lin, Yue; Xie, Dasen; He, Xiaoming; Li, Shaoshan
2017-05-01
The WRKY transcription factors play an important role in plant resistance for biotic and abiotic stresses. In the present study, we cloned 10 WRKY gene homologs (CqWRKY) in Chieh-qua (Benincasa hispida Cogn. var. Chieh-qua) using the rapid-amplification of cDNA ends (RACE) or homology-based cloning methods. We characterized the structure of these CqWRKY genes. Phylogenetic analysis of these sequences with cucumber homologs suggested possible structural conservation of these genes among cucurbit crops. We examined the expression levels of these genes in response to fusaric acid (FA) treatment between resistant and susceptible Chieh-qua lines with quantitative real-time PCR. All genes could be upregulated upon FA treatment, but four CqWRKY genes exhibited differential expression between resistant and susceptible lines before and after FA application. CqWRKY31 seemed to be a positive regulator while CqWRKY1, CqWRKY23 and CqWRKY53 were negative regulators of fusaric resistance. This is the first report of characterization of WRKY family genes in Chieh-qua. The results may also be useful in breeding Chieh-qua for Fusarium wilt resistance.
Analysis of hairpin RNA transgene-induced gene silencing in Fusarium oxysporum
2013-01-01
Background Hairpin RNA (hpRNA) transgenes can be effective at inducing RNA silencing and have been exploited as a powerful tool for gene function analysis in many organisms. However, in fungi, expression of hairpin RNA transcripts can induce post-transcriptional gene silencing, but in some species can also lead to transcriptional gene silencing, suggesting a more complex interplay of the two pathways at least in some fungi. Because many fungal species are important pathogens, RNA silencing is a powerful technique to understand gene function, particularly when gene knockouts are difficult to obtain. We investigated whether the plant pathogenic fungus Fusarium oxysporum possesses a functional gene silencing machinery and whether hairpin RNA transcripts can be employed to effectively induce gene silencing. Results Here we show that, in the phytopathogenic fungus F. oxysporum, hpRNA transgenes targeting either a β-glucuronidase (Gus) reporter transgene (hpGus) or the endogenous gene Frp1 (hpFrp) did not induce significant silencing of the target genes. Expression analysis suggested that the hpRNA transgenes are prone to transcriptional inactivation, resulting in low levels of hpRNA and siRNA production. However, the hpGus RNA can be efficiently transcribed by promoters acquired either by recombination with a pre-existing, actively transcribed Gus transgene or by fortuitous integration near an endogenous gene promoter allowing siRNA production. These siRNAs effectively induced silencing of a target Gus transgene, which in turn appeared to also induce secondary siRNA production. Furthermore, our results suggested that hpRNA transcripts without poly(A) tails are efficiently processed into siRNAs to induce gene silencing. A convergent promoter transgene, designed to express poly(A)-minus sense and antisense Gus RNAs, without an inverted-repeat DNA structure, induced consistent Gus silencing in F. oxysporum. Conclusions These results indicate that F. oxysporum possesses functional RNA silencing machineries for siRNA production and target mRNA cleavage, but hpRNA transgenes may induce transcriptional self-silencing due to its inverted-repeat structure. Our results suggest that F. oxysporum possesses a similar gene silencing pathway to other fungi like fission yeast, and indicate a need for developing more effective RNA silencing technology for gene function studies in this fungal pathogen. PMID:23819794
Adeno-associated virus inverted terminal repeats stimulate gene editing.
Hirsch, M L
2015-02-01
Advancements in genome editing have relied on technologies to specifically damage DNA which, in turn, stimulates DNA repair including homologous recombination (HR). As off-target concerns complicate the therapeutic translation of site-specific DNA endonucleases, an alternative strategy to stimulate gene editing based on fragile DNA was investigated. To do this, an episomal gene-editing reporter was generated by a disruptive insertion of the adeno-associated virus (AAV) inverted terminal repeat (ITR) into the egfp gene. Compared with a non-structured DNA control sequence, the ITR induced DNA damage as evidenced by increased gamma-H2AX and Mre11 foci formation. As local DNA damage stimulates HR, ITR-mediated gene editing was investigated using DNA oligonucleotides as repair substrates. The AAV ITR stimulated gene editing >1000-fold in a replication-independent manner and was not biased by the polarity of the repair oligonucleotide. Analysis of additional human DNA sequences demonstrated stimulation of gene editing to varying degrees. In particular, inverted yet not direct, Alu repeats induced gene editing, suggesting a role for DNA structure in the repair event. Collectively, the results demonstrate that inverted DNA repeats stimulate gene editing via double-strand break repair in an episomal context and allude to efficient gene editing of the human chromosome using fragile DNA sequences.
Clustering Algorithms: Their Application to Gene Expression Data
Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel
2016-01-01
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867
Gao, Feng; Song, Weibo; Katz, Laura A
2014-08-01
In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that (1) alternative processing is extensive among gene families; and (2) such gene families are likely to be C. uncinata specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family-a protein kinase domain containing protein (PKc)-from two C. uncinata strains. Analysis of the PKc sequences reveals that (1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and (2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES.
Lock, Eric F; Hoadley, Katherine A; Marron, J S; Nobel, Andrew B
2013-03-01
Research in several fields now requires the analysis of datasets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such datasets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data, and provides new directions for the visual exploration of joint and individual structure. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.
Structural and functional analyses of genes encoding VQ proteins in apple.
Dong, Qinglong; Zhao, Shuang; Duan, Dingyue; Tian, Yi; Wang, Yanpeng; Mao, Ke; Zhou, Zongshan; Ma, Fengwang
2018-07-01
Recent studies with Arabidopsis and soybean have shown that a class of valine-glutamine (VQ) motif-containing proteins interacts with some WRKY transcription factors. However, little is known about the evolution, structures, and functions of those proteins in apple. Here, we examined their features and identified 49 apple VQ genes. Our evolutional analysis revealed that the proteins could be clustered into nine groups together with their homologues in 33 species. Historically, the main characteristics of proteins in Groups I, V, VI, VII, IX, and X were thought to have been generated before the monocot-dicot split, whereas those in Groups II, III + IV, and VIII were generated after that split. In the structural analysis, apple MdVQ proteins appeared to bind only with Group I and IIc MdWRKY proteins. Meanwhile, MdVQ1, MdVQ10, MdVQ15, and MdVQ36 interacted with multiple MdVQ proteins to form heterodimers but MdVQ15 formed a homodimer. The functional analysis indicated that overexpression of some apple MdVQs in Arabidopsis and tobacco plants effected their vegetative and reproductive growth. These results provide important information about the characteristics of apple MdVQ genes and can serve as a solid foundation for further studies about the role of WRKY-VQ interactions in regulating apple developmental and defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Chassain, Benoît; Lemée, Ludovic; Didi, Jennifer; Thiberge, Jean-Michel; Brisse, Sylvain; Pons, Jean-Louis
2012-01-01
Staphylococcus lugdunensis is recognized as one of the major pathogenic species within the genus Staphylococcus, even though it belongs to the coagulase-negative group. A multilocus sequence typing (MLST) scheme was developed to study the genetic relationships and population structure of 87 S. lugdunensis isolates from various clinical and geographic sources by DNA sequence analysis of seven housekeeping genes (aroE, dat, ddl, gmk, ldh, recA, and yqiL). The number of alleles ranged from four (gmk and ldh) to nine (yqiL). Allelic profiles allowed the definition of 20 different sequence types (STs) and five clonal complexes. The 20 STs lacked correlation with geographic source. Isolates recovered from hematogenic infections (blood or osteoarticular isolates) or from skin and soft tissue infections did not cluster in separate lineages. Penicillin-resistant isolates clustered mainly in one clonal complex, unlike glycopeptide-tolerant isolates, which did not constitute a distinct subpopulation within S. lugdunensis. Phylogenies from the sequences of the seven individual housekeeping genes were congruent, indicating a predominantly mutational evolution of these genes. Quantitative analysis of the linkages between alleles from the seven loci revealed a significant linkage disequilibrium, thus confirming a clonal population structure for S. lugdunensis. This first MLST scheme for S. lugdunensis provides a new tool for investigating the macroepidemiology and phylogeny of this unusually virulent coagulase-negative Staphylococcus. PMID:22785196
Analysis of Craniocardiac Malformations in Xenopus using Optical Coherence Tomography
Deniz, Engin; Jonas, Stephan; Hooper, Michael; N. Griffin, John; Choma, Michael A.; Khokha, Mustafa K.
2017-01-01
Birth defects affect 3% of children in the United States. Among the birth defects, congenital heart disease and craniofacial malformations are major causes of mortality and morbidity. Unfortunately, the genetic mechanisms underlying craniocardiac malformations remain largely uncharacterized. To address this, human genomic studies are identifying sequence variations in patients, resulting in numerous candidate genes. However, the molecular mechanisms of pathogenesis for most candidate genes are unknown. Therefore, there is a need for functional analyses in rapid and efficient animal models of human disease. Here, we coupled the frog Xenopus tropicalis with Optical Coherence Tomography (OCT) to create a fast and efficient system for testing craniocardiac candidate genes. OCT can image cross-sections of microscopic structures in vivo at resolutions approaching histology. Here, we identify optimal OCT imaging planes to visualize and quantitate Xenopus heart and facial structures establishing normative data. Next we evaluate known human congenital heart diseases: cardiomyopathy and heterotaxy. Finally, we examine craniofacial defects by a known human teratogen, cyclopamine. We recapitulate human phenotypes readily and quantify the functional and structural defects. Using this approach, we can quickly test human craniocardiac candidate genes for phenocopy as a critical first step towards understanding disease mechanisms of the candidate genes. PMID:28195132
2013-01-01
Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484
Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272
Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Homology-dependent Gene Silencing in Paramecium
Ruiz, Françoise; Vayssié, Laurence; Klotz, Catherine; Sperling, Linda; Madeddu, Luisa
1998-01-01
Microinjection at high copy number of plasmids containing only the coding region of a gene into the Paramecium somatic macronucleus led to a marked reduction in the expression of the corresponding endogenous gene(s). The silencing effect, which is stably maintained throughout vegetative growth, has been observed for all Paramecium genes examined so far: a single-copy gene (ND7), as well as members of multigene families (centrin genes and trichocyst matrix protein genes) in which all closely related paralogous genes appeared to be affected. This phenomenon may be related to posttranscriptional gene silencing in transgenic plants and quelling in Neurospora and allows the efficient creation of specific mutant phenotypes thus providing a potentially powerful tool to study gene function in Paramecium. For the two multigene families that encode proteins that coassemble to build up complex subcellular structures the analysis presented herein provides the first experimental evidence that the members of these gene families are not functionally redundant. PMID:9529389
Distribution of mutations in the PEX gene in families with X-linked hypophosphataemic rickets (HYP).
Rowe, P S; Oudet, C L; Francis, F; Sinding, C; Pannetier, S; Econs, M J; Strom, T M; Meitinger, T; Garabedian, M; David, A; Macher, M A; Questiaux, E; Popowska, E; Pronicka, E; Read, A P; Mokrzycki, A; Glorieux, F H; Drezner, M K; Hanauer, A; Lehrach, H; Goulding, J N; O'Riordan, J L
1997-04-01
Mutations in the PEX gene at Xp22.1 (phosphate-regulating gene with homologies to endopeptidases, on the X-chromosome), are responsible for X-linked hypophosphataemic rickets (HYP). Homology of PEX to the M13 family of Zn2+ metallopeptidases which include neprilysin (NEP) as prototype, has raised important questions regarding PEX function at the molecular level. The aim of this study was to analyse 99 HYP families for PEX gene mutations, and to correlate predicted changes in the protein structure with Zn2+ metallopeptidase gene function. Primers flanking 22 characterised exons were used to amplify DNA by PCR, and SSCP was then used to screen for mutations. Deletions, insertions, nonsense mutations, stop codons and splice mutations occurred in 83% of families screened for in all 22 exons, and 51% of a separate set of families screened in 17 PEX gene exons. Missense mutations in four regions of the gene were informative regarding function, with one mutation in the Zn2+-binding site predicted to alter substrate enzyme interaction and catalysis. Computer analysis of the remaining mutations predicted changes in secondary structure, N-glycosylation, protein phosphorylation and catalytic site molecular structure. The wide range of mutations that align with regions required for protease activity in NEP suggests that PEX also functions as a protease, and may act by processing factor(s) involved in bone mineral metabolism.
Analysis of Flavonoids and the Flavonoid Structural Genes in Brown Fiber of Upland Cotton
Liu, Yongchang; Li, Yanjun; Zhang, Xinyu; Jones, Brian Joseph; Sun, Yuqiang; Sun, Jie
2013-01-01
Backgroud As a result of changing consumer preferences, cotton (Gossypium Hirsutum L.) from varieties with naturally colored fibers is becoming increasingly sought after in the textile industry. The molecular mechanisms leading to colored fiber development are still largely unknown, although it is expected that the color is derived from flavanoids. Experimental Design Firstly, four key genes of the flavonoid biosynthetic pathway in cotton (GhC4H, GhCHS, GhF3′H, and GhF3′5′H) were cloned and studied their expression profiles during the development of brown- and white cotton fibers by QRT-PCR. And then, the concentrations of four components of the flavonoid biosynthetic pathway, naringenin, quercetin, kaempferol and myricetin in brown- and white fibers were analyzed at different developmental stages by HPLC. Result The predicted proteins of the four flavonoid structural genes corresponding to these genes exhibit strong sequence similarity to their counterparts in various plant species. Transcript levels for all four genes were considerably higher in developing brown fibers than in white fibers from a near isogenic line (NIL). The contents of four flavonoids (naringenin, quercetin, kaempferol and myricetin) were significantly higher in brown than in white fibers and corresponding to the biosynthetic gene expression levels. Conclusions Flavonoid structural gene expression and flavonoid metabolism are important in the development of pigmentation in brown cotton fibers. PMID:23527031
Shitsukawa, Naoki; Tahira, Chikako; Kassai, Ken-Ichiro; Hirabayashi, Chizuru; Shimizu, Tomoaki; Takumi, Shigeo; Mochida, Keiichi; Kawaura, Kanako; Ogihara, Yasunari; Murai, Koji
2007-06-01
Bread wheat (Triticum aestivum) is a hexaploid species with A, B, and D ancestral genomes. Most bread wheat genes are present in the genome as triplicated homoeologous genes (homoeologs) derived from the ancestral species. Here, we report that both genetic and epigenetic alterations have occurred in the homoeologs of a wheat class E MADS box gene. Two class E genes are identified in wheat, wheat SEPALLATA (WSEP) and wheat LEAFY HULL STERILE1 (WLHS1), which are homologs of Os MADS45 and Os MADS1 in rice (Oryza sativa), respectively. The three wheat homoeologs of WSEP showed similar genomic structures and expression profiles. By contrast, the three homoeologs of WLHS1 showed genetic and epigenetic alterations. The A genome WLHS1 homoeolog (WLHS1-A) had a structural alteration that contained a large novel sequence in place of the K domain sequence. A yeast two-hybrid analysis and a transgenic experiment indicated that the WLHS1-A protein had no apparent function. The B and D genome homoeologs, WLHS1-B and WLHS1-D, respectively, had an intact MADS box gene structure, but WLHS1-B was predominantly silenced by cytosine methylation. Consequently, of the three WLHS1 homoeologs, only WLHS1-D functions in hexaploid wheat. This is a situation where three homoeologs are differentially regulated by genetic and epigenetic mechanisms.
Limited family structure and BRCA gene mutation status in single cases of breast cancer.
Weitzel, Jeffrey N; Lagos, Veronica I; Cullinane, Carey A; Gambol, Patricia J; Culver, Julie O; Blazer, Kathleen R; Palomares, Melanie R; Lowstuter, Katrina J; MacDonald, Deborah J
2007-06-20
An autosomal dominant pattern of hereditary breast cancer may be masked by small family size or transmission through males given sex-limited expression. To determine if BRCA gene mutations are more prevalent among single cases of early onset breast cancer in families with limited vs adequate family structure than would be predicted by currently available probability models. A total of 1543 women seen at US high-risk clinics for genetic cancer risk assessment and BRCA gene testing were enrolled in a prospective registry study between April 1997 and February 2007. Three hundred six of these women had breast cancer before age 50 years and no first- or second-degree relatives with breast or ovarian cancers. The main outcome measure was whether family structure, assessed from multigenerational pedigrees, predicts BRCA gene mutation status. Limited family structure was defined as fewer than 2 first- or second-degree female relatives surviving beyond age 45 years in either lineage. Family structure effect and mutation probability by the Couch, Myriad, and BRCAPRO models were assessed with stepwise multiple logistic regression. Model sensitivity and specificity were determined and receiver operating characteristic curves were generated. Family structure was limited in 153 cases (50%). BRCA gene mutations were detected in 13.7% of participants with limited vs 5.2% with adequate family structure. Family structure was a significant predictor of mutation status (odds ratio, 2.8; 95% confidence interval, 1.19-6.73; P = .02). Although none of the models performed well, receiver operating characteristic analysis indicated that modification of BRCAPRO output by a corrective probability index accounting for family structure was the most accurate BRCA gene mutation status predictor (area under the curve, 0.72; 95% confidence interval, 0.63-0.81; P<.001) for single cases of breast cancer. Family structure can affect the accuracy of mutation probability models. Genetic testing guidelines may need to be more inclusive for single cases of breast cancer when the family structure is limited and probability models need to be recreated using limited family history as an actual variable.
Hassan, Md. Imtaiyaz; Waheed, Abdul; Grubb, Jeffery H.; Klei, Herbert E.; Korolev, Sergey; Sly, William S.
2013-01-01
Human β-glucuronidase (GUS) cleaves β-D-glucuronic acid residues from the non-reducing termini of glycosaminoglycan and its deficiency leads to mucopolysaccharidosis type VII (MPSVII). Here we report a high resolution crystal structure of human GUS at 1.7 Å resolution and present an extensive analysis of the structural features, unifying recent findings in the field of lysosome targeting and glycosyl hydrolases. The structure revealed several new details including a new glycan chain at Asn272, in addition to that previously observed at Asn173, and coordination of the glycan chain at Asn173 with Lys197 of the lysosomal targeting motif which is essential for phosphotransferase recognition. Analysis of the high resolution structure not only provided new insights into the structural basis for lysosomal targeting but showed significant differences between human GUS, which is medically important in its own right, and E. coli GUS, which can be selectively inhibited in the human gut to prevent prodrug activation and is also widely used as a reporter gene by plant biologists. Despite these differences, both human and E. coli GUS share a high structure homology in all three domains with most of the glycosyl hydrolases, suggesting that they all evolved from a common ancestral gene. PMID:24260279
New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison
Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.
2011-01-01
Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011
Ernst, Antonia M; Jekat, Stephan B; Zielonka, Sascia; Müller, Boje; Neumann, Ulla; Rüping, Boris; Twyman, Richard M; Krzyzanek, Vladislav; Prüfer, Dirk; Noll, Gundula A
2012-07-10
The sieve element occlusion (SEO) gene family originally was delimited to genes encoding structural components of forisomes, which are specialized crystalloid phloem proteins found solely in the Fabaceae. More recently, SEO genes discovered in various non-Fabaceae plants were proposed to encode the common phloem proteins (P-proteins) that plug sieve plates after wounding. We carried out a comprehensive characterization of two tobacco (Nicotiana tabacum) SEO genes (NtSEO). Reporter genes controlled by the NtSEO promoters were expressed specifically in immature sieve elements, and GFP-SEO fusion proteins formed parietal agglomerates in intact sieve elements as well as sieve plate plugs after wounding. NtSEO proteins with and without fluorescent protein tags formed agglomerates similar in structure to native P-protein bodies when transiently coexpressed in Nicotiana benthamiana, and the analysis of these protein complexes by electron microscopy revealed ultrastructural features resembling those of native P-proteins. NtSEO-RNA interference lines were essentially devoid of P-protein structures and lost photoassimilates more rapidly after injury than control plants, thus confirming the role of P-proteins in sieve tube sealing. We therefore provide direct evidence that SEO genes in tobacco encode P-protein subunits that affect translocation. We also found that peptides recently identified in fascicular phloem P-protein plugs from squash (Cucurbita maxima) represent cucurbit members of the SEO family. Our results therefore suggest a common evolutionary origin for P-proteins found in the sieve elements of all dicotyledonous plants and demonstrate the exceptional status of extrafascicular P-proteins in cucurbits.
Schilf, Paul; Peter, Annette; Hurek, Thomas; Stick, Reimer
2014-07-01
Lamin proteins are found in all metazoans. Most non-vertebrate genomes including those of the closest relatives of vertebrates, the cephalochordates and tunicates, encode only a single lamin. In teleosts and tetrapods the number of lamin genes has quadrupled. They can be divided into four sub-types, lmnb1, lmnb2, LIII, and lmna, each characterized by particular features and functional differentiations. Little is known when during vertebrate evolution these features have emerged. Lampreys belong to the Agnatha, the sister group of the Gnathostomata. They split off first within the vertebrate lineage. Analysis of the sea lamprey (Petromyzon marinus) lamin complement presented here, identified three functional lamin genes, one encoding a lamin LIII, indicating that the characteristic gene structure of this subtype had been established prior to the agnathan/gnathostome split. Two other genes encode lamins for which orthology to gnathostome lamins cannot be designated. Search for lamin gene sequences in all vertebrate taxa for which sufficient sequence data are available reveals the evolutionary time frame in which specific features of the vertebrate lamins were established. Structural features characteristic for A-type lamins are not found in the lamprey genome. In contrast, lmna genes are present in all gnathostome lineages suggesting that this gene evolved with the emergence of the gnathostomes. The analysis of lamin gene neighborhoods reveals noticeable similarities between the different vertebrate lamin genes supporting the hypothesis that they emerged due to two rounds of whole genome duplication and makes clear that an orthologous relationship between a particular vertebrate paralog and lamins outside the vertebrate lineage cannot be established. Copyright © 2014 Elsevier GmbH. All rights reserved.
Ventura, Marco; Kenny, John G; Zhang, Ziding; Fitzgerald, Gerald F; van Sinderen, Douwe
2005-09-01
The so-called clp genes, which encode components of the Clp proteolytic complex, are widespread among bacteria. The Bifidobacterium breve UCC 2003 genome contains a clpB gene with significant homology to predicted clpB genes from other members of the Actinobacteridae group. The heat- and osmotic-inducibility of the B. breve UCC 2003 clpB homologue was verified by slot-blot analysis, while Northern blot and primer extension analyses showed that the clpB gene is transcribed as a monocistronic unit with a single promoter. The role of a hspR homologue, known to control the regulation of clpB and dnaK gene expression in other high G+C content bacteria was investigated by gel mobility shift assays. Moreover the predicted 3D structure of HspR provides further insight into the binding mode of this protein to the clpB promoter region, and highlights the key amino acid residues believed to be involved in the protein-DNA interaction.
Yan, Zaisheng; He, Yuhong; Cai, Haiyuan; Van Nostrand, Joy D; He, Zhili; Zhou, Jizhong; Krumholz, Lee R; Jiang, He-Long
2017-08-01
Sediment microbial fuel cells (SMFCs) can stimulate the degradation of polycyclic aromatic hydrocarbons in sediments, but the mechanism of this process is poorly understood at the microbial functional gene level. Here, the use of SMFC resulted in 92% benzo[a]pyrene (BaP) removal over 970 days relative to 54% in the controls. Sediment functions, microbial community structure, and network interactions were dramatically altered by the SMFC employment. Functional gene analysis showed that c-type cytochrome genes for electron transfer, aromatic degradation genes, and extracellular ligninolytic enzymes involved in lignin degradation were significantly enriched in bulk sediments during SMFC operation. Correspondingly, chemical analysis of the system showed that these genetic changes resulted in increases in the levels of easily oxidizable organic carbon and humic acids which may have resulted in increased BaP bioavailability and increased degradation rates. Tracking microbial functional genes and corresponding organic matter responses should aid mechanistic understanding of BaP enhanced biodegradation by microbial electrochemistry and development of sustainable bioremediation strategies.