Lampreys, the jawless vertebrates, contain only two ParaHox gene clusters.
Zhang, Huixian; Ravi, Vydianathan; Tay, Boon-Hui; Tohari, Sumanty; Pillai, Nisha E; Prasad, Aravind; Lin, Qiang; Brenner, Sydney; Venkatesh, Byrappa
2017-08-22
ParaHox genes ( Gsx , Pdx , and Cdx ) are an ancient family of developmental genes closely related to the Hox genes. They play critical roles in the patterning of brain and gut. The basal chordate, amphioxus, contains a single ParaHox cluster comprising one member of each family, whereas nonteleost jawed vertebrates contain four ParaHox genomic loci with six or seven ParaHox genes. Teleosts, which have experienced an additional whole-genome duplication, contain six ParaHox genomic loci with six ParaHox genes. Jawless vertebrates, represented by lampreys and hagfish, are the most ancient group of vertebrates and are crucial for understanding the origin and evolution of vertebrate gene families. We have previously shown that lampreys contain six Hox gene loci. Here we report that lampreys contain only two ParaHox gene clusters (designated as α- and β-clusters) bearing five ParaHox genes ( Gsxα , Pdxα , Cdxα , Gsxβ , and Cdxβ ). The order and orientation of the three genes in the α-cluster are identical to that of the single cluster in amphioxus. However, the orientation of Gsxβ in the β-cluster is inverted. Interestingly, Gsxβ is expressed in the eye, unlike its homologs in jawed vertebrates, which are expressed mainly in the brain. The lamprey Pdxα is expressed in the pancreas similar to jawed vertebrate Pdx genes, indicating that the pancreatic expression of Pdx was acquired before the divergence of jawless and jawed vertebrate lineages. It is likely that the lamprey Pdxα plays a crucial role in pancreas specification and insulin production similar to the Pdx of jawed vertebrates.
Wang, Guang-Zhong; Lercher, Martin J.; Hurst, Laurence D.
2011-01-01
Abstract How is noise in gene expression modulated? Do mechanisms of noise control impact genome organization? In yeast, the expression of one gene can affect that of a very close neighbor. As the effect is highly regionalized, we hypothesize that genes in different orientations will have differing degrees of coupled expression and, in turn, different noise levels. Divergently organized gene pairs, in particular those with bidirectional promoters, have close promoters, maximizing the likelihood that expression of one gene affects the neighbor. With more distant promoters, the same is less likely to hold for gene pairs in nondivergent orientation. Stochastic models suggest that coupled chromatin dynamics will typically result in low abundance-corrected noise (ACN). Transcription of noncoding RNA (ncRNA) from a bidirectional promoter, we thus hypothesize to be a noise-reduction, expression-priming, mechanism. The hypothesis correctly predicts that protein-coding genes with a bidirectional promoter, including those with a ncRNA partner, have lower ACN than other genes and divergent gene pairs uniquely have correlated ACN. Moreover, as predicted, ACN increases with the distance between promoters. The model also correctly predicts ncRNA transcripts to be often divergently transcribed from genes that a priori would be under selection for low noise (essential genes, protein complex genes) and that the latter genes should commonly reside in divergent orientation. Likewise, that genes with bidirectional promoters are rare subtelomerically, cluster together, and are enriched in essential gene clusters is expected and observed. We conclude that gene orientation and transcription of ncRNAs are candidate modulators of noise. PMID:21402863
USDA-ARS?s Scientific Manuscript database
Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...
Luis F. Larrondo; Bernardo Gonzalez; Dan Cullen; Rafael Vicuna
2004-01-01
A cluster of multicopper oxidase genes (mco1, mco2, mco3, mco4) from the lignin-degrading basidiomycete Phanerochaete chrysosporium is described. The four genes share the same transcriptional orientation within a 25 kb region. mco1, mco2 and mco3 are tightly grouped, with intergenic regions of 2.3 and 0.8 kb, respectively, whereas mco4 is located 11 kb upstream of mco1...
Das, G; Henning, D; Wright, D; Reddy, R
1988-01-01
Whereas the genes coding for trimethyl guanosine-capped snRNAs are transcribed by RNA polymerase II, the U6 RNA genes are transcribed by RNA polymerase III. In this study, we have analyzed the cis-regulatory elements involved in the transcription of a mouse U6 snRNA gene in vitro and in frog oocytes. Transcriptional analysis of mutant U6 gene constructs showed that, unlike most known cases of polymerase III transcription, intragenic sequences except the initiation nucleotide are dispensable for efficient and accurate transcription of U6 gene in vitro. Transcription of 5' deletion mutants in vitro and in frog oocytes showed that the upstream region, within 79 bp from the initiation nucleotide, contains elements necessary for U6 gene transcription. Transcription studies were carried out in frog oocytes with U6 genes containing 5' distal sequence; these studies revealed that the distal element acts as an orientation-dependent enhancer when present upstream to the gene, while it is orientation-independent but distance-dependent enhancer when placed down-stream to the U6 gene. Analysis of 3' deletion mutants showed that the transcription termination of U6 RNA is dependent on a T cluster present on the 3' end of the gene, thus providing further support to other lines of evidence that U6 genes are transcribed by RNA polymerase III. These observations suggest the involvement of a composite of components of RNA polymerase II and III transcription machineries in the transcription of U6 genes by RNA polymerase III. Images PMID:3366121
Siegel, Nicol; Hoegg, Simone; Salzburger, Walter; Braasch, Ingo; Meyer, Axel
2007-01-01
Background The evolutionary lineage leading to the teleost fish underwent a whole genome duplication termed FSGD or 3R in addition to two prior genome duplications that took place earlier during vertebrate evolution (termed 1R and 2R). Resulting from the FSGD, additional copies of genes are present in fish, compared to tetrapods whose lineage did not experience the 3R genome duplication. Interestingly, we find that ParaHox genes do not differ in number in extant teleost fishes despite their additional genome duplication from the genomic situation in mammals, but they are distributed over twice as many paralogous regions in fish genomes. Results We determined the DNA sequence of the entire ParaHox C1 paralogon in the East African cichlid fish Astatotilapia burtoni, and compared it to orthologous regions in other vertebrate genomes as well as to the paralogous vertebrate ParaHox D paralogons. Evolutionary relationships among genes from these four chromosomal regions were studied with several phylogenetic algorithms. We provide evidence that the genes of the ParaHox C paralogous cluster are duplicated in teleosts, just as it had been shown previously for the D paralogon genes. Overall, however, synteny and cluster integrity seems to be less conserved in ParaHox gene clusters than in Hox gene clusters. Comparative analyses of non-coding sequences uncovered conserved, possibly co-regulatory elements, which are likely to contain promoter motives of the genes belonging to the ParaHox paralogons. Conclusion There seems to be strong stabilizing selection for gene order as well as gene orientation in the ParaHox C paralogon, since with a few exceptions, only the lengths of the introns and intergenic regions differ between the distantly related species examined. The high degree of evolutionary conservation of this gene cluster's architecture in particular – but possibly clusters of genes more generally – might be linked to the presence of promoter, enhancer or inhibitor motifs that serve to regulate more than just one gene. Therefore, deletions, inversions or relocations of individual genes could destroy the regulation of the clustered genes in this region. The existence of such a regulation network might explain the evolutionary conservation of gene order and orientation over the course of hundreds of millions of years of vertebrate evolution. Another possible explanation for the highly conserved gene order might be the existence of a regulator not located immediately next to its corresponding gene but further away since a relocation or inversion would possibly interrupt this interaction. Different ParaHox clusters were found to have experienced differential gene loss in teleosts. Yet the complete set of these homeobox genes was maintained, albeit distributed over almost twice the number of chromosomes. Selection due to dosage effects and/or stoichiometric disturbance might act more strongly to maintain a modal number of homeobox genes (and possibly transcription factors more generally) per genome, yet permit the accumulation of other (non regulatory) genes associated with these homeobox gene clusters. PMID:17822543
Heyers, Dominik; Manns, Martina; Luksch, Harald; Güntürkün, Onur; Mouritsen, Henrik
2007-09-26
The magnetic compass of migratory birds has been suggested to be light-dependent. Retinal cryptochrome-expressing neurons and a forebrain region, "Cluster N", show high neuronal activity when night-migratory songbirds perform magnetic compass orientation. By combining neuronal tracing with behavioral experiments leading to sensory-driven gene expression of the neuronal activity marker ZENK during magnetic compass orientation, we demonstrate a functional neuronal connection between the retinal neurons and Cluster N via the visual thalamus. Thus, the two areas of the central nervous system being most active during magnetic compass orientation are part of an ascending visual processing stream, the thalamofugal pathway. Furthermore, Cluster N seems to be a specialized part of the visual wulst. These findings strongly support the hypothesis that migratory birds use their visual system to perceive the reference compass direction of the geomagnetic field and that migratory birds "see" the reference compass direction provided by the geomagnetic field.
Hong, Sun Woo; Yoo, Jae-Wook; Bose, Shambhunath; Park, Jung-Hyun; Han, Kyungsun; Kim, Soyoun; Lim, Chi-Yeon; Kim, Hojun; Lee, Dong-Ki
2015-10-19
Human constitution, the fundamental basis of oriental medicine, is categorized into different patterns for a particular disease according to the physical, physiological, and clinical characteristics of the individuals. Obesity, a condition of metabolic disorder, is classified according to six patterns in oriental medicine, as follows: spleen deficiency syndrome, phlegm fluid syndrome, yang deficiency syndrome (YDS), food accumulation syndrome (FAS), liver depression syndrome (LDS), and blood stasis syndrome. In oriental medicine, identification of the disease pattern for individual obese patients is performed on the basis of differentiation in obesity syndrome index and, accordingly, personalized treatment is provided to the patients. The aim of the current study was to understand the obesity patterns in oriental medicine from the genomic point of view via determining the gene expression signature of obese patients using peripheral blood mononuclear cells as the samples. The study was conducted in 23 South Korean obese subjects (19 female and four male) with BMI ≥25 kg/m(2). Identification of oriental obesity pattern was based on the software-guided evaluation of the responses of the subjects to a questionnaire developed by the Korean Institute of Oriental Medicine. The expression profiles of genes were determined using DNA microarray and the level of transcription of genes of interest was further evaluated using quantitative real-time PCR (qRT-PCR). Gene clustering analysis of the microarray data from the FAS, LDS, and YDS subjects exhibited disease pattern-specific upregulation of expression of several genes in a particular cluster. Further analysis of transcription of selected genes using qRT-PCR led to identification of specific genes, including prostaglandin endoperoxide synthase 2, G0/G1 switch 2, carcinoembryonic antigen-related cell adhesion molecule 3, cystein-serine-rich nuclear protein 1, and interleukin 8 receptor, alpha which were highly expressed in LDS obesity constitution. Our current study can be considered as a valuable contribution to the understanding of possible explanation for obesity pattern differentiation in oriental medicine. Further studies can address a novel possibility that the genomic and oriental empirical approaches can be combined and implemented in systematic and synergistic development of personalized medicine. This clinical trial was registered in Clinical Research Information Service of Korea National Institute of Health ( https://cris.nih.go.kr/cris/index.jsp ). KCT0000387.
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.
Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun
2012-01-01
Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Identification of lethal cluster of genes in the yeast transcription network
NASA Astrophysics Data System (ADS)
Rho, K.; Jeong, H.; Kahng, B.
2006-05-01
Identification of essential or lethal genes would be one of the ultimate goals in drug designs. Here we introduce an in silico method to select the cluster with a high population of lethal genes, called lethal cluster, through microarray assay. We construct a gene transcription network based on the microarray expression level. Links are added one by one in the descending order of the Pearson correlation coefficients between two genes. As the link density p increases, two meaningful link densities pm and ps are observed. At pm, which is smaller than the percolation threshold, the number of disconnected clusters is maximum, and the lethal genes are highly concentrated in a certain cluster that needs to be identified. Thus the deletion of all genes in that cluster could efficiently lead to a lethal inviable mutant. This lethal cluster can be identified by an in silico method. As p increases further beyond the percolation threshold, the power law behavior in the degree distribution of a giant cluster appears at ps. We measure the degree of each gene at ps. With the information pertaining to the degrees of each gene at ps, we return to the point pm and calculate the mean degree of genes of each cluster. We find that the lethal cluster has the largest mean degree.
Cheng, Yi-Qiang; Yang, Min; Matter, Andrea M
2007-06-01
A gene cluster responsible for the biosynthesis of anticancer agent FK228 has been identified, cloned, and partially characterized in Chromobacterium violaceum no. 968. First, a genome-scanning approach was applied to identify three distinctive C. violaceum no. 968 genomic DNA clones that code for portions of nonribosomal peptide synthetase and polyketide synthase. Next, a gene replacement system developed originally for Pseudomonas aeruginosa was adapted to inactivate the genomic DNA-associated candidate natural product biosynthetic genes in vivo with high efficiency. Inactivation of a nonribosomal peptide synthetase-encoding gene completely abolished FK228 production in mutant strains. Subsequently, the entire FK228 biosynthetic gene cluster was cloned and sequenced. This gene cluster is predicted to encompass a 36.4-kb DNA region that includes 14 genes. The products of nine biosynthetic genes are proposed to constitute an unusual hybrid nonribosomal peptide synthetase-polyketide synthase-nonribosomal peptide synthetase assembly line including accessory activities for the biosynthesis of FK228. In particular, a putative flavin adenine dinucleotide-dependent pyridine nucleotide-disulfide oxidoreductase is proposed to catalyze disulfide bond formation between two sulfhydryl groups of cysteine residues as the final step in FK228 biosynthesis. Acquisition of the FK228 biosynthetic gene cluster and acclimation of an efficient genetic system should enable genetic engineering of the FK228 biosynthetic pathway in C. violaceum no. 968 for the generation of structural analogs as anticancer drug candidates.
Lukashin, A V; Fuchs, R
2001-05-01
Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.
Srujan, Marepally; Chandrashekhar, Voshavar; Reddy, Rakesh C; Prabhakar, Rairala; Sreedhar, Bojja; Chaudhuri, Arabinda
2011-08-01
Understanding the structural parameters of cationic amphiphiles which can influence gene transfer efficiencies of cationic amphiphiles continues to remain important for designing efficient liposomal gene delivery reagents. Previously we demonstrated the influence of structural orientation of the ester linker (widely used in covalently tethering the polar head and the non-polar tails) in modulating in vitro gene transfer efficiencies of cationic amphiphiles. However, our previously described cationic amphiphiles with ester linkers failed to deliver genes under in vivo conditions. Herein we report on the development of a highly serum compatible cationic amphiphile with circulation stable amide linker which shows remarkable selectivity in transfecting mouse lung. We also demonstrate that reversing structural orientation of the amide linker adversely affects both serum compatibility and the lung selective gene transfer property. Dynamic laser light scattering and atomic force microscopic studies revealed smaller average hydrodynamic sizes of the liposomes of transfection efficient lipid than those for the liposomes of transfection incompetent analog (148 ± 1 nm vs 214 ± 4 nm). Average surface potential of the liposomes of transfection competent amphiphiles were found to be significantly higher than that for the liposomes of transfection incompetent analog (10.7 ± 5.4 mV vs 2.8 ± 1.3 mV, respectively). Findings in fluorescence resonance energy transfer and dye entrapment experiments support lower rigidity and higher biomembrane fusogenicity of the liposomes of the transfection efficient amphiphiles. Importantly, cationic lipoplexes of the novel amide-linker based amphiphile exhibited higher mouse lung selective gene transfer properties than DOTAP, one of the widely used commercially available liposomal lung transfection kits. In summary, the present findings demonstrate for the first time that amide linker structural orientation profoundly influences the serum compatibility and lung transfection efficiencies of cationic amphiphiles. Copyright © 2011 Elsevier Ltd. All rights reserved.
SeMPI: a genome-based secondary metabolite prediction and identification web server.
Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan
2017-07-03
The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cadherin genes and evolutionary novelties in the octopus.
Wang, Z Yan; Ragsdale, Clifton W
2017-09-01
All animals with large brains must have molecular mechanisms to regulate neuronal process outgrowth and prevent neurite self-entanglement. In vertebrates, two major gene families implicated in these mechanisms are the clustered protocadherins and the atypical cadherins. However, the molecular mechanisms utilized in complex invertebrate brains, such as those of the cephalopods, remain largely unknown. Recently, we identified protocadherins and atypical cadherins in the octopus. The octopus protocadherin expansion shares features with the mammalian clustered protocadherins, including enrichment in neural tissues, clustered head-to-tail orientations in the genome, and a large first exon encoding all cadherin domains. Other octopus cadherins, including a newly-identified cadherin with 77 extracellular cadherin domains, are elevated in the suckers, a striking cephalopod novelty. Future study of these octopus genes may yield insights into the general functions of protocadherins in neural wiring and cadherin-related proteins in complex morphogenesis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bao, Yun-Juan; Liang, Zhong; Mayfield, Jeffrey A.; McShan, William M.; Lee, Shaun W.; Ploplis, Victoria A.; Castellino, Francis J.
2016-01-01
Symmetric genomic rearrangements around replication axes in genomes are commonly observed in prokaryotic genomes, including Group A Streptococcus (GAS). However, asymmetric rearrangements are rare. Our previous studies showed that the hypervirulent invasive GAS strain, M23ND, containing an inactivated transcriptional regulator system, covRS, exhibits unique extensive asymmetric rearrangements, which reconstructed a genomic structure distinct from other GAS genomes. In the current investigation, we identified the rearrangement events and examined the genetic consequences and evolutionary implications underlying the rearrangements. By comparison with a close phylogenetic relative, M18-MGAS8232, we propose a molecular model wherein a series of asymmetric rearrangements have occurred in M23ND, involving translocations, inversions and integrations mediated by multiple factors, viz., rRNA-comX (factor for late competence), transposons and phage-encoded gene segments. Assessments of the cumulative gene orientations and GC skews reveal that the asymmetric genomic rearrangements did not affect the general genomic integrity of the organism. However, functional distributions reveal re-clustering of a broad set of CovRS-regulated actively transcribed genes, including virulence factors and metabolic genes, to the same leading strand, with high confidence (p-value ~10−10). The re-clustering of the genes suggests a potential selection advantage for the spatial proximity to the transcription complexes, which may contain the global transcriptional regulator, CovRS, and other RNA polymerases. Their proximities allow for efficient transcription of the genes required for growth, virulence and persistence. A new paradigm of survival strategies of GAS strains is provided through multiple genomic rearrangements, while, at the same time, maintaining genomic integrity. PMID:27329479
Garbuz, David G; Yushenova, Irina A; Zatsepina, Olga G; Przhiboro, Andrey A; Bettencourt, Brian R; Evgen'ev, Michael B
2011-03-22
Previously, we described the heat shock response in dipteran species belonging to the family Stratiomyidae that develop in thermally and chemically contrasting habitats including highly aggressive ones. Although all species studied exhibit high constitutive levels of Hsp70 accompanied by exceptionally high thermotolerance, we also detected characteristic interspecies differences in heat shock protein (Hsp) expression and survival after severe heat shock. Here, we analyzed genomic libraries from two Stratiomyidae species from thermally and chemically contrasting habitats and determined the structure and organization of their hsp70 clusters. Although the genomes of both species contain similar numbers of hsp70 genes, the spatial distribution of hsp70 copies differs characteristically. In a population of the eurytopic species Stratiomys singularior, which exists in thermally variable and chemically aggressive (hypersaline) conditions, the hsp70 copies form a tight cluster with approximately equal intergenic distances. In contrast, in a population of the stenotopic Oxycera pardalina that dwells in a stable cold spring, we did not find hsp70 copies in tandem orientation. In this species, the distance between individual hsp70 copies in the genome is very large, if they are linked at all. In O. pardalina we detected the hsp68 gene located next to a hsp70 copy in tandem orientation. Although the hsp70 coding sequences of S. singularior are highly homogenized via conversion, the structure and general arrangement of the hsp70 clusters are highly polymorphic, including gross aberrations, various deletions in intergenic regions, and insertion of incomplete Mariner transposons in close vicinity to the 3'-UTRs. The hsp70 gene families in S. singularior and O. pardalina evolved quite differently from one another. We demonstrated clear evidence of homogenizing gene conversion in the S. singularior hsp70 genes, which form tight clusters in this species. In the case of the other species, O. pardalina, we found no clear trace of concerted evolution for the dispersed hsp70 genes. Furthermore, in the latter species we detected hsp70 pseudogenes, representing a hallmark of the birth-and-death process.
2011-01-01
Background Previously, we described the heat shock response in dipteran species belonging to the family Stratiomyidae that develop in thermally and chemically contrasting habitats including highly aggressive ones. Although all species studied exhibit high constitutive levels of Hsp70 accompanied by exceptionally high thermotolerance, we also detected characteristic interspecies differences in heat shock protein (Hsp) expression and survival after severe heat shock. Here, we analyzed genomic libraries from two Stratiomyidae species from thermally and chemically contrasting habitats and determined the structure and organization of their hsp70 clusters. Results Although the genomes of both species contain similar numbers of hsp70 genes, the spatial distribution of hsp70 copies differs characteristically. In a population of the eurytopic species Stratiomys singularior, which exists in thermally variable and chemically aggressive (hypersaline) conditions, the hsp70 copies form a tight cluster with approximately equal intergenic distances. In contrast, in a population of the stenotopic Oxycera pardalina that dwells in a stable cold spring, we did not find hsp70 copies in tandem orientation. In this species, the distance between individual hsp70 copies in the genome is very large, if they are linked at all. In O. pardalina we detected the hsp68 gene located next to a hsp70 copy in tandem orientation. Although the hsp70 coding sequences of S. singularior are highly homogenized via conversion, the structure and general arrangement of the hsp70 clusters are highly polymorphic, including gross aberrations, various deletions in intergenic regions, and insertion of incomplete Mariner transposons in close vicinity to the 3'-UTRs. Conclusions The hsp70 gene families in S. singularior and O. pardalina evolved quite differently from one another. We demonstrated clear evidence of homogenizing gene conversion in the S. singularior hsp70 genes, which form tight clusters in this species. In the case of the other species, O. pardalina, we found no clear trace of concerted evolution for the dispersed hsp70 genes. Furthermore, in the latter species we detected hsp70 pseudogenes, representing a hallmark of the birth-and-death process. PMID:21426536
Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.
Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi
2018-01-01
Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.
Li, Yaqian; Du, Xilin; Lu, Zhi John; Wu, Daqiang; Zhao, Yilei; Ren, Bin; Huang, Jiaofang; Huang, Xianqing; Xu, Yuhong; Xu, Yuquan
2011-01-01
Background Phenazines are important compounds produced by pseudomonads and other bacteria. Two phz gene clusters called phzA1-G1 and phzA2-G2, respectively, were found in the genome of Pseudomonas sp. M18, an effective biocontrol agent, which is highly homologous to the opportunistic human pathogen P. aeruginosa PAO1, however little is known about the correlation between the expressions of two phz gene clusters. Methodology/Principal Findings Two chromosomal insertion inactivated mutants for the two gene clusters were constructed respectively and the correlation between the expressions of two phz gene clusters was investigated in strain M18. Phenazine-1-carboxylic acid (PCA) molecules produced from phzA2-G2 gene cluster are able to auto-regulate expression itself and activate the expression of phzA1-G1 gene cluster in a circulated amplification pattern. However, the post-transcriptional expression of phzA1-G1 transcript was blocked principally through 5′-untranslated region (UTR). In contrast, the phzA2-G2 gene cluster was transcribed to a lesser extent and translated efficiently and was negatively regulated by the GacA signal transduction pathway, mainly at a post-transcriptional level. Conclusions/Significance A single molecule, PCA, produced in different quantities by the two phz gene clusters acted as the functional mediator and the two phz gene clusters developed a specific regulatory mechanism which acts through 5′-UTR to transfer a single, but complex bacterial signaling event in Pseudomonas sp. strain M18. PMID:21559370
Conditions for the Evolution of Gene Clusters in Bacterial Genomes
Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.
2010-01-01
Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992
Functional Organization of hsp70 Cluster in Camel (Camelus dromedarius) and Other Mammals
Garbuz, David G.; Astakhova, Lubov N.; Zatsepina, Olga G.; Arkhipova, Irina R.; Nudler, Eugene; Evgen'ev, Michael B.
2011-01-01
Heat shock protein 70 (Hsp70) is a molecular chaperone providing tolerance to heat and other challenges at the cellular and organismal levels. We sequenced a genomic cluster containing three hsp70 family genes linked with major histocompatibility complex (MHC) class III region from an extremely heat tolerant animal, camel (Camelus dromedarius). Two hsp70 family genes comprising the cluster contain heat shock elements (HSEs), while the third gene lacks HSEs and should not be induced by heat shock. Comparison of the camel hsp70 cluster with the corresponding regions from several mammalian species revealed similar organization of genes forming the cluster. Specifically, the two heat inducible hsp70 genes are arranged in tandem, while the third constitutively expressed hsp70 family member is present in inverted orientation. Comparison of regulatory regions of hsp70 genes from camel and other mammals demonstrates that transcription factor matches with highest significance are located in the highly conserved 250-bp upstream region and correspond to HSEs followed by NF-Y and Sp1 binding sites. The high degree of sequence conservation leaves little room for putative camel-specific regulatory elements. Surprisingly, RT-PCR and 5′/3′-RACE analysis demonstrated that all three hsp70 genes are expressed in camel's muscle and blood cells not only after heat shock, but under normal physiological conditions as well, and may account for tolerance of camel cells to extreme environmental conditions. A high degree of evolutionary conservation observed for the hsp70 cluster always linked with MHC locus in mammals suggests an important role of such organization for coordinated functioning of these vital genes. PMID:22096537
Kettle, Andrew J; Carere, Jason; Batley, Jacqueline; Manners, John M; Kazan, Kemal; Gardiner, Donald M
2016-03-01
A number of cereals produce the benzoxazolinone class of phytoalexins. Fusarium species pathogenic towards these hosts can typically degrade these compounds via an aminophenol intermediate, and the ability to do so is encoded by a group of genes found in the Fusarium Detoxification of Benzoxazolinone (FDB) cluster. A zinc finger transcription factor encoded by one of the FDB cluster genes (FDB3) has been proposed to regulate the expression of other genes in the cluster and hence is potentially involved in benzoxazolinone degradation. Herein we show that Fdb3 is essential for the ability of Fusarium pseudograminearum to efficiently detoxify the predominant wheat benzoxazolinone, 6-methoxy-benzoxazolin-2-one (MBOA), but not benzoxazoline-2-one (BOA). Furthermore, additional genes thought to be part of the FDB gene cluster, based upon transcriptional response to benzoxazolinones, are regulated by Fdb3. However, deletion mutants for these latter genes remain capable of benzoxazolinone degradation, suggesting that they are not essential for this process. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.
Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert
2018-03-02
The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and specificity 70-80% in training and external validation series. All of these results may point to a possible biological relevance of gene orientation inversion not directly dependent on genetic sequence information. This work opens the gate to the use of GOINs as a tool for the study of the structure of chromosomes and the study of protein function in proteome research.
Booma, P M; Prabhakaran, S; Dhanalakshmi, R
2014-01-01
Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.
Booma, P. M.; Prabhakaran, S.; Dhanalakshmi, R.
2014-01-01
Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality. PMID:25136661
Zhu, Yun J; Fitch, Maureen M M; Moore, Paul H
2006-01-01
Transgenic papaya plants were initially obtained using particle bombardment, a method having poor efficiency in producing intact, single-copy insertion of transgenes. Single-copy gene insertion was improved using Agrobacterium tumefaciens. With progress being made in genome sequencing and gene discovery, there is a need for more efficient methods of transformation in order to study the function of these genes. We describe a protocol for Agrobacterium-mediated transformation using carborundum-wounded papaya embryogenic calli. This method should lead to high-throughput transformation, which on average produced at least one plant that was positive in polymerase chain reaction (PCR), histochemical staining, or by Southern blot hybridization from 10 to 20% of the callus clusters that had been co-cultivated with Agrobacterium. Plants regenerated from the callus clusters in 9 to 13 mo.
Genomic organization of the rat alpha 2u-globulin gene cluster.
McFadyen, D A; Addison, W; Locke, J
1999-05-01
The alpha 2u-globulin are a group of similar proteins, belonging to the lipocalin superfamily of proteins, that are synthesized in a subset of secretory tissues in rats. The many alpha 2u-globulin isoforms are encoded by a multigene family that exhibits extensive homology. Despite a high degree of sequence identity, individual family members show diverse expression patterns involving complex hormonal, tissue-specific, and developmental regulation. Analysis suggests that there are approximately 20 alpha 2u-globulin genes in the rat genome. We have used fluorescence in situ hybridization (FISH) to show that the alpha 2u-globulin genes are clustered at a single site on rat Chromosome (Chr) 5 (5q22-24). Southern blots of rat genomic DNA separated by pulsed field gel electrophoresis indicated that the alpha 2u-globulin genes are contained on two NruI fragments with a total size of 880 kbp. Analysis of three P1 clones containing alpha 2u-globulin genes indicated that the alpha 2u-globulin genes are tandemly arranged in a head-to-tail fashion. The organization of the alpha 2u-globulin genes in the rat as a tandem array of single genes differs from the homologous major urinary protein genes in the mouse, which are organized as tandem arrays of divergently oriented gene pairs. The structure of these gene clusters may have consequences for the proposed function, as a pheromone transporter, for the protein products encoded by these genes.
GEsture: an online hand-drawing tool for gene expression pattern search.
Wang, Chunyan; Xu, Yiqing; Wang, Xuelin; Zhang, Li; Wei, Suyun; Ye, Qiaolin; Zhu, Youxiang; Yin, Hengfu; Nainwal, Manoj; Tanon-Reyes, Luis; Cheng, Feng; Yin, Tongming; Ye, Ning
2018-01-01
Gene expression profiling data provide useful information for the investigation of biological function and process. However, identifying a specific expression pattern from extensive time series gene expression data is not an easy task. Clustering, a popular method, is often used to classify similar expression genes, however, genes with a 'desirable' or 'user-defined' pattern cannot be efficiently detected by clustering methods. To address these limitations, we developed an online tool called GEsture. Users can draw, or graph a curve using a mouse instead of inputting abstract parameters of clustering methods. GEsture explores genes showing similar, opposite and time-delay expression patterns with a gene expression curve as input from time series datasets. We presented three examples that illustrate the capacity of GEsture in gene hunting while following users' requirements. GEsture also provides visualization tools (such as expression pattern figure, heat map and correlation network) to display the searching results. The result outputs may provide useful information for researchers to understand the targets, function and biological processes of the involved genes.
An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus.
Coyle, Christine M; Panaccione, Daniel G
2005-06-01
The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin.
An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†
Coyle, Christine M.; Panaccione, Daniel G.
2005-01-01
The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009
Zhu, Li-Ping; Yue, Xin-Jing; Han, Kui; Li, Zhi-Feng; Zheng, Lian-Shuai; Yi, Xiu-Nan; Wang, Hai-Long; Zhang, You-Ming; Li, Yue-Zhong
2015-07-22
Exotic genes, especially clustered multiple-genes for a complex pathway, are normally integrated into chromosome for heterologous expression. The influences of insertion sites on heterologous expression and allotropic expressions of exotic genes on host remain mostly unclear. We compared the integration and expression efficiencies of single and multiple exotic genes that were inserted into Myxococcus xanthus genome by transposition and attB-site-directed recombination. While the site-directed integration had a rather stable chloramphenicol acetyl transferase (CAT) activity, the transposition produced varied CAT enzyme activities. We attempted to integrate the 56-kb gene cluster for the biosynthesis of antitumor polyketides epothilones into M. xanthus genome by site-direction but failed, which was determined to be due to the insertion size limitation at the attB site. The transposition technique produced many recombinants with varied production capabilities of epothilones, which, however, were not paralleled to the transcriptional characteristics of the local sites where the genes were integrated. Comparative transcriptomics analysis demonstrated that the allopatric integrations caused selective changes of host transcriptomes, leading to varied expressions of epothilone genes in different mutants. With the increase of insertion fragment size, transposition is a more practicable integration method for the expression of exotic genes. Allopatric integrations selectively change host transcriptomes, which lead to varied expression efficiencies of exotic genes.
Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang
2016-09-01
With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
McGary, Kriston L; Slot, Jason C; Rokas, Antonis
2013-07-09
Genomic analyses have proliferated without being tied to tangible phenotypes. For example, although coordination of both gene expression and genetic linkage have been offered as genetic mechanisms for the frequently observed clustering of genes participating in fungal metabolic pathways, elucidation of the phenotype(s) favored by selection, resulting in cluster formation and maintenance, has not been forthcoming. We noted that the cause of certain well-studied human metabolic disorders is the accumulation of toxic intermediate compounds (ICs), which occurs when the product of an enzyme is not used as a substrate by a downstream neighbor in the metabolic network. This raises the hypothesis that the phenotype favored by selection to drive gene clustering is the mitigation of IC toxicity. To test this, we examined 100 diverse fungal genomes for the simplest type of cluster, gene pairs that are both metabolic neighbors and chromosomal neighbors immediately adjacent to each other, which we refer to as "double neighbor gene pairs" (DNGPs). Examination of the toxicity of their corresponding ICs shows that, compared with chromosomally nonadjacent metabolic neighbors, DNGPs are enriched for ICs that have acutely toxic LD50 doses or reactive functional groups. Furthermore, DNGPs are significantly more likely to be divergently oriented on the chromosome; remarkably, ∼40% of these DNGPs have ICs known to be toxic. We submit that the structure of synteny in metabolic pathways of fungi is a signature of selection for protection against the accumulation of toxic metabolic intermediates.
McGary, Kriston L.; Slot, Jason C.; Rokas, Antonis
2013-01-01
Genomic analyses have proliferated without being tied to tangible phenotypes. For example, although coordination of both gene expression and genetic linkage have been offered as genetic mechanisms for the frequently observed clustering of genes participating in fungal metabolic pathways, elucidation of the phenotype(s) favored by selection, resulting in cluster formation and maintenance, has not been forthcoming. We noted that the cause of certain well-studied human metabolic disorders is the accumulation of toxic intermediate compounds (ICs), which occurs when the product of an enzyme is not used as a substrate by a downstream neighbor in the metabolic network. This raises the hypothesis that the phenotype favored by selection to drive gene clustering is the mitigation of IC toxicity. To test this, we examined 100 diverse fungal genomes for the simplest type of cluster, gene pairs that are both metabolic neighbors and chromosomal neighbors immediately adjacent to each other, which we refer to as “double neighbor gene pairs” (DNGPs). Examination of the toxicity of their corresponding ICs shows that, compared with chromosomally nonadjacent metabolic neighbors, DNGPs are enriched for ICs that have acutely toxic LD50 doses or reactive functional groups. Furthermore, DNGPs are significantly more likely to be divergently oriented on the chromosome; remarkably, ∼40% of these DNGPs have ICs known to be toxic. We submit that the structure of synteny in metabolic pathways of fungi is a signature of selection for protection against the accumulation of toxic metabolic intermediates. PMID:23798424
Boldogköi, Zsolt
2012-01-01
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too. PMID:22783276
Boldogköi, Zsolt
2012-01-01
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
Dimitrakopoulou, Konstantina; Vrahatis, Aristidis G; Wilk, Esther; Tsakalidis, Athanasios K; Bezerianos, Anastasios
2013-09-01
The increasing flow of short time series microarray experiments for the study of dynamic cellular processes poses the need for efficient clustering tools. These tools must deal with three primary issues: first, to consider the multi-functionality of genes; second, to evaluate the similarity of the relative change of amplitude in the time domain rather than the absolute values; third, to cope with the constraints of conventional clustering algorithms such as the assignment of the appropriate cluster number. To address these, we propose OLYMPUS, a novel unsupervised clustering algorithm that integrates Differential Evolution (DE) method into Fuzzy Short Time Series (FSTS) algorithm with the scope to utilize efficiently the information of population of the first and enhance the performance of the latter. Our hybrid approach provides sets of genes that enable the deciphering of distinct phases in dynamic cellular processes. We proved the efficiency of OLYMPUS on synthetic as well as on experimental data. The discriminative power of OLYMPUS provided clusters, which refined the so far perspective of the dynamics of host response mechanisms to Influenza A (H1N1). Our kinetic model sets a timeline for several pathways and cell populations, implicated to participate in host response; yet no timeline was assigned to them (e.g. cell cycle, homeostasis). Regarding the activity of B cells, our approach revealed that some antibody-related mechanisms remain activated until day 60 post infection. The Matlab codes for implementing OLYMPUS, as well as example datasets, are freely accessible via the Web (http://biosignal.med.upatras.gr/wordpress/biosignal/). Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Genomic analysis of a new mammalian distal-less gene: Dlx7.
Nakamura, S; Stock, D W; Wydner, K L; Bollekens, J A; Takeshita, K; Nagai, B M; Chiba, S; Kitamura, T; Freeland, T M; Zhao, Z; Minowada, J; Lawrence, J B; Weiss, K M; Ruddle, F H
1996-12-15
We have cloned a new Dlx gene (Dlx7) from human and mouse that may represent the mammalian orthologue of the newt gene NvHBox-5. The homeodomains of these genes are highly similar to all other vertebrate Dlx genes, and regions of similarity also exist between mammalian Dlx7 and a subset of vertebrate Dlx genes downstream of the homeodomain. The sequence divergence between human and mouse Dlx7 in these regions is greater than that predicted from comparisons of other vertebrate Dlx genes, however, and there is little sequence similarity upstream of the homeodomain both between these two genes and with other Dlx genes. We present evidence for alternative splicing of mouse Dlx7 upstream of the homeodomain that may account for some of this divergence. We have mapped human DLX7 distal to the 5' end of the HOXB cluster at an estimated distance of between 1 and 2 Mb by FISH. Both the human and the mouse Dlx7 are shown to be closely linked to Dlx3 in a convergently transcribed orientation. These mapping results support the possibility that vertebrate distal-less genes have been duplicated in concert with the Hox clusters.
The Productivity Analysis of Chennai Automotive Industry Cluster
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2014-07-01
Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.
Menges, R; Muth, G; Wohlleben, W; Stegmann, E
2007-11-01
All known gene clusters for glycopeptide antibiotic biosynthesis contain a conserved gene supposed to encode an ABC-transporter. In the balhimycin-producer Amycolatopsis balhimycina this gene (tba) is localised between the prephenate dehydrogenase gene pdh and the peptide synthetase gene bpsA. Inactivation of tba in A. balhimycina by gene replacement did not interfere with growth and did not affect balhimycin resistance. However, in the supernatant of the tba mutant RM43 less balhimycin was accumulated compared to the wild type; and the intra-cellular balhimycin concentration was ten times higher in the tba mutant RM43 than in the wild type. These data suggest that the ABC transporter encoded in the balhimycin biosynthesis gene cluster is not involved in resistance but is required for the efficient export of the antibiotic. To elucidate the activity of Tba it was heterologously expressed in Escherichia coli with an N-terminal His-tag and purified by nickel chromatography. A photometric assay revealed that His(6)-Tba solubilised in dodecylmaltoside possesses ATPase activity, characteristic for ABC-transporters.
Novel genomic island modifies DNA with 7-deazaguanine derivatives
Thiaville, Jennifer J.; Kellner, Stefanie M.; Yuan, Yifeng; Hutinet, Geoffrey; Thiaville, Patrick C.; Jumpathong, Watthanachai; Mohapatra, Susovan; Brochier-Armanet, Celine; Letarov, Andrey V.; Hillebrand, Roman; Malik, Chanchal K.; Rizzo, Carmelo J.; Dedon, Peter C.; de Crécy-Lagard, Valérie
2016-01-01
The discovery of ∼20-kb gene clusters containing a family of paralogs of tRNA guanosine transglycosylase genes, called tgtA5, alongside 7-cyano-7-deazaguanine (preQ0) synthesis and DNA metabolism genes, led to the hypothesis that 7-deazaguanine derivatives are inserted in DNA. This was established by detecting 2’-deoxy-preQ0 and 2’-deoxy-7-amido-7-deazaguanosine in enzymatic hydrolysates of DNA extracted from the pathogenic, Gram-negative bacteria Salmonella enterica serovar Montevideo. These modifications were absent in the closely related S. enterica serovar Typhimurium LT2 and from a mutant of S. Montevideo, each lacking the gene cluster. This led us to rename the genes of the S. Montevideo cluster as dpdA-K for 7-deazapurine in DNA. Similar gene clusters were analyzed in ∼150 phylogenetically diverse bacteria, and the modifications were detected in DNA from other organisms containing these clusters, including Kineococcus radiotolerans, Comamonas testosteroni, and Sphingopyxis alaskensis. Comparative genomic analysis shows that, in Enterobacteriaceae, the cluster is a genomic island integrated at the leuX locus, and the phylogenetic analysis of the TgtA5 family is consistent with widespread horizontal gene transfer. Comparison of transformation efficiencies of modified or unmodified plasmids into isogenic S. Montevideo strains containing or lacking the cluster strongly suggests a restriction–modification role for the cluster in Enterobacteriaceae. Another preQ0 derivative, 2’-deoxy-7-formamidino-7-deazaguanosine, was found in the Escherichia coli bacteriophage 9g, as predicted from the presence of homologs of genes involved in the synthesis of the archaeosine tRNA modification. These results illustrate a deep and unexpected evolutionary connection between DNA and tRNA metabolism. PMID:26929322
Clustered Integrin Ligands as a Novel Approach for the Targeting of Non-Viral Vectors
NASA Astrophysics Data System (ADS)
Ng, Quinn Kwan Tai
Gene transfer or gene delivery is described as the process in which foreign DNA is introduced into cells. Over the years, gene delivery has gained the attention of many researchers and has been developed as powerful tools for use in biotechnology and medicine. With the completion of the Human Genome Project, such advances in technology allowed for the identification of diseases ranging from hereditary disorders to acquired ones (cancer) which were thought to be incurable. Gene therapy provides the means necessary to treat or eliminate genetic diseases from its origin, unlike traditional medicine which only treat symptoms. With ongoing clinical trials for gene therapy increasing, the greatest difficulty still lies in developing safe systems which can target cells of interest to provide efficient delivery. Nature, over millions of years of evolution, has provided an example of one of the most efficient delivery systems: viruses. Although the use of viruses for gene delivery has been well studied, the safety issues involving immunogenicity, insertional mutagenesis, high cost, and poor reproducibility has provided problems for their clinical application. From understanding viruses, we gain insight to designing new systems for non-viral gene delivery. One of these techniques utilized by adenoviruses is the clustering of ligands on its surface through the use of a protein called a penton base. Through the use of nanotechnology we can mimic this basic concept in non-viral gene delivery systems. This dissertation research is focused on developing and applying a novel system for displaying the integrin binding ligand (RGD) in a constrained manner to form a clustered integrin ligand binding platform to be used to enhance the targeting and efficiency of non-viral gene delivery vectors. Peptide mixed monolayer protected gold nanoparticles provides a suitable surface for ligand clustering. A relationship between the peptide ratios in the reaction solution used to form these ligand clusters compared to the reacted amounts on the surface of the particle was studied. This provided us the ability to control the size of the clusters formed and the spacing between the integrins for gold nanoparticles of various sizes. We then applied the clustered ligand binding system for targeting of DNA/PEI polyplexes and demonstrated that the use of RGD nanoclusters enhances gene transfer up to 35-fold which was dependent on the density of alphavbeta3 integrins on the cell surface. Cell integrin sensitivity was shown in which cells with higher alpha vbeta3 densities resulting in higher luciferase transgene expression. The targeting of RGD nanoclusters for DNA/PEI polyplexes was further shown in vivo using PET/CT technology which displayed improved targeting towards high level alphavbeta3 integrin expression (U87MG) tumors over medium level alphavbeta 3 integrin expression (HeLa). In addition to studying the clustered integrin binding system, the current non-viral vectors used suffer from stability and toxicity issues in vitro and in vivo. We have applied a new chemistry for synthesizing nanogels utilizing a Traut's reagent initiated Michael addition reaction for modification of diamine containing crosslikers which will allow for the development of stable and cell demanded release of oligonucleotides. We have shown bulk gels made were capable of encapsulating and holding DNA within the gel and were able to synthesize them into nanogels. The combined research shown here using clustered integrin ligands and a new type of nanogel synthesis provides an ideal system for gene delivery in the future.
Wiese, A; Syldatk, C; Mattes, R; Altenbuchner, J
2001-09-01
Arthrobacter aurescens DSM 3747 hydrolyzes stereospecifically 5'-monosubstituted hydantoins to alpha-amino acids. The genes involved in hydantoin utilization (hyu) were isolated on an 8.7-kb DNA fragment, and by DNA sequence analysis eight ORFs were identified. The hyu gene cluster includes four genes: hyuP encoding a putative transport protein, the hydantoin racemase gene hyuA, the hydantoinase gene hyuH, and the carbamoylase gene hyuC. The four genes are transcribed in the same direction. Upstream of hyuP and in opposite orientation to the hyu genes, three ORFs were found showing similarities to cytochrome P450 monooxygenase (ORF1, incomplete), to membrane proteins (ORF2), and to ferredoxin (ORF3). ORF8 was found downstream of hyuC and again in opposite orientation to the hyu genes. The gene product of ORF8 displayed similarities to the LacI/GalR family of transcriptional regulators. Reverse transcriptase PCR experiments and Northern blot analysis revealed that the genes hyuPAHC are coexpressed in A. aurescens after induction with 3-N-CH3-IMH. The expression of the hyu operon was not regulated by the putative regulator ORF8 as shown by gene disruption and mobility-shift experiments.
Nonlinear dimensionality reduction of data lying on the multicluster manifold.
Meng, Deyu; Leung, Yee; Fung, Tung; Xu, Zongben
2008-08-01
A new method, which is called decomposition-composition (D-C) method, is proposed for the nonlinear dimensionality reduction (NLDR) of data lying on the multicluster manifold. The main idea is first to decompose a given data set into clusters and independently calculate the low-dimensional embeddings of each cluster by the decomposition procedure. Based on the intercluster connections, the embeddings of all clusters are then composed into their proper positions and orientations by the composition procedure. Different from other NLDR methods for multicluster data, which consider associatively the intracluster and intercluster information, the D-C method capitalizes on the separate employment of the intracluster neighborhood structures and the intercluster topologies for effective dimensionality reduction. This, on one hand, isometrically preserves the rigid-body shapes of the clusters in the embedding process and, on the other hand, guarantees the proper locations and orientations of all clusters. The theoretical arguments are supported by a series of experiments performed on the synthetic and real-life data sets. In addition, the computational complexity of the proposed method is analyzed, and its efficiency is theoretically analyzed and experimentally demonstrated. Related strategies for automatic parameter selection are also examined.
Chen, Lin-Yuan; Tang, Ping-Han; Wu, Ten-Ming
2016-07-14
In terms of the local bond-orientational order (LBOO) parameters, a cluster approach to analyze local structures of simple liquids was developed. In this approach, a cluster is defined as a combination of neighboring seeds having at least nb local-orientational bonds and their nearest neighbors, and a cluster ensemble is a collection of clusters with a specified nb and number of seeds ns. This cluster analysis was applied to investigate the microscopic structures of liquid Ga at ambient pressure (AP). The liquid structures studied were generated through ab initio molecular dynamics simulations. By scrutinizing the static structure factors (SSFs) of cluster ensembles with different combinations of nb and ns, we found that liquid Ga at AP contained two types of cluster structures, one characterized by sixfold orientational symmetry and the other showing fourfold orientational symmetry. The SSFs of cluster structures with sixfold orientational symmetry were akin to the SSF of a hard-sphere fluid. On the contrary, the SSFs of cluster structures showing fourfold orientational symmetry behaved similarly as the anomalous SSF of liquid Ga at AP, which is well known for exhibiting a high-q shoulder. The local structures of a highly LBOO cluster whose SSF displayed a high-q shoulder were found to be more similar to the structure of β-Ga than those of other solid phases of Ga. More generally, the cluster structures showing fourfold orientational symmetry have an inclination to resemble more to β-Ga.
A cross-species bi-clustering approach to identifying conserved co-regulated genes.
Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo
2016-06-15
A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared to the two-step method and several recent joint clustering methods. We then applied this approach to two real world datasets of gene expression during the pre-implantation embryonic development of the human and mouse. Co-regulated genes consistent between the human and mouse were identified, offering insights into conserved functions, as well as similarities and differences in genome activation timing between the human and mouse embryos. The R package containing the implementation of the proposed method in C ++ is available at: https://github.com/JavonSun/mvbc.git and also at the R platform https://www.r-project.org/ jinbo@engr.uconn.edu. © The Author 2016. Published by Oxford University Press.
paraGSEA: a scalable approach for large-scale gene expression profiling
Peng, Shaoliang; Yang, Shunyun
2017-01-01
Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463
Nanospectroscopy of thiacyanine dye molecules adsorbed on silver nanoparticle clusters
NASA Astrophysics Data System (ADS)
Ralević, Uroš; Isić, Goran; Anicijević, Dragana Vasić; Laban, Bojana; Bogdanović, Una; Lazović, Vladimir M.; Vodnik, Vesna; Gajić, Radoš
2018-03-01
The adsorption of thiacyanine dye molecules on citrate-stabilized silver nanoparticle clusters drop-cast onto freshly cleaved mica or highly oriented pyrolytic graphite surfaces is examined using colocalized surface-enhanced Raman spectroscopy and atomic force microscopy. The incidence of dye Raman signatures in photoluminescence hotspots identified around nanoparticle clusters is considered for both citrate- and borate-capped silver nanoparticles and found to be substantially lower in the former case, suggesting that the citrate anions impede the efficient dye adsorption. Rigorous numerical simulations of light scattering on random nanoparticle clusters are used for estimating the electromagnetic enhancement and elucidating the hotspot formation mechanism. The majority of the enhanced Raman signal, estimated to be more than 90%, is found to originate from the nanogaps between adjacent nanoparticles in the cluster, regardless of the cluster size and geometry.
Lateralized activation of Cluster N in the brains of migratory songbirds
Liedvogel, Miriam; Feenders, Gesa; Wada, Kazuhiro; Troje, Nikolaus F.; Jarvis, Erich D.; Mouritsen, Henrik
2008-01-01
Cluster N is a cluster of forebrain regions found in night-migratory songbirds that shows high activation of activity-dependent gene expression during night-time vision. We have suggested that Cluster N may function as a specialized night-vision area in night-migratory birds and that it may be involved in processing light-mediated magnetic compass information. Here, we investigated these ideas. We found a significant lateralized dominance of Cluster N activation in the right hemisphere of European robins (Erithacus rubecula). Activation predominantly originated from the contralateral (left) eye. Garden warblers (Sylvia borin) tested under different magnetic field conditions and under monochromatic red light did not show significant differences in Cluster N activation. In the fairly sedentary Sardinian warbler (Sylvia melanocephala), which belongs to the same phyolgenetic clade, Cluster N showed prominent activation levels, similar to that observed in garden warblers and European robins. Thus, it seems that Cluster N activation occurs at night in all species within predominantly migratory groups of birds, probably because such birds have the capability of switching between migratory and sedentary life styles. The activation studies suggest that although Cluster N is lateralized, as is the dependence on magnetic compass orientation, either Cluster N is not involved in magnetic processing or the magnetic modulations of the primary visual signal, forming the basis for the currently supported light-dependent magnetic compass mechanism, are relatively small such that activity-dependent gene expression changes are not sensitive enough to pick them up. PMID:17331212
The Productivity and Technical Efficiency of Textile Industry Clusters in India
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2013-09-01
The Indian textile industry is one the largest and oldest sectors in the country and among the most important in the economy in terms of output, investment and employment (E). The sector employs nearly 35 million people and after agriculture, is the second-highest employer in the country. Its importance is underlined by the fact that it accounts for around 4 % of Gross Domestic Product, 14 % of industrial production, 9 % of excise collections, 18 % of E in the industrial sector, and 16 % of the country's total exports (Ex) earnings. For inclusive growth and sustainable development most of the Textile Manufacturers has adopted the Cluster Development Approach. The objective is to study the physical and financial performance, correlation, regression and Data Envelopment Analysis by measuring technical efficiency (Ø), peer weights (λi), input slacks (S-), output slacks (S+) and return to scale of four textile clusters (TCs) namely IchalKaranji Textile Cluster, Maharashtra; Ludhiana Textile Cluster, Punjab; Tirupur Textile Cluster, Tamilnadu and Panipat Textile Cluster, Haryana in India. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper Model by taking number of units (U) and number of E as inputs and sales (S) and Ex in crores as an outputs. The non-zero λi's represents the weights for efficient clusters. The S > 0 obtained for one TC reveals the excess U (S-) and E (S-) and shortage in sales (S+) and Ex (S+). To conclude, for inclusive growth and sustainable development, the inefficient TC should increase their S/turnover and Ex, as decrease in number of enterprises and E is practically not possible. Moreover for sustainable development, the TC should strengthen infrastructure interrelationships, technology interrelationships, procurement interrelationships, production interrelationships and marketing interrelationships to decrease cost, increase productivity and efficiency to compete in the world market.
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A.; Marks, Jonathan A.; Haiser, Henry J.; Turnbaugh, Peter J.
2015-01-01
ABSTRACT Elucidation of the molecular mechanisms underlying the human gut microbiota’s effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. PMID:25873372
Bhattacharya, Anindya; De, Rajat K
2010-08-01
Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. Copyright 2010 Elsevier Inc. All rights reserved.
Ding, Jiarui; Shah, Sohrab; Condon, Anne
2016-01-01
Motivation: Many biological data processing problems can be formalized as clustering problems to partition data points into sensible and biologically interpretable groups. Results: This article introduces densityCut, a novel density-based clustering algorithm, which is both time- and space-efficient and proceeds as follows: densityCut first roughly estimates the densities of data points from a K-nearest neighbour graph and then refines the densities via a random walk. A cluster consists of points falling into the basin of attraction of an estimated mode of the underlining density function. A post-processing step merges clusters and generates a hierarchical cluster tree. The number of clusters is selected from the most stable clustering in the hierarchical cluster tree. Experimental results on ten synthetic benchmark datasets and two microarray gene expression datasets demonstrate that densityCut performs better than state-of-the-art algorithms for clustering biological datasets. For applications, we focus on the recent cancer mutation clustering and single cell data analyses, namely to cluster variant allele frequencies of somatic mutations to reveal clonal architectures of individual tumours, to cluster single-cell gene expression data to uncover cell population compositions, and to cluster single-cell mass cytometry data to detect communities of cells of the same functional states or types. densityCut performs better than competing algorithms and is scalable to large datasets. Availability and Implementation: Data and the densityCut R package is available from https://bitbucket.org/jerry00/densitycut_dev. Contact: condon@cs.ubc.ca or sshah@bccrc.ca or jiaruid@cs.ubc.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153661
A Multiple Sphere T-Matrix Fortran Code for Use on Parallel Computer Clusters
NASA Technical Reports Server (NTRS)
Mackowski, D. W.; Mishchenko, M. I.
2011-01-01
A general-purpose Fortran-90 code for calculation of the electromagnetic scattering and absorption properties of multiple sphere clusters is described. The code can calculate the efficiency factors and scattering matrix elements of the cluster for either fixed or random orientation with respect to the incident beam and for plane wave or localized- approximation Gaussian incident fields. In addition, the code can calculate maps of the electric field both interior and exterior to the spheres.The code is written with message passing interface instructions to enable the use on distributed memory compute clusters, and for such platforms the code can make feasible the calculation of absorption, scattering, and general EM characteristics of systems containing several thousand spheres.
A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli.
Li, Mingji; Wang, Junshu; Geng, Yanping; Li, Yikui; Wang, Qian; Liang, Quanfeng; Qi, Qingsheng
2012-02-06
For metabolic engineering, many rate-limiting steps may exist in the pathways of accumulating the target metabolites. Increasing copy number of the desired genes in these pathways is a general method to solve the problem, for example, the employment of the multi-copy plasmid-based expression system. However, this method may bring genetic instability, structural instability and metabolic burden to the host, while integrating of the desired gene into the chromosome may cause inadequate transcription or expression. In this study, we developed a strategy for obtaining gene overexpression by engineering promoter clusters consisted of multiple core-tac-promoters (MCPtacs) in tandem. Through a uniquely designed in vitro assembling process, a series of promoter clusters were constructed. The transcription strength of these promoter clusters showed a stepwise enhancement with the increase of tandem repeats number until it reached the critical value of five. Application of the MCPtacs promoter clusters in polyhydroxybutyrate (PHB) production proved that it was efficient. Integration of the phaCAB genes with the 5CPtacs promoter cluster resulted in an engineered E.coli that can accumulate 23.7% PHB of the cell dry weight in batch cultivation. The transcription strength of the MCPtacs promoter cluster can be greatly improved by increasing the tandem repeats number of the core-tac-promoter. By integrating the desired gene together with the MCPtacs promoter cluster into the chromosome of E. coli, we can achieve high and stale overexpression with only a small size. This strategy has an application potential in many fields and can be extended to other bacteria.
Singh, Dadabhai T; Trehan, Rahul; Schmidt, Bertil; Bretschneider, Timo
2008-01-01
Preparedness for a possible global pandemic caused by viruses such as the highly pathogenic influenza A subtype H5N1 has become a global priority. In particular, it is critical to monitor the appearance of any new emerging subtypes. Comparative phyloinformatics can be used to monitor, analyze, and possibly predict the evolution of viruses. However, in order to utilize the full functionality of available analysis packages for large-scale phyloinformatics studies, a team of computer scientists, biostatisticians and virologists is needed--a requirement which cannot be fulfilled in many cases. Furthermore, the time complexities of many algorithms involved leads to prohibitive runtimes on sequential computer platforms. This has so far hindered the use of comparative phyloinformatics as a commonly applied tool in this area. In this paper the graphical-oriented workflow design system called Quascade and its efficient usage for comparative phyloinformatics are presented. In particular, we focus on how this task can be effectively performed in a distributed computing environment. As a proof of concept, the designed workflows are used for the phylogenetic analysis of neuraminidase of H5N1 isolates (micro level) and influenza viruses (macro level). The results of this paper are hence twofold. Firstly, this paper demonstrates the usefulness of a graphical user interface system to design and execute complex distributed workflows for large-scale phyloinformatics studies of virus genes. Secondly, the analysis of neuraminidase on different levels of complexity provides valuable insights of this virus's tendency for geographical based clustering in the phylogenetic tree and also shows the importance of glycan sites in its molecular evolution. The current study demonstrates the efficiency and utility of workflow systems providing a biologist friendly approach to complex biological dataset analysis using high performance computing. In particular, the utility of the platform Quascade for deploying distributed and parallelized versions of a variety of computationally intensive phylogenetic algorithms has been shown. Secondly, the analysis of the utilized H5N1 neuraminidase datasets at macro and micro levels has clearly indicated a pattern of spatial clustering of the H5N1 viral isolates based on geographical distribution rather than temporal or host range based clustering.
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A; Marks, Jonathan A; Haiser, Henry J; Turnbaugh, Peter J; Balskus, Emily P
2015-04-14
Elucidation of the molecular mechanisms underlying the human gut microbiota's effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. Anaerobic choline utilization is a bacterial metabolic activity that occurs in the human gut and is linked to multiple diseases. While bacterial genes responsible for choline fermentation (the cut gene cluster) have been recently identified, there has been no characterization of these genes in human gut isolates and microbial communities. In this work, we use multiple approaches to demonstrate that the pathway encoded by the cut genes is present and functional in a diverse range of human gut bacteria and is also widespread in stool metagenomes. We also developed a PCR-based strategy to detect a key functional gene (cutC) involved in this pathway and applied it to characterize newly isolated choline-utilizing strains. Both our analyses of the cut gene cluster and this molecular tool will aid efforts to further understand the role of choline metabolism in the human gut microbiota and its link to disease. Copyright © 2015 Martínez-del Campo et al.
Chapter 7. Cloning and analysis of natural product pathways.
Gust, Bertolt
2009-01-01
The identification of gene clusters of natural products has lead to an enormous wealth of information about their biosynthesis and its regulation, and about self-resistance mechanisms. Well-established routine techniques are now available for the cloning and sequencing of gene clusters. The subsequent functional analysis of the complex biosynthetic machinery requires efficient genetic tools for manipulation. Until recently, techniques for the introduction of defined changes into Streptomyces chromosomes were very time-consuming. In particular, manipulation of large DNA fragments has been challenging due to the absence of suitable restriction sites for restriction- and ligation-based techniques. The homologous recombination approach called recombineering (referred to as Red/ET-mediated recombination in this chapter) has greatly facilitated targeted genetic modifications of complex biosynthetic pathways from actinomycetes by eliminating many of the time-consuming and labor-intensive steps. This chapter describes techniques for the cloning and identification of biosynthetic gene clusters, for the generation of gene replacements within such clusters, for the construction of integrative library clones and their expression in heterologous hosts, and for the assembly of entire biosynthetic gene clusters from the inserts of individual library clones. A systematic approach toward insertional mutation of a complete Streptomyces genome is shown by the use of an in vitro transposon mutagenesis procedure.
Jang, Hongje; Min, Dal-Hee
2015-03-24
The polyvinylpyrrolidone (PVP)-coated spherically clustered porous gold-silver alloy nanoparticle (PVP-SPAN) was prepared by low temperature mediated, partially inhibited galvanic replacement reaction followed by silver etching process. The prepared porous nanostructures exhibited excellent photothermal conversion efficiency under irradiation of near-infrared light (NIR) and allowed a high payload of both doxorubicin (Dox) and thiolated dye-labeled oligonucleotide, DNAzyme (FDz). Especially, PVP-SPAN provided 10 times higher loading capacity for oligonucleotide than conventional hollow nanoshells due to increased pore diameter and surface-to-volume ratio. We demonstrated highly efficient chemo-thermo-gene multitherapy based on codelivery of Dox and FDz with NIR-mediated photothermal therapeutic effect using a model system of hepatitis C virus infected human liver cells (Huh7 human hepatocarcinoma cell line containing hepatitis C virus NS3 gene replicon) compared to conventional hollow nanoshells.
Improving cluster-based missing value estimation of DNA microarray data.
Brás, Lígia P; Menezes, José C
2007-06-01
We present a modification of the weighted K-nearest neighbours imputation method (KNNimpute) for missing values (MVs) estimation in microarray data based on the reuse of estimated data. The method was called iterative KNN imputation (IKNNimpute) as the estimation is performed iteratively using the recently estimated values. The estimation efficiency of IKNNimpute was assessed under different conditions (data type, fraction and structure of missing data) by the normalized root mean squared error (NRMSE) and the correlation coefficients between estimated and true values, and compared with that of other cluster-based estimation methods (KNNimpute and sequential KNN). We further investigated the influence of imputation on the detection of differentially expressed genes using SAM by examining the differentially expressed genes that are lost after MV estimation. The performance measures give consistent results, indicating that the iterative procedure of IKNNimpute can enhance the prediction ability of cluster-based methods in the presence of high missing rates, in non-time series experiments and in data sets comprising both time series and non-time series data, because the information of the genes having MVs is used more efficiently and the iterative procedure allows refining the MV estimates. More importantly, IKNN has a smaller detrimental effect on the detection of differentially expressed genes.
Detecting communities in large networks
NASA Astrophysics Data System (ADS)
Capocci, A.; Servedio, V. D. P.; Caldarelli, G.; Colaiori, F.
2005-07-01
We develop an algorithm to detect community structure in complex networks. The algorithm is based on spectral methods and takes into account weights and link orientation. Since the method detects efficiently clustered nodes in large networks even when these are not sharply partitioned, it turns to be specially suitable for the analysis of social and information networks. We test the algorithm on a large-scale data-set from a psychological experiment of word association. In this case, it proves to be successful both in clustering words, and in uncovering mental association patterns.
An efficient method to identify differentially expressed genes in microarray experiments
Qin, Huaizhen; Feng, Tao; Harding, Scott A.; Tsai, Chung-Jui; Zhang, Shuanglin
2013-01-01
Motivation Microarray experiments typically analyze thousands to tens of thousands of genes from small numbers of biological replicates. The fact that genes are normally expressed in functionally relevant patterns suggests that gene-expression data can be stratified and clustered into relatively homogenous groups. Cluster-wise dimensionality reduction should make it feasible to improve screening power while minimizing information loss. Results We propose a powerful and computationally simple method for finding differentially expressed genes in small microarray experiments. The method incorporates a novel stratification-based tight clustering algorithm, principal component analysis and information pooling. Comprehensive simulations show that our method is substantially more powerful than the popular SAM and eBayes approaches. We applied the method to three real microarray datasets: one from a Populus nitrogen stress experiment with 3 biological replicates; and two from public microarray datasets of human cancers with 10 to 40 biological replicates. In all three analyses, our method proved more robust than the popular alternatives for identification of differentially expressed genes. Availability The C++ code to implement the proposed method is available upon request for academic use. PMID:18453554
Census of solo LuxR genes in prokaryotic genomes
Hudaiberdiev, Sanjarbek; Choudhary, Kumari S.; Vera Alvarez, Roberto; Gelencsér, Zsolt; Ligeti, Balázs; Lamba, Doriano; Pongor, Sándor
2015-01-01
luxR genes encode transcriptional regulators that control acyl homoserine lactone-based quorum sensing (AHL QS) in Gram negative bacteria. On the bacterial chromosome, luxR genes are usually found next or near to a luxI gene encoding the AHL signal synthase. Recently, a number of luxR genes were described that have no luxI genes in their vicinity on the chromosome. These so-called solo luxR genes may either respond to internal AHL signals produced by a non-adjacent luxI in the chromosome, or can respond to exogenous signals. Here we present a survey of solo luxR genes found in complete and draft bacterial genomes in the NCBI databases using HMMs. We found that 2698 of the 3550 luxR genes found are solos, which is an unexpectedly high number even if some of the hits may be false positives. We also found that solo LuxR sequences form distinct clusters that are different from the clusters of LuxR sequences that are part of the known luxR-luxI topological arrangements. We also found a number of cases that we termed twin luxR topologies, in which two adjacent luxR genes were in tandem or divergent orientation. Many of the luxR solo clusters were devoid of the sequence motifs characteristic of AHL binding LuxR proteins so there is room to speculate that the solos may be involved in sensing hitherto unknown signals. It was noted that only some of the LuxR clades are rich in conserved cysteine residues. Molecular modeling suggests that some of the cysteines may be involved in disulfide formation, which makes us speculate that some LuxR proteins, including some of the solos may be involved in redox regulation. PMID:25815274
Census of solo LuxR genes in prokaryotic genomes.
Hudaiberdiev, Sanjarbek; Choudhary, Kumari S; Vera Alvarez, Roberto; Gelencsér, Zsolt; Ligeti, Balázs; Lamba, Doriano; Pongor, Sándor
2015-01-01
luxR genes encode transcriptional regulators that control acyl homoserine lactone-based quorum sensing (AHL QS) in Gram negative bacteria. On the bacterial chromosome, luxR genes are usually found next or near to a luxI gene encoding the AHL signal synthase. Recently, a number of luxR genes were described that have no luxI genes in their vicinity on the chromosome. These so-called solo luxR genes may either respond to internal AHL signals produced by a non-adjacent luxI in the chromosome, or can respond to exogenous signals. Here we present a survey of solo luxR genes found in complete and draft bacterial genomes in the NCBI databases using HMMs. We found that 2698 of the 3550 luxR genes found are solos, which is an unexpectedly high number even if some of the hits may be false positives. We also found that solo LuxR sequences form distinct clusters that are different from the clusters of LuxR sequences that are part of the known luxR-luxI topological arrangements. We also found a number of cases that we termed twin luxR topologies, in which two adjacent luxR genes were in tandem or divergent orientation. Many of the luxR solo clusters were devoid of the sequence motifs characteristic of AHL binding LuxR proteins so there is room to speculate that the solos may be involved in sensing hitherto unknown signals. It was noted that only some of the LuxR clades are rich in conserved cysteine residues. Molecular modeling suggests that some of the cysteines may be involved in disulfide formation, which makes us speculate that some LuxR proteins, including some of the solos may be involved in redox regulation.
Cui, G F; Wu, L F; Wang, X N; Jia, W J; Duan, Q; Ma, L L; Jiang, Y L; Wang, J H
2014-07-29
Inter-simple sequence repeat (ISSR) markers were used to discriminate 62 lily cultivars of 5 hybrid series. Eight ISSR primers generated 104 bands in total, which all showed 100% polymorphism, and an average of 13 bands were amplified by each primer. Two software packages, POPGENE 1.32 and NTSYSpc 2.1, were used to analyze the data matrix. Our results showed that the observed number of alleles (NA), effective number of alleles (NE), Nei's genetic diversity (H), and Shannon's information index (I) were 1.9630, 1.4179, 0.2606, and 0.4080, respectively. The highest genetic similarity (0.9601) was observed between the Oriental x Trumpet and Oriental lilies, which indicated that the two hybrids had a close genetic relationship. An unweighted pair-group method with arithmetic means dendrogram showed that the 62 lily cultivars clustered into two discrete groups. The first group included the Oriental and OT cultivars, while the Asiatic, LA, and Longiflorum lilies were placed in the second cluster. The distribution of individuals in the principal component analysis was consistent with the clustering of the dendrogram. Fingerprints of all lily cultivars built from 8 primers could be separated completely. This study confirmed the effect and efficiency of ISSR identification in lily cultivars.
Object-Oriented Image Clustering Method Using UAS Photogrammetric Imagery
NASA Astrophysics Data System (ADS)
Lin, Y.; Larson, A.; Schultz-Fellenz, E. S.; Sussman, A. J.; Swanson, E.; Coppersmith, R.
2016-12-01
Unmanned Aerial Systems (UAS) have been used widely as an imaging modality to obtain remotely sensed multi-band surface imagery, and are growing in popularity due to their efficiency, ease of use, and affordability. Los Alamos National Laboratory (LANL) has employed the use of UAS for geologic site characterization and change detection studies at a variety of field sites. The deployed UAS equipped with a standard visible band camera to collect imagery datasets. Based on the imagery collected, we use deep sparse algorithmic processing to detect and discriminate subtle topographic features created or impacted by subsurface activities. In this work, we develop an object-oriented remote sensing imagery clustering method for land cover classification. To improve the clustering and segmentation accuracy, instead of using conventional pixel-based clustering methods, we integrate the spatial information from neighboring regions to create super-pixels to avoid salt-and-pepper noise and subsequent over-segmentation. To further improve robustness of our clustering method, we also incorporate a custom digital elevation model (DEM) dataset generated using a structure-from-motion (SfM) algorithm together with the red, green, and blue (RGB) band data for clustering. In particular, we first employ an agglomerative clustering to create an initial segmentation map, from where every object is treated as a single (new) pixel. Based on the new pixels obtained, we generate new features to implement another level of clustering. We employ our clustering method to the RGB+DEM datasets collected at the field site. Through binary clustering and multi-object clustering tests, we verify that our method can accurately separate vegetation from non-vegetation regions, and are also able to differentiate object features on the surface.
Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W
1998-08-01
The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
dndDB: a database focused on phosphorothioation of the DNA backbone.
Ou, Hong-Yu; He, Xinyi; Shao, Yucheng; Tai, Cui; Rajakumar, Kumar; Deng, Zixin
2009-01-01
The Dnd DNA degradation phenotype was first observed during electrophoresis of genomic DNA from Streptomyces lividans more than 20 years ago. It was subsequently shown to be governed by the five-gene dnd cluster. Similar gene clusters have now been found to be widespread among many other distantly related bacteria. Recently the dnd cluster was shown to mediate the incorporation of sulphur into the DNA backbone via a sequence-selective, stereo-specific phosphorothioate modification in Escherichia coli B7A. Intriguingly, to date all identified dnd clusters lie within mobile genetic elements, the vast majority in laterally transferred genomic islands. We organized available data from experimental and bioinformatics analyses about the DNA phosphorothioation phenomenon and associated documentation as a dndDB database. It contains the following detailed information: (i) Dnd phenotype; (ii) dnd gene clusters; (iii) genomic islands harbouring dnd genes; (iv) Dnd proteins and conserved domains. As of 25 December 2008, dndDB contained data corresponding to 24 bacterial species exhibiting the Dnd phenotype reported in the scientific literature. In addition, via in silico analysis, dndDB identified 26 syntenic dnd clusters from 25 species of Eubacteria and Archaea, 25 dnd-bearing genomic islands and one dnd plasmid containing 114 dnd genes. A further 397 other genes coding for proteins with varying levels of similarity to Dnd proteins were also included in dndDB. A broad range of similarity search, sequence alignment and phylogenetic tools are readily accessible to allow for to individualized directions of research focused on dnd genes. dndDB can facilitate efficient investigation of a wide range of aspects relating to dnd DNA modification and other island-encoded functions in host organisms. dndDB version 1.0 is freely available at http://mml.sjtu.edu.cn/dndDB/.
Banelli, Barbara; Brigati, Claudio; Di Vinci, Angela; Casciano, Ida; Forlani, Alessandra; Borzì, Luana; Allemanni, Giorgio; Romani, Massimo
2012-03-01
Epigenetic alterations are hallmarks of cancer and powerful biomarkers, whose clinical utilization is made difficult by the absence of standardization and of common methods of data interpretation. The coordinate methylation of many loci in cancer is defined as 'CpG island methylator phenotype' (CIMP) and identifies clinically distinct groups of patients. In neuroblastoma (NB), CIMP is defined by a methylation signature, which includes different loci, but its predictive power on outcome is entirely recapitulated by the PCDHB cluster only. We have developed a robust and cost-effective pyrosequencing-based assay that could facilitate the clinical application of CIMP in NB. This assay permits the unbiased simultaneous amplification and sequencing of 17 out of 19 genes of the PCDHB cluster for quantitative methylation analysis, taking into account all the sequence variations. As some of these variations were at CpG doublets, we bypassed the data interpretation conducted by the methylation analysis software to assign the corrected methylation value at these sites. The final result of the assay is the mean methylation level of 17 gene fragments in the protocadherin B cluster (PCDHB) cluster. We have utilized this assay to compare the methylation levels of the PCDHB cluster between high-risk and very low-risk NB patients, confirming the predictive value of CIMP. Our results demonstrate that the pyrosequencing-based assay herein described is a powerful instrument for the analysis of this gene cluster that may simplify the data comparison between different laboratories and, in perspective, could facilitate its clinical application. Furthermore, our results demonstrate that, in principle, pyrosequencing can be efficiently utilized for the methylation analysis of gene clusters with high internal homologies.
eMBI: Boosting Gene Expression-based Clustering for Cancer Subtypes.
Chang, Zheng; Wang, Zhenjia; Ashby, Cody; Zhou, Chuan; Li, Guojun; Zhang, Shuzhong; Huang, Xiuzhen
2014-01-01
Identifying clinically relevant subtypes of a cancer using gene expression data is a challenging and important problem in medicine, and is a necessary premise to provide specific and efficient treatments for patients of different subtypes. Matrix factorization provides a solution by finding checker-board patterns in the matrices of gene expression data. In the context of gene expression profiles of cancer patients, these checkerboard patterns correspond to genes that are up- or down-regulated in patients with particular cancer subtypes. Recently, a new matrix factorization framework for biclustering called Maximum Block Improvement (MBI) is proposed; however, it still suffers several problems when applied to cancer gene expression data analysis. In this study, we developed many effective strategies to improve MBI and designed a new program called enhanced MBI (eMBI), which is more effective and efficient to identify cancer subtypes. Our tests on several gene expression profiling datasets of cancer patients consistently indicate that eMBI achieves significant improvements in comparison with MBI, in terms of cancer subtype prediction accuracy, robustness, and running time. In addition, the performance of eMBI is much better than another widely used matrix factorization method called nonnegative matrix factorization (NMF) and the method of hierarchical clustering, which is often the first choice of clinical analysts in practice.
eMBI: Boosting Gene Expression-based Clustering for Cancer Subtypes
Chang, Zheng; Wang, Zhenjia; Ashby, Cody; Zhou, Chuan; Li, Guojun; Zhang, Shuzhong; Huang, Xiuzhen
2014-01-01
Identifying clinically relevant subtypes of a cancer using gene expression data is a challenging and important problem in medicine, and is a necessary premise to provide specific and efficient treatments for patients of different subtypes. Matrix factorization provides a solution by finding checker-board patterns in the matrices of gene expression data. In the context of gene expression profiles of cancer patients, these checkerboard patterns correspond to genes that are up- or down-regulated in patients with particular cancer subtypes. Recently, a new matrix factorization framework for biclustering called Maximum Block Improvement (MBI) is proposed; however, it still suffers several problems when applied to cancer gene expression data analysis. In this study, we developed many effective strategies to improve MBI and designed a new program called enhanced MBI (eMBI), which is more effective and efficient to identify cancer subtypes. Our tests on several gene expression profiling datasets of cancer patients consistently indicate that eMBI achieves significant improvements in comparison with MBI, in terms of cancer subtype prediction accuracy, robustness, and running time. In addition, the performance of eMBI is much better than another widely used matrix factorization method called nonnegative matrix factorization (NMF) and the method of hierarchical clustering, which is often the first choice of clinical analysts in practice. PMID:25374455
NASA Astrophysics Data System (ADS)
Chen, Haichao; Meng, Xiaobo; Niu, Fenglin; Tang, Youcai; Yin, Chen; Wu, Furong
2018-02-01
Microseismic monitoring is crucial to improving stimulation efficiency of hydraulic fracturing treatment, as well as to mitigating potential induced seismic hazard. We applied an improved matching and locating technique to the downhole microseismic data set during one treatment stage along a horizontal well within the Weiyuan shale gas play inside Sichuan Basin in SW China, resulting in 3,052 well-located microseismic events. We employed this expanded catalog to investigate the spatiotemporal evolution of the microseismicity in order to constrain migration of the injected fluids and the associated dynamic processes. The microseismicity is generally characterized by two distinctly different clusters, both of which are highly correlated with the injection activity spatially and temporarily. The distant and well-confined cluster (cluster A) is featured by relatively large-magnitude events, with 40 events of M -1 or greater, whereas the cluster in the immediate vicinity of the wellbore (cluster B) includes two apparent lineations of seismicity with a NE-SW trending, consistent with the predominant orientation of natural fractures. We calculated the b-value and D-value, an index of fracture complexity, and found significant differences between the two seismicity clusters. Particularly, the distant cluster showed an extremely low b-value ( 0.47) and D-value ( 1.35). We speculate that the distant cluster is triggered by reactivation of a preexisting critically stressed fault, whereas the two lineations are induced by shear failures of optimally oriented natural fractures associated with fluid diffusion. In both cases, the spatially clustered microseismicity related to hydraulic stimulation is strongly controlled by the preexisting faults and fractures.
Song, Jia; Che, Jiaqian; You, Zhengying; Ye, Xiaogang; Li, Jisheng; Ye, Lupeng; Zhang, Yuyu; Qian, Qiujie; Zhong, Boxiong
2016-12-01
To probe the general phenomena of gene mutations, Bombyx mori, the lepidopterous model organism, was chosen as the experimental model. To easily detect phenotypic variations, the piggyBac system was utilized to introduce two marker genes into the silkworm, and 23.4% transposition efficiency aided in easily breeding a new strain for the entire experiment. Then, the clustered regularly interspaced short palindromic repeats/an associated protein (Cas9) system was utilized. The results showed that the Cas9 system can induce efficient gene mutations and the base changes could be detected since the G 0 individuals in B. mori; and that the mutation rates on different target sites were diverse. Next, the gRNA2-targeted site that generated higher mutation rate was chosen, and the experimental results were enumerated. First, the mutation proportion in G 1 generation was 30.1%, and some gene mutations were not inherited from the G 0 generation; second, occasionally, base substitutions did not lead to variation in the amino-acid sequence, which decreased the efficiency of phenotypic changes compared with that of genotypic changes. These results laid the foundation for better use of the Cas9 system in silkworm gene editing. © The Author 2016. Published by Oxford University Press on behalf of the Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Technical Efficiency of Automotive Industry Cluster in Chennai
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2012-07-01
Chennai is also called as Detroit of India due to its automotive industry presence producing over 40 % of the India's vehicle and components. During 2001-2002, diagnostic study was conducted on the Automotive Component Industries (ACI) in Ambattur Industrial Estate, Chennai and in SWOT analysis it was found that it had faced problems on infrastructure, technology, procurement, production and marketing. In the year 2004-2005 under the cluster development approach (CDA), they formed Chennai auto cluster, under public private partnership concept, received grant from Government of India, Government of Tamil Nadu, Ambattur Municipality, bank loans and stake holders. This results development in infrastructure, technology, procurement, production and marketing interrelationships among ACI. The objective is to determine the correlation coefficient, regression equation, technical efficiency, peer weights, slack variables and return to scale of cluster before and after the CDA. The methodology adopted is collection of primary data from ACI and analyzing using data envelopment analysis (DEA) of input oriented Banker-Charnes-Cooper model. There is significant increase in correlation coefficient and the regression analysis reveals that for one percent increase in employment and net worth, the gross output increases significantly after the CDA. The DEA solver gives the technical efficiency of ACI by taking shift, employment, net worth as input data and quality, gross output and export ratio as output data. From the technical score and ranking of ACI, it is found that there is significant increase in technical efficiency of ACI when compared to CDA. The slack variables obtained clearly reveals the excess employment and net worth and no shortage of gross output. To conclude there is increase in technical efficiency of not only Chennai auto cluster in general but also Chennai auto components industries in particular.
Jensen, Kristopher Torp; Fløe, Lasse; Petersen, Trine Skov; Huang, Jinrong; Xu, Fengping; Bolund, Lars; Luo, Yonglun; Lin, Lin
2017-07-01
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (CRISPR-Cas9) systems have emerged as the method of choice for genome editing, but large variations in on-target efficiencies continue to limit their applicability. Here, we investigate the effect of chromatin accessibility on Cas9-mediated gene editing efficiency for 20 gRNAs targeting 10 genomic loci in HEK293T cells using both SpCas9 and the eSpCas9(1.1) variant. Our study indicates that gene editing is more efficient in euchromatin than in heterochromatin, and we validate this finding in HeLa cells and in human fibroblasts. Furthermore, we investigate the gRNA sequence determinants of CRISPR-Cas9 activity using a surrogate reporter system and find that the efficiency of Cas9-mediated gene editing is dependent on guide sequence secondary structure formation. This knowledge can aid in the further improvement of tools for gRNA design. © 2017 Federation of European Biochemical Societies.
Merino-Puerto, Victoria; Herrero, Antonia
2013-01-01
The filamentous, heterocyst-forming cyanobacteria perform oxygenic photosynthesis in vegetative cells and nitrogen fixation in heterocysts, and their filaments can be hundreds of cells long. In the model heterocyst-forming cyanobacterium Anabaena sp. strain PCC 7120, the genes in the fraC-fraD-fraE operon are required for filament integrity mainly under conditions of nitrogen deprivation. The fraC operon transcript partially overlaps gene all2395, which lies in the opposite DNA strand and ends 1 bp beyond fraE. Gene all2395 produces transcripts of 1.35 kb (major transcript) and 2.2 kb (minor transcript) that overlap fraE and whose expression is dependent on the N-control transcription factor NtcA. Insertion of a gene cassette containing transcriptional terminators between fraE and all2395 prevented production of the antisense RNAs and resulted in an increased length of the cyanobacterial filaments. Deletion of all2395 resulted in a larger increase of filament length and in impaired growth, mainly under N2-fixing conditions and specifically on solid medium. We denote all2395 the fraF gene, which encodes a protein restricting filament length. A FraF-green fluorescent protein (GFP) fusion protein accumulated significantly in heterocysts. Similar to some heterocyst differentiation-related proteins such as HglK, HetL, and PatL, FraF is a pentapeptide repeat protein. We conclude that the fraC-fraD-fraE←fraF gene cluster (where the arrow indicates a change in orientation), in which cis antisense RNAs are produced, regulates morphology by encoding proteins that influence positively (FraC, FraD, FraE) or negatively (FraF) the length of the filament mainly under conditions of nitrogen deprivation. This gene cluster is often conserved in heterocyst-forming cyanobacteria. PMID:23813733
A regulatory circuit for piwi by the large Maf gene traffic jam in Drosophila.
Saito, Kuniaki; Inagaki, Sachi; Mituyama, Toutai; Kawamura, Yoshinori; Ono, Yukiteru; Sakota, Eri; Kotani, Hazuki; Asai, Kiyoshi; Siomi, Haruhiko; Siomi, Mikiko C
2009-10-29
PIWI-interacting RNAs (piRNAs) silence retrotransposons in Drosophila germ lines by associating with the PIWI proteins Argonaute 3 (AGO3), Aubergine (Aub) and Piwi. piRNAs in Drosophila are produced from intergenic repetitive genes and piRNA clusters by two systems: the primary processing pathway and the amplification loop. The amplification loop occurs in a Dicer-independent, PIWI-Slicer-dependent manner. However, primary piRNA processing remains elusive. Here we analysed piRNA processing in a Drosophila ovarian somatic cell line where Piwi, but not Aub or AGO3, is expressed; thus, only the primary piRNAs exist. In addition to flamenco, a Piwi-specific piRNA cluster, traffic jam (tj), a large Maf gene, was determined as a new piRNA cluster. piRNAs arising from tj correspond to the untranslated regions of tj messenger RNA and are sense-oriented. piRNA loading on to Piwi may occur in the cytoplasm. zucchini, a gene encoding a putative cytoplasmic nuclease, is required for tj-derived piRNA production. In tj and piwi mutant ovaries, somatic cells fail to intermingle with germ cells and Fasciclin III is overexpressed. Loss of tj abolishes Piwi expression in gonadal somatic cells. Thus, in gonadal somatic cells, tj gives rise simultaneously to two different molecules: the TJ protein, which activates Piwi expression, and piRNAs, which define the Piwi targets for silencing.
Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q
2015-07-01
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Career Orientation Curriculum Guide: 7-8.
ERIC Educational Resources Information Center
Willoughby-Eastlake School District, Willoughby, OH.
The Ohio Career Development Model at the 7th and 8th grade level, the career orientation segment, states that students are to be exposed or oriented to the 15 USOE occupational clusters. Units are outlined relating each subject area to a specific cluster or clusters. Each unit includes a developmental objective, related behavioral objectives, and…
Highly efficient biallelic genome editing of human ES/iPS cells using a CRISPR/Cas9 or TALEN system.
Takayama, Kazuo; Igai, Keisuke; Hagihara, Yasuko; Hashimoto, Rina; Hanawa, Morifumi; Sakuma, Tetsushi; Tachibana, Masashi; Sakurai, Fuminori; Yamamoto, Takashi; Mizuguchi, Hiroyuki
2017-05-19
Genome editing research of human ES/iPS cells has been accelerated by clustered regularly interspaced short palindromic repeats/CRISPR-associated 9 (CRISPR/Cas9) and transcription activator-like effector nucleases (TALEN) technologies. However, the efficiency of biallelic genetic engineering in transcriptionally inactive genes is still low, unlike that in transcriptionally active genes. To enhance the biallelic homologous recombination efficiency in human ES/iPS cells, we performed screenings of accessorial genes and compounds. We found that RAD51 overexpression and valproic acid treatment enhanced biallelic-targeting efficiency in human ES/iPS cells regardless of the transcriptional activity of the targeted locus. Importantly, RAD51 overexpression and valproic acid treatment synergistically increased the biallelic homologous recombination efficiency. Our findings would facilitate genome editing study using human ES/iPS cells. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ji, Shuiwang
2013-07-11
The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship.
Liu, Shi-Huo; Li, Hong-Fei; Yang, Yang; Yang, Rui-Lin; Yang, Wen-Jia; Jiang, Hong-Bo; Dou, Wei; Smagghe, Guy; Wang, Jin-Jun
2018-05-01
Chitinases (Chts) and chitin deacetylases (CDAs) are important enzymes required for chitin metabolism in insects. In this study, 12 Cht-related genes (including seven Cht genes and five imaginal disc growth factor genes) and 6 CDA genes (encoding seven proteins) were identified in Bactrocera dorsalis using genome-wide searching and transcript profiling. Based on the conserved sequences and phylogenetic relationships, 12 Cht-related proteins were clustered into eight groups (group I-V and VII-IX). Further domain architecture analysis showed that all contained at least one chitinase catalytic domain, however, only four (BdCht5, BdCht7, BdCht8 and BdCht10) possessed chitin-binding domains. The subsequent phylogenetic analysis revealed that seven CDAs were clustered into five groups (group I-V), and all had one chitin deacetylase catalytic domain. However, only six exhibited chitin-binding domains. Finally, the development- and tissue-specific expression profiling showed that transcript levels of the 12 Cht-related genes and 6 CDA genes varied considerably among eggs, larvae, pupae and adults, as well as among different tissues of larvae and adults. Our findings illustrate the structural differences and expression patterns of Cht and CDA genes in B. dorsalis, and provide important information for the development of new pest control strategies based on these vital enzymes. Copyright © 2018. Published by Elsevier Inc.
The PhytoClust tool for metabolic gene clusters discovery in plant genomes
Fuchs, Lisa-Maria
2017-01-01
Abstract The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. PMID:28486689
The PhytoClust tool for metabolic gene clusters discovery in plant genomes.
Töpfer, Nadine; Fuchs, Lisa-Maria; Aharoni, Asaph
2017-07-07
The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Optimized Clustering Estimators for BAO Measurements Accounting for Significant Redshift Uncertainty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ross, Ashley J.; Banik, Nilanjan; Avila, Santiago
2017-05-15
We determine an optimized clustering statistic to be used for galaxy samples with significant redshift uncertainty, such as those that rely on photometric redshifts. To do so, we study the BAO information content as a function of the orientation of galaxy clustering modes with respect to their angle to the line-of-sight (LOS). The clustering along the LOS, as observed in a redshift-space with significant redshift uncertainty, has contributions from clustering modes with a range of orientations with respect to the true LOS. For redshift uncertaintymore » $$\\sigma_z \\geq 0.02(1+z)$$ we find that while the BAO information is confined to transverse clustering modes in the true space, it is spread nearly evenly in the observed space. Thus, measuring clustering in terms of the projected separation (regardless of the LOS) is an efficient and nearly lossless compression of the signal for $$\\sigma_z \\geq 0.02(1+z)$$. For reduced redshift uncertainty, a more careful consideration is required. We then use more than 1700 realizations of galaxy simulations mimicking the Dark Energy Survey Year 1 sample to validate our analytic results and optimized analysis procedure. We find that using the correlation function binned in projected separation, we can achieve uncertainties that are within 10 per cent of of those predicted by Fisher matrix forecasts. We predict that DES Y1 should achieve a 5 per cent distance measurement using our optimized methods. We expect the results presented here to be important for any future BAO measurements made using photometric redshift data.« less
Optimized clustering estimators for BAO measurements accounting for significant redshift uncertainty
NASA Astrophysics Data System (ADS)
Ross, Ashley J.; Banik, Nilanjan; Avila, Santiago; Percival, Will J.; Dodelson, Scott; Garcia-Bellido, Juan; Crocce, Martin; Elvin-Poole, Jack; Giannantonio, Tommaso; Manera, Marc; Sevilla-Noarbe, Ignacio
2017-12-01
We determine an optimized clustering statistic to be used for galaxy samples with significant redshift uncertainty, such as those that rely on photometric redshifts. To do so, we study the baryon acoustic oscillation (BAO) information content as a function of the orientation of galaxy clustering modes with respect to their angle to the line of sight (LOS). The clustering along the LOS, as observed in a redshift-space with significant redshift uncertainty, has contributions from clustering modes with a range of orientations with respect to the true LOS. For redshift uncertainty σz ≥ 0.02(1 + z), we find that while the BAO information is confined to transverse clustering modes in the true space, it is spread nearly evenly in the observed space. Thus, measuring clustering in terms of the projected separation (regardless of the LOS) is an efficient and nearly lossless compression of the signal for σz ≥ 0.02(1 + z). For reduced redshift uncertainty, a more careful consideration is required. We then use more than 1700 realizations (combining two separate sets) of galaxy simulations mimicking the Dark Energy Survey Year 1 (DES Y1) sample to validate our analytic results and optimized analysis procedure. We find that using the correlation function binned in projected separation, we can achieve uncertainties that are within 10 per cent of those predicted by Fisher matrix forecasts. We predict that DES Y1 should achieve a 5 per cent distance measurement using our optimized methods. We expect the results presented here to be important for any future BAO measurements made using photometric redshift data.
Gwiazda, Kamila S; Grier, Alexandra E; Sahni, Jaya; Burleigh, Stephen M; Martin, Unja; Yang, Julia G; Popp, Nicholas A; Krutein, Michelle C; Khan, Iram F; Jacoby, Kyle; Jensen, Michael C; Rawlings, David J; Scharenberg, Andrew M
2016-09-29
Many future therapeutic applications of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 and related RNA-guided nucleases are likely to require their use to promote gene targeting, thus necessitating development of methods that provide for delivery of three components-Cas9, guide RNAs and recombination templates-to primary cells rendered proficient for homology-directed repair. Here, we demonstrate an electroporation/transduction codelivery method that utilizes mRNA to express both Cas9 and mutant adenoviral E4orf6 and E1b55k helper proteins in association with adeno-associated virus (AAV) vectors expressing guide RNAs and recombination templates. By transiently enhancing target cell permissiveness to AAV transduction and gene editing efficiency, this novel approach promotes efficient gene disruption and/or gene targeting at multiple loci in primary human T-cells, illustrating its broad potential for application in translational gene editing.
Sustainable Development in Indian Automotive Component Clusters
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2013-01-01
India is the world's second fastest growing auto market and boasts of the sixth largest automobile industry after China, the US, Germany, Japan and Brazil. The Indian auto component industry recorded its highest year-on-year growth of 34.2 % in 2010-2011, raking in revenue of US 39.9 billion; major contribution coming from exports at US five billion and fresh investment from the US at around US two billion. For inclusive growth and sustainable development most of the auto components manufacturers has adopted the cluster development approach. The objective is to study the technical efficiency (θ), peer weights (λ i ), input slacks (S-) and output slacks (S+) of four Auto Component Clusters (ACC) in India. The methodology adopted is using Data Envelopment Analysis of Input Oriented Banker Charnes Cooper Model by taking number of units and number of employments as inputs and sales and exports in crores as an outputs. The non-zero λ i 's represents the weights for efficient clusters. The S > 0 obtained for one ACC reveals the excess no. of units (S-) and employment (S-) and shortage in sales (S+) and exports (S+). However the variable returns to scale are increasing for three clusters, constant for one more cluster and with nil decrease. To conclude, for inclusive growth and sustainable development, the inefficient ACC should increase their turnover and exports, as decrease in no. of enterprises and employment is practically not possible. Moreover for sustainable development, the ACC should strengthen infrastructure interrelationships, technology interrelationships, procurement interrelationships, production interrelationships and marketing interrelationships to increase productivity and efficiency to compete in the world market.
Beites, Tiago; Mendes, Marta V
2015-01-01
The increased number of bacterial genome sequencing projects has generated over the last years a large reservoir of genomic information. In silico analysis of this genomic data has renewed the interest in bacterial bioprospecting for bioactive compounds by unveiling novel biosynthetic gene clusters of unknown or uncharacterized metabolites. However, only a small fraction of those metabolites is produced under laboratory-controlled conditions; the remaining clusters represent a pool of novel metabolites that are waiting to be "awaken". Activation of the biosynthetic gene clusters that present reduced or no expression (known as cryptic or silent clusters) by heterologous expression has emerged as a strategy for the identification and production of novel bioactive molecules. Synthetic biology, with engineering principles at its core, provides an excellent framework for the development of efficient heterologous systems for the expression of biosynthetic gene clusters. However, a common problem in its application is the host-interference problem, i.e., the unpredictable interactions between the device and the host that can hamper the desired output. Although an effort has been made to develop orthogonal devices, the most proficient way to overcome the host-interference problem is through genome simplification. In this review we present an overview on the strategies and tools used in the development of hosts/chassis for the heterologous expression of specialized metabolites biosynthetic gene clusters. Finally, we introduce the concept of specialized host as the next step of development of expression hosts.
Zhang, Bo; Zhang, Lin; Dai, Ruixue; Yu, Meiying; Zhao, Guoping; Ding, Xiaoming
2013-01-01
Streptomyces bacteria are known for producing important natural compounds by secondary metabolism, especially antibiotics with novel biological activities. Functional studies of antibiotic-biosynthesizing gene clusters are generally through homologous genomic recombination by gene-targeting vectors. Here, we present a rapid and efficient method for construction of gene-targeting vectors. This approach is based on Streptomyces phage φBT1 integrase-mediated multisite in vitro site-specific recombination. Four 'entry clones' were assembled into a circular plasmid to generate the destination gene-targeting vector by a one-step reaction. The four 'entry clones' contained two clones of the upstream and downstream flanks of the target gene, a selectable marker and an E. coli-Streptomyces shuttle vector. After targeted modification of the genome, the selectable markers were removed by φC31 integrase-mediated in vivo site-specific recombination between pre-placed attB and attP sites. Using this method, part of the calcium-dependent antibiotic (CDA) and actinorhodin (Act) biosynthetic gene clusters were deleted, and the rrdA encoding RrdA, a negative regulator of Red production, was also deleted. The final prodiginine production of the engineered strain was over five times that of the wild-type strain. This straightforward φBT1 and φC31 integrase-based strategy provides an alternative approach for rapid gene-targeting vector construction and marker removal in streptomycetes.
He, S Y; Lindeberg, M; Chatterjee, A K; Collmer, A
1991-02-01
The out genes of the enterobacterial plant pathogen Erwinia chrysanthemi are responsible for the efficient extracellular secretion of multiple plant cell wall-degrading enzymes, including four isozymes of pectate lyase, exo-poly-alpha-D-galacturonosidase, pectin methylesterase, and cellulase. Out- mutants of Er. chrysanthemi are unable to export any of these proteins beyond the periplasm and are severely reduced in virulence. We have cloned out genes from Er. chrysanthemi in the stable, low-copy-number cosmid pCPP19 by complementing several transposon-induced mutations. The cloned out genes were clustered in a 12-kilobase chromosomal DNA region, complemented all existing out mutations in Er. chrysanthemi EC16, and enabled Escherichia coli strains to efficiently secrete the extracellular pectic enzymes produced from cloned Er. chrysanthemi genes, while retaining the periplasmic marker protein beta-lactamase. DNA sequencing of a 2.4-kilobase EcoRI fragment within the out cluster revealed four genes arranged colinearly and sharing substantial similarity with the Klebsiella pneumoniae genes pulH, pulI, pulJ, and pulK, which are necessary for pullulanase secretion. However, K. pneumoniae cells harboring the cloned Er. chrysanthemi pelE gene were unable to secrete the Erwinia pectate lyase. Furthermore, the Er. chrysanthemi Out system was unable to secrete an extracellular pectate lyase encoded by a gene from a closely related plant pathogen. Erwinia carotovora ssp. carotovora. The results suggest that these enterobacteria secrete polysaccharidases by a conserved mechanism whose protein-recognition capacities have diverged.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, M; Craft, D
Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchicalmore » clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve cancer classification using biological pathways. Patients are classified with greater specificity and physiological relevance as compared to current gene-specific approaches. Focus now moves to utilizing PICS for pan-cancer patient-specific treatment response prediction.« less
Fluorescent in situ hybridisation to amphioxus chromosomes.
Castro, Luis Filipe Costa; Holland, Peter William Harold
2002-12-01
We describe an efficient protocol for mapping genes and other DNA sequences to amphioxus chromosomes using fluorescent in situ hybridisation. We apply this method to identify the number and location of ribosomal DNA gene clusters and telomere sequences in metaphase spreads of Branchiostoma floridae. We also describe how the locations of two single copy genes can be mapped relative to each other, and demonstrate this by mapping an amphioxus Pax gene relative to a homologue of the Notch gene. These methods have great potential for performing comparative genomics between amphioxus and vertebrates.
A Hybrid Approach for CpG Island Detection in the Human Genome.
Yang, Cheng-Hong; Lin, Yu-Da; Chiang, Yi-Cheng; Chuang, Li-Yeh
2016-01-01
CpG islands have been demonstrated to influence local chromatin structures and simplify the regulation of gene activity. However, the accurate and rapid determination of CpG islands for whole DNA sequences remains experimentally and computationally challenging. A novel procedure is proposed to detect CpG islands by combining clustering technology with the sliding-window method (PSO-based). Clustering technology is used to detect the locations of all possible CpG islands and process the data, thus effectively obviating the need for the extensive and unnecessary processing of DNA fragments, and thus improving the efficiency of sliding-window based particle swarm optimization (PSO) search. This proposed approach, named ClusterPSO, provides versatile and highly-sensitive detection of CpG islands in the human genome. In addition, the detection efficiency of ClusterPSO is compared with eight CpG island detection methods in the human genome. Comparison of the detection efficiency for the CpG islands in human genome, including sensitivity, specificity, accuracy, performance coefficient (PC), and correlation coefficient (CC), ClusterPSO revealed superior detection ability among all of the test methods. Moreover, the combination of clustering technology and PSO method can successfully overcome their respective drawbacks while maintaining their advantages. Thus, clustering technology could be hybridized with the optimization algorithm method to optimize CpG island detection. The prediction accuracy of ClusterPSO was quite high, indicating the combination of CpGcluster and PSO has several advantages over CpGcluster and PSO alone. In addition, ClusterPSO significantly reduced implementation time.
Modification of the Genome of Domestic Animals.
Lotti, Samantha N; Polkoff, Kathryn M; Rubessa, Marcello; Wheeler, Matthew B
2017-07-03
In the past few years, new technologies have arisen that enable higher efficiency of gene editing. With the increase ease of using gene editing technologies, it is important to consider the best method for transferring new genetic material to livestock animals. Microinjection is a technique that has proven to be effective in mice but is less efficient in large livestock animals. Over the years, a variety of methods have been used for cloning as well as gene transfer including; nuclear transfer, sperm mediated gene transfer (SMGT), and liposome-mediated DNA transfer. This review looks at the different success rate of these methods and how they have evolved to become more efficient. As well as gene editing technologies, including Zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the most recent clustered regulatory interspaced short palindromic repeats (CRISPRs). Through the advancements in gene-editing technologies, generating transgenic animals is now more accessible and affordable. The goals of producing transgenic animals are to 1) increase our understanding of biology and biomedical science; 2) increase our ability to produce more efficient animals; and 3) produce disease resistant animals. ZFNs, TALENs, and CRISPRs combined with gene transfer methods increase the possibility of achieving these goals.
Zhang, Weipeng; Lu, Liang; Lai, Qiliang; Zhu, Beika; Li, Zhongrui; Xu, Ying; Shao, Zongze; Herrup, Karl; Moore, Bradley S.; Ross, Avena C.; Qian, Pei-Yuan
2016-01-01
The thalassospiramide lipopeptides have great potential for therapeutic applications; however, their structural and functional diversity and biosynthesis are poorly understood. Here, by cultivating 130 Rhodospirillaceae strains sampled from oceans worldwide, we discovered 21 new thalassospiramide analogues and demonstrated their neuroprotective effects. To investigate the diversity of biosynthetic gene cluster (BGC) architectures, we sequenced the draft genomes of 28 Rhodospirillaceae strains. Our family-wide genomic analysis revealed three types of dysfunctional BGCs and four functional BGCs whose architectures correspond to four production patterns. This correlation allowed us to reassess the “diversity-oriented biosynthesis” proposed for the microbial production of thalassospiramides, which involves iteration of several key modules. Preliminary evolutionary investigation suggested that the functional BGCs could have arisen through module/domain loss, whereas the dysfunctional BGCs arose through horizontal gene transfer. Further comparative genomics indicated that thalassospiramide production is likely to be attendant on particular genes/pathways for amino acid metabolism, signaling transduction, and compound efflux. Our findings provide a systematic understanding of thalassospiramide production and new insights into the underlying mechanism. PMID:27875306
Geffroy, V; Sicard, D; de Oliveira, J C; Sévignac, M; Cohen, S; Gepts, P; Neema, C; Langin, T; Dron, M
1999-09-01
The recent cloning of plant resistance (R) genes and the sequencing of resistance gene clusters have shed light on the molecular evolution of R genes. However, up to now, no attempt has been made to correlate this molecular evolution with the host-pathogen coevolution process at the population level. Cross-inoculations were carried out between 26 strains of the fungal pathogen Colletotrichum lindemuthianum and 48 Phaseolus vulgaris plants collected in the three centers of diversity of the host species. A high level of diversity for resistance against the pathogen was revealed. Most of the resistance specificities were overcome in sympatric situations, indicating an adaptation of the pathogen to the local host. In contrast, plants were generally resistant to allopatric strains, suggesting that R genes that were efficient against exotic strains but had been overcome locally were maintained in the plant genome. These results indicated that coevolution processes between the two protagonists led to a differentiation for resistance in the three centers of diversity of the host. To improve our understanding of the molecular evolution of these different specificities, a recombinant inbred (RI) population derived from two representative genotypes of the Andean (JaloEEP558) and Mesoamerican (BAT93) gene pools was used to map anthracnose specificities. A gene cluster comprising both Andean (Co-y; Co-z) and Mesoamerican (Co-9) host resistance specificities was identified, suggesting that this locus existed prior to the separation of the two major gene pools of P. vulgaris. Molecular analysis revealed a high level of complexity at this locus. It harbors 11 restriction fragment length polymorphisms when R gene analog (RGA) clones are used. The relationship between the coevolution process and diversification of resistance specificities at resistance gene clusters is discussed.
Wang, H-X; Chen, Y-Y; Ge, L; Fang, T-T; Meng, J; Liu, Z; Fang, X-Y; Ni, S; Lin, C; Wu, Y-Y; Wang, M-L; Shi, N-N; He, H-G; Hong, K; Shen, Y-M
2013-07-01
Ansamycins are a family of macrolactams that are synthesized by type I polyketide synthase (PKS) using 3-amino-5-hydroxybenzoic acid (AHBA) as the starter unit. Most members of the family have strong antimicrobial, antifungal, anticancer and/or antiviral activities. We aimed to discover new ansamycins and/or other AHBA-containing natural products from actinobacteria. Through PCR screening of AHBA synthase gene, we identified 26 AHBA synthase gene-positive strains from 206 plant-associated actinomycetes (five positives) and 688 marine-derived actinomycetes (21 positives), representing a positive ratio of 2·4-3·1%. Twenty-five ansamycins, including eight new compounds, were isolated from six AHBA synthase gene-positive strains through TLC-guided fractionations followed by repeated column chromatography. To gain information about those potential ansamycin gene clusters whose products were unknown, seven strains with phylogenetically divergent AHBA synthase genes were subjected to fosmid library construction. Of the seven gene clusters we obtained, three show characteristics for typical ansamycin gene clusters, and other four, from Micromonospora spp., appear to lack the amide synthase gene, which is unusual for ansamycin biosynthesis. The gene composition of these four gene clusters suggests that they are involved in the biosynthesis of a new family of hybrid PK-NRP compounds containing AHBA substructure. PCR screening of AHBA synthase is an efficient approach to discover novel ansamycins and other AHBA-containing natural products. This work demonstrates that the AHBA-based screening method is a useful approach for discovering novel ansamycins and other AHBA-containing natural products from new microbial resources. Journal of Applied Microbiology © 2013 The Society for Applied Microbiology.
An analysis of cluster headache information provided on internet websites.
Peterlin, B Lee; Gambini-Suarez, Eduardo; Lidicker, Jeffrey; Levin, Morris
2008-03-01
To evaluate the quality of websites providing cluster headache information for patients and healthcare providers. The Internet has become an increasingly important source of healthcare information. However, limited data exist regarding the quality of websites providing headache information. This was a cross-sectional study conducted in February 2007. Websites providing cluster headache information were determined on the search engine MetaCrawler and classified as either patient oriented or healthcare provider oriented. The overall quality of each site was evaluated using a score system. Readability was evaluated using the Flesch-Kincaid Grade Level Readability Score (FKRS). Website quality was analyzed based on ownership, purpose, authorship, author qualifications, attribution, interactivity, and currency. The technical quality of the cluster headache information was analyzed based on content specific to cluster headache. The final ranking, based on the sum of the ranks of all 3 categories, was determined and then contrasted between the patient-oriented and healthcare professional-oriented websites using 2-sample t-tests. Of the first 40 websites found on MetaCrawler, 72.5% were advertisements, unrelated to headache, or repeated websites. Although the standard US writing averages are at a seventh to eighth grade level, the mean FKRS of all sites was at a 12th grade level of difficulty, with no significant difference between the patient-oriented or healthcare provider-oriented websites (P = .54). Of a total possible 14 points, the overall mean quality component score was 9.9 for all sites; and of a total possible 23 points, the overall mean technical component score was 13.9. There was no significant difference for either the quality or technical component scores between patient-oriented or healthcare provider-oriented websites (P = .45 and P = .80, respectively). There are numerous cluster headache websites that can be found on the Internet. The quality of most of the websites dedicated to cluster headache is mediocre, and although there are some excellent cluster headache websites, these sites may be challenging for many users to locate. There was no significant difference in the overall quality of websites oriented for patients or healthcare providers providing cluster headache information evaluated in this study. In addition, websites providing high-quality cluster headache information are written at an educational level too high for a significant portion of the general population to fully utilize. Physicians should strongly consider providing lists of quality websites on cluster headache for their patients.
The AAV-mediated and RNA-guided CRISPR/Cas9 system for gene therapy of DMD and BMD.
Wang, Jing-Zhang; Wu, Peng; Shi, Zhi-Min; Xu, Yan-Li; Liu, Zhi-Jun
2017-08-01
Mutations in the dystrophin gene (Dmd) result in Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD), which afflict many newborn boys. In 2016, Brain and Development published several interesting articles on DMD treatment with antisense oligonucleotide, kinase inhibitor, and prednisolone. Even more strikingly, three articles in the issue 6271 of Science in 2016 provide new insights into gene therapy of DMD and BMD via the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9). In brief, adeno-associated virus (AAV) vectors transport guided RNAs (gRNAs) and Cas9 into mdx mouse model, gRNAs recognize the mutated Dmd exon 23 (having a stop codon), and Cas9 cut the mutated exon 23 off the Dmd gene. These manipulations restored expression of truncated but partially functional dystrophin, improved skeletal and cardiac muscle function, and increased survival of mdx mice significantly. This review concisely summarized the related advancements and discussed their primary implications in the future gene therapy of DMD, including AAV-vector selection, gRNA designing, Cas9 optimization, dystrophin-restoration efficiency, administration routes, and systemic and long-term therapeutic efficacy. Future orientations, including off-target effects, safety concerns, immune responses, precision medicine, and Dmd-editing in the brain (potentially blocked by the blood-brain barrier) were also elucidated briefly. Collectively, the AAV-mediated and RNA-guided CRISPR/Cas9 system has major superiorities compared with traditional gene therapy, and might contribute to the treatment of DMD and BMD substantially in the near future. Copyright © 2017 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Orientation selectivity and the functional clustering of synaptic inputs in primary visual cortex
Wilson, Daniel E.; Whitney, David E.; Scholl, Benjamin; Fitzpatrick, David
2016-01-01
The majority of neurons in primary visual cortex are tuned for stimulus orientation, but the factors that account for the range of orientation selectivities exhibited by cortical neurons remain unclear. To address this issue, we used in vivo 2-photon calcium imaging to characterize the orientation tuning and spatial arrangement of synaptic inputs to the dendritic spines of individual pyramidal neurons in layer 2/3 of ferret visual cortex. The summed synaptic input to individual neurons reliably predicted the neuron’s orientation preference, but did not account for differences in orientation selectivity among neurons. These differences reflected a robust input-output nonlinearity that could not be explained by spike threshold alone, and was strongly correlated with the spatial clustering of co-tuned synaptic inputs within the dendritic field. Dendritic branches with more co-tuned synaptic clusters exhibited greater rates of local dendritic calcium events supporting a prominent role for functional clustering of synaptic inputs in dendritic nonlinearities that shape orientation selectivity. PMID:27294510
CRISPR/Cas9 nuclease-mediated gene knock-in in bovine-induced pluripotent cells.
Heo, Young Tae; Quan, Xiaoyuan; Xu, Yong Nan; Baek, Soonbong; Choi, Hwan; Kim, Nam-Hyung; Kim, Jongpil
2015-02-01
Efficient and precise genetic engineering in livestock such as cattle holds great promise in agriculture and biomedicine. However, techniques that generate pluripotent stem cells, as well as reliable tools for gene targeting in livestock, are still inefficient, and thus not routinely used. Here, we report highly efficient gene targeting in the bovine genome using bovine pluripotent cells and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 nuclease. First, we generate induced pluripotent stem cells (iPSCs) from bovine somatic fibroblasts by the ectopic expression of yamanaka factors and GSK3β and MEK inhibitor (2i) treatment. We observed that these bovine iPSCs are highly similar to naïve pluripotent stem cells with regard to gene expression and developmental potential in teratomas. Moreover, CRISPR/Cas9 nuclease, which was specific for the bovine NANOG locus, showed highly efficient editing of the bovine genome in bovine iPSCs and embryos. To conclude, CRISPR/Cas9 nuclease-mediated homologous recombination targeting in bovine pluripotent cells is an efficient gene editing method that can be used to generate transgenic livestock in the future.
Clustering by soft-constraint affinity propagation: applications to gene-expression data.
Leone, Michele; Sumedha; Weigt, Martin
2007-10-15
Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data. This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.
2009-01-01
Background Soybeans grown in the upper Midwestern United States often suffer from iron deficiency chlorosis, which results in yield loss at the end of the season. To better understand the effect of iron availability on soybean yield, we identified genes in two near isogenic lines with changes in expression patterns when plants were grown in iron sufficient and iron deficient conditions. Results Transcriptional profiles of soybean (Glycine max, L. Merr) near isogenic lines Clark (PI548553, iron efficient) and IsoClark (PI547430, iron inefficient) grown under Fe-sufficient and Fe-limited conditions were analyzed and compared using the Affymetrix® GeneChip® Soybean Genome Array. There were 835 candidate genes in the Clark (PI548553) genotype and 200 candidate genes in the IsoClark (PI547430) genotype putatively involved in soybean's iron stress response. Of these candidate genes, fifty-eight genes in the Clark genotype were identified with a genetic location within known iron efficiency QTL and 21 in the IsoClark genotype. The arrays also identified 170 single feature polymorphisms (SFPs) specific to either Clark or IsoClark. A sliding window analysis of the microarray data and the 7X genome assembly coupled with an iterative model of the data showed the candidate genes are clustered in the genome. An analysis of 5' untranslated regions in the promoter of candidate genes identified 11 conserved motifs in 248 differentially expressed genes, all from the Clark genotype, representing 129 clusters identified earlier, confirming the cluster analysis results. Conclusion These analyses have identified the first genes with expression patterns that are affected by iron stress and are located within QTL specific to iron deficiency stress. The genetic location and promoter motif analysis results support the hypothesis that the differentially expressed genes are co-regulated. The combined results of all analyses lead us to postulate iron inefficiency in soybean is a result of a mutation in a transcription factor(s), which controls the expression of genes required in inducing an iron stress response. PMID:19678937
Formation of Nitrogenase NifDK Tetramers in the Mitochondria of Saccharomyces cerevisiae
2017-01-01
Transferring the prokaryotic enzyme nitrogenase into a eukaryotic host with the final aim of developing N2 fixing cereal crops would revolutionize agricultural systems worldwide. Targeting it to mitochondria has potential advantages because of the organelle’s high O2 consumption and the presence of bacterial-type iron–sulfur cluster biosynthetic machinery. In this study, we constructed 96 strains of Saccharomyces cerevisiae in which transcriptional units comprising nine Azotobacter vinelandii nif genes (nifHDKUSMBEN) were integrated into the genome. Two combinatorial libraries of nif gene clusters were constructed: a library of mitochondrial leading sequences consisting of 24 clusters within four subsets of nif gene expression strength, and an expression library of 72 clusters with fixed mitochondrial leading sequences and nif expression levels assigned according to factorial design. In total, 29 promoters and 18 terminators were combined to adjust nif gene expression levels. Expression and mitochondrial targeting was confirmed at the protein level as immunoblot analysis showed that Nif proteins could be efficiently accumulated in mitochondria. NifDK tetramer formation, an essential step of nitrogenase assembly, was experimentally proven both in cell-free extracts and in purified NifDK preparations. This work represents a first step toward obtaining functional nitrogenase in the mitochondria of a eukaryotic cell. PMID:28221768
NASA Astrophysics Data System (ADS)
Maczewski, Lukasz
2010-05-01
The International Linear Collider (ILC) is a project of an electron-positron (e+e-) linear collider with the centre-of-mass energy of 200-500 GeV. Monolithic Active Pixel Sensors (MAPS) are one of the proposed silicon pixel detector concepts for the ILC vertex detector (VTX). Basic characteristics of two MAPS pixel matrices MIMOSA-5 (17 μm pixel pitch) and MIMOSA-18 (10 μm pixel pitch) are studied and compared (pedestals, noises, calibration of the ADC-to-electron conversion gain, detector efficiency and charge collection properties). The e+e- collisions at the ILC will be accompanied by intense beamsstrahlung background of electrons and positrons hitting inner planes of the vertex detector. Tracks of this origin leave elongated clusters contrary to those of secondary hadrons. Cluster characteristics and orientation with respect to the pixels netting are studied for perpendicular and inclined tracks. Elongation and precision of determining the cluster orientation as a function of the angle of incidence were measured. A simple model of signal formation (based on charge diffusion) is proposed and tested using the collected data.
[siRNA-mediated tissue factor knockdown in porcine neonatal islet cell clusters in vitro].
Ji, Ming; Yi, Shounan; Yu, Deling; Wang, Wei
2011-12-01
To determine the genetic modification on neonatal porcine islet cell clusters (NICC) by small interfering RNA (siRNA)-mediated tissue factor (TF) knockdown in vitro. Porcine NICC were transfected with 5 pairs of designed siRNA respectively or in different combinations with lipofectamine 2000. Transfected NICC were analyzed for TF gene by real-time PCR to select the siRNA which worked best. Meanwhile, the viability of NICC after the TF siRNA transfection was examined by FACS. The efficiency of TF gene and protein suppression was measured by real-time PCR and and FACS respectively. Real-time PCR and FACS showed that a 60% reduction in the TF gene expression and a 50% reduction in the protien level of TF on NICC were achieved by transfecting 3 pairs of selected siRNA. The siRNA transfection had no significant effect on the viability of NICC which was analyzed by FACS. The expression of TF on porcine NICC is efficiently suppressed by 3 pairs of designed siRNA in vitro.
NASA Astrophysics Data System (ADS)
Guo, Jingyu; Tian, Dehua; McKinney, Brett A.; Hartman, John L.
2010-06-01
Interactions between genetic and/or environmental factors are ubiquitous, affecting the phenotypes of organisms in complex ways. Knowledge about such interactions is becoming rate-limiting for our understanding of human disease and other biological phenomena. Phenomics refers to the integrative analysis of how all genes contribute to phenotype variation, entailing genome and organism level information. A systems biology view of gene interactions is critical for phenomics. Unfortunately the problem is intractable in humans; however, it can be addressed in simpler genetic model systems. Our research group has focused on the concept of genetic buffering of phenotypic variation, in studies employing the single-cell eukaryotic organism, S. cerevisiae. We have developed a methodology, quantitative high throughput cellular phenotyping (Q-HTCP), for high-resolution measurements of gene-gene and gene-environment interactions on a genome-wide scale. Q-HTCP is being applied to the complete set of S. cerevisiae gene deletion strains, a unique resource for systematically mapping gene interactions. Genetic buffering is the idea that comprehensive and quantitative knowledge about how genes interact with respect to phenotypes will lead to an appreciation of how genes and pathways are functionally connected at a systems level to maintain homeostasis. However, extracting biologically useful information from Q-HTCP data is challenging, due to the multidimensional and nonlinear nature of gene interactions, together with a relative lack of prior biological information. Here we describe a new approach for mining quantitative genetic interaction data called recursive expectation-maximization clustering (REMc). We developed REMc to help discover phenomic modules, defined as sets of genes with similar patterns of interaction across a series of genetic or environmental perturbations. Such modules are reflective of buffering mechanisms, i.e., genes that play a related role in the maintenance of physiological homeostasis. To develop the method, 297 gene deletion strains were selected based on gene-drug interactions with hydroxyurea, an inhibitor of ribonucleotide reductase enzyme activity, which is critical for DNA synthesis. To partition the gene functions, these 297 deletion strains were challenged with growth inhibitory drugs known to target different genes and cellular pathways. Q-HTCP-derived growth curves were used to quantify all gene interactions, and the data were used to test the performance of REMc. Fundamental advantages of REMc include objective assessment of total number of clusters and assignment to each cluster a log-likelihood value, which can be considered an indicator of statistical quality of clusters. To assess the biological quality of clusters, we developed a method called gene ontology information divergence z-score (GOid_z). GOid_z summarizes total enrichment of GO attributes within individual clusters. Using these and other criteria, we compared the performance of REMc to hierarchical and K-means clustering. The main conclusion is that REMc provides distinct efficiencies for mining Q-HTCP data. It facilitates identification of phenomic modules, which contribute to buffering mechanisms that underlie cellular homeostasis and the regulation of phenotypic expression.
Chee, S Y
2015-05-25
The mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) gene has been universally and successfully utilized as a barcoding gene, mainly because it can be amplified easily, applied across a wide range of taxa, and results can be obtained cheaply and quickly. However, in rare cases, the gene can fail to distinguish between species, particularly when exposed to highly sensitive methods of data analysis, such as the Bayesian method, or when taxa have undergone introgressive hybridization, over-splitting, or incomplete lineage sorting. Such cases require the use of alternative markers, and nuclear DNA markers are commonly used. In this study, a dendrogram produced by Bayesian analysis of an mtDNA COI dataset was compared with that of a nuclear DNA ATPS-α dataset, in order to evaluate the efficiency of COI in barcoding Malaysian nerites (Neritidae). In the COI dendrogram, most of the species were in individual clusters, except for two species: Nerita chamaeleon and N. histrio. These two species were placed in the same subcluster, whereas in the ATPS-α dendrogram they were in their own subclusters. Analysis of the ATPS-α gene also placed the two genera of nerites (Nerita and Neritina) in separate clusters, whereas COI gene analysis placed both genera in the same cluster. Therefore, in the case of the Neritidae, the ATPS-α gene is a better barcoding gene than the COI gene.
[Construction of screening system for mutation of negative regulatory genes in Streptomyces].
Zhu, Yu; Feng, Chi; Tan, Huarong; Tian, Yuqing
2013-10-04
We aimed to create a novel report system for screening the mutation of the negative regulatory genes, especially for those repressing the expression of cryptic antibiotics clusters. We used marker-free gene disruption strategy, which combines with the "REDIRECT (Rapid Efficient Directed Recombination Time Saving)" technology and in vivo site-specific recombination by Streptomyces phage phiBT1 integrase, to construct a scbR2/inoA double mutant strain of S. coelicolor M145. This strain was used as the host of the report system. For the construction of the reporter plasmid, the ScbR2 repressed promoter of cpkO from CPK (cryptic polyketide) cluster was used to drive the expression of a promoterless conserved gene inoA of S. coelicolor. Then the reporter plasmid was introduced into the host strain described above to test the availability of inoA as a reporter gene in this system. The scbR2/inoA double mutant strain gave rise to a bald pheno type on MM medium in the absence of inositol, and produced yellow pigmented secondary metabolite by the disruption of scbR2 to release the repression of cpkO, a pathway specific activator gene situated in CPK cluster. After introducing the reporter plasmid into this test stain, the resulting strain recovered the phenotype as wild-type strain, indicating that the promoter of cpkO can drive the expression of inoA in scbR2 mutant and consequently restore the biosynthesis of inositol. Our results indicated that inoA can be used as a novel reporter gene for Streptomyces, especially for detecting the activation of the "silent" promoter. This report system might be available for screening the mutation of the negative regulatory genes for the cryptic secondary metabolic gene clusters.
Stachler, Aris-Edda; Marchfelder, Anita
2016-07-15
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system is used by bacteria and archaea to fend off foreign genetic elements. Since its discovery it has been developed into numerous applications like genome editing and regulation of transcription in eukaryotes and bacteria. For archaea currently no tools for transcriptional repression exist. Because molecular biology analyses in archaea become more and more widespread such a tool is vital for investigating the biological function of essential genes in archaea. Here we use the model archaeon Haloferax volcanii to demonstrate that its endogenous CRISPR-Cas system I-B can be harnessed to repress gene expression in archaea. Deletion of cas3 and cas6b genes results in efficient repression of transcription. crRNAs targeting the promoter region reduced transcript levels down to 8%. crRNAs targeting the reading frame have only slight impact on transcription. crRNAs that target the coding strand repress expression only down to 88%, whereas crRNAs targeting the template strand repress expression down to 8%. Repression of an essential gene results in reduction of transcription levels down to 22%. Targeting efficiencies can be enhanced by expressing a catalytically inactive Cas3 mutant. Genes can be targeted on plasmids or on the chromosome, they can be monocistronic or part of a polycistronic operon. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Stachler, Aris-Edda; Marchfelder, Anita
2016-01-01
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system is used by bacteria and archaea to fend off foreign genetic elements. Since its discovery it has been developed into numerous applications like genome editing and regulation of transcription in eukaryotes and bacteria. For archaea currently no tools for transcriptional repression exist. Because molecular biology analyses in archaea become more and more widespread such a tool is vital for investigating the biological function of essential genes in archaea. Here we use the model archaeon Haloferax volcanii to demonstrate that its endogenous CRISPR-Cas system I-B can be harnessed to repress gene expression in archaea. Deletion of cas3 and cas6b genes results in efficient repression of transcription. crRNAs targeting the promoter region reduced transcript levels down to 8%. crRNAs targeting the reading frame have only slight impact on transcription. crRNAs that target the coding strand repress expression only down to 88%, whereas crRNAs targeting the template strand repress expression down to 8%. Repression of an essential gene results in reduction of transcription levels down to 22%. Targeting efficiencies can be enhanced by expressing a catalytically inactive Cas3 mutant. Genes can be targeted on plasmids or on the chromosome, they can be monocistronic or part of a polycistronic operon. PMID:27226589
Sporulation genes associated with sporulation efficiency in natural isolates of yeast.
Tomar, Parul; Bhatia, Aatish; Ramdas, Shweta; Diao, Liyang; Bhanot, Gyan; Sinha, Himanshu
2013-01-01
Yeast sporulation efficiency is a quantitative trait and is known to vary among experimental populations and natural isolates. Some studies have uncovered the genetic basis of this variation and have identified the role of sporulation genes (IME1, RME1) and sporulation-associated genes (FKH2, PMS1, RAS2, RSF1, SWS2), as well as non-sporulation pathway genes (MKT1, TAO3) in maintaining this variation. However, these studies have been done mostly in experimental populations. Sporulation is a response to nutrient deprivation. Unlike laboratory strains, natural isolates have likely undergone multiple selections for quick adaptation to varying nutrient conditions. As a result, sporulation efficiency in natural isolates may have different genetic factors contributing to phenotypic variation. Using Saccharomyces cerevisiae strains in the genetically and environmentally diverse SGRP collection, we have identified genetic loci associated with sporulation efficiency variation in a set of sporulation and sporulation-associated genes. Using two independent methods for association mapping and correcting for population structure biases, our analysis identified two linked clusters containing 4 non-synonymous mutations in genes - HOS4, MCK1, SET3, and SPO74. Five regulatory polymorphisms in five genes such as MLS1 and CDC10 were also identified as putative candidates. Our results provide candidate genes contributing to phenotypic variation in the sporulation efficiency of natural isolates of yeast.
Sporulation Genes Associated with Sporulation Efficiency in Natural Isolates of Yeast
Ramdas, Shweta; Diao, Liyang; Bhanot, Gyan; Sinha, Himanshu
2013-01-01
Yeast sporulation efficiency is a quantitative trait and is known to vary among experimental populations and natural isolates. Some studies have uncovered the genetic basis of this variation and have identified the role of sporulation genes (IME1, RME1) and sporulation-associated genes (FKH2, PMS1, RAS2, RSF1, SWS2), as well as non-sporulation pathway genes (MKT1, TAO3) in maintaining this variation. However, these studies have been done mostly in experimental populations. Sporulation is a response to nutrient deprivation. Unlike laboratory strains, natural isolates have likely undergone multiple selections for quick adaptation to varying nutrient conditions. As a result, sporulation efficiency in natural isolates may have different genetic factors contributing to phenotypic variation. Using Saccharomyces cerevisiae strains in the genetically and environmentally diverse SGRP collection, we have identified genetic loci associated with sporulation efficiency variation in a set of sporulation and sporulation-associated genes. Using two independent methods for association mapping and correcting for population structure biases, our analysis identified two linked clusters containing 4 non-synonymous mutations in genes – HOS4, MCK1, SET3, and SPO74. Five regulatory polymorphisms in five genes such as MLS1 and CDC10 were also identified as putative candidates. Our results provide candidate genes contributing to phenotypic variation in the sporulation efficiency of natural isolates of yeast. PMID:23874994
IoT Service Clustering for Dynamic Service Matchmaking.
Zhao, Shuai; Yu, Le; Cheng, Bo; Chen, Junliang
2017-07-27
As the adoption of service-oriented paradigms in the IoT (Internet of Things) environment, real-world devices will open their capabilities through service interfaces, which enable other functional entities to interact with them. In an IoT application, it is indispensable to find suitable services for satisfying users' requirements or replacing the unavailable services. However, from the perspective of performance, it is inappropriate to find desired services from the service repository online directly. Instead, clustering services offline according to their similarity and matchmaking or discovering service online in limited clusters is necessary. This paper proposes a multidimensional model-based approach to measure the similarity between IoT services. Then, density-peaks-based clustering is employed to gather similar services together according to the result of similarity measurement. Based on the service clustering, the algorithms of dynamic service matchmaking, discovery, and replacement will be performed efficiently. Evaluating experiments are conducted to validate the performance of proposed approaches, and the results are promising.
IoT Service Clustering for Dynamic Service Matchmaking
Yu, Le; Cheng, Bo; Chen, Junliang
2017-01-01
As the adoption of service-oriented paradigms in the IoT (Internet of Things) environment, real-world devices will open their capabilities through service interfaces, which enable other functional entities to interact with them. In an IoT application, it is indispensable to find suitable services for satisfying users’ requirements or replacing the unavailable services. However, from the perspective of performance, it is inappropriate to find desired services from the service repository online directly. Instead, clustering services offline according to their similarity and matchmaking or discovering service online in limited clusters is necessary. This paper proposes a multidimensional model-based approach to measure the similarity between IoT services. Then, density-peaks-based clustering is employed to gather similar services together according to the result of similarity measurement. Based on the service clustering, the algorithms of dynamic service matchmaking, discovery, and replacement will be performed efficiently. Evaluating experiments are conducted to validate the performance of proposed approaches, and the results are promising. PMID:28749431
2013-01-01
Background The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. Results In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Conclusions Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship. PMID:23845024
Golubovskaya, Inna N; Harper, Lisa C; Pawlowski, Wojciech P; Schichnes, Denise; Cande, W Zacheus
2002-01-01
The clustering of telomeres on the nuclear envelope (NE) during meiotic prophase to form the bouquet arrangement of chromosomes may facilitate homologous chromosome synapsis. The pam1 (plural abnormalities of meiosis 1) gene is the first maize gene that appears to be required for telomere clustering, and homologous synapsis is impaired in pam1. Telomere clustering on the NE is arrested or delayed at an intermediate stage in pam1. Telomeres associate with the NE during the leptotene-zygotene transition but cluster slowly if at all as meiosis proceeds. Intermediate stages in telomere clustering including miniclusters are observed in pam1 but not in wild-type meiocytes. The tight bouquet normally seen at zygotene is a rare event. In contrast, the polarization of centromeres vs. telomeres in the nucleus at the leptotene-zygotene transition is the same in mutant and wild-type cells. Defects in homologous chromosome synapsis include incomplete synapsis, nonhomologous synapsis, and unresolved interlocks. However, the number of RAD51 foci on chromosomes in pam1 is similar to that of wild type. We suggest that the defects in homologous synapsis and the retardation of prophase I arise from the irregularity of telomere clustering and propose that pam1 is involved in the control of bouquet formation and downstream meiotic prophase I events. PMID:12524364
A YAC contig of the human CC chemokine genes clustered on chromosome 17q11.2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naruse, Kuniko; Nomiyama, Hisayuki; Miura, Retsu
1996-06-01
CC chemokines are cytokines that attract and activate leukocytes. The human genes for the CC chemokines are clustered on chromosome 17. To elucidate the genomic organization of the CC chemokine genes, we constructed a YAC contig comprising 34 clones. The contig was shown to contain all 10 CC chemokine genes reported so far, except for one gene whose nucleotide sequence is not available. The contig also contains 4 CC chemokine-like genes, which were deposited in GenBank as ESTs and are here referred to as NCC-1, NCC-2, NCC-3, and NCC-4. Within the contig, the CC chemokine genes were localized in twomore » regions. In addition, the CC chemokine genes were localized in two regions. In addition, the CC chemokine genes were more precisely mapped on chromosome 17q11.2 using a somatic cell hybrid cell DNA panel containing various portions of human chromosome 17. Interestingly, a reciprocal translocation t(Y;17) breakpoint, contained in the hybrid cell line Y1741, lay between the two chromosome 17 chemokine gene regions covered by our YAC contig. From these results, the order and the orientation of CC chemokine genes on chromosome 17 were determined as follows: centromere-neurofibromatosis 1-(MCP-3, MCP-1, NCC-1, I-309)-Y1741 breakpoint-RANTES-(LD78{gamma}, AT744.2, LD78{beta})-(NCC-3, NCC-2, AT744.1, LD78{alpha})-NCC-4-retinoic acid receptor {alpha}-telomere. 22 refs., 1 fig., 2 tabs.« less
The capacity limitations of orientation summary statistics
Attarha, Mouna; Moore, Cathleen M.
2015-01-01
The simultaneous–sequential method was used to test the processing capacity of establishing mean orientation summaries. Four clusters of oriented Gabor patches were presented in the peripheral visual field. One of the clusters had a mean orientation that was tilted either left or right while the mean orientations of the other three clusters were roughly vertical. All four clusters were presented at the same time in the simultaneous condition whereas the clusters appeared in temporal subsets of two in the sequential condition. Performance was lower when the means of all four clusters had to be processed concurrently than when only two had to be processed in the same amount of time. The advantage for establishing fewer summaries at a given time indicates that the processing of mean orientation engages limited-capacity processes (Experiment 1). This limitation cannot be attributed to crowding, low target-distractor discriminability, or a limited-capacity comparison process (Experiments 2 and 3). In contrast to the limitations of establishing multiple summary representations, establishing a single summary representation unfolds without interference (Experiment 4). When interpreted in the context of recent work on the capacity of summary statistics, these findings encourage reevaluation of the view that early visual perception consists of summary statistic representations that unfold independently across multiple areas of the visual field. PMID:25810160
Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin
2016-01-01
ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic from Streptomyces rochei Sal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery. PMID:27451447
Kraakman, L S; Mager, W H; Maurer, K T; Nieuwint, R T; Planta, R J
1989-01-01
Transcription of the majority of the ribosomal protein (rp) genes in yeast is activated through common cis-acting elements, designated RPG-boxes. These elements have been shown to act as specific binding sites for the protein factor TUF/RAP1/GRF1 in vitro. Two such elements occur in the intergenic region separating the divergently transcribed genes encoding L46 and S24. To investigate whether the two RPG-boxes mediate transcription activation of both the L46 and S24 gene, two experimental strategies were followed: cloning of the respective genes on multicopy vectors and construction of fusion genes. Cloning of the L46 + S24 gene including the intergenic region in a multicopy yeast vector indicated that both genes are transcriptionally active. Using constructs in which only the S24 or the L46 gene is present, with or without the intergenic region, we obtained evidence that the intergenic region is indispensable for transcription activation of either gene. To demarcate the element(s) responsible for this activation, fusions of the intergenic region in either orientation to the galK reporter gene were made. Northern analysis of the levels of hybrid mRNA demonstrated that the intergenic region can serve as an heterologous promoter when it is in the 'S24-orientation'. Surprisingly, however, when fused in the reverse orientation the intergenic region did hardly confer transcription activity on the fusion gene. Furthermore, a 274 bp FnuDII-FnuDII fragment from the intergenic region that contains the RPG-boxes, could replace the naturally occurring upstream activation site (UASrpg) of the L25 rp-gene only when inserted in the 'S24-orientation'. Removal of 15 bp from the FnuDII fragment appeared to be sufficient to obtain transcription activation in the 'L46 orientation' as well. Analysis of a construct in which the RPG-boxes were selectively deleted from the promoter region of the L46 gene indicated that the RPG-boxes are needed for efficient transcriptional activation of the L46 gene. We conclude that all promoter elements for the S24 gene are located within the intergenic region, where the RPG-boxes are the most likely UAS-elements. However, the intergenic region (including the RPG-boxes) is required but not sufficient to confer transcription activity on the L46 gene. Images PMID:2602141
Kraakman, L S; Mager, W H; Maurer, K T; Nieuwint, R T; Planta, R J
1989-12-11
Transcription of the majority of the ribosomal protein (rp) genes in yeast is activated through common cis-acting elements, designated RPG-boxes. These elements have been shown to act as specific binding sites for the protein factor TUF/RAP1/GRF1 in vitro. Two such elements occur in the intergenic region separating the divergently transcribed genes encoding L46 and S24. To investigate whether the two RPG-boxes mediate transcription activation of both the L46 and S24 gene, two experimental strategies were followed: cloning of the respective genes on multicopy vectors and construction of fusion genes. Cloning of the L46 + S24 gene including the intergenic region in a multicopy yeast vector indicated that both genes are transcriptionally active. Using constructs in which only the S24 or the L46 gene is present, with or without the intergenic region, we obtained evidence that the intergenic region is indispensable for transcription activation of either gene. To demarcate the element(s) responsible for this activation, fusions of the intergenic region in either orientation to the galK reporter gene were made. Northern analysis of the levels of hybrid mRNA demonstrated that the intergenic region can serve as an heterologous promoter when it is in the 'S24-orientation'. Surprisingly, however, when fused in the reverse orientation the intergenic region did hardly confer transcription activity on the fusion gene. Furthermore, a 274 bp FnuDII-FnuDII fragment from the intergenic region that contains the RPG-boxes, could replace the naturally occurring upstream activation site (UASrpg) of the L25 rp-gene only when inserted in the 'S24-orientation'. Removal of 15 bp from the FnuDII fragment appeared to be sufficient to obtain transcription activation in the 'L46 orientation' as well. Analysis of a construct in which the RPG-boxes were selectively deleted from the promoter region of the L46 gene indicated that the RPG-boxes are needed for efficient transcriptional activation of the L46 gene. We conclude that all promoter elements for the S24 gene are located within the intergenic region, where the RPG-boxes are the most likely UAS-elements. However, the intergenic region (including the RPG-boxes) is required but not sufficient to confer transcription activity on the L46 gene.
Seki, Akiko; Rutz, Sascha
2018-03-05
CRISPR (clustered, regularly interspaced, short palindromic repeats)/Cas9 (CRISPR-associated protein 9) has become the tool of choice for generating gene knockouts across a variety of species. The ability for efficient gene editing in primary T cells not only represents a valuable research tool to study gene function but also holds great promise for T cell-based immunotherapies, such as next-generation chimeric antigen receptor (CAR) T cells. Previous attempts to apply CRIPSR/Cas9 for gene editing in primary T cells have resulted in highly variable knockout efficiency and required T cell receptor (TCR) stimulation, thus largely precluding the study of genes involved in T cell activation or differentiation. Here, we describe an optimized approach for Cas9/RNP transfection of primary mouse and human T cells without TCR stimulation that results in near complete loss of target gene expression at the population level, mitigating the need for selection. We believe that this method will greatly extend the feasibly of target gene discovery and validation in primary T cells and simplify the gene editing process for next-generation immunotherapies. © 2018 Genentech.
A new fast method for inferring multiple consensus trees using k-medoids.
Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir
2018-04-05
Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while providing similar or better clustering results in most cases. This makes it particularly well suited for the analysis of large genomic and phylogenetic datasets.
Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin
2018-01-01
Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405
Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...
2018-01-05
Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.
Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
Consensus properties and their large-scale applications for the gene duplication problem.
Moon, Jucheol; Lin, Harris T; Eulenstein, Oliver
2016-06-01
Solving the gene duplication problem is a classical approach for species tree inference from gene trees that are confounded by gene duplications. This problem takes a collection of gene trees and seeks a species tree that implies the minimum number of gene duplications. Wilkinson et al. posed the conjecture that the gene duplication problem satisfies the desirable Pareto property for clusters. That is, for every instance of the problem, all clusters that are commonly present in the input gene trees of this instance, called strict consensus, will also be found in every solution to this instance. We prove that this conjecture does not generally hold. Despite this negative result we show that the gene duplication problem satisfies a weaker version of the Pareto property where the strict consensus is found in at least one solution (rather than all solutions). This weaker property contributes to our design of an efficient scalable algorithm for the gene duplication problem. We demonstrate the performance of our algorithm in analyzing large-scale empirical datasets. Finally, we utilize the algorithm to evaluate the accuracy of standard heuristics for the gene duplication problem using simulated datasets.
Regulatory genes and their roles for improvement of antibiotic biosynthesis in Streptomyces.
Lu, Fengjuan; Hou, Yanyan; Zhang, Heming; Chu, Yiwen; Xia, Haiyang; Tian, Yongqiang
2017-08-01
The numerous secondary metabolites in Streptomyces spp. are crucial for various applications. For example, cephamycin C is used as an antibiotic, and avermectin is used as an insecticide. Specifically, antibiotic yield is closely related to many factors, such as the external environment, nutrition (including nitrogen and carbon sources), biosynthetic efficiency and the regulatory mechanisms in producing strains. There are various types of regulatory genes that work in different ways, such as pleiotropic (or global) regulatory genes, cluster-situated regulators, which are also called pathway-specific regulatory genes, and many other regulators. The study of regulatory genes that influence antibiotic biosynthesis in Streptomyces spp. not only provides a theoretical basis for antibiotic biosynthesis in Streptomyces but also helps to increase the yield of antibiotics via molecular manipulation of these regulatory genes. Currently, more and more emphasis is being placed on the regulatory genes of antibiotic biosynthetic gene clusters in Streptomyces spp., and many studies on these genes have been performed to improve the yield of antibiotics in Streptomyces. This paper lists many antibiotic biosynthesis regulatory genes in Streptomyces spp. and focuses on frequently investigated regulatory genes that are involved in pathway-specific regulation and pleiotropic regulation and their applications in genetic engineering.
Chin, John J; Kim, Anna J; Takahashi, Lois; Wiebe, Douglas J
2015-01-01
Social determinants of health may be substantially affected by spatial factors, which together may explain the persistence of health inequities. Clustering of possible sources of negative health and social outcomes points to a spatial focus for future interventions. We analyzed the spatial clustering of sex work businesses in Southern California to examine where and why they cluster. We explored economic and legal factors as possible explanations of clustering. We manually coded data from a website used by paying members to post reviews of female massage parlor workers. We identified clusters of sexually oriented massage parlor businesses using spatial autocorrelation tests. We conducted spatial regression using census tract data to identify predictors of clustering. A total of 889 venues were identified. Clusters of tracts having higher-than-expected numbers of sexually oriented massage parlors ("hot spots") were located outside downtowns. These hot spots were characterized by a higher proportion of adult males, a higher proportion of households below the federal poverty level, and a smaller average household size. Sexually oriented massage parlors in Los Angeles and Orange counties cluster in particular neighborhoods. More research is needed to ascertain the causal factors of such clusters and how interventions can be designed to leverage these spatial factors.
Guo, Shengye; Li, Xingyu; He, Pengfei; Ho, Honhing; Wu, Yixin; He, Yueqiu
2015-06-01
Bacillus subtilis XF-1 is a gram-positive, plant-associated bacterium that stimulates plant growth and produces secondary metabolites that suppress soil-borne plant pathogens. In particular, it is especially highly efficient at controlling the clubroot disease of cruciferous crops. Its 4,061,186-bp genome contains an estimated 3853 protein-coding sequences and the 1155 genes of XF-1 are present in most genome-sequenced Bacillus strains: 3757 genes in B. subtilis 168, and 1164 in B. amyloliquefaciens FZB42. Analysis using the Cluster of Orthologous Groups database of proteins shows that 60 genes control bacterial mobility, 221 genes are related to cell wall and membrane biosynthesis, and more than 112 are genes associated with secondary metabolites. In addition, the genes contributed to the strain's plant colonization, bio-control and stimulation of plant growth. Sequencing of the genome is a fundamental step for developing a desired strain to serve as an efficient biological control agent and plant growth stimulator. Similar to other members of the taxon, XF-1 has a genome that contains giant gene clusters for the non-ribosomal synthesis of antifungal lipopeptides (surfactin and fengycin), the polyketides (macrolactin and bacillaene), the siderophore bacillibactin, and the dipeptide bacilysin. There are two synthesis pathways for volatile growth-promoting compounds. The expression of biosynthesized antibiotic peptides in XF-1 was revealed by matrix-assisted laser desorption/ionization-time of flight mass spectrometry.
He, S Y; Lindeberg, M; Chatterjee, A K; Collmer, A
1991-01-01
The out genes of the enterobacterial plant pathogen Erwinia chrysanthemi are responsible for the efficient extracellular secretion of multiple plant cell wall-degrading enzymes, including four isozymes of pectate lyase, exo-poly-alpha-D-galacturonosidase, pectin methylesterase, and cellulase. Out- mutants of Er. chrysanthemi are unable to export any of these proteins beyond the periplasm and are severely reduced in virulence. We have cloned out genes from Er. chrysanthemi in the stable, low-copy-number cosmid pCPP19 by complementing several transposon-induced mutations. The cloned out genes were clustered in a 12-kilobase chromosomal DNA region, complemented all existing out mutations in Er. chrysanthemi EC16, and enabled Escherichia coli strains to efficiently secrete the extracellular pectic enzymes produced from cloned Er. chrysanthemi genes, while retaining the periplasmic marker protein beta-lactamase. DNA sequencing of a 2.4-kilobase EcoRI fragment within the out cluster revealed four genes arranged colinearly and sharing substantial similarity with the Klebsiella pneumoniae genes pulH, pulI, pulJ, and pulK, which are necessary for pullulanase secretion. However, K. pneumoniae cells harboring the cloned Er. chrysanthemi pelE gene were unable to secrete the Erwinia pectate lyase. Furthermore, the Er. chrysanthemi Out system was unable to secrete an extracellular pectate lyase encoded by a gene from a closely related plant pathogen. Erwinia carotovora ssp. carotovora. The results suggest that these enterobacteria secrete polysaccharidases by a conserved mechanism whose protein-recognition capacities have diverged. Images PMID:1992458
An improved K-means clustering method for cDNA microarray image segmentation.
Wang, T N; Li, T J; Shao, G F; Wu, S X
2015-07-14
Microarray technology is a powerful tool for human genetic research and other biomedical applications. Numerous improvements to the standard K-means algorithm have been carried out to complete the image segmentation step. However, most of the previous studies classify the image into two clusters. In this paper, we propose a novel K-means algorithm, which first classifies the image into three clusters, and then one of the three clusters is divided as the background region and the other two clusters, as the foreground region. The proposed method was evaluated on six different data sets. The analyses of accuracy, efficiency, expression values, special gene spots, and noise images demonstrate the effectiveness of our method in improving the segmentation quality.
Zheng, Xiaomei; Zheng, Ping; Zhang, Kun; Cairns, Timothy C; Meyer, Vera; Sun, Jibin; Ma, Yanhe
2018-04-30
The CRISPR/Cas9 system is a revolutionary genome editing tool. However, in eukaryotes, search and optimization of a suitable promoter for guide RNA expression is a significant technical challenge. Here we used the industrially important fungus, Aspergillus niger, to demonstrate that the 5S rRNA gene, which is both highly conserved and efficiently expressed in eukaryotes, can be used as a guide RNA promoter. The gene editing system was established with 100% rates of precision gene modifications among dozens of transformants using short (40-bp) homologous donor DNA. This system was also applicable for generation of designer chromosomes, as evidenced by deletion of a 48 kb gene cluster required for biosynthesis of the mycotoxin fumonisin B1. Moreover, this system also facilitated simultaneous mutagenesis of multiple genes in A. niger. We anticipate that the use of the 5S rRNA gene as guide RNA promoter can broadly be applied for engineering highly efficient eukaryotic CRISPR/Cas9 toolkits. Additionally, the system reported here will enable development of designer chromosomes in model and industrially important fungi.
Highly efficient CRISPR/HDR-mediated knock-in for mouse embryonic stem cells and zygotes.
Wang, Bangmei; Li, Kunyu; Wang, Amy; Reiser, Michelle; Saunders, Thom; Lockey, Richard F; Wang, Jia-Wang
2015-10-01
The clustered regularly interspaced short palindromic repeat (CRISPR) gene editing technique, based on the non-homologous end-joining (NHEJ) repair pathway, has been used to generate gene knock-outs with variable sizes of small insertion/deletions with high efficiency. More precise genome editing, either the insertion or deletion of a desired fragment, can be done by combining the homology-directed-repair (HDR) pathway with CRISPR cleavage. However, HDR-mediated gene knock-in experiments are typically inefficient, and there have been no reports of successful gene knock-in with DNA fragments larger than 4 kb. Here, we describe the targeted insertion of large DNA fragments (7.4 and 5.8 kb) into the genomes of mouse embryonic stem (ES) cells and zygotes, respectively, using the CRISPR/HDR technique without NHEJ inhibitors. Our data show that CRISPR/HDR without NHEJ inhibitors can result in highly efficient gene knock-in, equivalent to CRISPR/HDR with NHEJ inhibitors. Although NHEJ is the dominant repair pathway associated with CRISPR-mediated double-strand breaks (DSBs), and biallelic gene knock-ins are common, NHEJ and biallelic gene knock-ins were not detected. Our results demonstrate that efficient targeted insertion of large DNA fragments without NHEJ inhibitors is possible, a result that should stimulate interest in understanding the mechanisms of high efficiency CRISPR targeting in general.
Bao, Zehua; Xiao, Han; Liang, Jing; Zhang, Lu; Xiong, Xiong; Sun, Ning; Si, Tong; Zhao, Huimin
2015-05-15
One-step multiple gene disruption in the model organism Saccharomyces cerevisiae is a highly useful tool for both basic and applied research, but it remains a challenge. Here, we report a rapid, efficient, and potentially scalable strategy based on the type II Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR associated proteins (Cas) system to generate multiple gene disruptions simultaneously in S. cerevisiae. A 100 bp dsDNA mutagenizing homologous recombination donor is inserted between two direct repeats for each target gene in a CRISPR array consisting of multiple donor and guide sequence pairs. An ultrahigh copy number plasmid carrying iCas9, a variant of wild-type Cas9, trans-encoded RNA (tracrRNA), and a homology-integrated crRNA cassette is designed to greatly increase the gene disruption efficiency. As proof of concept, three genes, CAN1, ADE2, and LYP1, were simultaneously disrupted in 4 days with an efficiency ranging from 27 to 87%. Another three genes involved in an artificial hydrocortisone biosynthetic pathway, ATF2, GCY1, and YPR1, were simultaneously disrupted in 6 days with 100% efficiency. This homology-integrated CRISPR (HI-CRISPR) strategy represents a powerful tool for creating yeast strains with multiple gene knockouts.
Song, Bing; Fan, Yong; He, Wenyin; Zhu, Detu; Niu, Xiaohua; Wang, Ding; Ou, Zhanhui; Luo, Min; Sun, Xiaofang
2015-05-01
The generation of beta-thalassemia (β-Thal) patient-specific induced pluripotent stem cells (iPSCs), subsequent homologous recombination-based gene correction of disease-causing mutations/deletions in the β-globin gene (HBB), and their derived hematopoietic stem cell (HSC) transplantation offers an ideal therapeutic solution for treating this disease. However, the hematopoietic differentiation efficiency of gene-corrected β-Thal iPSCs has not been well evaluated in the previous studies. In this study, we used the latest gene-editing tool, clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9), to correct β-Thal iPSCs; gene-corrected cells exhibit normal karyotypes and full pluripotency as human embryonic stem cells (hESCs) showed no off-targeting effects. Then, we evaluated the differentiation efficiency of the gene-corrected β-Thal iPSCs. We found that during hematopoietic differentiation, gene-corrected β-Thal iPSCs showed an increased embryoid body ratio and various hematopoietic progenitor cell percentages. More importantly, the gene-corrected β-Thal iPSC lines restored HBB expression and reduced reactive oxygen species production compared with the uncorrected group. Our study suggested that hematopoietic differentiation efficiency of β-Thal iPSCs was greatly improved once corrected by the CRISPR/Cas9 system, and the information gained from our study would greatly promote the clinical application of β-Thal iPSC-derived HSCs in transplantation.
Cell type-specific termination of transcription by transposable element sequences.
Conley, Andrew B; Jordan, I King
2012-09-30
Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3' UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
Abramyan, Tigran M; Snyder, James A; Thyparambil, Aby A; Stuart, Steven J; Latour, Robert A
2016-08-05
Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Oriwol, Daniel; Trempa, Matthias; Sylla, Lamine; Leipner, Hartmut S.
2017-04-01
Dislocation clusters are the main crystal defects in multicrystalline silicon and are detrimental for solar cell efficiency. They were formed during the silicon ingot casting due to the relaxation of strain energy. The evolution of the dislocation clusters was studied by means of automated analysing tools of the standard wafer and cell production giving information about the cluster development as a function of the ingot height. Due to the observation of the whole wafer surface the point of view is of macroscopic nature. It was found that the dislocations tend to build clusters of high density which usually expand in diameter as a function of ingot height. According to their structure the dislocation clusters can be divided into light and dense clusters. The appearance of both types shows a clear dependence on the orientation of the grain growth direction. Additionally, a process of annihilation of dislocation clusters during the crystallization has been observed. To complement the macroscopic description, the dislocation clusters were also investigates by TEM. It is shown that the dislocations within the subgrain boundaries are closely arranged. Distances of 40-30 nm were found. These results lead to the conclusion that the dislocation density within the cluster structure is impossible to quantify by means of etch pit counting.
Dahms, Sven O.; Kuester, Miriam; Streb, Carsten; Roth, Christian; Sträter, Norbert; Than, Manuel E.
2013-01-01
Heavy-atom clusters (HA clusters) containing a large number of specifically arranged electron-dense scatterers are especially useful for experimental phase determination of large complex structures, weakly diffracting crystals or structures with large unit cells. Often, the determination of the exact orientation of the HA cluster and hence of the individual heavy-atom positions proves to be the critical step in successful phasing and subsequent structure solution. Here, it is demonstrated that molecular replacement (MR) with either anomalous or isomorphous differences is a useful strategy for the correct placement of HA cluster compounds. The polyoxometallate cluster hexasodium α-metatungstate (HMT) was applied in phasing the structure of death receptor 6. Even though the HA cluster is bound in alternate partially occupied orientations and is located at a special position, its correct localization and orientation could be determined at resolutions as low as 4.9 Å. The broad applicability of this approach was demonstrated for five different derivative crystals that included the compounds tantalum tetradecabromide and trisodium phosphotungstate in addition to HMT. The correct placement of the HA cluster depends on the length of the intramolecular vectors chosen for MR, such that both a larger cluster size and the optimal choice of the wavelength used for anomalous data collection strongly affect the outcome. PMID:23385464
Annotation of gene function in citrus using gene expression information and co-expression networks
2014-01-01
Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
Turolla, Andrea; Sabatino, Raffaella; Fontaneto, Diego; Eckert, Ester M; Colinas, Noemi; Corno, Gianluca; Citterio, Barbara; Biavasco, Francesca; Antonelli, Manuela; Mauro, Alessandro; Mangiaterra, Gianmarco; Di Cesare, Andrea
2017-10-01
Peracetic acid (PAA) is an organic compound used efficiently as disinfectant in wastewater treatments. Yet, at low doses it may cause selection; thus, the effect of low doses of PAA on Enterococcus faecium as a proxy of human-related microbial waste was evaluated. Bacteria were treated with increasing doses of PAA (from 0 to 25 mg L -1 min) and incubated in regrowth experiments under non-growing, limiting conditions and under growing, favorable conditions. The changes in bacterial abundance, in bacterial phenotype (number and composition of small cell clusters), and in the abundance of an antibiotic resistance gene (ARG) was evaluated. The experiment demonstrated that the selected doses of PAA efficiently removed enterococci, and induced a long-lasting effect after PAA inactivation. The relative abundance of small clusters increased during the experiment when compared with that of the inoculum. Moreover, under growing favorable conditions the relative abundance of small clusters decreased and the number of cells per cluster increased with increasing PAA doses. A strong stability of the measured ARG was found, not showing any effect during the whole experiment. The results demonstrated the feasibility of low doses of PAA to inactivate bacteria. However, the stress induced by PAA disinfection promoted a bacterial adaptation, even if potentially without affecting the abundance of the ARG. Copyright © 2017 Elsevier Ltd. All rights reserved.
Clustering cancer gene expression data by projective clustering ensemble
Yu, Xianxue; Yu, Guoxian
2017-01-01
Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920
Kim, Anna J.; Takahashi, Lois; Wiebe, Douglas J.
2015-01-01
Objective Social determinants of health may be substantially affected by spatial factors, which together may explain the persistence of health inequities. Clustering of possible sources of negative health and social outcomes points to a spatial focus for future interventions. We analyzed the spatial clustering of sex work businesses in Southern California to examine where and why they cluster. We explored economic and legal factors as possible explanations of clustering. Methods We manually coded data from a website used by paying members to post reviews of female massage parlor workers. We identified clusters of sexually oriented massage parlor businesses using spatial autocorrelation tests. We conducted spatial regression using census tract data to identify predictors of clustering. Results A total of 889 venues were identified. Clusters of tracts having higher-than-expected numbers of sexually oriented massage parlors (“hot spots”) were located outside downtowns. These hot spots were characterized by a higher proportion of adult males, a higher proportion of households below the federal poverty level, and a smaller average household size. Conclusion Sexually oriented massage parlors in Los Angeles and Orange counties cluster in particular neighborhoods. More research is needed to ascertain the causal factors of such clusters and how interventions can be designed to leverage these spatial factors. PMID:26327731
Chen, Zeming; Liu, Fuyao; Chen, Yanke; Liu, Jun; Wang, Xiaoying; Chen, Ann T; Deng, Gang; Zhang, Hongyi; Liu, Jie; Hong, Zhangyong; Zhou, Jiangbing
2017-12-08
Due to its simplicity, versatility, and high efficiency, the clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 technology has emerged as one of the most promising approaches for treatment of a variety of genetic diseases, including human cancers. However, further translation of CRISPR/Cas9 for cancer gene therapy requires development of safe approaches for efficient, highly specific delivery of both Cas9 and single guide RNA to tumors. Here, novel core-shell nanostructure, liposome-templated hydrogel nanoparticles (LHNPs) that are optimized for efficient codelivery of Cas9 protein and nucleic acids is reported. It is demonstrated that, when coupled with the minicircle DNA technology, LHNPs deliver CRISPR/Cas9 with efficiency greater than commercial agent Lipofectamine 2000 in cell culture and can be engineered for targeted inhibition of genes in tumors, including tumors the brain. When CRISPR/Cas9 targeting a model therapeutic gene, polo-like kinase 1 (PLK1), is delivered, LHNPs effectively inhibit tumor growth and improve tumor-bearing mouse survival. The results suggest LHNPs as versatile CRISPR/Cas9-delivery tool that can be adapted for experimentally studying the biology of cancer as well as for clinically translating cancer gene therapy.
Chromosomal arrangement of leghemoglobin genes in soybean.
Lee, J S; Brown, G G; Verma, D P
1983-01-01
A cluster of four different leghemoglobin (Lb) genes was isolated from AluI-HaeIII and EcoRI genomic libraries of soybean in a set of overlapping clones which together include 45 kilobases (kb) of contiguous DNA. These four genes, including a pseudogene, are present in the same orientation and are arranged in the order: 5'-Lba-Lbc1-Lb psi-Lbc3-3'. The intergenic regions average 2.5 kb. In addition to this main Lb locus, there are other Lb genes which do not appear to be contiguous to this locus. A sequence probably common to the 3' region of Lb loci was found flanking the Lbc3 gene. The 3' flanking region of the main Lb locus also contains a sequence that appears to be expressed more abundantly in root tissue. Another sequence which is primarily expressed in root and leaf is found 5' to two Lb loci. Overall, the main leghemoglobin locus is similar in structure to the mammalian globin gene loci. Images PMID:6310504
Kawamura, M; Wright, F A C; Declerck, D; Freire, M C M; Hu, D Y; Honkala, E; Lévy, G; Kalwitzki, M; Polychronopoulou, A; Yip, H K; Kinirons, M J; Eli, I; Petti, S; Komabayashi, T; Kim, K J; Razak, A A A; Srisilapanan, P; Kwan, S Y L
2005-08-01
To identify similarities and differences in oral health attitudes, behaviour and values among freshman dental students. Cross-cultural survey of dental students. 18 cultural areas. 904 first-year dental students completed the Hiroshima University-Dental Behavioural Inventory (HU-DBI) translated into their own languages. Individual areas were clustered by similarity in responses to the questions. The first group displayed an 'occidental-culture orientation' with the exception of Brazil (Cluster 1 comprised: Australia, United Kingdom, Ireland, Belgium and Brazil, Cluster 2: Germany, Italy, Finland and France). The second group displayed an 'oriental-cultural orientation' with the exception of Greece and Israel (Cluster 3 comprised: China and Indonesia, and Cluster 4: Japan, Korea, Israel, Hong Kong, Malaysia, Thailand and Greece). Australia and United Kingdom were the countries that were most alike. Ireland was the 'neighbour' to these countries. Greece and Malaysia had similar patterns of oral health behaviour although geographic conditions are very different. Although it was considered that in Hong Kong, occidental nations have affected the development of education, it remained in the oriental-culture group. Comparison with the data from the occidentals indicates that a higher percentage of the orientals put off going to the dentist until they have toothache (p < 0.001). Only a small proportion of the occidentals (8%) reported a perception of inevitability in having false teeth, whereas 33% of the orientals held this fatalistic belief (p = 0.001). Grouping the countries into key cultural orientations and international clusters yielded plausible results, using the HU-DBI.
Liu, Li Xue; Li, Qin Qin; Zhang, Yun Zeng; Hu, Yue; Jiao, Jian; Guo, Hui Juan; Zhang, Xing Xing; Zhang, Biliang; Chen, Wen Xin; Tian, Chang Fu
2017-12-01
Receiving nodulation and nitrogen fixation genes does not guarantee rhizobia an effective symbiosis with legumes. Here, variations in gene content were determined for three Sinorhizobium species showing contrasting symbiotic efficiency on soybeans. A nitrate-reduction gene cluster absent in S. sojae was found to be essential for symbiotic adaptations of S. fredii and S. sp. III. In S. fredii, the deletion mutation of the nap (nitrate reductase), instead of nir (nitrite reductase) and nor (nitric oxide reductase), led to defects in nitrogen-fixation (Fix - ). By contrast, none of these core nitrate-reduction genes were required for the symbiosis of S. sp. III. However, within the same gene cluster, the deletion of hemN1 (encoding oxygen-independent coproporphyrinogen III oxidase) in both S. fredii and S. sp. III led to the formation of nitrogen-fixing (Fix + ) but ineffective (Eff - ) nodules. These Fix + /Eff - nodules were characterized by significantly lower enzyme activity of glutamine synthetase indicating rhizobial modulation of nitrogen-assimilation by plants. A distant homologue of HemN1 from S. sojae can complement this defect in S. fredii and S. sp. III, but exhibited a more pleotropic role in symbiosis establishment. These findings highlighted the lineage-dependent optimization of symbiotic functions in different rhizobial species associated with the same host. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
Multiplexed CRISPR/Cas9 Genome Editing and Gene Regulation Using Csy4 in Saccharomyces cerevisiae.
Ferreira, Raphael; Skrekas, Christos; Nielsen, Jens; David, Florian
2018-01-19
Clustered regularly interspaced short palindromic repeats (CRISPR) technology has greatly accelerated the field of strain engineering. However, insufficient efforts have been made toward developing robust multiplexing tools in Saccharomyces cerevisiae. Here, we exploit the RNA processing capacity of the bacterial endoribonuclease Csy4 from Pseudomonas aeruginosa, to generate multiple gRNAs from a single transcript for genome editing and gene interference applications in S. cerevisiae. In regards to genome editing, we performed a quadruple deletion of FAA1, FAA4, POX1 and TES1 reaching 96% efficiency out of 24 colonies tested. Then, we used this system to efficiently transcriptionally regulate the three genes, OLE1, HMG1 and ACS1. Thus, we demonstrate that multiplexed genome editing and gene regulation can be performed in a fast and effective manner using Csy4.
Hensman, James; Lawrence, Neil D; Rattray, Magnus
2013-08-20
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Wei, Dong; Tian, Chuan-Bei; Liu, Shi-Huo; Wang, Tao; Smagghe, Guy; Jia, Fu-Xian; Dou, Wei; Wang, Jin-Jun
2016-06-01
In the male reproductive system of insects, the male accessory glands and ejaculatory duct (MAG/ED) are important organs and their primary function is to enhance the fertility of spermatozoa. Proteins secreted by the MAG/ED are also known to induce post-mating changes and immunity responses in the female insect. To understand the gene expression profile in the MAG/ED of the oriental fruit fly Bactrocera dorsalis (Hendel), that is an important pest in fruits, we performed an Illumina-based deep sequencing of mRNA. This yielded 54,577,630 clean reads corresponding to 4.91Gb total nucleotides that were assembled and clustered to 30,669 unigenes (average 645bp). Among them, 20,419 unigenes were functionally annotated to known proteins/peptides in Gene Orthology, Clusters of Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes pathway databases. Typically, many genes were involved in immunity and these included microbial recognition proteins and antimicrobial peptides. Subsequently, the inducible expression of these immunity-related genes was confirmed by qRT-PCR analysis when insects were challenged with immunity-inducible factors, suggesting their function in guaranteeing fertilization success. Besides, we identified some important reproductive genes such as juvenile hormone- and ecdysteroid-related genes in this de novo assembly. In conclusion, this transcriptomic sequencing of B. dorsalis MAG/ED provides insights to facilitate further functional research of reproduction, immunity and molecular evolution of reproductive proteins in this important agricultural pest. Copyright © 2015 Elsevier Inc. All rights reserved.
On the three-quarter view advantage of familiar object recognition.
Nonose, Kohei; Niimi, Ryosuke; Yokosawa, Kazuhiko
2016-11-01
A three-quarter view, i.e., an oblique view, of familiar objects often leads to a higher subjective goodness rating when compared with other orientations. What is the source of the high goodness for oblique views? First, we confirmed that object recognition performance was also best for oblique views around 30° view, even when the foreshortening disadvantage of front- and side-views was minimized (Experiments 1 and 2). In Experiment 3, we measured subjective ratings of view goodness and two possible determinants of view goodness: familiarity of view, and subjective impression of three-dimensionality. Three-dimensionality was measured as the subjective saliency of visual depth information. The oblique views were rated best, most familiar, and as approximating greatest three-dimensionality on average; however, the cluster analyses showed that the "best" orientation systematically varied among objects. We found three clusters of objects: front-preferred objects, oblique-preferred objects, and side-preferred objects. Interestingly, recognition performance and the three-dimensionality rating were higher for oblique views irrespective of the clusters. It appears that recognition efficiency is not the major source of the three-quarter view advantage. There are multiple determinants and variability among objects. This study suggests that the classical idea that a canonical view has a unique advantage in object perception requires further discussion.
Preclinical Evaluation of An Anti-HCV miRNA Cluster for Treatment of HCV Infection
Yang, Xiao; Marcucci, Katherine; Anguela, Xavier; Couto, Linda B.
2013-01-01
We developed a strategy to treat hepatitis C virus (HCV) infection by replacing five endogenous microRNA (miRNA) sequences of a natural miRNA cluster (miR-17–92) with sequences that are complementary to the HCV genome. This miRNA cluster (HCV-miR-Cluster 5) is delivered to cells using adeno-associated virus (AAV) vectors and the miRNAs are expressed in the liver, the site of HCV replication and assembly. AAV-HCV-miR-Cluster 5 inhibited bona fide HCV replication in vitro by up to 95% within 2 days, and the spread of HCV to uninfected cells was prevented by continuous expression of the anti-HCV miRNAs. Furthermore, the number of cells harboring HCV RNA replicons decreased dramatically by sustained expression of the anti-HCV miRNAs, suggesting that the vector is capable of curing cells of HCV. Delivery of AAV-HCV-miR-Cluster 5 to mice resulted in efficient transfer of the miRNA gene cluster and expression of all five miRNAs in liver tissue, at levels up to 1,300 copies/cell. These levels achieved up to 98% gene silencing of cognate HCV sequences, and no liver toxicity was observed, supporting the safety of this approach. Therefore, AAV-HCV-miR-Cluster 5 represents a different paradigm for the treatment of HCV infection. PMID:23295950
Ten billion years of brightest cluster galaxy alignments
NASA Astrophysics Data System (ADS)
West, Michael J.; de Propris, Roberto; Bremer, Malcolm N.; Phillipps, Steven
2017-07-01
A galaxy's orientation is one of its most basic observable properties. Astronomers once assumed that galaxies are randomly oriented in space; however, it is now clear that some have preferred orientations with respect to their surroundings. Chief among these are giant elliptical galaxies found in the centres of rich galaxy clusters. Numerous studies have shown that the major axes of these galaxies often share the same orientation as the surrounding matter distribution on larger scales1,2,3,4,5,6. Using Hubble Space Telescope observations of 65 distant galaxy clusters, we show that similar alignments are seen at earlier epochs when the Universe was only one-third of its current age. These results suggest that the brightest galaxies in clusters are the product of a special formation history, one influenced by development of the cosmic web over billions of years.
Evidence for an ergot alkaloid gene cluster in Claviceps purpurea.
Tudzynski, P; Hölter, K; Correia, T; Arntz, C; Grammel, N; Keller, U
1999-02-01
A gene (cpd1) coding for the dimethylallyltryptophan synthase (DMATS) that catalyzes the first specific step in the biosynthesis of ergot alkaloids, was cloned from a strain of Claviceps purpurea that produces alkaloids in axenic culture. The derived gene product (CPD1) shows only 70% similarity to the corresponding gene previously isolated from Claviceps strain ATCC 26245, which is likely to be an isolate of C. fusiformis. Therefore, the related cpd1 most probably represents the first C. purpurea gene coding for an enzymatic step of the alkaloid biosynthetic pathway to be cloned. Analysis of the 3'-flanking region of cpd1 revealed a second, closely linked ergot alkaloid biosynthetic gene named cpps1, which codes for a 356-kDa polypeptide showing significant similarity to fungal modular peptide synthetases. The protein contains three amino acid-activating modules, and in the second module a sequence is found which matches that of an internal peptide (17 amino acids in length) obtained from a tryptic digest of lysergyl peptide synthetase 1 (LPS1) of C. purpurea, thus confirming that cpps1 encodes LPS1. LPS1 activates the three amino acids of the peptide portion of ergot peptide alkaloids during D-lysergyl peptide assembly. Chromosome walking revealed the presence of additional genes upstream of cpd1 which are probably also involved in ergot alkaloid biosynthesis: cpox1 probably codes for an FAD-dependent oxidoreductase (which could represent the chanoclavine cyclase), and a second putative oxidoreductase gene, cpox2, is closely linked to it in inverse orientation. RT-PCR experiments confirm that all four genes are expressed under conditions of peptide alkaloid biosynthesis. These results strongly suggest that at least some genes of ergot alkaloid biosynthesis in C. purpurea are clustered, opening the way for a detailed molecular genetic analysis of the pathway.
Liu, Yong; Wei, Wen-Ping; Ye, Bang-Ce
2018-05-18
The overexpression of bacterial secondary metabolite biosynthetic enzymes is the basis for industrial overproducing strains. Genome editing tools can be used to further improve gene expression and yield. Saccharopolyspora erythraea produces erythromycin, which has extensive clinical applications. In this study, the CRISPR-Cas9 system was used to edit genes in the S. erythraea genome. A temperature-sensitive plasmid containing the PermE promoter, to drive Cas9 expression, and the Pj23119 and PkasO promoters, to drive sgRNAs, was designed. Erythromycin esterase, encoded by S. erythraea SACE_1765, inactivates erythromycin by hydrolyzing the macrolactone ring. Sequencing and qRT-PCR confirmed that reporter genes were successfully inserted into the SACE_1765 gene. Deletion of SACE_1765 in a high-producing strain resulted in a 12.7% increase in erythromycin levels. Subsequent PermE- egfp knock-in at the SACE_0712 locus resulted in an 80.3% increase in erythromycin production compared with that of wild type. Further investigation showed that PermE promoter knock-in activated the erythromycin biosynthetic gene clusters at the SACE_0712 locus. Additionally, deletion of indA (SACE_1229) using dual sgRNA targeting without markers increased the editing efficiency to 65%. In summary, we have successfully applied Cas9-based genome editing to a bacterial strain, S. erythraea, with a high GC content. This system has potential application for both genome-editing and biosynthetic gene cluster activation in Actinobacteria.
Pan, Hung-Yin; Chen, Carton W; Huang, Chih-Hung
2018-04-17
Soil bacteria Streptomyces are the most important producers of secondary metabolites, including most known antibiotics. These bacteria and their close relatives are unique in possessing linear chromosomes, which typically harbor 20 to 30 biosynthetic gene clusters of tens to hundreds of kb in length. Many Streptomyces chromosomes are accompanied by linear plasmids with sizes ranging from several to several hundred kb. The large linear plasmids also often contain biosynthetic gene clusters. We have developed a targeted recombination procedure for arm exchanges between a linear plasmid and a linear chromosome. A chromosomal segment inserted in an artificially constructed plasmid allows homologous recombination between the two replicons at the homology. Depending on the design, the recombination may result in two recombinant replicons or a single recombinant chromosome with the loss of the recombinant plasmid that lacks a replication origin. The efficiency of such targeted recombination ranges from 9 to 83% depending on the locations of the homology (and thus the size of the chromosomal arm exchanged), essentially eliminating the necessity of selection. The targeted recombination is useful for the efficient engineering of the Streptomyces genome for large-scale deletion, addition, and shuffling.
ERIC Educational Resources Information Center
Ohio State Dept. of Education, Columbus.
Skills to be developed by junior high school students (grades 7-8) along with activities and procedures for achieving desired performance objectives for each of the 15 U.S. Office of Education (USOE) occupational clusters are outlined in this career orientation guide, designed to implement the second phase (career orientation) of Ohio's…
Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko
2012-07-15
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
The Psychology of Yoga Practitioners: A Cluster Analysis.
Genovese, Jeremy E C; Fondran, Kristine M
2017-11-01
Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall -Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.
The Psychology of Yoga Practitioners: A Cluster Analysis.
Genovese, Jeremy E C; Fondran, Kristine M
2017-03-30
Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall-Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.
The Glucuronic Acid Utilization Gene Cluster from Bacillus stearothermophilus T-6
Shulami, Smadar; Gat, Orit; Sonenshein, Abraham L.; Shoham, Yuval
1999-01-01
A λ-EMBL3 genomic library of Bacillus stearothermophilus T-6 was screened for hemicellulolytic activities, and five independent clones exhibiting β-xylosidase activity were isolated. The clones overlap each other and together represent a 23.5-kb chromosomal segment. The segment contains a cluster of xylan utilization genes, which are organized in at least three transcriptional units. These include the gene for the extracellular xylanase, xylanase T-6; part of an operon coding for an intracellular xylanase and a β-xylosidase; and a putative 15.5-kb-long transcriptional unit, consisting of 12 genes involved in the utilization of α-d-glucuronic acid (GlcUA). The first four genes in the potential GlcUA operon (orf1, -2, -3, and -4) code for a putative sugar transport system with characteristic components of the binding-protein-dependent transport systems. The most likely natural substrate for this transport system is aldotetraouronic acid [2-O-α-(4-O-methyl-α-d-glucuronosyl)-xylotriose] (MeGlcUAXyl3). The following two genes code for an intracellular α-glucuronidase (aguA) and a β-xylosidase (xynB). Five more genes (kdgK, kdgA, uxaC, uxuA, and uxuB) encode proteins that are homologous to enzymes involved in galacturonate and glucuronate catabolism. The gene cluster also includes a potential regulatory gene, uxuR, the product of which resembles repressors of the GntR family. The apparent transcriptional start point of the cluster was determined by primer extension analysis and is located 349 bp from the initial ATG codon. The potential operator site is a perfect 12-bp inverted repeat located downstream from the promoter between nucleotides +170 and +181. Gel retardation assays indicated that UxuR binds specifically to this sequence and that this binding is efficiently prevented in vitro by MeGlcUAXyl3, the most likely molecular inducer. PMID:10368143
Transcriptional organization of the DNA region controlling expression of the K99 gene cluster.
Roosendaal, B; Damoiseaux, J; Jordi, W; de Graaf, F K
1989-01-01
The transcriptional organization of the K99 gene cluster was investigated in two ways. First, the DNA region, containing the transcriptional signals was analyzed using a transcription vector system with Escherichia coli galactokinase (GalK) as assayable marker and second, an in vitro transcription system was employed. A detailed analysis of the transcription signals revealed that a strong promoter PA and a moderate promoter PB are located upstream of fanA and fanB, respectively. No promoter activity was detected in the intercistronic region between fanB and fanC. Factor-dependent terminators of transcription were detected and are probably located in the intercistronic region between fanA and fanB (T1), and between fanB and fanC (T2). A third terminator (T3) was observed between fanC and fanD and has an efficiency of 90%. Analysis of the regulatory region in an in vitro transcription system confirmed the location of the respective transcription signals. A model for the transcriptional organization of the K99 cluster is presented. Indications were obtained that the trans-acting regulatory polypeptides FanA and FanB both function as anti-terminators. A model for the regulation of expression of the K99 gene cluster is postulated.
Molecular evidence of Burkholderia pseudomallei genotypes based on geographical distribution.
Zulkefli, Noorfatin Jihan; Mariappan, Vanitha; Vellasamy, Kumutha Malar; Chong, Chun Wie; Thong, Kwai Lin; Ponnampalavanar, Sasheela; Vadivelu, Jamuna; Teh, Cindy Shuan Ju
2016-01-01
Background. Central intermediary metabolism (CIM) in bacteria is defined as a set of metabolic biochemical reactions within a cell, which is essential for the cell to survive in response to environmental perturbations. The genes associated with CIM are commonly found in both pathogenic and non-pathogenic strains. As these genes are involved in vital metabolic processes of bacteria, we explored the efficiency of the genes in genotypic characterization of Burkholderia pseudomallei isolates, compared with the established pulsed-field gel electrophoresis (PFGE) and multilocus sequence typing (MLST) schemes. Methods. Nine previously sequenced B. pseudomallei isolates from Malaysia were characterized by PFGE, MLST and CIM genes. The isolates were later compared to the other 39 B. pseudomallei strains, retrieved from GenBank using both MLST and sequence analysis of CIM genes. UniFrac and hierachical clustering analyses were performed using the results generated by both MLST and sequence analysis of CIM genes. Results. Genetic relatedness of nine Malaysian B. pseudomallei isolates and the other 39 strains was investigated. The nine Malaysian isolates were subtyped into six PFGE profiles, four MLST profiles and five sequence types based on CIM genes alignment. All methods demonstrated the clonality of OB and CB as well as CMS and THE. However, PFGE showed less than 70% similarity between a pair of morphology variants, OS and OB. In contrast, OS was identical to the soil isolate, MARAN. To have a better understanding of the genetic diversity of B. pseudomallei worldwide, we further aligned the sequences of genes used in MLST and genes associated with CIM for the nine Malaysian isolates and 39 B. pseudomallei strains from NCBI database. Overall, based on the CIM genes, the strains were subtyped into 33 profiles where majority of the strains from Asian countries were clustered together. On the other hand, MLST resolved the isolates into 31 profiles which formed three clusters. Hierarchical clustering using UniFrac distance suggested that the isolates from Australia were genetically distinct from the Asian isolates. Nevertheless, statistical significant differences were detected between isolates from Malaysia, Thailand and Australia. Discussion. Overall, PFGE showed higher discriminative power in clustering the nine Malaysian B. pseudomallei isolates and indicated its suitability for localized epidemiological study. Compared to MLST, CIM genes showed higher resolution in distinguishing those non-related strains and better clustering of strains from different geographical regions. A closer genetic relatedness of Malaysian isolates with all Asian strains in comparison to Australian strains was observed. This finding was supported by UniFrac analysis which resulted in geographical segregation between Australia and the Asian countries.
Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering
Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043
[Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].
Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong
2015-11-01
The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.
USDA-ARS?s Scientific Manuscript database
Among various genome editing tools available for functional genomic studies, reagents based on clustered regularly interspersed palindromic repeats (CRISPR) have gained popularity due to ease and versatility. CRISPR reagents consists of ribonucleoprotein (RNP) complexes formed by combining guide RNA...
Are Early Somatic Embryos of the Norway Spruce (Picea abies (L.) Karst.) Organised?
Petrek, Jiri; Zitka, Ondrej; Adam, Vojtech; Bartusek, Karel; Anjum, Naser A.; Pereira, Eduarda; Havel, Ladislav; Kizek, Rene
2015-01-01
Background Somatic embryogenesis in conifer species has great potential for the forestry industry. Hence, a number of methods have been developed for their efficient and rapid propagation through somatic embryogenesis. Although information is available regarding the previous process-mediated generation of embryogenic cells to form somatic embryos, there is a dearth of information in the literature on the detailed structure of these clusters. Methodology/Principal Findings The main aim of this study was to provide a more detailed structure of the embryogenic tissue clusters obtained through the in vitro propagation of the Norway spruce (Picea abies (L.) Karst.). We primarily focused on the growth of early somatic embryos (ESEs). The data on ESE growth suggested that there may be clear distinctions between their inner and outer regions. Therefore, we selected ESEs collected on the 56th day after sub-cultivation to dissect the homogeneity of the ESE clusters. Two colourimetric assays (acetocarmine and fluorescein diacetate/propidium iodide staining) and one metabolic assay based on the use of 2,3,5-triphenyltetrazolium chloride uncovered large differences in the metabolic activity inside the cluster. Next, we performed nuclear magnetic resonance measurements. The ESE cluster seemed to be compactly aggregated during the first four weeks of cultivation; thereafter, the difference between the 1H nuclei concentration in the inner and outer clusters was more evident. There were clear differences in the visual appearance of embryos from the outer and inner regions. Finally, a cluster was divided into six parts (three each from the inner and the outer regions of the embryo) to determine their growth and viability. The innermost embryos (centripetally towards the cluster centre) could grow after sub-cultivation but exhibited the slowest rate and required the longest time to reach the common growth rate. To confirm our hypothesis on the organisation of the ESE cluster, we investigated the effect of cluster orientation on the cultivation medium and the influence of the change of the cluster’s three-dimensional orientation on its development. Maintaining the same position when transferring ESEs into new cultivation medium seemed to be necessary because changes in the orientation significantly affected ESE growth. Conclusions and Significance This work illustrated the possible inner organisation of ESEs. The outer layer of ESEs is formed by individual somatic embryos with high metabolic activity (and with high demands for nutrients, oxygen and water), while an embryonal group is directed outside of the ESE cluster. Somatic embryos with depressed metabolic activity were localised in the inner regions, where these embryonic tissues probably have a very important transport function. PMID:26624287
Crystal structures of the NO sensor NsrR reveal how its iron-sulfur cluster modulates DNA binding
NASA Astrophysics Data System (ADS)
Volbeda, Anne; Dodd, Erin L.; Darnault, Claudine; Crack, Jason C.; Renoux, Oriane; Hutchings, Matthew I.; Le Brun, Nick E.; Fontecilla-Camps, Juan C.
2017-04-01
NsrR from Streptomyces coelicolor (Sc) regulates the expression of three genes through the progressive degradation of its [4Fe-4S] cluster on nitric oxide (NO) exposure. We report the 1.95 Å resolution crystal structure of dimeric holo-ScNsrR and show that the cluster is coordinated by the three invariant Cys residues from one monomer and, unexpectedly, Asp8 from the other. A cavity map suggests that NO displaces Asp8 as a cluster ligand and, while D8A and D8C variants remain NO sensitive, DNA binding is affected. A structural comparison of holo-ScNsrR with an apo-IscR-DNA complex shows that the [4Fe-4S] cluster stabilizes a turn between ScNsrR Cys93 and Cys99 properly oriented to interact with the DNA backbone. In addition, an apo ScNsrR structure suggests that Asn97 from this turn, along with Arg12, which forms a salt-bridge with Asp8, are instrumental in modulating the position of the DNA recognition helix region relative to its major groove.
[The Russian gene pool: gene geography of Alu-insertions (ACE, APOA1, B65, PV92 TPA25)].
Solov'eva, D S; Balanovskaia, E V; Kuznetsova, M A; Vasinskaia, O A; Frolova, S A; Pocheshkhova, E A; Evseeva, I V; Boldyreva, M N; Balanovskiĭ, O P
2010-01-01
The analysis of five Alu insertion loci (ACE, AP4OA1, B65, PV92, TPA25) has been carried out for the first time in 10 Russian populations (1088 individuals), covered all parts of historical area of the Russian ethnos. Depending on locus, Russian populations exhibit similarity with their western (European populations) or with the eastern (populations of the Ural region) neighbors. Considering frequencies of the studied Alu-insertions, Russian gene pool exhibits low variation: average difference between populations is d = 0.007, whereas on classical markers, mtDNA and Y chromosome heterogeneity of Russian gene pool is essentially higher (0.013, 0.033 and 0.142 respectively). Therefore, this set of five Alu insertions has lower variability on the intra-ethnic level. However in inter-ethnic comparisons the clear pattern was obtained: 13 Eastern European ethnic groups formed three clusters, according with their historical and geographical position--East Slavic, Caucasian and South Ural clusters. The obtained data confirms efficiency of using Alu insertions for studying genetic differentiation and history of a gene pool of the Eastern European populations.
Fractal Clustering and Knowledge-driven Validation Assessment for Gene Expression Profiling.
Wang, Lu-Yong; Balasubramanian, Ammaiappan; Chakraborty, Amit; Comaniciu, Dorin
2005-01-01
DNA microarray experiments generate a substantial amount of information about the global gene expression. Gene expression profiles can be represented as points in multi-dimensional space. It is essential to identify relevant groups of genes in biomedical research. Clustering is helpful in pattern recognition in gene expression profiles. A number of clustering techniques have been introduced. However, these traditional methods mainly utilize shape-based assumption or some distance metric to cluster the points in multi-dimension linear Euclidean space. Their results shows poor consistence with the functional annotation of genes in previous validation study. From a novel different perspective, we propose fractal clustering method to cluster genes using intrinsic (fractal) dimension from modern geometry. This method clusters points in such a way that points in the same clusters are more self-affine among themselves than to the points in other clusters. We assess this method using annotation-based validation assessment for gene clusters. It shows that this method is superior in identifying functional related gene groups than other traditional methods.
Influence of oxygen on the chemical stage of radiobiological mechanism
NASA Astrophysics Data System (ADS)
Barilla, Jiří; Lokajíček, Miloš V.; Pisaková, Hana; Simr, Pavel
2016-07-01
The simulation of the chemical stage of radiobiological mechanism may be very helpful in studying the radiobiological effect of ionizing radiation when the water radical clusters formed by the densely ionizing ends of primary or secondary charged particle may form DSBs damaging DNA molecules in living cells. It is possible to study not only the efficiency of individual radicals but also the influence of other species or radiomodifiers (mainly oxygen) being present in water medium during irradiation. The mathematical model based on Continuous Petri nets (proposed by us recently) will be described. It makes it possible to analyze two main processes running at the same time: chemical radical reactions and the diffusion of radical clusters formed during energy transfer. One may study the time change of radical concentrations due to the chemical reactions running during diffusion process. Some orientation results concerning the efficiency of individual radicals in DSB formation (in the case of Co60 radiation) will be presented; the influence of oxygen present in water medium during irradiation will be shown, too.
Swarnkar, Mohit Kumar; Vyas, Pratibha; Rahi, Praveen; Thakur, Rishu; Thakur, Namika; Singh, Anil Kumar
2015-01-01
The complete genome sequence of 6.45 Mb is reported here for Pseudomonas trivialis strain IHBB745 (MTCC 5336), which is an efficient, stress-tolerant, and broad-spectrum plant growth-promoting rhizobacterium. The gene-coding clusters predicted the genes for phosphate solubilization, siderophore production, 1-aminocyclopropane-1-carboxylate (ACC) deaminase activity, indole-3-acetic acid (IAA) production, and stress response. PMID:26337878
Finding gene clusters for a replicated time course study
2014-01-01
Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656
Multiconstrained gene clustering based on generalized projections
2010-01-01
Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386
Neurons in cat V1 show significant clustering by degree of tuning
Ziskind, Avi J.; Emondi, Al A.; Kurgansky, Andrei V.; Rebrik, Sergei P.
2015-01-01
Neighboring neurons in cat primary visual cortex (V1) have similar preferred orientation, direction, and spatial frequency. How diverse is their degree of tuning for these properties? To address this, we used single-tetrode recordings to simultaneously isolate multiple cells at single recording sites and record their responses to flashed and drifting gratings of multiple orientations, spatial frequencies, and, for drifting gratings, directions. Orientation tuning width, spatial frequency tuning width, and direction selectivity index (DSI) all showed significant clustering: pairs of neurons recorded at a single site were significantly more similar in each of these properties than pairs of neurons from different recording sites. The strength of the clustering was generally modest. The percent decrease in the median difference between pairs from the same site, relative to pairs from different sites, was as follows: for different measures of orientation tuning width, 29–35% (drifting gratings) or 15–25% (flashed gratings); for DSI, 24%; and for spatial frequency tuning width measured in octaves, 8% (drifting gratings). The clusterings of all of these measures were much weaker than for preferred orientation (68% decrease) but comparable to that seen for preferred spatial frequency in response to drifting gratings (26%). For the above properties, little difference in clustering was seen between simple and complex cells. In studies of spatial frequency tuning to flashed gratings, strong clustering was seen among simple-cell pairs for tuning width (70% decrease) and preferred frequency (71% decrease), whereas no clustering was seen for simple-complex or complex-complex cell pairs. PMID:25652921
Role of higher-multipole deformations in exotic {sup 14}C cluster radioactivity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sawhney, Gudveen; Sharma, Manoj K.; Gupta, Raj K.
2011-06-15
We have studied nine cases of spontaneous emission of {sup 14}C clusters in the ground-state decays of the same number of parent nuclei from the trans-lead region, specifically from {sup 221}Fr to {sup 226}Th, using the preformed cluster model (PCM) of Gupta and collaborators, with choices of spherical, quadrupole deformation ({beta}{sub 2}) alone, and higher-multipole deformations ({beta}{sub 2}, {beta}{sub 3}, {beta}{sub 4}) with cold ''compact'' orientations {theta}{sup c} of decay products. The calculated {sup 14}C cluster decay half-life times are found to be in nice agreement with experimental data only for the case of higher-multipole deformations ({beta}{sub 2}-{beta}{sub 4}) andmore » {theta}{sup c} orientations of cold elongated configurations. In other words, compared to our earlier study of clusters heavier than {sup 14}C, where the inclusion of {beta}{sub 2} alone, with ''optimum'' orientations, was found to be enough to give the best comparison with data, here for {sup 14}C cluster decay the inclusion of higher-multipole deformations (up to hexadecapole), together with {theta}{sup c} orientations, is found to be essential on the basis of the PCM. Interestingly, whereas both the penetration probability and assault frequency work simply as scaling factors, the preformation probability is strongly influenced by the order of multipole deformations and orientations of nuclei. The possible role of Q value and angular-momentum effects are also considered in reference to {sup 14}C cluster radioactivity.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Su, S.
1992-01-01
An equivalent circuit model was postulated for PFSI (perfluoro-sulfanate-ionomer) polymers. It successfully models three different dielectric relaxation mechanisms taking place within long and short sidechain PFSI's in an alternating electric field. The three dielectric processes are long-range ion inter-cluster hopping in the low frequency region, short-range intra-cluster polarization occurred in frequencies at about 10[sup 3] to 10[sup 6] Hz, and Debye-like orientation of water molecules taking place at very high frequencies. When membranes are annealed in the proximity of the glass transition temperature of ionic clusters, the packing of sulfonate groups becomes more efficient. This is by the fact thatmore » the symmetrical parameter of the distribution of relaxation time of the Cole-Cole equation increases with annealing time. The cluster activities of the long and short sidechain polymers act differently in different electrolyte solutions. The sidechains of the long sidechain polymer act like a spring, it contracts while the material was equilibrated in low concentration solutions and it expands as equilibrated in concentrated solutions. The cluster dimension of the long sidechain material does not vary too much. The cluster dimension of short sidechain polymers can vary significantly on different electrolyte solutions.« less
Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki
2014-08-01
Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Detecting false positive sequence homology: a machine learning approach.
Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Bybee, Seth M
2016-02-24
Accurate detection of homologous relationships of biological sequences (DNA or amino acid) amongst organisms is an important and often difficult task that is essential to various evolutionary studies, ranging from building phylogenies to predicting functional gene annotations. There are many existing heuristic tools, most commonly based on bidirectional BLAST searches that are used to identify homologous genes and combine them into two fundamentally distinct classes: orthologs and paralogs. Due to only using heuristic filtering based on significance score cutoffs and having no cluster post-processing tools available, these methods can often produce multiple clusters constituting unrelated (non-homologous) sequences. Therefore sequencing data extracted from incomplete genome/transcriptome assemblies originated from low coverage sequencing or produced by de novo processes without a reference genome are susceptible to high false positive rates of homology detection. In this paper we develop biologically informative features that can be extracted from multiple sequence alignments of putative homologous genes (orthologs and paralogs) and further utilized in context of guided experimentation to verify false positive outcomes. We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data. Almost ~42 % and ~25 % of predicted putative homologies by InParanoid and HaMStR respectively were classified as false positives on experimental data set. Our process increases the quality of output from other clustering algorithms by providing a novel post-processing method that is both fast and efficient at removing low quality clusters of putative homologous genes recovered by heuristic-based approaches.
Haarmann, Thomas; Lorenz, Nicole; Tudzynski, Paul
2008-01-01
The ergot fungus Claviceps purpurea uses mainly the nonhomologous-end-joining (NHEJ) system for integration of exogenous DNA, leading to a low frequency of homologous integration (1-2%). To improve gene targeting efficiency we deleted the C. purpurea ku70 gene in two different strains: the pathogenic strain 20.1 and the apathogenic, ergot alkaloid producing strain P1. The mutants were not impaired in vegetative and pathogenic development nor alkaloid production. Gene targeting efficiency was significantly increased (50-60%) in the Deltaku70 mutants. The P1 Deltaku70 strain (producing ergotamine and ergocryptine) was used for targeted deletion of lpsA1, one of the two trimodular NRPS genes present in the alkaloid gene cluster, encoding D-lysergyl peptide synthetases involved in formation of the tripeptide moiety of ergopeptines. Mutants lacking the lpsA1 gene were shown to be incapable of producing ergotamine but were still able to produce ergocryptine, proving that LpsA1 is involved in ergotamine biosynthesis.
Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi.
Slot, Jason C; Rokas, Antonis
2011-01-25
Genes involved in intermediary and secondary metabolism in fungi are frequently physically linked or clustered. For example, in Aspergillus nidulans the entire pathway for the production of sterigmatocystin (ST), a highly toxic secondary metabolite and a precursor to the aflatoxins (AF), is located in a ∼54 kb, 23 gene cluster. We discovered that a complete ST gene cluster in Podospora anserina was horizontally transferred from Aspergillus. Phylogenetic analysis shows that most Podospora cluster genes are adjacent to or nested within Aspergillus cluster genes, although the two genera belong to different taxonomic classes. Furthermore, the Podospora cluster is highly conserved in content, sequence, and microsynteny with the Aspergillus ST/AF clusters and its intergenic regions contain 14 putative binding sites for AflR, the transcription factor required for activation of the ST/AF biosynthetic genes. Examination of ∼52,000 Podospora expressed sequence tags identified transcripts for 14 genes in the cluster, with several expressed at multiple life cycle stages. The presence of putative AflR-binding sites and the expression evidence for several cluster genes, coupled with the recent independent discovery of ST production in Podospora [1], suggest that this HGT event probably resulted in a functional cluster. Given the abundance of metabolic gene clusters in fungi, our finding that one of the largest known metabolic gene clusters moved intact between species suggests that such transfers might have significantly contributed to fungal metabolic diversity. PAPERFLICK: Copyright © 2011 Elsevier Ltd. All rights reserved.
Heritability of targeted gene modifications induced by plant-optimized CRISPR systems.
Mao, Yanfei; Botella, Jose Ramon; Zhu, Jian-Kang
2017-03-01
The Streptococcus-derived CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas9 (CRISPR-associated protein 9) system has emerged as a very powerful tool for targeted gene modifications in many living organisms including plants. Since the first application of this system for plant gene modification in 2013, this RNA-guided DNA endonuclease system has been extensively engineered to meet the requirements of functional genomics and crop trait improvement in a number of plant species. Given its short history, the emphasis of many studies has been the optimization of the technology to improve its reliability and efficiency to generate heritable gene modifications in plants. Here we review and analyze the features of customized CRISPR/Cas9 systems developed for plant genetic studies and crop breeding. We focus on two essential aspects: the heritability of gene modifications induced by CRISPR/Cas9 and the factors affecting its efficiency, and we provide strategies for future design of systems with improved activity and heritability in plants.
Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F.; Shaw, Peter
2017-01-01
Abstract Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. PMID:28175342
Cai, Haiyuan
2012-01-01
Gene Transfer Agent (GTA) particles are released by bacteria and resemble small, tailed bacteriophages. GTA particles contain small, random pieces of host DNA rather than GTA structural genes or a phage genome. Gene transfer mediated by GTA is efficient and species specific based on knowledge of currently best studied GTAs produced by 4 anaerobes. Genome sequencing projects have revealed a remarkable distribution of GTA gene clusters in the genomes of marine bacterioplankton, implying GTA may be an important mechanism for horizontal gene transfer in ocean. On basis of characterization of the 4 best studied GTAs, this review described GTAs released by numerically dominant marine bacteria, discussed their properties that were important for horizontal gene transfer in ocean, and gave future perspectives to advance GTA research.
High-Efficiency Genome Editing of Streptomyces Species by an Engineered CRISPR/Cas System.
Wang, Y; Cobb, R E; Zhao, H
2016-01-01
Next-generation sequencing technologies have rapidly expanded the genomic information of numerous organisms and revealed a rich reservoir of natural product gene clusters from microbial genomes, especially from Streptomyces, the largest genus of known actinobacteria at present. However, genetic engineering of these bacteria is often time consuming and labor intensive, if even possible. In this chapter, we describe the design and construction of pCRISPomyces, an engineered Type II CRISPR/Cas system, for targeted multiplex gene deletions in Streptomyces lividans, Streptomyces albus, and Streptomyces viridochromogenes with editing efficiency ranging from 70% to 100%. We demonstrate pCRISPomyces as a powerful tool for genome editing in Streptomyces. © 2016 Elsevier Inc. All rights reserved.
Constrained clusters of gene expression profiles with pathological features.
Sese, Jun; Kurokawa, Yukinori; Monden, Morito; Kato, Kikuya; Morishita, Shinichi
2004-11-22
Gene expression profiles should be useful in distinguishing variations in disease, since they reflect accurately the status of cells. The primary clustering of gene expression reveals the genotypes that are responsible for the proximity of members within each cluster, while further clustering elucidates the pathological features of the individual members of each cluster. However, since the first clustering process and the second classification step, in which the features are associated with clusters, are performed independently, the initial set of clusters may omit genes that are associated with pathologically meaningful features. Therefore, it is important to devise a way of identifying gene expression clusters that are associated with pathological features. We present the novel technique of 'itemset constrained clustering' (IC-Clustering), which computes the optimal cluster that maximizes the interclass variance of gene expression between groups, which are divided according to the restriction that only divisions that can be expressed using common features are allowed. This constraint automatically labels each cluster with a set of pathological features which characterize that cluster. When applied to liver cancer datasets, IC-Clustering revealed informative gene expression clusters, which could be annotated with various pathological features, such as 'tumor' and 'man', or 'except tumor' and 'normal liver function'. In contrast, the k-means method overlooked these clusters.
Diametrical clustering for identifying anti-correlated gene clusters.
Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman
2003-09-01
Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel
2018-01-16
The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Gulati, Arvind; Swarnkar, Mohit Kumar; Vyas, Pratibha; Rahi, Praveen; Thakur, Rishu; Thakur, Namika; Singh, Anil Kumar
2015-09-03
The complete genome sequence of 6.45 Mb is reported here for Pseudomonas trivialis strain IHBB745 (MTCC 5336), which is an efficient, stress-tolerant, and broad-spectrum plant growth-promoting rhizobacterium. The gene-coding clusters predicted the genes for phosphate solubilization, siderophore production, 1-aminocyclopropane-1-carboxylate (ACC) deaminase activity, indole-3-acetic acid (IAA) production, and stress response. Copyright © 2015 Gulati et al.
Breakup of a homeobox cluster after genome duplication in teleosts
Mulley, John F.; Chiu, Chi-hua; Holland, Peter W. H.
2006-01-01
Several families of homeobox genes are arranged in genomic clusters in metazoan genomes, including the Hox, ParaHox, NK, Rhox, and Iroquois gene clusters. The selective pressures responsible for maintenance of these gene clusters are poorly understood. The ParaHox gene cluster is evolutionarily conserved between amphioxus and human but is fragmented in teleost fishes. We show that two basal ray-finned fish, Polypterus and Amia, each possess an intact ParaHox cluster; this implies that the selective pressure maintaining clustering was lost after whole-genome duplication in teleosts. Cluster breakup is because of gene loss, not transposition or inversion, and the total number of ParaHox genes is the same in teleosts, human, mouse, and frog. We propose that this homeobox gene cluster is held together in chordates by the existence of interdigitated control regions that could be separated after locus duplication in the teleost fish. PMID:16801555
McDonald, G.D.; Paillet, Frederick L.; Barton, C.C.; Johnson, C.D.
1997-01-01
The clustering of orientations of hydraulically conductive fractures in bedrock at the Mirror Lake, New Hampshire fractured rock study site was investigated by comparing the orientations of fracture populations in two subvertical borehole arrays with those mapped on four adjacent subvertical roadcuts. In the boreholes and the roadcuts, the orientation of fracture populations appears very similar after borehole data are compensated for undersampling of steeply dipping fractures. Compensated borehole and pavement fracture data indicate a northeast-striking population of fractures with varying dips concentrated near that of the local foliation in the adjacent rock. The data show no correlation between fracture density (fractures/linear meter) and distance from lithologic contacts in both the boreholes and the roadcuts. The population of water-producing borehole fractures is too small (28 out of 610 fractures) to yield meaningful orientation comparisons. However, the orientation of large aperture fractures (which contains all the producing fractures) contains two or three subsidiary clusters in orientation frequency that are not evident in stereographic projections of the entire population containing all aperture sizes. Further, these subsidiary orientation clusters do not coincide with the dominant (subhorizontal and subvertical) regional fracture orientations.
Charles, J. P.; Chihara, C.; Nejad, S.; Riddiford, L. M.
1997-01-01
A 36-kb genomic DNA segment of the Drosophila melanogaster genome containing 12 clustered cuticle genes has been mapped and partially sequenced. The cluster maps at 65A 5-6 on the left arm of the third chromosome, in agreement with the previously determined location of a putative cluster encompassing the genes for the third instar larval cuticle proteins LCP5, LCP6 and LCP8. This cluster is the largest cuticle gene cluster discovered to date and shows a number of surprising features that explain in part the genetic complexity of the LCP5, LCP6 and LCP8 loci. The genes encoding LCP5 and LCP8 are multiple copy genes and the presence of extensive similarity in their coding regions gives the first evidence for gene conversion in cuticle genes. In addition, five genes in the cluster are intronless. Four of these five have arisen by retroposition. The other genes in the cluster have a single intron located at an unusual location for insect cuticle genes. PMID:9383064
Wang, Zhengrui; Rahman, A B M Moshiur; Wang, Guoying; Ludewig, Uwe; Shen, Jianbo; Neumann, Günter
2015-04-01
This study addresses hormonal interactions involved in cluster-root (CR) development of phosphate (Pi)-deficient white lupin (Lupinus albus), which represents the most efficient plant strategy for root-induced mobilisation of sparingly soluble soil phosphorus (P) sources. Shoot-to-root translocation of auxin was unaffected by P-limitation, while strong stimulatory effects of external sucrose on CR formation, even in P-sufficient plants, suggest sucrose, rather than auxins, acts as a shoot-borne signal, triggering the induction of CR primordia. Ethylene may act as mediator of the sucrose signal, as indicated by moderately increased expression of genes involved in ethylene biosynthesis in pre-emergent clusters and by strong inhibitory effects of the ethylene antagonist CoCl2 on CR formation induced by sucrose amendments or P-limitation. As reported in other plants, moderately increased production of brassinosteroids (BRs) and cytokinin, in pre-emergent clusters, may be required for the formation of auxin gradients necessary for induction of CR primordia via interference with auxin biosynthesis and transport. The well-documented inhibition of root elongation by high doses of ethylene may be involved in the growth inhibition of lateral rootlets during CR maturation, indicated by a massive increased expression of gene involved in ethylene production, associated with a declined expression of transcripts with stimulatory effects (BR and auxin-related genes). Copyright © 2014 Elsevier GmbH. All rights reserved.
Magnetic fields in the Perseus Spiral Arm and in Infrared Dark Clouds
NASA Astrophysics Data System (ADS)
Hoq, Sadia
2017-04-01
The magnetic (B) field is ubiquitous throughout the Milky Way. Several fundamental questions about the B-field in the cool, star-forming interstellar medium (ISM) remain unanswered. In this dissertation, near-infrared (NIR) polarimetric observations are used to study the large-scale Galactic B-field in the cool ISM in a spiral arm and to determine the role of B-fields in the formation of Infrared Dark Clouds (IRDCs). NIR polarimetry of 31 star clusters, located in and around the Perseus spiral arm, were obtained to determine the orientation of the plane-of-sky B-field in the outer Galaxy, and whether the presence of a spiral arm influenced B-field properties. Cluster distances, which provide upper limits to the B-field probed by observations, were estimated by developing a maximum likelihood method to fit theoretical stellar isochrones to stars in cluster color-magnitude diagrams (CMDs). Using the distance estimates, the cluster locations relative to the Perseus arm were found. The cluster polarization percentages and orientations were compared between clusters foreground to the arm and clusters inside or behind the arm. The cluster polarization orientations are predominantly parallel to the Galactic plane. Clusters inside and behind the arm have larger polarization percentages, likely a result of more polarizing material along the line of sight. The cluster polarization data were also compared to optical, inner Galaxy NIR, and Planck submm polarimetry data, and showed agreement with all three data sets. The polarimetric properties of one IRDC, G28.23, were determined using deep NIR observations. The polarization orientations relative to the cloud major axis were found to change directions with distance from the cloud axis. The B-field strength was estimated to be 10 to 100microG. Despite these large inferred B-field strengths, the B-field was found not to be the dominant force in the formation of the IRDC, though the B-field morphology was influenced by the cloud. Using NIR observations, the B-field of 27 IRDCs were studied. The relative polarization orientations with respect to the cloud major axes were found. No preferential relative orientation was found, implying that the B-field did not greatly influence the formation of this sample of IRDCs.
Nerys-Junior, Arildo; Braga-Dias, Luciene P; Pezzuto, Paula; Cotta-de-Almeida, Vinícius; Tanuri, Amilcar
2018-01-01
The human C-C chemokine receptor type-5 (CCR5) is the major transmembrane co-receptor that mediates HIV-1 entry into target CD4+ cells. Gene therapy to knock-out the CCR5 gene has shown encouraging results in providing a functional cure for HIV-1 infection. In gene therapy strategies, the initial region of the CCR5 gene is a hotspot for producing functional gene knock-out. Such target gene editing can be done using programmable endonucleases such as transcription activator-like effector nucleases (TALEN) or clustered regularly interspaced short palindromic repeats (CRISPR-Cas9). These two gene editing approaches are the most modern and effective tools for precise gene modification. However, little is known of potential differences in the efficiencies of TALEN and CRISPR-Cas9 for editing the beginning of the CCR5 gene. To examine which of these two methods is best for gene therapy, we compared the patterns and amount of editing at the beginning of the CCR5 gene using TALEN and CRISPR-Cas9 followed by DNA sequencing. This comparison revealed that CRISPR-Cas9 mediated the sorting of cells that contained 4.8 times more gene editing than TALEN+ transfected cells.
Nerys-Junior, Arildo; Braga-Dias, Luciene P.; Pezzuto, Paula; Cotta-de-Almeida, Vinícius; Tanuri, Amilcar
2018-01-01
Abstract The human C-C chemokine receptor type-5 (CCR5) is the major transmembrane co-receptor that mediates HIV-1 entry into target CD4+ cells. Gene therapy to knock-out the CCR5 gene has shown encouraging results in providing a functional cure for HIV-1 infection. In gene therapy strategies, the initial region of the CCR5 gene is a hotspot for producing functional gene knock-out. Such target gene editing can be done using programmable endonucleases such as transcription activator-like effector nucleases (TALEN) or clustered regularly interspaced short palindromic repeats (CRISPR-Cas9). These two gene editing approaches are the most modern and effective tools for precise gene modification. However, little is known of potential differences in the efficiencies of TALEN and CRISPR-Cas9 for editing the beginning of the CCR5 gene. To examine which of these two methods is best for gene therapy, we compared the patterns and amount of editing at the beginning of the CCR5 gene using TALEN and CRISPR-Cas9 followed by DNA sequencing. This comparison revealed that CRISPR-Cas9 mediated the sorting of cells that contained 4.8 times more gene editing than TALEN+ transfected cells. PMID:29583154
Kong, Xiang-Zhen; Liu, Jin-Xing; Zheng, Chun-Hou; Hou, Mi-Xiao; Wang, Juan
2017-07-01
High dimensionality has become a typical feature of biomolecular data. In this paper, a novel dimension reduction method named p-norm singular value decomposition (PSVD) is proposed to seek the low-rank approximation matrix to the biomolecular data. To enhance the robustness to outliers, the Lp-norm is taken as the error function and the Schatten p-norm is used as the regularization function in the optimization model. To evaluate the performance of PSVD, the Kmeans clustering method is then employed for tumor clustering based on the low-rank approximation matrix. Extensive experiments are carried out on five gene expression data sets including two benchmark data sets and three higher dimensional data sets from the cancer genome atlas. The experimental results demonstrate that the PSVD-based method outperforms many existing methods. Especially, it is experimentally proved that the proposed method is more efficient for processing higher dimensional data with good robustness, stability, and superior time performance.
Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun
2014-11-25
The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position among isolates but also functionally essential for a given species and to further evaluate the stability or flexibility of such genome structures across lineages are of importance. Based on a large number of multi-isolate pangenomic data, our analysis reveals that a subset of core genes is organized into a core-gene-defined genome organizational framework, or cGOF. Furthermore, the lineage-associated cGOFs among Gram-positive and Gram-negative bacteria behave differently: the former, composed of 2 to 4 segments, have their fragments symmetrically rearranged around the origin-terminus axis, whereas the latter show more complex segmentation and are partitioned asymmetrically into chromosomal structures. The definition of cGOFs provides new insights into prokaryotic genome organization and efficient guidance for genome assembly and analysis. Copyright © 2014 Kang et al.
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus. PMID:28435299
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin ( acm ) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN , encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus .
Matamoros, Sébastien; van Hattem, Jarne M; Arcilla, Maris S; Willemse, Niels; Melles, Damian C; Penders, John; Vinh, Trung Nguyen; Thi Hoa, Ngo; de Jong, Menno D; Schultsz, Constance
2017-11-10
To understand the dynamics behind the worldwide spread of the mcr-1 gene, we determined the population structure of Escherichia coli and of mobile genetic elements (MGEs) carrying the mcr-1 gene. After a systematic review of the literature we included 65 E. coli whole genome sequences (WGS), adding 6 recently sequenced travel related isolates, and 312 MLST profiles. We included 219 MGEs described in 7 Enterobacteriaceae species isolated from human, animal and environmental samples. Despite a high overall diversity, 2 lineages were observed in the E. coli population that may function as reservoirs of the mcr-1 gene, the largest of which was linked to ST10, a sequence type known for its ubiquity in human faecal samples and in food samples. No genotypic clustering by geographical origin or isolation source was observed. Amongst a total of 13 plasmid incompatibility types, the IncI2, IncX4 and IncHI2 plasmids accounted for more than 90% of MGEs carrying the mcr-1 gene. We observed significant geographical clustering with regional spread of IncHI2 plasmids in Europe and IncI2 in Asia. These findings point towards promiscuous spread of the mcr-1 gene by efficient horizontal gene transfer dominated by a limited number of plasmid incompatibility types.
Cary, J. W.; Han, Z.; Yin, Y.; Lohmar, J. M.; Shantappa, S.; Harris-Coward, P. Y.; Mack, B.; Ehrlich, K. C.; Wei, Q.; Arroyo-Manzanares, N.; Uka, V.; Vanhaecke, L.; Bhatnagar, D.; Yu, J.; Nierman, W. C.; Johns, M. A.; Sorensen, D.; Shen, H.; De Saeger, S.; Diana Di Mavungu, J.
2015-01-01
The global regulatory veA gene governs development and secondary metabolism in numerous fungal species, including Aspergillus flavus. This is especially relevant since A. flavus infects crops of agricultural importance worldwide, contaminating them with potent mycotoxins. The most well-known are aflatoxins, which are cytotoxic and carcinogenic polyketide compounds. The production of aflatoxins and the expression of genes implicated in the production of these mycotoxins are veA dependent. The genes responsible for the synthesis of aflatoxins are clustered, a signature common for genes involved in fungal secondary metabolism. Studies of the A. flavus genome revealed many gene clusters possibly connected to the synthesis of secondary metabolites. Many of these metabolites are still unknown, or the association between a known metabolite and a particular gene cluster has not yet been established. In the present transcriptome study, we show that veA is necessary for the expression of a large number of genes. Twenty-eight out of the predicted 56 secondary metabolite gene clusters include at least one gene that is differentially expressed depending on presence or absence of veA. One of the clusters under the influence of veA is cluster 39. The absence of veA results in a downregulation of the five genes found within this cluster. Interestingly, our results indicate that the cluster is expressed mainly in sclerotia. Chemical analysis of sclerotial extracts revealed that cluster 39 is responsible for the production of aflavarin. PMID:26209694
A tripartite clustering analysis on microRNA, gene and disease model.
Shen, Chengcheng; Liu, Ying
2012-02-01
Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.
Pyeon, Hye-Rim; Nah, Hee-Ju; Kang, Seung-Hoon; Choi, Si-Sun; Kim, Eung-Soo
2017-05-31
Heterologous expression of biosynthetic gene clusters of natural microbial products has become an essential strategy for titer improvement and pathway engineering of various potentially-valuable natural products. A Streptomyces artificial chromosomal conjugation vector, pSBAC, was previously successfully applied for precise cloning and tandem integration of a large polyketide tautomycetin (TMC) biosynthetic gene cluster (Nah et al. in Microb Cell Fact 14(1):1, 2015), implying that this strategy could be employed to develop a custom overexpression scheme of natural product pathway clusters present in actinomycetes. To validate the pSBAC system as a generally-applicable heterologous overexpression system for a large-sized polyketide biosynthetic gene cluster in Streptomyces, another model polyketide compound, the pikromycin biosynthetic gene cluster, was preciously cloned and heterologously expressed using the pSBAC system. A unique HindIII restriction site was precisely inserted at one of the border regions of the pikromycin biosynthetic gene cluster within the chromosome of Streptomyces venezuelae, followed by site-specific recombination of pSBAC into the flanking region of the pikromycin gene cluster. Unlike the previous cloning process, one HindIII site integration step was skipped through pSBAC modification. pPik001, a pSBAC containing the pikromycin biosynthetic gene cluster, was directly introduced into two heterologous hosts, Streptomyces lividans and Streptomyces coelicolor, resulting in the production of 10-deoxymethynolide, a major pikromycin derivative. When two entire pikromycin biosynthetic gene clusters were tandemly introduced into the S. lividans chromosome, overproduction of 10-deoxymethynolide and the presence of pikromycin, which was previously not detected, were both confirmed. Moreover, comparative qRT-PCR results confirmed that the transcription of pikromycin biosynthetic genes was significantly upregulated in S. lividans containing tandem clusters of pikromycin biosynthetic gene clusters. The 60 kb pikromycin biosynthetic gene cluster was isolated in a single integration pSBAC vector. Introduction of the pikromycin biosynthetic gene cluster into the pikromycin non-producing strains resulted in higher pikromycin production. The utility of the pSBAC system as a precise cloning tool for large-sized biosynthetic gene clusters was verified through heterologous expression of the pikromycin biosynthetic gene cluster. Moreover, this pSBAC-driven heterologous expression strategy was confirmed to be an ideal approach for production of low and inconsistent natural products such as pikromycin in S. venezuelae, implying that this strategy could be employed for development of a custom overexpression scheme of natural product biosynthetic gene clusters in actinomycetes.
From hormones to secondary metabolism: the emergence of metabolic gene clusters in plants.
Chu, Hoi Yee; Wegel, Eva; Osbourn, Anne
2011-04-01
Gene clusters for the synthesis of secondary metabolites are a common feature of microbial genomes. Well-known examples include clusters for the synthesis of antibiotics in actinomycetes, and also for the synthesis of antibiotics and toxins in filamentous fungi. Until recently it was thought that genes for plant metabolic pathways were not clustered, and this is certainly true in many cases; however, five plant secondary metabolic gene clusters have now been discovered, all of them implicated in synthesis of defence compounds. An obvious assumption might be that these eukaryotic gene clusters have arisen by horizontal gene transfer from microbes, but there is compelling evidence to indicate that this is not the case. This raises intriguing questions about how widespread such clusters are, what the significance of clustering is, why genes for some metabolic pathways are clustered and those for others are not, and how these clusters form. In answering these questions we may hope to learn more about mechanisms of genome plasticity and adaptive evolution in plants. It is noteworthy that for the five plant secondary metabolic gene clusters reported so far, the enzymes for the first committed steps all appear to have been recruited directly or indirectly from primary metabolic pathways involved in hormone synthesis. This may or may not turn out to be a common feature of plant secondary metabolic gene clusters as new clusters emerge. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong
2013-07-01
A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
Sykes, Timothy; Yates, Steven; Nagy, Istvan; Asp, Torben; Small, Ian
2017-01-01
Perennial ryegrass (Lolium perenne L.) is widely used for forage production in both permanent and temporary grassland systems. To increase yields in perennial ryegrass, recent breeding efforts have been focused on strategies to more efficiently exploit heterosis by hybrid breeding. Cytoplasmic male sterility (CMS) is a widely applied mechanism to control pollination for commercial hybrid seed production and although CMS systems have been identified in perennial ryegrass, they are yet to be fully characterized. Here, we present a bioinformatics pipeline for efficient identification of candidate restorer of fertility (Rf) genes for CMS. From a high-quality draft of the perennial ryegrass genome, 373 pentatricopeptide repeat (PPR) genes were identified and classified, further identifying 25 restorer of fertility-like PPR (RFL) genes through a combination of DNA sequence clustering and comparison to known Rf genes. This extensive gene family was targeted as the majority of Rf genes in higher plants are RFL genes. These RFL genes were further investigated by phylogenetic analyses, identifying three groups of perennial ryegrass RFLs. These three groups likely represent genomic regions of active RFL generation and identify the probable location of perennial ryegrass PPR-Rf genes. This pipeline allows for the identification of candidate PPR-Rf genes from genomic sequence data and can be used in any plant species. Functional markers for PPR-Rf genes will facilitate map-based cloning of Rf genes and enable the use of CMS as an efficient tool to control pollination for hybrid crop production. PMID:26951780
Supervised group Lasso with applications to microarray data analysis
Ma, Shuangge; Song, Xiao; Huang, Jian
2007-01-01
Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436
Mugford, Sam T.; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J.; Lomonossoff, George P.; Osbourn, Anne
2013-01-01
Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069
Optimizing sgRNA structure to improve CRISPR-Cas9 knockout efficiency.
Dang, Ying; Jia, Gengxiang; Choi, Jennie; Ma, Hongming; Anaya, Edgar; Ye, Chunting; Shankar, Premlata; Wu, Haoquan
2015-12-15
Single-guide RNA (sgRNA) is one of the two key components of the clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome-editing system. The current commonly used sgRNA structure has a shortened duplex compared with the native bacterial CRISPR RNA (crRNA)-transactivating crRNA (tracrRNA) duplex and contains a continuous sequence of thymines, which is the pause signal for RNA polymerase III and thus could potentially reduce transcription efficiency. Here, we systematically investigate the effect of these two elements on knockout efficiency and showed that modifying the sgRNA structure by extending the duplex length and mutating the fourth thymine of the continuous sequence of thymines to cytosine or guanine significantly, and sometimes dramatically, improves knockout efficiency in cells. In addition, the optimized sgRNA structure also significantly increases the efficiency of more challenging genome-editing procedures, such as gene deletion, which is important for inducing a loss of function in non-coding genes. By a systematic investigation of sgRNA structure we find that extending the duplex by approximately 5 bp combined with mutating the continuous sequence of thymines at position 4 to cytosine or guanine significantly increases gene knockout efficiency in CRISPR-Cas9-based genome editing experiments.
Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology
Fischbach, Michael; Voigt, Christopher A.
2014-01-01
Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes
Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J
2008-01-01
Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at . PMID:18366802
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim; ...
2015-02-05
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.
Schulz, Tizian; Stoye, Jens; Doerr, Daniel
2018-05-08
Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Engineering Synthetic Gene Circuits in Living Cells with CRISPR Technology.
Jusiak, Barbara; Cleto, Sara; Perez-Piñera, Pablo; Lu, Timothy K
2016-07-01
One of the goals of synthetic biology is to build regulatory circuits that control cell behavior, for both basic research purposes and biomedical applications. The ability to build transcriptional regulatory devices depends on the availability of programmable, sequence-specific, and effective synthetic transcription factors (TFs). The prokaryotic clustered regularly interspaced short palindromic repeat (CRISPR) system, recently harnessed for transcriptional regulation in various heterologous host cells, offers unprecedented ease in designing synthetic TFs. We review how CRISPR can be used to build synthetic gene circuits and discuss recent advances in CRISPR-mediated gene regulation that offer the potential to build increasingly complex, programmable, and efficient gene circuits in the future. Copyright © 2016. Published by Elsevier Ltd.
A Cluster Analytic Study of Clinical Orientations among Chemical Dependency Counselors.
ERIC Educational Resources Information Center
Thombs, Dennis L.; Osborn, Cynthia J.
2001-01-01
Three distinct clinical orientations were identified in a sample of chemical dependency counselors (N=406). Based on cluster analysis, the largest group, identified and labeled as "uniform counselors," endorsed a simple, moral-disease model with little interest in psychosocial interventions. (Contains 50 references and 4 tables.) (GCP)
Radial alignment of elliptical galaxies by the tidal force of a cluster of galaxies
NASA Astrophysics Data System (ADS)
Rong, Yu; Yi, Shu-Xu; Zhang, Shuang-Nan; Tu, Hong
2015-08-01
Unlike the random radial orientation distribution of field elliptical galaxies, galaxies in a cluster are expected to point preferentially towards the centre of the cluster, as a result of the cluster's tidal force on its member galaxies. In this work, an analytic model is formulated to simulate this effect. The deformation time-scale of a galaxy in a cluster is usually much shorter than the time-scale of change of the tidal force; the dynamical process of tidal interaction within the galaxy can thus be ignored. The equilibrium shape of a galaxy is then assumed to be the surface of equipotential that is the sum of the self-gravitational potential of the galaxy and the tidal potential of the cluster at this location. We use a Monte Carlo method to calculate the radial orientation distribution of cluster galaxies, by assuming a Navarro-Frenk-White mass profile for the cluster and the initial ellipticity of field galaxies. The radial angles show a single-peak distribution centred at zero. The Monte Carlo simulations also show that a shift of the reference centre from the real cluster centre weakens the anisotropy of the radial angle distribution. Therefore, the expected radial alignment cannot be revealed if the distribution of spatial position angle is used instead of that of radial angle. The observed radial orientations of elliptical galaxies in cluster Abell 2744 are consistent with the simulated distribution.
Finding approximate gene clusters with Gecko 3.
Winter, Sascha; Jahn, Katharina; Wehner, Stefanie; Kuchenbecker, Leon; Marz, Manja; Stoye, Jens; Böcker, Sebastian
2016-11-16
Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Functional clustering of time series gene expression data by Granger causality
2012-01-01
Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Cao, Yingxiu; Li, Xiaofei; Li, Feng; Song, Hao
2017-09-15
Extracellular electron transfer (EET) in Shewanella oneidensis MR-1, which is one of the most well-studied exoelectrogens, underlies many microbial electrocatalysis processes, including microbial fuel cells, microbial electrolysis cells, and microbial electrosynthesis. However, regulating the efficiency of EET remains challenging due to the lack of efficient genome regulation tools that regulate gene expression levels in S. oneidensis. Here, we systematically established a transcriptional regulation technology, i.e., clustered regularly interspaced short palindromic repeats interference (CRISPRi), in S. oneidensis MR-1 using green fluorescent protein (GFP) as a reporter. We used this CRISPRi technology to repress the expression levels of target genes, individually and in combination, in the EET pathways (e.g., the MtrCAB pathway and genes affecting the formation of electroactive biofilms in S. oneidensis), which in turn enabled the efficient regulation of EET efficiency. We then established a translational regulation technology, i.e., Hfq-dependent small regulatory RNA (sRNA), in S. oneidensis by repressing the GFP reporter and mtrA, which is a critical gene in the EET pathways in S. oneidensis. To achieve coordinated transcriptional and translational regulation at the genomic level, the CRISPRi and Hfq-dependent sRNA systems were incorporated into a single plasmid harbored in a recombinant S. oneidensis strain, which enabled an even higher efficiency of mtrA gene repression in the EET pathways than that achieved by the CRISPRi and Hfq-dependent sRNA system alone, as exhibited by the reduced electricity output. Overall, we developed a combined CRISPRi-sRNA method that enabled the synergistic transcriptional and translational regulation of target genes in S. oneidensis. This technology involving CRISPRi-sRNA transcriptional-translational regulation of gene expression at the genomic level could be applied to other microorganisms.
Swaminathan, Sivakumar; Morrone, Dana; Wang, Qiang; Fulton, D. Bruce; Peters, Reuben J.
2009-01-01
Biosynthetic gene clusters are common in microbial organisms, but rare in plants, raising questions regarding the evolutionary forces that drive their assembly in multicellular eukaryotes. Here, we characterize the biochemical function of a rice (Oryza sativa) cytochrome P450 monooxygenase, CYP76M7, which seems to act in the production of antifungal phytocassanes and defines a second diterpenoid biosynthetic gene cluster in rice. This cluster is uniquely multifunctional, containing enzymatic genes involved in the production of two distinct sets of phytoalexins, the antifungal phytocassanes and antibacterial oryzalides/oryzadiones, with the corresponding genes being subject to distinct transcriptional regulation. The lack of uniform coregulation of the genes within this multifunctional cluster suggests that this was not a primary driving force in its assembly. However, the cluster is dedicated to specialized metabolism, as all genes in the cluster are involved in phytoalexin metabolism. We hypothesize that this dedication to specialized metabolism led to the assembly of the corresponding biosynthetic gene cluster. Consistent with this hypothesis, molecular phylogenetic comparison demonstrates that the two rice diterpenoid biosynthetic gene clusters have undergone independent elaboration to their present-day forms, indicating continued evolutionary pressure for coclustering of enzymatic genes encoding components of related biosynthetic pathways. PMID:19825834
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.
Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C
2015-10-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Deciphering principles of transcription regulation in eukaryotic genomes
Nguyen, Dat H; D'haeseleer, Patrik
2006-01-01
Transcription regulation has been responsible for organismal complexity and diversity in the course of biological evolution and adaptation, and it is determined largely by the context-dependent behavior of cis-regulatory elements (CREs). Therefore, understanding principles underlying CRE behavior in regulating transcription constitutes a fundamental objective of quantitative biology, yet these remain poorly understood. Here we present a deterministic mathematical strategy, the motif expression decomposition (MED) method, for deriving principles of transcription regulation at the single-gene resolution level. MED operates on all genes in a genome without requiring any a priori knowledge of gene cluster membership, or manual tuning of parameters. Applying MED to Saccharomyces cerevisiae transcriptional networks, we identified four functions describing four different ways that CREs can quantitatively affect gene expression levels. These functions, three of which have extrema in different positions in the gene promoter (short-, mid-, and long-range) whereas the other depends on the motif orientation, are validated by expression data. We illustrate how nature could use these principles as an additional dimension to amplify the combinatorial power of a small set of CREs in regulating transcription. PMID:16738557
Drivers of genetic diversity in secondary metabolic gene clusters within a fungal species
Lind, Abigail L.; Wisecaver, Jennifer H.; Lameiras, Catarina; Wiemann, Philipp; Palmer, Jonathan M.; Keller, Nancy P.; Rodrigues, Fernando; Goldman, Gustavo H.
2017-01-01
Filamentous fungi produce a diverse array of secondary metabolites (SMs) critical for defense, virulence, and communication. The metabolic pathways that produce SMs are found in contiguous gene clusters in fungal genomes, an atypical arrangement for metabolic pathways in other eukaryotes. Comparative studies of filamentous fungal species have shown that SM gene clusters are often either highly divergent or uniquely present in one or a handful of species, hampering efforts to determine the genetic basis and evolutionary drivers of SM gene cluster divergence. Here, we examined SM variation in 66 cosmopolitan strains of a single species, the opportunistic human pathogen Aspergillus fumigatus. Investigation of genome-wide within-species variation revealed 5 general types of variation in SM gene clusters: nonfunctional gene polymorphisms; gene gain and loss polymorphisms; whole cluster gain and loss polymorphisms; allelic polymorphisms, in which different alleles corresponded to distinct, nonhomologous clusters; and location polymorphisms, in which a cluster was found to differ in its genomic location across strains. These polymorphisms affect the function of representative A. fumigatus SM gene clusters, such as those involved in the production of gliotoxin, fumigaclavine, and helvolic acid as well as the function of clusters with undefined products. In addition to enabling the identification of polymorphisms, the detection of which requires extensive genome-wide synteny conservation (e.g., mobile gene clusters and nonhomologous cluster alleles), our approach also implicated multiple underlying genetic drivers, including point mutations, recombination, and genomic deletion and insertion events as well as horizontal gene transfer from distant fungi. Finally, most of the variants that we uncover within A. fumigatus have been previously hypothesized to contribute to SM gene cluster diversity across entire fungal classes and phyla. We suggest that the drivers of genetic diversity operating within a fungal species shown here are sufficient to explain SM cluster macroevolutionary patterns. PMID:29149178
A highly efficient multi-core algorithm for clustering extremely large datasets
2010-01-01
Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
Koike-Yusa, Hiroko; Li, Yilong; Tan, E-Pien; Velasco-Herrera, Martin Del Castillo; Yusa, Kosuke
2014-03-01
Identification of genes influencing a phenotype of interest is frequently achieved through genetic screening by RNA interference (RNAi) or knockouts. However, RNAi may only achieve partial depletion of gene activity, and knockout-based screens are difficult in diploid mammalian cells. Here we took advantage of the efficiency and high throughput of genome editing based on type II, clustered, regularly interspaced, short palindromic repeats (CRISPR)-CRISPR-associated (Cas) systems to introduce genome-wide targeted mutations in mouse embryonic stem cells (ESCs). We designed 87,897 guide RNAs (gRNAs) targeting 19,150 mouse protein-coding genes and used a lentiviral vector to express these gRNAs in ESCs that constitutively express Cas9. Screening the resulting ESC mutant libraries for resistance to either Clostridium septicum alpha-toxin or 6-thioguanine identified 27 known and 4 previously unknown genes implicated in these phenotypes. Our results demonstrate the potential for efficient loss-of-function screening using the CRISPR-Cas9 system.
Properties of a U1 RNA enhancer-like sequence.
Ciliberto, G; Palla, F; Tebb, G; Mattaj, I W; Philipson, L
1987-01-01
The properties of a X.laevis U1B snRNA gene enhancer have been studied by microinjection in Xenopus oocytes. The enhancer-like sequence, defined as a short DNA stretch that is able to activate transcription in an orientation independent manner, is interchangeable between different U snRNA genes. The enhancer sequence alone does not, however, efficiently activate transcription from an SV40 pol II promoter but regains its activity when combined with the U-gene specific proximal sequence element. DNase I protection experiments show that the X.laevis U1B enhancer can interact specifically with a nuclear factor present in mammalian cells. Images PMID:3031597
Parallel goal-oriented adaptive finite element modeling for 3D electromagnetic exploration
NASA Astrophysics Data System (ADS)
Zhang, Y.; Key, K.; Ovall, J.; Holst, M.
2014-12-01
We present a parallel goal-oriented adaptive finite element method for accurate and efficient electromagnetic (EM) modeling of complex 3D structures. An unstructured tetrahedral mesh allows this approach to accommodate arbitrarily complex 3D conductivity variations and a priori known boundaries. The total electric field is approximated by the lowest order linear curl-conforming shape functions and the discretized finite element equations are solved by a sparse LU factorization. Accuracy of the finite element solution is achieved through adaptive mesh refinement that is performed iteratively until the solution converges to the desired accuracy tolerance. Refinement is guided by a goal-oriented error estimator that uses a dual-weighted residual method to optimize the mesh for accurate EM responses at the locations of the EM receivers. As a result, the mesh refinement is highly efficient since it only targets the elements where the inaccuracy of the solution corrupts the response at the possibly distant locations of the EM receivers. We compare the accuracy and efficiency of two approaches for estimating the primary residual error required at the core of this method: one uses local element and inter-element residuals and the other relies on solving a global residual system using a hierarchical basis. For computational efficiency our method follows the Bank-Holst algorithm for parallelization, where solutions are computed in subdomains of the original model. To resolve the load-balancing problem, this approach applies a spectral bisection method to divide the entire model into subdomains that have approximately equal error and the same number of receivers. The finite element solutions are then computed in parallel with each subdomain carrying out goal-oriented adaptive mesh refinement independently. We validate the newly developed algorithm by comparison with controlled-source EM solutions for 1D layered models and with 2D results from our earlier 2D goal oriented adaptive refinement code named MARE2DEM. We demonstrate the performance and parallel scaling of this algorithm on a medium-scale computing cluster with a marine controlled-source EM example that includes a 3D array of receivers located over a 3D model that includes significant seafloor bathymetry variations and a heterogeneous subsurface.
Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray
2004-01-01
One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.
OAS1 Polymorphisms Are Associated with Susceptibility to West Nile Encephalitis in Horses
Rios, Jonathan J.; Fleming, JoAnn G. W.; Bryant, Uneeda K.; Carter, Craig N.; Huber, John C.; Long, Maureen T.; Spencer, Thomas E.; Adelson, David L.
2010-01-01
West Nile virus, first identified within the United States in 1999, has since spread across the continental states and infected birds, humans and domestic animals, resulting in numerous deaths. Previous studies in mice identified the Oas1b gene, a member of the OAS/RNASEL innate immune system, as a determining factor for resistance to West Nile virus (WNV) infection. A recent case-control association study described mutations of human OAS1 associated with clinical susceptibility to WNV infection. Similar studies in horses, a particularly susceptible species, have been lacking, in part, because of the difficulty in collecting populations sufficiently homogenous in their infection and disease states. The equine OAS gene cluster most closely resembles the human cluster, with single copies of OAS1, OAS3 and OAS2 in the same orientation. With naturally occurring susceptible and resistant sub-populations to lethal West Nile encephalitis, we undertook a case-control association study to investigate whether, similar to humans (OAS1) and mice (Oas1b), equine OAS1 plays a role in resistance to severe WNV infection. We identified naturally occurring single nucleotide mutations in equine (Equus caballus) OAS1 and RNASEL genes and, using Fisher's Exact test, we provide evidence that mutations in equine OAS1 contribute to host susceptibility. Virtually all of the associated OAS1 polymorphisms were located within the interferon-inducible promoter, suggesting that differences in OAS1 gene expression may determine the host's ability to resist clinical manifestations associated with WNV infection. PMID:20479874
2010-01-01
Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082
Ortholog-based screening and identification of genes related to intracellular survival.
Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin
2018-04-20
Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.
Nowrousian, Minou
2009-04-01
During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.
Doublet, Benoît; Praud, Karine; Bertrand, Sophie; Collard, Jean-Marc; Weill, François-Xavier; Cloeckaert, Axel
2008-10-01
Salmonella genomic island 1 (SGI1) is an integrative mobilizable element that harbors a multidrug resistance (MDR) gene cluster. Since its identification in epidemic Salmonella enterica serovar Typhimurium DT104 strains, variant SGI1 MDR gene clusters conferring different MDR phenotypes have been identified in several S. enterica serovars and classified as SGI1-A to -O. A study was undertaken to characterize SGI1 from serovar Kentucky strains isolated from travelers returning from Africa. Several strains tested were found to contain the partially characterized variant SGI1-K, recently described in a serovar Kentucky strain isolated in Australia. This variant contained only one cassette array, aac(3)-Id-aadA7, and an adjacent mercury resistance module. Here, the uncharacterized part of SGI1-K was sequenced. Downstream of the mer module similar to that found in Tn21, a mosaic genetic structure was found, comprising (i) part of Tn1721 containing the tetracycline resistance genes tetR and tet(A); (ii) part of Tn5393 containing the streptomycin resistance genes strAB, IS1133, and a truncated tnpR gene; and (iii) a Tn3-like region containing the tnpR gene and the beta-lactamase bla(TEM-1) gene flanked by two IS26 elements in opposite orientations. The rightmost IS26 element was shown to be inserted into the S044 open reading frame of the SGI1 backbone. This variant MDR region was named SGI1-K1 according to the previously described variant SGI1-K. Other SGI1-K MDR regions due to different IS26 locations, inversion, and partial deletions were characterized and named SGI1-K2 to -K5. Two new SGI1 variants named SGI1-P1 and -P2 contained only the Tn3-like region comprising the beta-lactamase bla(TEM-1) gene flanked by the two IS26 elements inserted into the SGI1 backbone. Three other new variants harbored only one IS26 element inserted in place of the MDR region of SGI1 and were named SGI1-Q1 to -Q3. Thus, in serovar Kentucky, the SGI1 MDR region undergoes recombinational and insertional events of transposon and insertion sequences, resulting in a higher diversity of MDR gene clusters than previously reported and consequently a higher diversity of MDR phenotypes.
Poole, William; Leinonen, Kalle; Shmulevich, Ilya
2017-01-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady
2017-02-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
The Health Cluster. Career Orientation Series.
ERIC Educational Resources Information Center
Ohio State Dept. of Education, Columbus. Div. of Vocational Education.
Developed to provide seventh and eight grade students information about careers in the health occupational cluster, this booklet may be used to integrate career information with various subject areas. (It is one of several student booklets developed for use in the Ohio Career Orientation Program at grades 7 and 8 to assist students in making…
The Hospitality and Recreation Cluster. Career Orientation Series.
ERIC Educational Resources Information Center
Ohio State Dept. of Education, Columbus. Div. of Vocational Education.
Developed to provide seventh and eighth grade students information about careers in the hospitality and recreation occupational cluster, this booklet may be used to integrate career information with various subject areas. (It is one of several student booklets developed for use in the Ohio Career Orientation Program at grades 7 and 8 to assist…
The Personal Service Cluster. Career Orientation Series.
ERIC Educational Resources Information Center
Ohio State Dept. of Education, Columbus. Div. of Vocational Education.
Developed to provide seventh and eighth grade students information about careers in the personal service occupational cluster, this booklet may be used to integrate career information with various subject areas. (It is one of several student booklets developed for use in the Ohio Career Orientation Program at grades 7 and 8 to assist students in…
Díez-Villaseñor, César; Guzmán, Noemí M.; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J.M.
2013-01-01
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism. PMID:23445770
Díez-Villaseñor, César; Guzmán, Noemí M; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J M
2013-05-01
Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism.
Analysis of multiplex gene expression maps obtained by voxelation.
An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios
2009-04-29
Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.
Haarmann, Thomas; Machado, Caroline; Lübbe, Yvonne; Correia, Telmo; Schardl, Christopher L; Panaccione, Daniel G; Tudzynski, Paul
2005-06-01
The genomic region of Claviceps purpurea strain P1 containing the ergot alkaloid gene cluster [Tudzynski, P., Hölter, K., Correia, T., Arntz, C., Grammel, N., Keller, U., 1999. Evidence for an ergot alkaloid gene cluster in Claviceps purpurea. Mol. Gen. Genet. 261, 133-141] was explored by chromosome walking, and additional genes probably involved in the ergot alkaloid biosynthesis have been identified. The putative cluster sequence (extending over 68.5kb) contains 4 different nonribosomal peptide synthetase (NRPS) genes and several putative oxidases. Northern analysis showed that most of the genes were co-regulated (repressed by high phosphate), and identified probable flanking genes by lack of co-regulation. Comparison of the cluster sequences of strain P1, an ergotamine producer, with that of strain ECC93, an ergocristine producer, showed high conservation of most of the cluster genes, but significant variation in the NRPS modules, strongly suggesting that evolution of these chemical races of C. purpurea is determined by evolution of NRPS module specificity.
Patel, Vidushi S; Cooper, Steven J B; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer A M
2008-07-25
Vertebrate alpha (alpha)- and beta (beta)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the alpha- and beta-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil beta-globin gene (omega) in the marsupial alpha-cluster, however, suggested that duplication of the alpha-beta cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous alpha- and beta-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. The platypus alpha-globin cluster (chromosome 21) contains embryonic and adult alpha- globin genes, a beta-like omega-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-zeta-zeta'-alphaD-alpha3-alpha2-alpha1-omega-GBY-3'. The platypus beta-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-epsilon-beta-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate alpha-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal beta-globin clusters are embedded in olfactory genes. Thus, the mammalian alpha- and beta-globin clusters are orthologous to the bird alpha- and beta-globin clusters respectively. We propose that alpha- and beta-globin clusters evolved from an ancient MPG-C16orf35-alpha-beta-GBY-LUC7L arrangement 410 million years ago. A copy of the original beta (represented by omega in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of beta-globin genes with different expression profiles in different lineages.
Yasue, Akihiro; Mitsui, Silvia Naomi; Watanabe, Takahito; Sakuma, Tetsushi; Oyadomari, Seiichi; Yamamoto, Takashi; Noji, Sumihare; Mito, Taro; Tanaka, Eiji
2014-07-16
Since the establishment of embryonic stem (ES) cell lines, the combined use of gene targeting with homologous recombination has aided in elucidating the functions of various genes. However, the ES cell technique is inefficient and time-consuming. Recently, two new gene-targeting technologies have been developed: the transcription activator-like effector nuclease (TALEN) system, and the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) system. In addition to aiding researchers in solving conventional problems, these technologies can be used to induce site-specific mutations in various species for which ES cells have not been established. Here, by targeting the Fgf10 gene through RNA microinjection in one-cell mouse embryos with the TALEN and CRISPR/Cas systems, we produced the known limb-defect phenotypes of Fgf10-deficient embryos at the F0 generation. Compared to the TALEN system, the CRISPR/Cas system induced the limb-defect phenotypes with a strikingly higher efficiency. Our results demonstrate that although both gene-targeting technologies are useful, the CRISPR/Cas system more effectively elicits single-step biallelic mutations in mice.
Garrido, Daniel; Ruiz-Moyano, Santiago; Kirmiz, Nina; Davis, Jasmine C.; Totten, Sarah M.; Lemay, Danielle G.; Ugalde, Juan A.; German, J. Bruce; Lebrilla, Carlito B.; Mills, David A.
2016-01-01
The infant intestinal microbiota is often colonized by two subspecies of Bifidobacterium longum: subsp. infantis (B. infantis) and subsp. longum (B. longum). Competitive growth of B. infantis in the neonate intestine has been linked to the utilization of human milk oligosaccharides (HMO). However, little is known how B. longum consumes HMO. In this study, infant-borne B. longum strains exhibited varying HMO growth phenotypes. While all strains efficiently utilized lacto-N-tetraose, certain strains additionally metabolized fucosylated HMO. B. longum SC596 grew vigorously on HMO, and glycoprofiling revealed a preference for consumption of fucosylated HMO. Transcriptomes of SC596 during early-stage growth on HMO were more similar to growth on fucosyllactose, transiting later to a pattern similar to growth on neutral HMO. B. longum SC596 contains a novel gene cluster devoted to the utilization of fucosylated HMO, including genes for import of fucosylated molecules, fucose metabolism and two α-fucosidases. This cluster showed a modular induction during early growth on HMO and fucosyllactose. This work clarifies the genomic and physiological variation of infant-borne B. longum to HMO consumption, which resembles B. infantis. The capability to preferentially consume fucosylated HMO suggests a competitive advantage for these unique B. longum strains in the breast-fed infant gut. PMID:27756904
Garrido, Daniel; Ruiz-Moyano, Santiago; Kirmiz, Nina; Davis, Jasmine C; Totten, Sarah M; Lemay, Danielle G; Ugalde, Juan A; German, J Bruce; Lebrilla, Carlito B; Mills, David A
2016-10-19
The infant intestinal microbiota is often colonized by two subspecies of Bifidobacterium longum: subsp. infantis (B. infantis) and subsp. longum (B. longum). Competitive growth of B. infantis in the neonate intestine has been linked to the utilization of human milk oligosaccharides (HMO). However, little is known how B. longum consumes HMO. In this study, infant-borne B. longum strains exhibited varying HMO growth phenotypes. While all strains efficiently utilized lacto-N-tetraose, certain strains additionally metabolized fucosylated HMO. B. longum SC596 grew vigorously on HMO, and glycoprofiling revealed a preference for consumption of fucosylated HMO. Transcriptomes of SC596 during early-stage growth on HMO were more similar to growth on fucosyllactose, transiting later to a pattern similar to growth on neutral HMO. B. longum SC596 contains a novel gene cluster devoted to the utilization of fucosylated HMO, including genes for import of fucosylated molecules, fucose metabolism and two α-fucosidases. This cluster showed a modular induction during early growth on HMO and fucosyllactose. This work clarifies the genomic and physiological variation of infant-borne B. longum to HMO consumption, which resembles B. infantis. The capability to preferentially consume fucosylated HMO suggests a competitive advantage for these unique B. longum strains in the breast-fed infant gut.
Unveiling the biotransformation mechanism of indole in a Cupriavidus sp. strain.
Qu, Yuanyuan; Ma, Qiao; Liu, Ziyan; Wang, Weiwei; Tang, Hongzhi; Zhou, Jiti; Xu, Ping
2017-12-01
Indole, an important signaling molecule as well as a typical N-heterocyclic aromatic pollutant, is widespread in nature. However, the biotransformation mechanisms of indole are still poorly studied. Here, we sought to unlock the genetic determinants of indole biotransformation in strain Cupriavidus sp. SHE based on genomics, proteomics and functional studies. A total of 177 proteins were notably altered (118 up- and 59 downregulated) in cells grown in indole mineral salt medium when compared with that in sodium citrate medium. RT-qPCR and gene knockout assays demonstrated that an indole oxygenase gene cluster was responsible for the indole upstream metabolism. A functional indole oxygenase, termed IndA, was identified in the cluster, and its catalytic efficiency was higher than those of previously reported indole oxidation enzymes. Furthermore, the indole downstream metabolism was found to proceed via the atypical CoA-thioester pathway rather than conventional gentisate and salicylate pathways. This unusual pathway was catalyzed by a conserved 2-aminobenzoyl-CoA gene cluster, among which the 2-aminobenzoyl-CoA ligase initiated anthranilate transformation. This study unveils the genetic determinants of indole biotransformation and will provide new insights into our understanding of indole biodegradation in natural environments and its functional studies. © 2017 John Wiley & Sons Ltd.
RRW: repeated random walks on genome-scale protein networks for local cluster discovery
Macropol, Kathy; Can, Tolga; Singh, Ambuj K
2009-01-01
Background We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e.g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins. Results We apply the proposed technique on a functional network of yeast genes and accurately identify statistically significant clusters of proteins. We validate the biological significance of the results using known complexes in the MIPS complex catalogue database and well-characterized biological processes. We find that 90% of the created clusters have the majority of their catalogued proteins belonging to the same MIPS complex, and about 80% have the majority of their proteins involved in the same biological process. We compare our method to various other clustering techniques, such as the Markov Clustering Algorithm (MCL), and find a significant improvement in the RRW clusters' precision and accuracy values. Conclusion RRW, which is a technique that exploits the topology of the network, is more precise and robust in finding local clusters. In addition, it has the added flexibility of being able to find multi-functional proteins by allowing overlapping clusters. PMID:19740439
Long, Nicole M.; Kahana, Michael J.
2016-01-01
Although episodic and semantic memory share overlapping neural mechanisms, it remains unclear how our pre-existing semantic associations modulate the formation of new, episodic associations. When freely recalling recently studied words, people rely on both episodic and semantic associations, shown through temporal and semantic clustering of responses. We asked whether orienting participants toward semantic associations interferes with or facilitates the formation of episodic associations. We compared electroencephalographic (EEG) activity recorded during the encoding of subsequently recalled words that were either temporally or semantically clustered. Participants studied words with or without a concurrent semantic orienting task. We identified a neural signature of successful episodic association formation whereby high frequency EEG activity (HFA, 44 – 100 Hz) overlying left prefrontal regions increased for subsequently temporally clustered words, but only for those words studied without a concurrent semantic orienting task. To confirm that this disruption in the formation of episodic associations was driven by increased semantic processing, we measured the neural correlates of subsequent semantic clustering. We found that HFA increased for subsequently semantically clustered words only for lists with a concurrent semantic orienting task. This dissociation suggests that increased semantic processing of studied items interferes with the neural processes that support the formation of novel episodic associations. PMID:27617775
Long, Nicole M; Kahana, Michael J
2017-02-01
Although episodic and semantic memory share overlapping neural mechanisms, it remains unclear how our pre-existing semantic associations modulate the formation of new, episodic associations. When freely recalling recently studied words, people rely on both episodic and semantic associations, shown through temporal and semantic clustering of responses. We asked whether orienting participants toward semantic associations interferes with or facilitates the formation of episodic associations. We compared electroencephalographic (EEG) activity recorded during the encoding of subsequently recalled words that were either temporally or semantically clustered. Participants studied words with or without a concurrent semantic orienting task. We identified a neural signature of successful episodic association formation whereby high-frequency EEG activity (HFA, 44-100 Hz) overlying left prefrontal regions increased for subsequently temporally clustered words, but only for those words studied without a concurrent semantic orienting task. To confirm that this disruption in the formation of episodic associations was driven by increased semantic processing, we measured the neural correlates of subsequent semantic clustering. We found that HFA increased for subsequently semantically clustered words only for lists with a concurrent semantic orienting task. This dissociation suggests that increased semantic processing of studied items interferes with the neural processes that support the formation of novel episodic associations. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
CORM: An R Package Implementing the Clustering of Regression Models Method for Gene Clustering
Shi, Jiejun; Qin, Li-Xuan
2014-01-01
We report a new R package implementing the clustering of regression models (CORM) method for clustering genes using gene expression data and provide data examples illustrating each clustering function in the package. The CORM package is freely available at CRAN from http://cran.r-project.org. PMID:25452684
Clustering approaches to identifying gene expression patterns from DNA microarray data.
Do, Jin Hwan; Choi, Dong-Kug
2008-04-30
The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru
2015-01-01
Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.
Anitha, P; Anbarasu, Anand; Ramaiah, Sudha
2014-05-01
Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Bowen, Alice M; Johnson, Eachan O D; Mercuri, Francesco; Hoskins, Nicola J; Qiao, Ruihong; McCullagh, James S O; Lovett, Janet E; Bell, Stephen G; Zhou, Weihong; Timmel, Christiane R; Wong, Luet Lok; Harmer, Jeffrey R
2018-02-21
Cytochrome P450 (CYP) monooxygenases catalyze the oxidation of chemically inert carbon-hydrogen bonds in diverse endogenous and exogenous organic compounds by atmospheric oxygen. This C-H bond oxy-functionalization activity has huge potential in biotechnological applications. Class I CYPs receive the two electrons required for oxygen activation from NAD(P)H via a ferredoxin reductase and ferredoxin. The interaction of Class I CYPs with their cognate ferredoxin is specific. In order to reconstitute the activity of diverse CYPs, structural characterization of CYP-ferredoxin complexes is necessary, but little structural information is available. Here we report a structural model of such a complex (CYP199A2-HaPux) in frozen solution derived from distance and orientation restraints gathered by the EPR technique of orientation-selective double electron-electron resonance (os-DEER). The long-lived oscillations in the os-DEER spectra were well modeled by a single orientation of the CYP199A2-HaPux complex. The structure is different from the two known Class I CYP-Fdx structures: CYP11A1-Adx and CYP101A1-Pdx. At the protein interface, HaPux residues in the [Fe 2 S 2 ] cluster-binding loop and the α3 helix and the C-terminus residue interact with CYP199A2 residues in the proximal loop and the C helix. These residue contacts are consistent with biochemical data on CYP199A2-ferredoxin binding and electron transfer. Electron-tunneling calculations indicate an efficient electron-transfer pathway from the [Fe 2 S 2 ] cluster to the heme. This new structural model of a CYP-Fdx complex provides the basis for tailoring CYP enzymes for which the cognate ferredoxin is not known, to accept electrons from HaPux and display monooxygenase activity.
A Self-Organizing Spatial Clustering Approach to Support Large-Scale Network RTK Systems.
Shen, Lili; Guo, Jiming; Wang, Lei
2018-06-06
The network real-time kinematic (RTK) technique can provide centimeter-level real time positioning solutions and play a key role in geo-spatial infrastructure. With ever-increasing popularity, network RTK systems will face issues in the support of large numbers of concurrent users. In the past, high-precision positioning services were oriented towards professionals and only supported a few concurrent users. Currently, precise positioning provides a spatial foundation for artificial intelligence (AI), and countless smart devices (autonomous cars, unmanned aerial-vehicles (UAVs), robotic equipment, etc.) require precise positioning services. Therefore, the development of approaches to support large-scale network RTK systems is urgent. In this study, we proposed a self-organizing spatial clustering (SOSC) approach which automatically clusters online users to reduce the computational load on the network RTK system server side. The experimental results indicate that both the SOSC algorithm and the grid algorithm can reduce the computational load efficiently, while the SOSC algorithm gives a more elastic and adaptive clustering solution with different datasets. The SOSC algorithm determines the cluster number and the mean distance to cluster center (MDTCC) according to the data set, while the grid approaches are all predefined. The side-effects of clustering algorithms on the user side are analyzed with real global navigation satellite system (GNSS) data sets. The experimental results indicate that 10 km can be safely used as the cluster radius threshold for the SOSC algorithm without significantly reducing the positioning precision and reliability on the user side.
Efficient and Heritable Gene Targeting in Tilapia by CRISPR/Cas9
Li, Minghui; Yang, Huihui; Zhao, Jiue; Fang, Lingling; Shi, Hongjuan; Li, Mengru; Sun, Yunlv; Zhang, Xianbo; Jiang, Dongneng; Zhou, Linyan; Wang, Deshou
2014-01-01
Studies of gene function in non-model animals have been limited by the approaches available for eliminating gene function. The CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR associated) system has recently become a powerful tool for targeted genome editing. Here, we report the use of the CRISPR/Cas9 system to disrupt selected genes, including nanos2, nanos3, dmrt1, and foxl2, with efficiencies as high as 95%. In addition, mutations in dmrt1 and foxl2 induced by CRISPR/Cas9 were efficiently transmitted through the germline to F1. Obvious phenotypes were observed in the G0 generation after mutation of germ cell or somatic cell-specific genes. For example, loss of Nanos2 and Nanos3 in XY and XX fish resulted in germ cell-deficient gonads as demonstrated by GFP labeling and Vasa staining, respectively, while masculinization of somatic cells in both XY and XX gonads was demonstrated by Dmrt1 and Cyp11b2 immunohistochemistry and by up-regulation of serum androgen levels. Our data demonstrate that targeted, heritable gene editing can be achieved in tilapia, providing a convenient and effective approach for generating loss-of-function mutants. Furthermore, our study shows the utility of the CRISPR/Cas9 system for genetic engineering in non-model species like tilapia and potentially in many other teleost species. PMID:24709635
Nykyri, Johanna; Mattinen, Laura; Niemi, Outi; Adhikari, Satish; Kõiv, Viia; Somervuo, Panu; Fang, Xin; Auvinen, Petri; Mäe, Andres; Palva, E. Tapio; Pirhonen, Minna
2013-01-01
In this study, we characterized a putative Flp/Tad pilus-encoding gene cluster, and we examined its regulation at the transcriptional level and its role in the virulence of potato pathogenic enterobacteria of the genus Pectobacterium. The Flp/Tad pilus-encoding gene clusters in Pectobacterium atrosepticum, Pectobacterium wasabiae and Pectobacterium aroidearum were compared to previously characterized flp/tad gene clusters, including that of the well-studied Flp/Tad pilus model organism Aggregatibacter actinomycetemcomitans, in which this pilus is a major virulence determinant. Comparative analyses revealed substantial protein sequence similarity and open reading frame synteny between the previously characterized flp/tad gene clusters and the cluster in Pectobacterium, suggesting that the predicted flp/tad gene cluster in Pectobacterium encodes a Flp/Tad pilus-like structure. We detected genes for a novel two-component system adjacent to the flp/tad gene cluster in Pectobacterium, and mutant analysis demonstrated that this system has a positive effect on the transcription of selected Flp/Tad pilus biogenesis genes, suggesting that this response regulator regulate the flp/tad gene cluster. Mutagenesis of either the predicted regulator gene or selected Flp/Tad pilus biogenesis genes had a significant impact on the maceration ability of the bacterial strains in potato tubers, indicating that the Flp/Tad pilus-encoding gene cluster represents a novel virulence determinant in Pectobacterium. Soft-rot enterobacteria in the genera Pectobacterium and Dickeya are of great agricultural importance, and an investigation of the virulence of these pathogens could facilitate improvements in agricultural practices, thus benefiting farmers, the potato industry and consumers. PMID:24040039
Nykyri, Johanna; Mattinen, Laura; Niemi, Outi; Adhikari, Satish; Kõiv, Viia; Somervuo, Panu; Fang, Xin; Auvinen, Petri; Mäe, Andres; Palva, E Tapio; Pirhonen, Minna
2013-01-01
In this study, we characterized a putative Flp/Tad pilus-encoding gene cluster, and we examined its regulation at the transcriptional level and its role in the virulence of potato pathogenic enterobacteria of the genus Pectobacterium. The Flp/Tad pilus-encoding gene clusters in Pectobacterium atrosepticum, Pectobacterium wasabiae and Pectobacterium aroidearum were compared to previously characterized flp/tad gene clusters, including that of the well-studied Flp/Tad pilus model organism Aggregatibacter actinomycetemcomitans, in which this pilus is a major virulence determinant. Comparative analyses revealed substantial protein sequence similarity and open reading frame synteny between the previously characterized flp/tad gene clusters and the cluster in Pectobacterium, suggesting that the predicted flp/tad gene cluster in Pectobacterium encodes a Flp/Tad pilus-like structure. We detected genes for a novel two-component system adjacent to the flp/tad gene cluster in Pectobacterium, and mutant analysis demonstrated that this system has a positive effect on the transcription of selected Flp/Tad pilus biogenesis genes, suggesting that this response regulator regulate the flp/tad gene cluster. Mutagenesis of either the predicted regulator gene or selected Flp/Tad pilus biogenesis genes had a significant impact on the maceration ability of the bacterial strains in potato tubers, indicating that the Flp/Tad pilus-encoding gene cluster represents a novel virulence determinant in Pectobacterium. Soft-rot enterobacteria in the genera Pectobacterium and Dickeya are of great agricultural importance, and an investigation of the virulence of these pathogens could facilitate improvements in agricultural practices, thus benefiting farmers, the potato industry and consumers.
2015-01-01
Background Cellular processes are known to be modular and are realized by groups of proteins implicated in common biological functions. Such groups of proteins are called functional modules, and many community detection methods have been devised for their discovery from protein interaction networks (PINs) data. In current agglomerative clustering approaches, vertices with just a very few neighbors are often classified as separate clusters, which does not make sense biologically. Also, a major limitation of agglomerative techniques is that their computational efficiency do not scale well to large PINs. Finally, PIN data obtained from large scale experiments generally contain many false positives, and this makes it hard for agglomerative clustering methods to find the correct clusters, since they are known to be sensitive to noisy data. Results We propose a local similarity premetric, the relative vertex clustering value, as a new criterion allowing to decide when a node can be added to a given node's cluster and which addresses the above three issues. Based on this criterion, we introduce a novel and very fast agglomerative clustering technique, FAC-PIN, for discovering functional modules and protein complexes from a PIN data. Conclusions Our proposed FAC-PIN algorithm is applied to nine PIN data from eight different species including the yeast PIN, and the identified functional modules are validated using Gene Ontology (GO) annotations from DAVID Bioinformatics Resources. Identified protein complexes are also validated using experimentally verified complexes. Computational results show that FAC-PIN can discover functional modules or protein complexes from PINs more accurately and more efficiently than HC-PIN and CNM, the current state-of-the-art approaches for clustering PINs in an agglomerative manner. PMID:25734691
Mi, Shuofu; Jia, Xiaojing; Wang, Jinzhi; Qiao, Weibo; Peng, Xiaowei; Han, Yejun
2014-01-01
The xylanolytic extremely thermophilic bacterium Caldicellulosiruptor owensensis provides a promising platform for xylan utilization. In the present study, two novel xylanolytic enzymes, GH10 endo-β-1,4-xylanase (Coxyn A) and GH39 β-1,4-xylosidase (Coxyl A) encoded in one gene cluster of C.owensensis were heterogeneously expressed and biochemically characterized. The optimum temperature of the two xylanlytic enzymes was 75°C, and the respective optimum pH for Coxyn A and Coxyl A was 7.0 and 5.0. The difference of Coxyn A and Coxyl A in solution was existing as monomer and homodimer respectively, it was also observed in predicted secondary structure. Under optimum condition, the catalytic efficiency (kcat/Km) of Coxyn A was 366 mg ml(-1) s(-1) on beechwood xylan, and the catalytic efficiency (kcat/Km) of Coxyl A was 2253 mM(-1) s(-1) on pNP-β-D-xylopyranoside. Coxyn A degraded xylan to oligosaccharides, which were converted to monomer by Coxyl A. The two intracellular enzymes might be responsible for xylooligosaccharides utilization in C.owensensis, also provide a potential way for xylan degradation in vitro.
NASA Technical Reports Server (NTRS)
Zhang, Ye; Wong, Michael; Hada, Megumi; Wu, Honglu
2015-01-01
Microgravity has been shown to alter global gene expression patterns and protein levels both in cultured cells and animal models. It has been suggested that the packaging of chromatin fibers in the interphase nucleus is closely related to genome function, and the changes in transcriptional activity are tightly correlated with changes in chromatin folding. This study explores the changes of chromatin conformation and chromatin-chromatin interactions in the simulated microgravity environment, and investigates their correlation to the expression of genes located at different regions of the chromosome. To investigate the folding of chromatin in interphase under various culture conditions, human epithelial cells, fibroblasts, and lymphocytes were fixed in the G1 phase. Interphase chromosomes were hybridized with a multicolor banding in situ hybridization (mBAND) probe for chromosome 3 which distinguishes six regions of the chromosome as separate colors. After images were captured with a laser scanning confocal microscope, the 3-dimensional structure of interphase chromosome 3 was reconstructed at multi-mega base pair scale. In order to determine the effects of microgravity on chromosome conformation and orientation, measures such as distance between homologous pairs, relative orientation of chromosome arms about a shared midpoint, and orientation of arms within individual chromosomes were all considered as potentially impacted by simulated microgravity conditions. The studies revealed non-random folding of chromatin in interphase, and suggested an association of interphase chromatin folding with radiation-induced chromosome aberration hotspots. Interestingly, the distributions of genes with expression changes over chromosome 3 in cells cultured under microgravity environment are apparently clustered on specific loci and chromosomes. This data provides important insights into how mammalian cells respond to microgravity at molecular level.
An effective fuzzy kernel clustering analysis approach for gene expression data.
Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao
2015-01-01
Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.
Hu, Peinan; Zhao, Xueying; Zhang, Qinghua; Li, Weiming; Zu, Yao
2018-01-01
The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system has been proven to be an efficient and precise genome editing technology in various organisms. However, the gene editing efficiencies of Cas9 proteins with a nuclear localization signal (NLS) fused to different termini and Cas9 mRNA have not been systematically compared. Here, we compared the ability of Cas9 proteins with NLS fused to the N-, C-, or both the N- and C-termini and N-NLS-Cas9-NLS-C mRNA to target two sites in the tyr gene and two sites in the gol gene related to pigmentation in zebrafish. Phenotypic analysis revealed that all types of Cas9 led to hypopigmentation in similar proportions of injected embryos. Genome analysis by T7 Endonuclease I (T7E1) assays demonstrated that all types of Cas9 similarly induced mutagenesis in four target sites. Sequencing results further confirmed that a high frequency of indels occurred in the target sites (tyr1 > 66%, tyr2 > 73%, gol1 > 50%, and gol2 > 35%), as well as various types (more than six) of indel mutations observed in all four types of Cas9-injected embryos. Furthermore, all types of Cas9 showed efficient targeted mutagenesis on multiplex genome editing, resulting in multiple phenotypes simultaneously. Collectively, we conclude that various NLS-fused Cas9 proteins and Cas9 mRNAs have similar genome editing efficiencies on targeting single or multiple genes, suggesting that the efficiency of CRISPR/Cas9 genome editing is highly dependent on guide RNAs (gRNAs) and gene loci. These findings may help to simplify the selection of Cas9 for gene editing using the CRISPR/Cas9 system. PMID:29295818
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.
Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki
2013-07-09
The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase will be continuously updated and additional genomic/transcriptomic resources and analysis tools will be provided for further efficient analysis of the mechanism of insecticide resistance and the development of effective insecticides with a novel mode of action for DBM.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhai, Ying; Bai, Silei; Liu, Jingjing
Dithiolopyrrolone group antibiotics characterized by an electronically unique dithiolopyrrolone heterobicyclic core are known for their antibacterial, antifungal, insecticidal and antitumor activities. Recently the biosynthetic gene clusters for two dithiolopyrrolone compounds, holomycin and thiomarinol, have been identified respectively in different bacterial species. Here, we report a novel dithiolopyrrolone biosynthetic gene cluster (aut) isolated from Streptomyces thioluteus DSM 40027 which produces two pyrrothine derivatives, aureothricin and thiolutin. By comparison with other characterized dithiolopyrrolone clusters, eight genes in the aut cluster were verified to be responsible for the assembly of dithiolopyrrolone core. The aut cluster was further confirmed by heterologous expression and in-framemore » gene deletion experiments. Intriguingly, we found that the heterogenetic thioesterase HlmK derived from the holomycin (hlm) gene cluster in Streptomyces clavuligerus significantly improved heterologous biosynthesis of dithiolopyrrolones in Streptomyces albus through coexpression with the aut cluster. In the previous studies, HlmK was considered invalid because it has a Ser to Gly point mutation within the canonical Ser-His-Asp catalytic triad of thioesterases. However, gene inactivation and complementation experiments in our study unequivocally demonstrated that HlmK is an active distinctive type II thioesterase that plays a beneficial role in dithiolopyrrolone biosynthesis. - Highlights: • Cloning of the aureothricin biosynthetic gene cluster from Streptomyces thioluteus DSM 40027. • Identification of the aureothricin gene cluster by heterologous expression and in-frame gene deletion. • The heterogenetic thioesterase HlmK significantly improved dithiolopyrrolones production of the aureothricin gene cluster. • Identification of HlmK as an unusual type II thioesterase.« less
Osborne, Peter W; Benoit, Gérard; Laudet, Vincent; Schubert, Michael; Ferrier, David E K
2009-03-01
The ParaHox cluster is the evolutionary sister to the Hox cluster. Like the Hox cluster, the ParaHox cluster displays spatial and temporal regulation of the component genes along the anterior/posterior axis in a manner that correlates with the gene positions within the cluster (a feature called collinearity). The ParaHox cluster is however a simpler system to study because it is composed of only three genes. We provide a detailed analysis of the amphioxus ParaHox cluster and, for the first time in a single species, examine the regulation of the cluster in response to a single developmental signalling molecule, retinoic acid (RA). Embryos treated with either RA or RA antagonist display altered ParaHox gene expression: AmphiGsx expression shifts in the neural tube, and the endodermal boundary between AmphiXlox and AmphiCdx shifts its anterior/posterior position. We identified several putative retinoic acid response elements and in vitro assays suggest some may participate in RA regulation of the ParaHox genes. By comparison to vertebrate ParaHox gene regulation we explore the evolutionary implications. This work highlights how insights into the regulation and evolution of more complex vertebrate arrangements can be obtained through studies of a simpler, unduplicated amphioxus gene cluster.
A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression
Nguyen, Nha; Vo, An; Choi, Inchan
2015-01-01
Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910
de Lima-Morales, Daiana; Chaves-Moreno, Diego; Wos-Oxley, Melissa L; Jáuregui, Ruy; Vilchez-Vargas, Ramiro; Pieper, Dietmar H
2016-01-01
Pseudomonas veronii 1YdBTEX2, a benzene and toluene degrader, and Pseudomonas veronii 1YB2, a benzene degrader, have previously been shown to be key players in a benzene-contaminated site. These strains harbor unique catabolic pathways for the degradation of benzene comprising a gene cluster encoding an isopropylbenzene dioxygenase where genes encoding downstream enzymes were interrupted by stop codons. Extradiol dioxygenases were recruited from gene clusters comprising genes encoding a 2-hydroxymuconic semialdehyde dehydrogenase necessary for benzene degradation but typically absent from isopropylbenzene dioxygenase-encoding gene clusters. The benzene dihydrodiol dehydrogenase-encoding gene was not clustered with any other aromatic degradation genes, and the encoded protein was only distantly related to dehydrogenases of aromatic degradation pathways. The involvement of the different gene clusters in the degradation pathways was suggested by real-time quantitative reverse transcription PCR. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
NASA Astrophysics Data System (ADS)
Lau, Yun-Fai; Kan, Yuet Wai
1983-09-01
We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.
The DOPA decarboxylase (DDC) gene is associated with alerting attention.
Zhu, Bi; Chen, Chuansheng; Moyzis, Robert K; Dong, Qi; Chen, Chunhui; He, Qinghua; Li, Jin; Li, Jun; Lei, Xuemei; Lin, Chongde
2013-06-03
DOPA decarboxylase (DDC) is involved in the synthesis of dopamine, norepinephrine and serotonin. It has been suggested that genes involved in the dopamine, norepinephrine, and cholinergic systems play an essential role in the efficiency of human attention networks. Attention refers to the cognitive process of obtaining and maintaining the alert state, orienting to sensory events, and regulating the conflicts of thoughts and behavior. The present study tested seven single nucleotide polymorphisms (SNPs) within the DDC gene for association with attention, which was assessed by the Attention Network Test to detect three networks of attention, including alerting, orienting, and executive attention, in a healthy Han Chinese sample (N=451). Association analysis for individual SNPs indicated that four of the seven SNPs (rs3887825, rs7786398, rs10499695, and rs6969081) were significantly associated with alerting attention. Haplotype-based association analysis revealed that alerting was associated with the haplotype G-A-T for SNPs rs7786398-rs10499695-rs6969081. These associations remained significant after correcting for multiple testing by max(T) permutation. No association was found for orienting and executive attention. This study provides the first evidence for the involvement of the DDC gene in alerting attention. A better understanding of the genetic basis of distinct attention networks would allow us to develop more effective diagnosis, treatment, and prevention of deficient or underdeveloped alerting attention as well as its related prevalent neuropsychiatric disorders. Copyright © 2012 Elsevier Inc. All rights reserved.
Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri
2007-01-01
Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis. PMID:18305825
Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri
2007-12-30
Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.
A cluster merging method for time series microarray with production values.
Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio
2014-09-01
A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.
Chromatin-Specific Regulation of Mammalian rDNA Transcription by Clustered TTF-I Binding Sites
Diermeier, Sarah D.; Németh, Attila; Rehli, Michael; Grummt, Ingrid; Längst, Gernot
2013-01-01
Enhancers and promoters often contain multiple binding sites for the same transcription factor, suggesting that homotypic clustering of binding sites may serve a role in transcription regulation. Here we show that clustering of binding sites for the transcription termination factor TTF-I downstream of the pre-rRNA coding region specifies transcription termination, increases the efficiency of transcription initiation and affects the three-dimensional structure of rRNA genes. On chromatin templates, but not on free rDNA, clustered binding sites promote cooperative binding of TTF-I, loading TTF-I to the downstream terminators before it binds to the rDNA promoter. Interaction of TTF-I with target sites upstream and downstream of the rDNA transcription unit connects these distal DNA elements by forming a chromatin loop between the rDNA promoter and the terminators. The results imply that clustered binding sites increase the binding affinity of transcription factors in chromatin, thus influencing the timing and strength of DNA-dependent processes. PMID:24068958
Orthology detection combining clustering and synteny for very large datasets.
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K; Prohaska, Sonja J; Stadler, Peter F
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Stadler, Peter F.
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. PMID:25137074
O'Connell Motherway, Mary; Zomer, Aldert; Leahy, Sinead C.; Reunanen, Justus; Bottacini, Francesca; Claesson, Marcus J.; O'Brien, Frances; Flynn, Kiera; Casey, Patrick G.; Moreno Munoz, Jose Antonio; Kearney, Breda; Houston, Aileen M.; O'Mahony, Caitlin; Higgins, Des G.; Shanahan, Fergus; Palva, Airi; de Vos, Willem M.; Fitzgerald, Gerald F.; Ventura, Marco; O'Toole, Paul W.; van Sinderen, Douwe
2011-01-01
Development of the human gut microbiota commences at birth, with bifidobacteria being among the first colonizers of the sterile newborn gastrointestinal tract. To date, the genetic basis of Bifidobacterium colonization and persistence remains poorly understood. Transcriptome analysis of the Bifidobacterium breve UCC2003 2.42-Mb genome in a murine colonization model revealed differential expression of a type IVb tight adherence (Tad) pilus-encoding gene cluster designated “tad2003.” Mutational analysis demonstrated that the tad2003 gene cluster is essential for efficient in vivo murine gut colonization, and immunogold transmission electron microscopy confirmed the presence of Tad pili at the poles of B. breve UCC2003 cells. Conservation of the Tad pilus-encoding locus among other B. breve strains and among sequenced Bifidobacterium genomes supports the notion of a ubiquitous pili-mediated host colonization and persistence mechanism for bifidobacteria. PMID:21690406
O'Connell Motherway, Mary; Zomer, Aldert; Leahy, Sinead C; Reunanen, Justus; Bottacini, Francesca; Claesson, Marcus J; O'Brien, Frances; Flynn, Kiera; Casey, Patrick G; Munoz, Jose Antonio Moreno; Kearney, Breda; Houston, Aileen M; O'Mahony, Caitlin; Higgins, Des G; Shanahan, Fergus; Palva, Airi; de Vos, Willem M; Fitzgerald, Gerald F; Ventura, Marco; O'Toole, Paul W; van Sinderen, Douwe
2011-07-05
Development of the human gut microbiota commences at birth, with bifidobacteria being among the first colonizers of the sterile newborn gastrointestinal tract. To date, the genetic basis of Bifidobacterium colonization and persistence remains poorly understood. Transcriptome analysis of the Bifidobacterium breve UCC2003 2.42-Mb genome in a murine colonization model revealed differential expression of a type IVb tight adherence (Tad) pilus-encoding gene cluster designated "tad(2003)." Mutational analysis demonstrated that the tad(2003) gene cluster is essential for efficient in vivo murine gut colonization, and immunogold transmission electron microscopy confirmed the presence of Tad pili at the poles of B. breve UCC2003 cells. Conservation of the Tad pilus-encoding locus among other B. breve strains and among sequenced Bifidobacterium genomes supports the notion of a ubiquitous pili-mediated host colonization and persistence mechanism for bifidobacteria.
Liu, Ying; Navathe, Shamkant B; Pivoshenko, Alex; Dasigi, Venu G; Dingledine, Ray; Ciliax, Brian J
2006-01-01
One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.
Zhang, Xiujun; Alemany, Lawrence B.; Fiedler, Hans-Peter; Goodfellow, Michael; Parry, Ronald J.
2008-01-01
The antibiotics lactonamycin and lactonamycin Z provide attractive leads for antibacterial drug development. Both antibiotics contain a novel aglycone core called lactonamycinone. To gain insight into lactonamycinone biosynthesis, cloning and precursor incorporation experiments were undertaken. The lactonamycin gene cluster was initially cloned from Streptomyces rishiriensis. Sequencing of ca. 61 kb of S. rishiriensis DNA revealed the presence of 57 open reading frames. These included genes coding for the biosynthesis of l-rhodinose, the sugar found in lactonamycin, and genes similar to those in the tetracenomycin biosynthetic gene cluster. Since lactonamycin production by S. rishiriensis could not be sustained, additional proof for the identity of the S. rishiriensis cluster was obtained by cloning the lactonamycin Z gene cluster from Streptomyces sanglieri. Partial sequencing of the S. sanglieri cluster revealed 15 genes that exhibited a very high degree of similarity to genes within the lactonamycin cluster, as well as an identical organization. Double-crossover disruption of one gene in the S. sanglieri cluster abolished lactonamycin Z production, and production was restored by complementation. These results confirm the identity of the genetic locus cloned from S. sanglieri and indicate that the highly similar locus in S. rishiriensis encodes lactonamycin biosynthetic genes. Precursor incorporation experiments with S. sanglieri revealed that lactonamycinone is biosynthesized in an unusual manner whereby glycine or a glycine derivative serves as a starter unit that is extended by nine acetate units. Analysis of the gene clusters and of the precursor incorporation data suggested a hypothetical scheme for lactonamycinone biosynthesis. PMID:18070976
The Low-Power Nucleus of PKS 1246-410 in the Centaurus Cluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, G.B.; /KIPAC, Menlo Park /NRAO, Socorro /New Mexico U.; Sanders, J.S.
2005-10-21
We present Chandra, Very Large Array (VLA), and Very Long Baseline Array (VLBA) observations of the nucleus of NGC 4696, a giant elliptical in the Centaurus cluster of galaxies. Like M87 in the Virgo cluster, PKS 1246-410 in the Centaurus cluster is a nearby example of a radio galaxy in a dense cluster environment. In analyzing the new X-ray data we have found a compact X-ray feature coincident with the optical and radio core. While nuclear emission from the X-ray source is expected, its luminosity is low, < 10{sup 40} erg s{sup -1}. We estimate the Bondi accretion radius tomore » be 30 pc and the accretion rate to be 0.01 M{sub {circle_dot}} y{sup -1} which under the canonical radiative efficiency of 10% would overproduce by 3.5 orders of magnitude the radiative luminosity. Much of this energy can be directed into the kinetic energy of the jet, which over time inflates the observed cavities seen in the thermal gas. The VLBA observations reveal a weak nucleus and a broad, one-sided jet extending over 25 parsecs in position angle -150 degrees. This jet is deflected on the kpc-scale to a more east-west orientation (position angle of -80 degrees).« less
Gopalappa, Ramu; Song, Myungjae; Chandrasekaran, Arun Pandian; Das, Soumyadip; Haq, Saba; Koh, Hyun Chul; Ramakrishna, Suresh
2018-05-31
Targeted genome editing by clustered regularly interspaced short palindromic repeats (CRISPR-Cas9) raised concerns over off-target effects. The use of double-nicking strategy using paired Cas9 nickase has been developed to minimize off-target effects. However, it was reported that the efficiency of paired nickases were comparable or lower than that of either corresponding nuclease alone. Recently, we conducted a systematic comparison of the efficiencies of several paired Cas9 with their corresponding Cas9 nucleases and showed that paired D10A Cas9 nickases are sometimes more efficient than individual nucleases for gene disruption. However, sometimes the designed paired Cas9 nickases exhibited significantly lower mutation frequencies than nucleases, hampering the generation of cells containing paired Cas9 nickase-induced mutations. Here we implemented IRES peptide-conjugation of fluorescent protein to Cas9 nickase and subjected for fluorescence-activated cell sorting. The sorted cell populations are highly enriched with cells containing paired Cas9 nickase-induced mutations, by a factor of up to 40-fold as compared with the unsorted population. Furthermore, gene-disrupted single cell clones using paired nickases followed by FACS sorting strategy were generated highly efficiently, without compromising with its low off-target effects. We envision that our fluorescent protein coupled paired nickase-mediated gene disruption, facilitating efficient and highly specific genome editing in medical research.
NASA Astrophysics Data System (ADS)
Li, ZhaoYu; Chen, Tao; Yan, GuangQing
2016-10-01
A new method for determining the central axial orientation of a two-dimensional coherent magnetic flux rope (MFR) via multipoint analysis of the magnetic-field structure is developed. The method is devised under the following geometrical assumptions: (1) on its cross section, the structure is left-right symmetric; (2) the projected structure velocity is vertical to the line of symmetry. The two conditions can be naturally satisfied for cylindrical MFRs and are expected to be satisfied for MFRs that are flattened within current sheets. The model test demonstrates that, for determining the axial orientation of such structures, the new method is more efficient and reliable than traditional techniques such as minimum-variance analysis of the magnetic field, Grad-Shafranov (GS) reconstruction, and the more recent method based on the cylindrically symmetric assumption. A total of five flux transfer events observed by Cluster are studied using the proposed approach, and the application results indicate that the observed structures, regardless of their actual physical properties, fit the assumed geometrical model well. For these events, the inferred axial orientations are all in excellent agreement with those obtained using the multi-GS reconstruction technique.
Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina
2017-01-01
The PKS-NRPS-derived tetramic acid equisetin and its N-desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus. The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum, a species distantly related to the notorious rice pathogen Fusarium fujikuroi. Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi. Bioinformatic analysis revealed that this cluster does not contain the equisetin N-methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi. Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22, led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23, encoding a second Zn(II)2Cys6 TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T. TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus. PMID:28379186
Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina
2017-04-05
The PKS-NRPS-derived tetramic acid equisetin and its N -desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus . The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum , a species distantly related to the notorious rice pathogen Fusarium fujikuroi . Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi . Bioinformatic analysis revealed that this cluster does not contain the equisetin N -methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi . Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22 , led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23 , encoding a second Zn(II)₂Cys₆ TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T . TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus.
Computational gene expression profiling under salt stress reveals patterns of co-expression
Sanchita; Sharma, Ashok
2016-01-01
Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
Mihali, Troco K; Kellmann, Ralf; Neilan, Brett A
2009-03-30
Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs) are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved in the biosynthesis, may also afford the identification of these gene clusters in dinoflagellates, the cause of human mortalities and significant financial loss to the tourism and shellfish industries.
Mihali, Troco K; Kellmann, Ralf; Neilan, Brett A
2009-01-01
Background Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs) are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. Results We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. Conclusion The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved in the biosynthesis, may also afford the identification of these gene clusters in dinoflagellates, the cause of human mortalities and significant financial loss to the tourism and shellfish industries. PMID:19331657
The miR-290-295 cluster as multi-faceted players in mouse embryonic stem cells.
Yuan, Kai; Ai, Wen-Bing; Wan, Lin-Yan; Tan, Xiao; Wu, Jiang-Feng
2017-01-01
Increasing evidence indicates that embryonic stem cell specific microRNAs (miRNAs) play an essential role in the early development of embryo. Among them, the miR-290-295 cluster is the most highly expressed in the mouse embryonic stem cells and involved in various biological processes. In this paper, we reviewed the research progress of the function of the miR-290-295 cluster in embryonic stem cells. The miR-290-295 cluster is involved in regulating embryonic stem cell pluripotency maintenance, self-renewal, and reprogramming somatic cells to an embryonic stem cell-like state. Moreover, the miR-290-295 cluster has a latent pro-survival function in embryonic stem cells and involved in tumourigenesis and senescence with a great significance. Elucidating the interaction between the miR-290-295 cluster and other modes of gene regulation will provide us new ideas on the biology of pluripotent stem cells. In the near future, the broad prospects of the miRNA cluster will be shown in the stem cell field, such as altering cell identities with high efficiency through the transient introduction of tissue-specific miRNA cluster.
Li, Zhi-yong; Bao, Hong-juan; Zhang, Shuo-feng; Ye, Tian-yuan; Yang, Ce; Li, Yan-wen
2015-02-01
To explore the intersection and regulation mechanism of "efficacy-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata's action gene in the combination environment of Sini decoction with the network pharmacological method. The gene interaction network of Aconiti Lateralis Radix Praeparata, Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma were mined and established with Cytoscape software and Agilent literature search plug-in. The "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata was formed according to its effects in anti-heart failure, neurotoxicity and cardiotoxicity. The target genes were clustered with Clusterviz plug-in. And the possible pathways of the "efficacy-tox- icity network" intersection of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata were forecasted in DAVID database. There were five genes related to neurotoxicity, cardiotoxicity and anti-heart failure function of Aconiti Lateralis Radix Praeparata, namely AKT1, BAX, HCC, IL6 and IL8, which formed 47 nodes genes in the "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata. There were 29 and 27 coincident genes in the "efficiency-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata. There were 23 and 17 possible regulatory pathways. In the combination environment of Sini decoction, Glycyrrhizae Radix et Rhizoma and Zingiberis Rhizoma may regulate the efficiency-toxicity network of Aconiti Lateralis Radix Praeparata by influencing immune-inflammatory signaling pathway, apoptosis-autophagy signaling pathway, nerve cell and myocardial ischemia and hypoxia protection signaling pathways.
Radial Alignment of Ellipitcal Galaxies by the Tidal Force of a Cluster of Galaxies
NASA Astrophysics Data System (ADS)
Zhang, Shuang-Nan; Rong, Yu; Tu, Hong
2015-08-01
Unlike the random radial orientation distribution of field elliptical galaxies, galaxies in a cluster of galaxies are expected to point preferentially toward the center of the cluster, as a result of the cluster's tidal force on its member galaxies. In this work an analytic model is formulated to simulate this effect. The deformation time scale of a galaxy in a cluster is usually much shorter than the time scale of change of the tidal force; the dynamical process of the tidal interaction within the galaxy can thus be ignored. An equilibrium shape of a galaxy is then assumed to be the surface of equipotential, which is the sum of the self-gravitational potential of the galaxy and the tidal potential of the cluster at this location. We use a Monte-Carlo method to calculate the radial orientation distribution of these galaxies, by assuming the NFW mass profile of the cluster and the initial ellipticity of field galaxies. The radial angles show a single peak distribution centered at zero. The Monte-Carlo simulations also show that a shift of the reference center from the real cluster center weakens the anisotropy of the radial angle distribution. Therefore, the expected radial alignment cannot be revealed if the distribution of spatial position angle is used instead of that of radial angle. The observed radial orientations of elliptical galaxies in cluster Abell~2744 are consistent with the simulated distribution.
Radial Alignment of Elliptical Galaxies by the Tidal Force of a Cluster of Galaxies
NASA Astrophysics Data System (ADS)
Zhang, Shuang-Nan; Rong, Yu; Tu, Hong
2015-08-01
Unlike the random radial orientation distribution of field elliptical galaxies, galaxies in a cluster of galaxies are expected to point preferentially toward the center of the cluster, as a result of the cluster's tidal force on its member galaxies. In this work an analytic model is formulated to simulate this effect. The deformation time scale of a galaxy in a cluster is usually much shorter than the time scale of change of the tidal force; the dynamical process of the tidal interaction within the galaxy can thus be ignored. An equilibrium shape of a galaxy is then assumed to be the surface of equipotential, which is the sum of the self-gravitational potential of the galaxy and the tidal potential of the cluster at this location. We use a Monte-Carlo method to calculate the radial orientation distribution of these galaxies, by assuming the NFW mass profile of the cluster and the initial ellipticity of field galaxies. The radial angles show a single peak distribution centered at zero. The Monte-Carlo simulations also show that a shift of the reference center from the real cluster center weakens the anisotropy of the radial angle distribution. Therefore, the expected radial alignment cannot be revealed if the distribution of spatial position angle is used instead of that of radial angle. The observed radial orientations of elliptical galaxies in cluster Abell~2744 are consistent with the simulated distribution.
Mining subspace clusters from DNA microarray data using large itemset techniques.
Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi
2009-05-01
Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.
Patel, Vidushi S; Cooper, Steven JB; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer AM
2008-01-01
Background Vertebrate alpha (α)- and beta (β)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the α- and β-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil β-globin gene (ω) in the marsupial α-cluster, however, suggested that duplication of the α-β cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous α- and β-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. Results The platypus α-globin cluster (chromosome 21) contains embryonic and adult α- globin genes, a β-like ω-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-ζ-ζ'-αD-α3-α2-α1-ω-GBY-3'. The platypus β-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-ε-β-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate α-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal β-globin clusters are embedded in olfactory genes. Thus, the mammalian α- and β-globin clusters are orthologous to the bird α- and β-globin clusters respectively. Conclusion We propose that α- and β-globin clusters evolved from an ancient MPG-C16orf35-α-β-GBY-LUC7L arrangement 410 million years ago. A copy of the original β (represented by ω in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of β-globin genes with different expression profiles in different lineages. PMID:18657265
Sleeping Beauty-baculovirus hybrid vectors for long-term gene expression in the eye.
Turunen, Tytteli Anni Kaarina; Laakkonen, Johanna Päivikki; Alasaarela, Laura; Airenne, Kari Juhani; Ylä-Herttuala, Seppo
2014-01-01
A baculovirus vector is capable of efficiently transducing many nondiving and diving cell types. However, the potential of baculovirus is restricted for many gene delivery applications as a result of the transient gene expression that it mediates. The plasmid-based Sleeping Beauty (SB) transposon system integrates transgenes into target cell genome efficiently with a genomic integration pattern that is generally considered safer than the integration of many other integrating vectors; yet efficient delivery of therapeutic genes into cells of target tissues in vivo is a major challenge for nonviral gene therapy. In the present study, SB was introduced into baculovirus to obtain novel hybrid vectors that would combine the best features of the two vector systems (i.e. effective gene delivery and efficient integration into the genome), thus circumventing the major limitations of these vectors. We constructed and optimized SB-baculovirus hybrid vectors that bear either SB100x transposase or SB transposon in the forward or reverse orientations with respect to the viral backbone The functionality of the novel hybrid vectors was investigated in cell cultures and in a proof-of-concept study in the mouse eye. The hybrid vectors showed high and sustained transgene expression that remained stable and demonstrated no signs of decline during the 2 months follow-up in vitro. These results were verified in the mouse eye where persistent transgene expression was detected two months after intravitreal injection. Our results confirm that (i) SB-baculovirus hybrid vectors mediate long-term gene expression in vitro and in vivo, and (ii) the hybrid vectors are potential new tools for the treatment of ocular diseases. Copyright © 2014 John Wiley & Sons, Ltd.
Adamek, Martina; Alanjary, Mohammad; Sales-Ortells, Helena; Goodfellow, Michael; Bull, Alan T; Winkler, Anika; Wibberg, Daniel; Kalinowski, Jörn; Ziemert, Nadine
2018-06-01
Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary. Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes. Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast light on the interconnections between secondary metabolite gene clusters and provide a way to prioritize biosynthetic pathways in the search and discovery of novel compounds.
Jadhav, Rohit R; Ye, Zhenqing; Huang, Rui-Lan; Liu, Joseph; Hsu, Pei-Yin; Huang, Yi-Wen; Rangel, Leticia B; Lai, Hung-Cheng; Roa, Juan Carlos; Kirma, Nameer B; Huang, Tim Hui-Ming; Jin, Victor X
2015-01-01
Recent genome-wide analysis has shown that DNA methylation spans long stretches of chromosome regions consisting of clusters of contiguous CpG islands or gene families. Hypermethylation of various gene clusters has been reported in many types of cancer. In this study, we conducted methyl-binding domain capture (MBDCap) sequencing (MBD-seq) analysis on a breast cancer cohort consisting of 77 patients and 10 normal controls, as well as a panel of 38 breast cancer cell lines. Bioinformatics analysis determined seven gene clusters with a significant difference in overall survival (OS) and further revealed a distinct feature that the conservation of a large gene cluster (approximately 70 kb) metallothionein-1 (MT1) among 45 species is much lower than the average of all RefSeq genes. Furthermore, we found that DNA methylation is an important epigenetic regulator contributing to gene repression of MT1 gene cluster in both ERα positive (ERα+) and ERα negative (ERα-) breast tumors. In silico analysis revealed much lower gene expression of this cluster in The Cancer Genome Atlas (TCGA) cohort for ERα + tumors. To further investigate the role of estrogen, we conducted 17β-estradiol (E2) and demethylating agent 5-aza-2'-deoxycytidine (DAC) treatment in various breast cancer cell types. Cell proliferation and invasion assays suggested MT1F and MT1M may play an anti-oncogenic role in breast cancer. Our data suggests that DNA methylation in large contiguous gene clusters can be potential prognostic markers of breast cancer. Further investigation of these clusters revealed that estrogen mediates epigenetic repression of MT1 cluster in ERα + breast cancer cell lines. In all, our studies identify thousands of breast tumor hypermethylated regions for the first time, in particular, discovering seven large contiguous hypermethylated gene clusters.
Identification and comparative analysis of the epidermal differentiation complex in snakes
Brigit Holthaus, Karin; Mlitz, Veronika; Strasser, Bettina; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold
2017-01-01
The epidermis of snakes efficiently protects against dehydration and mechanical stress. However, only few proteins of the epidermal barrier to the environment have so far been identified in snakes. Here, we determined the organization of the Epidermal Differentiation Complex (EDC), a cluster of genes encoding protein constituents of cornified epidermal structures, in snakes and compared it to the EDCs of other squamates and non-squamate reptiles. The EDC of snakes displays shared synteny with that of the green anole lizard, including the presence of a cluster of corneous beta-protein (CBP)/beta-keratin genes. We found that a unique CBP comprising 4 putative beta-sheets and multiple cysteine-rich EDC proteins are conserved in all snakes and other squamates investigated. Comparative genomics of squamates suggests that the evolution of snakes was associated with a gene duplication generating two isoforms of the S100 fused-type protein, scaffoldin, the origin of distinct snake-specific EDC genes, and the loss of other genes that were present in the EDC of the last common ancestor of snakes and lizards. Taken together, our results provide new insights into the evolution of the skin in squamates and a basis for the characterization of the molecular composition of the epidermis in snakes. PMID:28345630
Identifying Candidate Reprogramming Genes in Mouse Induced Pluripotent Stem Cells.
Gao, Fang; Li, Jingyu; Zhang, Heng; Yang, Xu; An, Tiezhu
2017-08-01
Factor-based induced reprogramming approaches have tremendous potential for human regenerative medicine, but the efficiencies of these approaches are still low. In this study, we analyzed the global transcriptional profiles of mouse induced pluripotent stem cells (miPSCs) and mouse embryonic stem cells (mESCs) from seven different labs and present here the first successful clustering according to cell type, not by lab of origin. We identified 2131 different expression genes (DEs) as candidate pluripotency-associated genes by comparing mESCs/miPSCs with somatic cells and 720 DEs between miPSCs and mESCs. Interestingly, there was a significant overlap between the two DE sets. Therefore, we defined the overlap DEs as "consensus DEs" including 313 miPSC-specific genes expressed at a higher level in miPSCs versus mESCs and 184 mESC-specific genes in total and reasoned that these may contribute to the differences in pluripotency between mESCs and miPSCs. A classification of "consensus DEs" according to their different expression levels between somatic cells and mESCs/miPSCs shows that 86% of the miPSC-specific genes are more highly expressed in somatic cells, while 73% of mESC-specific genes are highly expressed in mESCs/miPSCs, indicating that the miPSCs have not efficiently silenced the expression pattern of the somatic cells from which they are derived and failed to completely induce the genes with high expression levels in mESCs. We further revealed a strong correlation between oocyte-enriched factors and insufficiently induced mESC-specific genes and identified 11 hub genes via network analysis. In light of these findings, we postulated that these key hub genes might not only drive somatic cell nuclear transfer (SCNT) reprogramming but also augment the efficiency and quality of miPSC reprogramming.
WordCluster: detecting clusters of DNA words and genomic elements
2011-01-01
Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981
Orientational ordering of lamellar structures on closed surfaces
NASA Astrophysics Data System (ADS)
Pȩkalski, J.; Ciach, A.
2018-05-01
Self-assembly of particles with short-range attraction and long-range repulsion interactions on a flat and on a spherical surface is compared. Molecular dynamics simulations are performed for the two systems having the same area and the density optimal for formation of stripes of particles. Structural characteristics, e.g., a cluster size distribution, a number of defects, and an orientational order parameter (OP), as well as the specific heat, are obtained for a range of temperatures. In both cases, the cluster size distribution becomes bimodal and elongated clusters appear at the temperature corresponding to the maximum of the specific heat. When the temperature decreases, orientational ordering of the stripes takes place and the number of particles per cluster or stripe increases in both cases. However, only on the flat surface, the specific heat has another maximum at the temperature corresponding to a rapid change of the OP. On the sphere, the crossover between the isotropic and anisotropic structures occur in a much broader temperature interval; the orientational order is weaker and occurs at significantly lower temperature. At low temperature, the stripes on the sphere form spirals and the defects resemble defects in the nematic phase of rods adsorbed at a sphere.
Large clusters of co-expressed genes in the Drosophila genome.
Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I
2002-12-12
Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.
Unusual Gene Order and Organization of the Sea Urchin Hox Cluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cameron, R A; Rowen, L; Nesbitt, R
2005-10-11
The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is :more » 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
Unusual Gene Order and Organization of the Sea Urchin HoxCluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew
2005-05-10
The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is :more » 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants
Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; ...
2017-04-01
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants1[OPEN
Zhang, Peifen; Kim, Taehyong; Banf, Michael; Chavali, Arvind K.; Nilo-Poyanco, Ricardo; Bernard, Thomas
2017-01-01
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. PMID:28228535
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.
Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; Kim, Taehyong; Banf, Michael; Chae, Lee; Dreher, Kate; Chavali, Arvind K; Nilo-Poyanco, Ricardo; Bernard, Thomas; Kahn, Daniel; Rhee, Seung Y
2017-04-01
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. © 2017 American Society of Plant Biologists. All Rights Reserved.
Liu, L L; Liu, M J; Ma, M
2015-09-28
The central task of this study was to mine the gene-to-medium relationship. Adequate knowledge of this relationship could potentially improve the accuracy of differentially expressed gene mining. One of the approaches to differentially expressed gene mining uses conventional clustering algorithms to identify the gene-to-medium relationship. Compared to conventional clustering algorithms, self-organization maps (SOMs) identify the nonlinear aspects of the gene-to-medium relationships by mapping the input space into another higher dimensional feature space. However, SOMs are not suitable for huge datasets consisting of millions of samples. Therefore, a new computational model, the Function Clustering Self-Organization Maps (FCSOMs), was developed. FCSOMs take advantage of the theory of granular computing as well as advanced statistical learning methodologies, and are built specifically for each information granule (a function cluster of genes), which are intelligently partitioned by the clustering algorithm provided by the DAVID_6.7 software platform. However, only the gene functions, and not their expression values, are considered in the fuzzy clustering algorithm of DAVID. Compared to the clustering algorithm of DAVID, these experimental results show a marked improvement in the accuracy of classification with the application of FCSOMs. FCSOMs can handle huge datasets and their complex classification problems, as each FCSOM (modeled for each function cluster) can be easily parallelized.
Strategic groups, performance, and strategic response in the nursing home industry.
Zinn, J S; Aaronson, W E; Rosko, M D
1994-06-01
This study examines the effect of strategic group membership on nursing home performance and strategic behavior. Data from the 1987 Medicare and Medicaid Automated Certification Survey were combined with data from the 1987 and 1989 Pennsylvania Long Term Care Facility Questionnaire. The sample consisted of 383 Pennsylvania nursing homes. Cluster analysis was used to place the 383 nursing homes into strategic groups on the basis of variables measuring scope and resource deployment. Performance was measured by indicators of the quality of nursing home care (rates of pressure ulcers, catheterization, and restraint usage) and efficiency in services provision. Changes in Medicare participation after passage of the 1988 Medicare Catastrophic Coverage Act (MCCA) measured strategic behavior. MANOVA and Turkey HSD post hoc means tests determined if significant differences were associated with strategic group membership. Cluster analysis produced an optimal seven-group solution. Differences in group means were significant for the clustering, performance, and conduct variables (p < .0001). Strategic groups characterized by facilities providing a continuum of care services had the best patient care outcomes. The most efficient groups were characterized by facilities with high Medicare census. While all strategic groups increased Medicare census following passage of the MCCA, those dominated by for-profits had the greatest increases. Our analysis demonstrates that strategic orientation influences nursing home response to regulatory initiatives, a factor that should be recognized in policy formation directed at nursing home reform.
Tanaka-Tsuno, Fumiko; Mizukami-Murata, Satomi; Murata, Yoshinori; Nakamura, Toshihide; Ando, Akira; Takagi, Hiroshi; Shima, Jun
2007-10-01
In the modern baking industry, high-sucrose-tolerant (HS) and maltose-utilizing (LS) yeast were developed using breeding techniques and are now used commercially. Sugar utilization and high-sucrose tolerance differ significantly between HS and LS yeasts. We analysed the gene expression profiles of HS and LS yeasts under different sucrose conditions in order to determine their basic physiology. Two-way hierarchical clustering was performed to obtain the overall patterns of gene expression. The clustering clearly showed that the gene expression patterns of LS yeast differed from those of HS yeast. Quality threshold clustering was used to identify the gene clusters containing upregulated genes (cluster 1) and downregulated genes (cluster 2) under high-sucrose conditions. Clusters 1 and 2 contained numerous genes involved in carbon and nitrogen metabolism, respectively. The expression level of the genes involved in the metabolism of glycerol and trehalose, which are known to be osmoprotectants, in LS yeast was higher than that in HS yeast under sucrose concentrations of 5-40%. No clear correlation was found between the expression level of the genes involved in the biosynthesis of the osmoprotectants and the intracellular contents of the osmoprotectants. The present gene expression data were compared with data previously reported in a comprehensive analysis of a gene deletion strain collection. Welch's t-test for this comparison showed that the relative growth rates of the deletion strains whose deletion occurred in genes belonging to cluster 1 were significantly higher than the average growth rates of all deletion strains. Copyright 2007 John Wiley & Sons, Ltd.
Barron, Martin; Zhang, Siyuan
2018-01-01
Abstract Cell types in cell populations change as the condition changes: some cell types die out, new cell types may emerge and surviving cell types evolve to adapt to the new condition. Using single-cell RNA-sequencing data that measure the gene expression of cells before and after the condition change, we propose an algorithm, SparseDC, which identifies cell types, traces their changes across conditions and identifies genes which are marker genes for these changes. By solving a unified optimization problem, SparseDC completes all three tasks simultaneously. SparseDC is highly computationally efficient and demonstrates its accuracy on both simulated and real data. PMID:29140455
Catone, Mariela V.; Ruiz, Jimena A.; Castellanos, Mildred; Segura, Daniel; Espin, Guadalupe; López, Nancy I.
2014-01-01
Pseudomonas extremaustralis produces mainly polyhydroxybutyrate (PHB), a short chain length polyhydroxyalkanoate (sclPHA) infrequently found in Pseudomonas species. Previous studies with this strain demonstrated that PHB genes are located in a genomic island. In this work, the analysis of the genome of P. extremaustralis revealed the presence of another PHB cluster phbFPX, with high similarity to genes belonging to Burkholderiales, and also a cluster, phaC1ZC2D, coding for medium chain length PHA production (mclPHA). All mclPHA genes showed high similarity to genes from Pseudomonas species and interestingly, this cluster also showed a natural insertion of seven ORFs not related to mclPHA metabolism. Besides PHB, P. extremaustralis is able to produce mclPHA although in minor amounts. Complementation analysis demonstrated that both mclPHA synthases, PhaC1 and PhaC2, were functional. RT-qPCR analysis showed different levels of expression for the PHB synthase, phbC, and the mclPHA synthases. The expression level of phbC, was significantly higher than the obtained for phaC1 and phaC2, in late exponential phase cultures. The analysis of the proteins bound to the PHA granules showed the presence of PhbC and PhaC1, whilst PhaC2 could not be detected. In addition, two phasin like proteins (PhbP and PhaI) associated with the production of scl and mcl PHAs, respectively, were detected. The results of this work show the high efficiency of a foreign gene (phbC) in comparison with the mclPHA core genome genes (phaC1 and phaC2) indicating that the ability of P. extremaustralis to produce high amounts of PHB could be explained by the different expression levels of the genes encoding the scl and mcl PHA synthases. PMID:24887088
Barbary, Arnaud; Djian-Caporalino, Caroline; Marteu, Nathalie; Fazari, Ariane; Caromel, Bernard; Castagnone-Sereno, Philippe; Palloix, Alain
2016-01-01
With the banning of most chemical nematicides, the control of root-knot nematodes (RKNs) in vegetable crops is now based essentially on the deployment of single, major resistance genes (R-genes). However, these genes are rare and their efficacy is threatened by the capacity of RKNs to adapt. In pepper, several dominant R-genes are effective against RKNs, and their efficacy and durability have been shown to be greater in a partially resistant genetic background. However, the genetic determinants of this partial resistance were unknown. Here, a quantitative trait loci (QTL) analysis was performed on the F2:3 population from the cross between Yolo Wonder, an accession considered partially resistant or resistant, depending on the RKN species, and Doux Long des Landes, a susceptible cultivar. A genetic linkage map was constructed from 130 F2 individuals, and the 130 F3 families were tested for resistance to the three main RKN species, Meloidogyne incognita, M. arenaria, and M. javanica. For the first time in the pepper-RKN pathosystem, four major QTLs were identified and mapped to two clusters. The cluster on chromosome P1 includes three tightly linked QTLs with specific effects against individual RKN species. The fourth QTL, providing specific resistance to M. javanica, mapped to pepper chromosome P9, which is known to carry multiple NBS–LRR repeats, together with major R-genes for resistance to nematodes and other pathogens. The newly discovered cluster on chromosome P1 has a broad spectrum of action with major additive effects on resistance. These data highlight the role of host QTLs involved in plant-RKN interactions and provide innovative potential for the breeding of new pepper cultivars or rootstocks combining quantitative resistance and major R-genes, to increase both the efficacy and durability of RKN control by resistance genes. PMID:27242835
Zhang, Jing; Zhang, Qingwen; Liu, Xiaoxia; Li, Zhen
2017-01-01
MicroRNAs (miRNAs) are a group of endogenous non-coding small RNAs that have critical regulatory functions in almost all known biological processes at the post-transcriptional level in a variety of organisms. The oriental fruit moth Grapholita molesta is one of the most serious pests in orchards worldwide and threatens the production of Rosacea fruits. In this study, a de novo small RNA library constructed from mixed stages of G. molesta was sequenced through Illumina sequencing platform and a total of 536 mature miRNAs consisting of 291 conserved and 245 novel miRNAs were identified. Most of the conserved and novel miRNAs were detected with moderate abundance. The miRNAs in the same cluster normally showed correlated expressional profiles. A comparative analysis of the 79 conserved miRNA families within 31 arthropod species indicated that these miRNA families were more conserved among insects and within orders of closer phylogenetic relationships. The KEGG pathway analysis and network prediction of target genes indicated that the complex composed of miRNAs, clock genes and developmental regulation genes may play vital roles to regulate the developmental circadian rhythm of G. molesta. Furthermore, based on the sRNA library of G. molesta, suitable reference genes were selected and validated for study of miRNA transcriptional profile in G. molesta under two biotic and six abiotic experimental conditions. This study systematically documented the miRNA profile in G. molesta, which could lay a foundation for further understanding of the regulatory roles of miRNAs in the development and metabolism in this pest and might also suggest clues to the development of genetic-based techniques for agricultural pest control. PMID:28158242
2011-01-01
Background Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/coffea. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance. PMID:21303543
Collonnier, Cécile; Guyon-Debast, Anouchka; Maclot, François; Mara, Kostlend; Charlot, Florence; Nogué, Fabien
2017-05-15
Beyond its predominant role in human and animal therapy, the CRISPR-Cas9 system has also become an essential tool for plant research and plant breeding. Agronomic applications rely on the mastery of gene inactivation and gene modification. However, if the knock-out of genes by non-homologous end-joining (NHEJ)-mediated repair of the targeted double-strand breaks (DSBs) induced by the CRISPR-Cas9 system is rather well mastered, the knock-in of genes by homology-driven repair or end-joining remains difficult to perform efficiently in higher plants. In this review, we describe the different approaches that can be tested to improve the efficiency of CRISPR-induced gene modification in plants, which include the use of optimal transformation and regeneration protocols, the design of appropriate guide RNAs and donor templates and the choice of nucleases and means of delivery. We also present what can be done to orient DNA repair pathways in the target cells, and we show how the moss Physcomitrella patens can be used as a model plant to better understand what DNA repair mechanisms are involved, and how this knowledge could eventually be used to define more performant strategies of CRISPR-induced gene knock-in. Copyright © 2017 Elsevier Inc. All rights reserved.
Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.
You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary
2011-02-01
The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.
Genomics of Clostridium taeniosporum, an organism which forms endospores with ribbon-like appendages
Cambridge, Joshua M.; Blinkova, Alexandra L.; Salvador Rocha, Erick I.; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M.; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O.
2018-01-01
Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12–14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism. PMID:29293521
Cambridge, Joshua M; Blinkova, Alexandra L; Salvador Rocha, Erick I; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O; Walker, James R
2018-01-01
Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12-14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism.
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.
Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W
2017-08-01
In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Automatic Semantic Orientation of Adjectives for Indonesian Language Using PMI-IR and Clustering
NASA Astrophysics Data System (ADS)
Riyanti, Dewi; Arif Bijaksana, M.; Adiwijaya
2018-03-01
We present our work in the area of sentiment analysis for Indonesian language. We focus on bulding automatic semantic orientation using available resources in Indonesian. In this research we used Indonesian corpus that contains 9 million words from kompas.txt and tempo.txt that manually tagged and annotated with of part-of-speech tagset. And then we construct a dataset by taking all the adjectives from the corpus, removing the adjective with no orientation. The set contained 923 adjective words. This systems will include several steps such as text pre-processing and clustering. The text pre-processing aims to increase the accuracy. And finally clustering method will classify each word to related sentiment which is positive or negative. With improvements to the text preprocessing, can be achieved 72% of accuracy.
Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Data Analysis and Visualization; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,'' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
2008-05-12
The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii)more » evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.« less
Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R
1996-01-01
Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852
Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex
2010-01-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
Ficklin, Stephen P; Luo, Feng; Feltus, F Alex
2010-09-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
The sirodesmin biosynthetic gene cluster of the plant pathogenic fungus Leptosphaeria maculans.
Gardiner, Donald M; Cozijnsen, Anton J; Wilson, Leanne M; Pedras, M Soledade C; Howlett, Barbara J
2004-09-01
Sirodesmin PL is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). This phytotoxin belongs to the epipolythiodioxopiperazine (ETP) class of toxins produced by fungi including mammalian and plant pathogens. We report the cloning of a cluster of genes with predicted roles in the biosynthesis of sirodesmin PL and show via gene disruption that one of these genes (encoding a two-module non-ribosomal peptide synthetase) is essential for sirodesmin PL biosynthesis. Of the nine genes in the cluster tested, all are co-regulated with the production of sirodesmin PL in culture. A similar cluster is present in the genome of the opportunistic human pathogen Aspergillus fumigatus and is most likely responsible for the production of gliotoxin, which is also an ETP. Homologues of the genes in the cluster were also identified in expressed sequence tags of the ETP producing fungus Chaetomium globosum. Two other fungi with publicly available genome sequences, Magnaporthe grisea and Fusarium graminearum, had similar gene clusters. A comparative analysis of all four clusters is presented. This is the first report of the genes responsible for the biosynthesis of an ETP. Copyright 2004 Blackwell Publishing Ltd
Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish.
Kawahara, Atsuo; Hisano, Yu; Ota, Satoshi; Taimatsu, Kiyohito
2016-05-13
The zebrafish (Danio rerio) is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9) system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs) at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.
Information mining in remote sensing imagery
NASA Astrophysics Data System (ADS)
Li, Jiang
The volume of remotely sensed imagery continues to grow at an enormous rate due to the advances in sensor technology, and our capability for collecting and storing images has greatly outpaced our ability to analyze and retrieve information from the images. This motivates us to develop image information mining techniques, which is very much an interdisciplinary endeavor drawing upon expertise in image processing, databases, information retrieval, machine learning, and software design. This dissertation proposes and implements an extensive remote sensing image information mining (ReSIM) system prototype for mining useful information implicitly stored in remote sensing imagery. The system consists of three modules: image processing subsystem, database subsystem, and visualization and graphical user interface (GUI) subsystem. Land cover and land use (LCLU) information corresponding to spectral characteristics is identified by supervised classification based on support vector machines (SVM) with automatic model selection, while textural features that characterize spatial information are extracted using Gabor wavelet coefficients. Within LCLU categories, textural features are clustered using an optimized k-means clustering approach to acquire search efficient space. The clusters are stored in an object-oriented database (OODB) with associated images indexed in an image database (IDB). A k-nearest neighbor search is performed using a query-by-example (QBE) approach. Furthermore, an automatic parametric contour tracing algorithm and an O(n) time piecewise linear polygonal approximation (PLPA) algorithm are developed for shape information mining of interesting objects within the image. A fuzzy object-oriented database based on the fuzzy object-oriented data (FOOD) model is developed to handle the fuzziness and uncertainty. Three specific applications are presented: integrated land cover and texture pattern mining, shape information mining for change detection of lakes, and fuzzy normalized difference vegetation index (NDVI) pattern mining. The study results show the effectiveness of the proposed system prototype and the potentials for other applications in remote sensing.
Zhang, Jun; Zhang, Lei; Geng, Alei; Liu, Fanghua; Zhao, Guoping; Wang, Shengyue; Zhou, Zhihua; Yan, Xing
2015-01-01
Diverse cellulolytic bacteria are essential for maintaining high lignocellulose degradation ability in biogas digesters. However, little was known about functional genes and gene clusters of dominant cellulolytic bacteria in biogas digesters. This is the foundation to understand lignocellulose degradation mechanisms of biogas digesters and apply these gene resource for optimizing biofuel production. A combination of metagenomic and 16S rRNA gene clone library methods was used to investigate the dominant cellulolytic bacteria and their glycoside hydrolase (GH) genes in two biogas digesters. The 16S rRNA gene analysis revealed that the dominant cellulolytic bacteria were strains closely related to Clostridium straminisolvens and an uncultured cellulolytic bacterium designated BG-1. To recover GH genes from cellulolytic bacteria in general, and BG-1 in particular, a refined assembly approach developed in this study was used to assemble GH genes from metagenomic reads; 163 GH-containing contigs ≥ 1 kb in length were obtained. Six recovered GH5 genes that were expressed in E. coli demonstrated multiple lignocellulase activities and one had high mannanase activity (1255 U/mg). Eleven fosmid clones harboring the recovered GH-containing contigs were sequenced and assembled into 10 fosmid contigs. The composition of GH genes in the 163 assembled metagenomic contigs and 10 fosmid contigs indicated that diverse GHs and lignocellulose degradation mechanisms were present in the biogas digesters. In particular, a small portion of BG-1 genome information was recovered by PhyloPythiaS analysis. The lignocellulase gene clusters in BG-1 suggested that it might use a possible novel lignocellulose degradation mechanism to efficiently degrade lignocellulose. Dominant cellulolytic bacteria of biogas digester possess diverse GH genes, not only in sequences but also in their functions, which may be applied for production of biofuel in the future. PMID:26070087
Many nonuniversal archaeal ribosomal proteins are found in conserved gene clusters
WANG, JIACHEN; DASGUPTA, INDRANI; FOX, GEORGE E.
2009-01-01
The genomic associations of the archaeal ribosomal proteins, (r-proteins), were examined in detail. The archaeal versions of the universal r-protein genes are typically in clusters similar or identical and to those found in bacteria. Of the 35 nonuniversal archaeal r-protein genes examined, the gene encoding L18e was found to be associated with the conserved L13 cluster, whereas the genes for S4e, L32e and L19e were found in the archaeal version of the spc operon. Eleven nonuniversal protein genes were not associated with any common genomic context. Of the remaining 19 protein genes, 17 were convincingly assigned to one of 10 previously unrecognized gene clusters. Examination of the gene content of these clusters revealed multiple associations with genes involved in the initiation of protein synthesis, transcription or other cellular processes. The lack of such associations in the universal clusters suggests that initially the ribosome evolved largely independently of other processes. More recently it likely has evolved in concert with other cellular systems. It was also verified that a second copy of the gene encoding L7ae found in some bacteria is actually a homolog of the gene encoding L30e and should be annotated as such. PMID:19478915
On the multi-scale description of micro-structured fluids composed of aggregating rods
NASA Astrophysics Data System (ADS)
Perez, Marta; Scheuer, Adrien; Abisset-Chavanne, Emmanuelle; Ammar, Amine; Chinesta, Francisco; Keunings, Roland
2018-05-01
When addressing the flow of concentrated suspensions composed of rods, dense clusters are observed. Thus, the adequate modelling and simulation of such a flow requires addressing the kinematics of these dense clusters and their impact on the flow in which they are immersed. In a former work, we addressed a first modelling framework of these clusters, assumed so dense that they were considered rigid and their kinematics (flow-induced rotation) were totally defined by a symmetric tensor c with unit trace representing the cluster conformation. Then, the rigid nature of the clusters was relaxed, assuming them deformable, and a model giving the evolution of both the cluster shape and its microstructural orientation descriptor (the so-called shape and orientation tensors) was proposed. This paper compares the predictions coming from those models with finer-scale discrete simulations inspired from molecular dynamics modelling.
CRISPR-Cas9 Toolkit for Actinomycete Genome Editing.
Tong, Yaojun; Robertsen, Helene Lunde; Blin, Kai; Weber, Tilmann; Lee, Sang Yup
2018-01-01
Bacteria of the order Actinomycetales are one of the most important sources of bioactive natural products, which are the source of many drugs. However, many of them still lack efficient genome editing methods, some strains even cannot be manipulated at all. This restricts systematic metabolic engineering approaches for boosting known and discovering novel natural products. In order to facilitate the genome editing for actinomycetes, we developed a CRISPR-Cas9 toolkit with high efficiency for actinomyces genome editing. This basic toolkit includes a software for spacer (sgRNA) identification, a system for in-frame gene/gene cluster knockout, a system for gene loss-of-function study, a system for generating a random size deletion library, and a system for gene knockdown. For the latter, a uracil-specific excision reagent (USER) cloning technology was adapted to simplify the CRISPR vector construction process. The application of this toolkit was successfully demonstrated by perturbation of genomes of Streptomyces coelicolor A3(2) and Streptomyces collinus Tü 365. The CRISPR-Cas9 toolkit and related protocol described here can be widely used for metabolic engineering of actinomycetes.
NASA Astrophysics Data System (ADS)
Banerjee, D.; Jiang, F.-J.; Olesen, T. Z.; Orland, P.; Wiese, U.-J.
2018-05-01
We consider the (2 +1 ) -dimensional S U (2 ) quantum link model on the honeycomb lattice and show that it is equivalent to a quantum dimer model on the kagome lattice. The model has crystalline confined phases with spontaneously broken translation invariance associated with pinwheel order, which is investigated with either a Metropolis or an efficient cluster algorithm. External half-integer non-Abelian charges [which transform nontrivially under the Z (2 ) center of the S U (2 ) gauge group] are confined to each other by fractionalized strings with a delocalized Z (2 ) flux. The strands of the fractionalized flux strings are domain walls that separate distinct pinwheel phases. A second-order phase transition in the three-dimensional Ising universality class separates two confining phases: one with correlated pinwheel orientations, and the other with uncorrelated pinwheel orientations.
Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John
2003-02-01
We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.
Genetic analysis of biodegradation of tetralin by a Sphingomonas strain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hernaez, M.J.; Santero, E.; Reineke, W.
Tetralin (1,2,3,4-tetrahydronaphthalene) is produced for industrial purposes from naphthalene by catalytic hydrogenation or from anthracene by cracking. A strain designated TFA which very efficiently utilizes tetralin has been isolated from the Rhine river. The strain has been identified as Sphingomonas macrogoltabidus, based on 16S rDNA sequence similarity. Genetic analysis of tetralin biodegradation has been performed by insertion mutagenesis and by physical analysis and analysis of complementation between the mutants. The genes involved in tetralin utilization are clustered in a region of 9 kb, comprising at least five genes grouped in two divergently transcribed operons.
Function analysis of 5'-UTR of the cellulosomal xyl-doc cluster in Clostridium papyrosolvens.
Zou, Xia; Ren, Zhenxing; Wang, Na; Cheng, Yin; Jiang, Yuanyuan; Wang, Yan; Xu, Chenggang
2018-01-01
Anaerobic, mesophilic, and cellulolytic Clostridium papyrosolvens produces an efficient cellulolytic extracellular complex named cellulosome that hydrolyzes plant cell wall polysaccharides into simple sugars. Its genome harbors two long cellulosomal clusters: cip - cel operon encoding major cellulosome components (including scaffolding) and xyl - doc gene cluster encoding hemicellulases. Compared with works on cip - cel operon, there are much fewer studies on xyl - doc mainly due to its rare location in cellulolytic clostridia. Sequence analysis of xyl - doc revealed that it harbors a 5' untranslated region (5'-UTR) which potentially plays a role in the regulation of downstream gene expression. Here, we analyzed the function of 5'-UTR of xyl - doc cluster in C. papyrosolvens in vivo via transformation technology developed in this study. In this study, we firstly developed an electrotransformation method for C. papyrosolvens DSM 2782 before the analysis of 5'-UTR of xyl - doc cluster. In the optimized condition, a field with an intensity of 7.5-9.0 kV/cm was applied to a cuvette (0.2 cm gap) containing a mixture of plasmid and late cell suspended in exponential phase to form a 5 ms pulse in a sucrose-containing buffer. Afterwards, the putative promoter and the 5'-UTR of xyl - doc cluster were determined by sequence alignment. It is indicated that xyl - doc possesses a long conservative 5'-UTR with a complex secondary structure encompassing at least two perfect stem-loops which are potential candidates for controlling the transcriptional termination. In the last step, we employed an oxygen-independent flavin-based fluorescent protein (FbFP) as a quantitative reporter to analyze promoter activity and 5'-UTR function in vivo. It revealed that 5'-UTR significantly blocked transcription of downstream genes, but corn stover can relieve its suppression. In the present study, our results demonstrated that 5'-UTR of the cellulosomal xyl - doc cluster blocks the transcriptional activity of promoter. However, some substrates, such as corn stover, can relieve the effect of depression of 5'-UTR. Thus, it is speculated that 5'-UTR of xyl - doc was a putative riboswitch to regulate the expression of downstream cellulosomal genes, which is helpful to understand the complex regulation of cellulosome.
Reconstitutional Mutagenesis of the Maize P Gene by Short-Range Ac Transpositions
Moreno, M. A.; Chen, J.; Greenblatt, I.; Dellaporta, S. L.
1992-01-01
The tendency for Ac to transpose over short intervals has been utilized to develop insertional mutagenesis and fine structure genetic mapping strategies in maize. We recovered excisions of Ac from the P gene and insertions into nearby chromosomal sites. These closely linked Ac elements reinserted into the P gene, reconstituting over 250 unstable variegated alleles. Reconstituted alleles condition a variety of variegation patterns that reflect the position and orientation of Ac within the P gene. Molecular mapping and DNA sequence analyses have shown that reinsertion sites are dispersed throughout a 12.3-kb chromosomal region in the promoter, exons and introns of the P gene, but in some regions insertions sites were clustered in a nonrandom fashion. Transposition profiles and target site sequence data obtained from these studies have revealed several features of Ac transposition including its preference for certain target sites. These results clearly demonstrate the tendency of Ac to transpose to nearby sites in both proximal and distal directions from the donor site. With minor modifications, reconstitutional mutagenesis should be applicable to many Ac-induced mutations in maize and in other plant species and can possibly be extended to other eukaryotic transposon systems as well. PMID:1325389
LocExpress: a web server for efficiently estimating expression of novel transcripts.
Hou, Mei; Tian, Feng; Jiang, Shuai; Kong, Lei; Yang, Dechang; Gao, Ge
2016-12-22
The temporal and spatial-specific expression pattern of a transcript in multiple tissues and cell types can indicate key clues about its function. While several gene atlas available online as pre-computed databases for known gene models, it's still challenging to get expression profile for previously uncharacterized (i.e. novel) transcripts efficiently. Here we developed LocExpress, a web server for efficiently estimating expression of novel transcripts across multiple tissues and cell types in human (20 normal tissues/cells types and 14 cell lines) as well as in mouse (24 normal tissues/cell types and nine cell lines). As a wrapper to RNA-Seq quantification algorithm, LocExpress efficiently reduces the time cost by making abundance estimation calls increasingly within the minimum spanning bundle region of input transcripts. For a given novel gene model, such local context-oriented strategy allows LocExpress to estimate its FPKMs in hundreds of samples within minutes on a standard Linux box, making an online web server possible. To the best of our knowledge, LocExpress is the only web server to provide nearly real-time expression estimation for novel transcripts in common tissues and cell types. The server is publicly available at http://loc-express.cbi.pku.edu.cn .
Clustering change patterns using Fourier transformation with time-course gene expression data.
Kim, Jaehee
2011-01-01
To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.
Dover, Nir; Barash, Jason R.; Burke, Julianne N.; ...
2014-05-22
Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bontmore » gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ 70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.« less
ERIC Educational Resources Information Center
Montgomery County Public Schools, Rockville, MD.
In developing a program to assist the individual student to plan a goal-oriented program and increase his opportunities both to select courses moving him toward his personal goals and to use the community resources as supplemental educational experiences, the Winston Churchill High School designed a Career Cluster Curriculum Project, the first…
Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio
2014-12-01
The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.
Miyamoto, Kiyoko T.; Komatsu, Mamoru
2014-01-01
Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. PMID:24907338
Miyamoto, Kiyoko T; Komatsu, Mamoru; Ikeda, Haruo
2014-08-01
Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Cardoza, R. E.; Malmierca, M. G.; Hermosa, M. R.; Alexander, N. J.; McCormick, S. P.; Proctor, R. H.; Tijerino, A. M.; Rumbero, A.; Monte, E.; Gutiérrez, S.
2011-01-01
Trichothecenes are mycotoxins produced by Trichoderma, Fusarium, and at least four other genera in the fungal order Hypocreales. Fusarium has a trichothecene biosynthetic gene (TRI) cluster that encodes transport and regulatory proteins as well as most enzymes required for the formation of the mycotoxins. However, little is known about trichothecene biosynthesis in the other genera. Here, we identify and characterize TRI gene orthologues (tri) in Trichoderma arundinaceum and Trichoderma brevicompactum. Our results indicate that both Trichoderma species have a tri cluster that consists of orthologues of seven genes present in the Fusarium TRI cluster. Organization of genes in the cluster is the same in the two Trichoderma species but differs from the organization in Fusarium. Sequence and functional analysis revealed that the gene (tri5) responsible for the first committed step in trichothecene biosynthesis is located outside the cluster in both Trichoderma species rather than inside the cluster as it is in Fusarium. Heterologous expression analysis revealed that two T. arundinaceum cluster genes (tri4 and tri11) differ in function from their Fusarium orthologues. The Tatri4-encoded enzyme catalyzes only three of the four oxygenation reactions catalyzed by the orthologous enzyme in Fusarium. The Tatri11-encoded enzyme catalyzes a completely different reaction (trichothecene C-4 hydroxylation) than the Fusarium orthologue (trichothecene C-15 hydroxylation). The results of this study indicate that although some characteristics of the tri/TRI cluster have been conserved during evolution of Trichoderma and Fusarium, the cluster has undergone marked changes, including gene loss and/or gain, gene rearrangement, and divergence of gene function. PMID:21642405
Sawicki, Rafał; Singh, Sharda P; Mondal, Ashis K; Benes, Helen; Zimniak, Piotr
2003-01-01
From the fruitfly, Drosophila melanogaster, ten members of the cluster of Delta-class glutathione S-transferases (GSTs; formerly denoted as Class I GSTs) and one member of the Epsilon-class cluster (formerly GST-3) have been cloned, expressed in Escherichia coli, and their catalytic properties have been determined. In addition, nine more members of the Epsilon cluster have been identified through bioinformatic analysis but not further characterized. Of the 11 expressed enzymes, seven accepted the lipid peroxidation product 4-hydroxynonenal as substrate, and nine were active in glutathione conjugation of 1-chloro-2,4-dinitrobenzene. Since the enzymically active proteins included the gene products of DmGSTD3 and DmGSTD7 which were previously deemed to be pseudogenes, we investigated them further and determined that both genes are transcribed in Drosophila. Thus our present results indicate that DmGSTD3 and DmGSTD7 are probably functional genes. The existence and multiplicity of insect GSTs capable of conjugating 4-hydroxynonenal, in some cases with catalytic efficiencies approaching those of mammalian GSTs highly specialized for this function, indicates that metabolism of products of lipid peroxidation is a highly conserved biochemical pathway with probable detoxification as well as regulatory functions. PMID:12443531
Ko, Yi-An; Mukherjee, Bhramar; Smith, Jennifer A; Kardia, Sharon L R; Allison, Matthew; Diez Roux, Ana V
2016-11-01
There has been an increased interest in identifying gene-environment interaction (G × E) in the context of multiple environmental exposures. Most G × E studies analyze one exposure at a time, but we are exposed to multiple exposures in reality. Efficient analysis strategies for complex G × E with multiple environmental factors in a single model are still lacking. Using the data from the Multiethnic Study of Atherosclerosis, we illustrate a two-step approach for modeling G × E with multiple environmental factors. First, we utilize common clustering and classification strategies (e.g., k-means, latent class analysis, classification and regression trees, Bayesian clustering using Dirichlet Process) to define subgroups corresponding to distinct environmental exposure profiles. Second, we illustrate the use of an additive main effects and multiplicative interaction model, instead of the conventional saturated interaction model using product terms of factors, to study G × E with the data-driven exposure subgroups defined in the first step. We demonstrate useful analytical approaches to translate multiple environmental exposures into one summary class. These tools not only allow researchers to consider several environmental exposures in G × E analysis but also provide some insight into how genes modify the effect of a comprehensive exposure profile instead of examining effect modification for each exposure in isolation.
Eom, Gyeong Tae; Lee, Seung Hwan; Oh, Young Hoon; Choi, Ji Eun; Park, Si Jae; Song, Jae Kwang
2014-10-01
Heterologous ABC protein exporters, the apparatus of type I secretion pathway in Gram-negative bacteria, were used for extracellular production of Pseudomonas fluorescens lipase (TliA) in recombinant Escherichia coli. The effect of the expression of different ABC protein exporter gene clusters (P. fluorescens tliDEF, Pseudomonas aeruginosa aprDEF, Erwinia chrysanthemi prtDEF, and Serratia marcescens lipBCD genes) was examined on the secretion of TliA at growth temperatures of 20, 25, 30 and 35 °C. TliA secretion in recombinant E. coli XL10-Gold varied depending upon type of ABC protein exporter and culture temperature. E. coli expressing S. marcescens lipBCD genes showed the highest secretion level of TliA (122.8 U ml(-1)) when cultured at 25 °C. Thus, optimized culture conditions for efficient extracellular production of lipase in recombinant E. coli can be designed by changing the type of ABC protein exporter and the growth temperature.
CRISPR-cas System as a Genome Engineering Platform: Applications in Biomedicine and Biotechnology.
Hashemi, Atieh
2018-01-01
Genome editing mediated by Clustered Regularly Interspaced Palindromic Repeats (CRISPR) and its associated proteins (Cas) has recently been considered to be used as efficient, rapid and site-specific tool in the modification of endogenous genes in biomedically important cell types and whole organisms. It has become a predictable and precise method of choice for genome engineering by specifying a 20-nt targeting sequence within its guide RNA. Firstly, this review aims to describe the biology of CRISPR system. Next, the applications of CRISPR-Cas9 in various ways, such as efficient generation of a wide variety of biomedically important cellular models as well as those of animals, modifying epigenomes, conducting genome-wide screens, gene therapy, labelling specific genomic loci in living cells, metabolic engineering of yeast and bacteria and endogenous gene expression regulation by an altered version of this system were reviewed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
USDA-ARS?s Scientific Manuscript database
Cyclopiazonic acid (CPA), an indole-tetramic acid toxin, is produced by many species of Aspergillus and Penicillium. In addition to CPA Aspergillus flavus produces polyketide-derived carcinogenic aflatoxins (AFs). AF biosynthesis genes form a gene cluster in a subtelomeric region. Isolates of A. fla...
Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
2014-01-01
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
2014-01-01
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417
Targeted mutagenesis in cotton (Gossypium hirsutum L.) using the CRISPR/Cas9 system
Chen, Xiugui; Lu, Xuke; Shu, Na; Wang, Shuai; Wang, Junjuan; Wang, Delong; Guo, Lixue; Ye, Wuwei
2017-01-01
The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas9 system has been widely used for genome editing in various plants because of its simplicity, high efficiency and design flexibility. However, to our knowledge, there is no report on the application of CRISPR/Cas9-mediated targeted mutagenesis in cotton. Here, we report the genome editing and targeted mutagenesis in upland cotton (Gossypium hirsutum L., hereafter cotton) using the CRISPR/Cas9 system. We designed two guide RNAs to target distinct sites of the cotton Cloroplastos alterados 1 (GhCLA1) and vacuolar H+-pyrophosphatase (GhVP) genes. Mutations in these two genes were detected in cotton protoplasts. Most of the mutations were nucleotide substitutions, with one nucleotide insertion and one substitution found in GhCLA1 and one deletion found in GhVP in cotton protoplasts. Subsequently, the two vectors were transformed into cotton shoot apexes through Agrobacterium-mediated transformation, resulting in efficient target gene editing. Most of the mutations were nucleotide deletions, and the mutation efficiencies were 47.6–81.8% in transgenic cotton plants. Evaluation using restriction-enzyme-PCR assay and sequence analysis detected no off-target mutations. Our results indicated that the CRISPR/Cas9 system was an efficient and specific tool for targeted mutagenesis of the cotton genome. PMID:28287154
Targeted mutagenesis in cotton (Gossypium hirsutum L.) using the CRISPR/Cas9 system.
Chen, Xiugui; Lu, Xuke; Shu, Na; Wang, Shuai; Wang, Junjuan; Wang, Delong; Guo, Lixue; Ye, Wuwei
2017-03-13
The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas9 system has been widely used for genome editing in various plants because of its simplicity, high efficiency and design flexibility. However, to our knowledge, there is no report on the application of CRISPR/Cas9-mediated targeted mutagenesis in cotton. Here, we report the genome editing and targeted mutagenesis in upland cotton (Gossypium hirsutum L., hereafter cotton) using the CRISPR/Cas9 system. We designed two guide RNAs to target distinct sites of the cotton Cloroplastos alterados 1 (GhCLA1) and vacuolar H + -pyrophosphatase (GhVP) genes. Mutations in these two genes were detected in cotton protoplasts. Most of the mutations were nucleotide substitutions, with one nucleotide insertion and one substitution found in GhCLA1 and one deletion found in GhVP in cotton protoplasts. Subsequently, the two vectors were transformed into cotton shoot apexes through Agrobacterium-mediated transformation, resulting in efficient target gene editing. Most of the mutations were nucleotide deletions, and the mutation efficiencies were 47.6-81.8% in transgenic cotton plants. Evaluation using restriction-enzyme-PCR assay and sequence analysis detected no off-target mutations. Our results indicated that the CRISPR/Cas9 system was an efficient and specific tool for targeted mutagenesis of the cotton genome.
Efficient CRISPR/Cas9-based genome editing in carrot cells.
Klimek-Chodacka, Magdalena; Oleszkiewicz, Tomasz; Lowder, Levi G; Qi, Yiping; Baranski, Rafal
2018-04-01
The first report presenting successful and efficient carrot genome editing using CRISPR/Cas9 system. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated (Cas9) is a powerful genome editing tool that has been widely adopted in model organisms recently, but has not been used in carrot-a model species for in vitro culture studies and an important health-promoting crop grown worldwide. In this study, for the first time, we report application of the CRISPR/Cas9 system for efficient targeted mutagenesis of the carrot genome. Multiplexing CRISPR/Cas9 vectors expressing two single-guide RNA (gRNAs) targeting the carrot flavanone-3-hydroxylase (F3H) gene were tested for blockage of the anthocyanin biosynthesis in a model purple-colored callus using Agrobacterium-mediated genetic transformation. This approach allowed fast and visual comparison of three codon-optimized Cas9 genes and revealed that the most efficient one in generating F3H mutants was the Arabidopsis codon-optimized AteCas9 gene with up to 90% efficiency. Knockout of F3H gene resulted in the discoloration of calli, validating the functional role of this gene in the anthocyanin biosynthesis in carrot as well as providing a visual marker for screening successfully edited events. Most resulting mutations were small Indels, but long chromosome fragment deletions of 116-119 nt were also generated with simultaneous cleavage mediated by two gRNAs. The results demonstrate successful site-directed mutagenesis in carrot with CRISPR/Cas9 and the usefulness of a model callus culture to validate genome editing systems. Given that the carrot genome has been sequenced recently, our timely study sheds light on the promising application of genome editing tools for boosting basic and translational research in this important vegetable crop.
Barakate, Abdellah; Stephens, Jennifer
2016-01-01
Modern omics platforms have made the determination of susceptible/resistance genes feasible in any species generating huge numbers of potential targets for crop protection. However, the efforts to validate these targets have been hampered by the lack of a fast, precise, and efficient gene targeting system in plants. Now, the repurposing of clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) system has solved this problem. CRISPR/Cas9 is the latest synthetic endonuclease that has revolutionized basic research by allowing facile genome editing in prokaryotes and eukaryotes. Gene knockout is now feasible at an unprecedented efficiency with the possibility of multiplexing several targets and even genome-wide mutagenesis screening. In a short time, this powerful tool has been engineered for an array of applications beyond gene editing. Here, we briefly describe the CRISPR/Cas9 system, its recent improvements and applications in gene manipulation and single DNA/RNA molecule analysis. We summarize a few recent tests targeting plant pathogens and discuss further potential applications in pest control and plant–pathogen interactions that will inform plant breeding for crop protection. PMID:27313592
Sawhney, Neha
2014-01-01
Methylglucuronoarabinoxylan (MeGAXn) from agricultural residues and energy crops is a significant yet underutilized biomass resource for production of biofuels and chemicals. Mild thermochemical pretreatment of bagasse yields MeGAXn requiring saccharifying enzymes for conversion to fermentable sugars. A xylanolytic bacterium, Paenibacillus sp. strain JDR-2, produces an extracellular cell-associated GH10 endoxylanse (XynA1) which efficiently depolymerizes methylglucuronoxylan (MeGXn) from hardwoods coupled with assimilation of oligosaccharides for further processing by intracellular GH67 α-glucuronidase, GH10 endoxylanase, and GH43 β-xylosidase. This process has been ascribed to genes that comprise a xylan utilization regulon that encodes XynA1 and includes a gene cluster encoding transcriptional regulators, ABC transporters, and intracellular enzymes that convert assimilated oligosaccharides to fermentable sugars. Here we show that Paenibacillus sp. JDR-2 utilized MeGAXn without accumulation of oligosaccharides in the medium. The Paenibacillus sp. JDR-2 growth rate on MeGAXn was 3.1-fold greater than that on oligosaccharides generated from MeGAXn by XynA1. Candidate genes encoding GH51 arabinofuranosidases with potential roles were identified. Following growth on MeGAXn, quantitative reverse transcription-PCR identified a cluster of genes encoding a GH51 arabinofuranosidase (AbfB) and transcriptional regulators which were coordinately expressed along with the genes comprising the xylan utilization regulon. The action of XynA1 on MeGAXn generated arabinoxylobiose, arabinoxylotriose, xylobiose, xylotriose, and methylglucuronoxylotriose. Recombinant AbfB processed arabinoxylooligosaccharides to xylooligosaccharides and arabinose. MeGAXn processing by Paenibacillus sp. JDR-2 may be achieved by extracellular depolymerization by XynA1 coupled to assimilation of oligosaccharides and further processing by intracellular enzymes, including AbfB. Paenibacillus sp. JDR-2 provides a GH10/GH67 system complemented with genes encoding intracellular GH51 arabinofuranosidases for efficient utilization of MeGAXn. PMID:25063665
Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi
2017-01-01
Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis ( Ci ), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi ( Hr ) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1 , 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a , 11/12/13.b and HoxX . To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr , distinct from their arrangement in Ci . We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1 , 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci , we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci . Nevertheless, some features are shared in Hox gene components and gene arrangement on the chromosomes, suggesting that Hox gene cluster disintegration in ascidians involved early events common to tunicates as well as later ascidian lineage-specific events.
Ehrlich, Kenneth C; Mack, Brian M
2014-06-23
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.
Ehrlich, Kenneth C.; Mack, Brian M.
2014-01-01
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity. PMID:24960201
Hussain, Razak; Kumari, Indu; Sharma, Shikha; Ahmed, Mushtaq; Khan, Tabreiz Ahmad; Akhter, Yusuf
2017-12-01
Trichothecenes are the secondary metabolites produced by Trichoderma spp. Some of these molecules have been reported for their ability to stimulate plant growth by suppressing plant diseases and hence enabling Trichoderma spp. to be efficiently used as biocontrol agents in modern agriculture. Many of the proteins involved in the trichothecenes biosynthetic pathway in Trichoderma spp. are encoded by the genes present in the tri cluster. Tri4 protein catalyzes three consecutive oxygenation reaction steps during biosynthesis of isotrichodiol in the trichothecenes biosynthetic pathway, while tri11 protein catalyzes the C4 hydroxylation of 12, 13-epoxytrichothec-9-ene to produce trichodermol. In the present study, we have homology modelled the three-dimensional structures of tri4 and tri11 proteins. Furthermore, molecular dynamics simulations were carried out to elucidate the mechanism of their action. Both tri4 and tri11 encode for cytochrome P450 monooxygenase like proteins. These data also revealed effector-induced allosteric changes on substrate binding at an alternative binding site and showed potential homotropic negative cooperativity. These analyses also showed that their catalytic mechanism relies on protein-ligand and protein-heme interactions controlled by hydrophobic and hydrogen-bonding interactions which orient the complex in optimal conformation within the active sites.
Gene Cluster Encoding Cholate Catabolism in Rhodococcus spp.
Wilbrink, Maarten H.; Casabon, Israël; Stewart, Gordon R.; Liu, Jie; van der Geize, Robert; Eltis, Lindsay D.
2012-01-01
Bile acids are highly abundant steroids with important functions in vertebrate digestion. Their catabolism by bacteria is an important component of the carbon cycle, contributes to gut ecology, and has potential commercial applications. We found that Rhodococcus jostii RHA1 grows well on cholate, as well as on its conjugates, taurocholate and glycocholate. The transcriptome of RHA1 growing on cholate revealed 39 genes upregulated on cholate, occurring in a single gene cluster. Reverse transcriptase quantitative PCR confirmed that selected genes in the cluster were upregulated 10-fold on cholate versus on cholesterol. One of these genes, kshA3, encoding a putative 3-ketosteroid-9α-hydroxylase, was deleted and found essential for growth on cholate. Two coenzyme A (CoA) synthetases encoded in the cluster, CasG and CasI, were heterologously expressed. CasG was shown to transform cholate to cholyl-CoA, thus initiating side chain degradation. CasI was shown to form CoA derivatives of steroids with isopropanoyl side chains, likely occurring as degradation intermediates. Orthologous gene clusters were identified in all available Rhodococcus genomes, as well as that of Thermomonospora curvata. Moreover, Rhodococcus equi 103S, Rhodococcus ruber Chol-4 and Rhodococcus erythropolis SQ1 each grew on cholate. In contrast, several mycolic acid bacteria lacking the gene cluster were unable to grow on cholate. Our results demonstrate that the above-mentioned gene cluster encodes cholate catabolism and is distinct from a more widely occurring gene cluster encoding cholesterol catabolism. PMID:23024343
Chu, Van Trung; Graf, Robin; Wirtz, Tristan; Weber, Timm; Favret, Jeremy; Li, Xun; Petsch, Kerstin; Tran, Ngoc Tung; Sieweke, Michael H; Berek, Claudia; Kühn, Ralf; Rajewsky, Klaus
2016-11-01
Applying clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 (Cas9)-mediated mutagenesis to primary mouse immune cells, we used high-fidelity single guide RNAs (sgRNAs) designed with an sgRNA design tool (CrispRGold) to target genes in primary B cells, T cells, and macrophages isolated from a Cas9 transgenic mouse line. Using this system, we achieved an average knockout efficiency of 80% in B cells. On this basis, we established a robust small-scale CRISPR-mediated screen in these cells and identified genes essential for B-cell activation and plasma cell differentiation. This screening system does not require deep sequencing and may serve as a precedent for the application of CRISPR/Cas9 to primary mouse cells.
Roehe, Rainer; Dewhurst, Richard J.; Duthie, Carol-Anne; Rooke, John A.; McKain, Nest; Ross, Dave W.; Hyslop, Jimmy J.; Waterhouse, Anthony; Freeman, Tom C.
2016-01-01
Methane produced by methanogenic archaea in ruminants contributes significantly to anthropogenic greenhouse gas emissions. The host genetic link controlling microbial methane production is unknown and appropriate genetic selection strategies are not developed. We used sire progeny group differences to estimate the host genetic influence on rumen microbial methane production in a factorial experiment consisting of crossbred breed types and diets. Rumen metagenomic profiling was undertaken to investigate links between microbial genes and methane emissions or feed conversion efficiency. Sire progeny groups differed significantly in their methane emissions measured in respiration chambers. Ranking of the sire progeny groups based on methane emissions or relative archaeal abundance was consistent overall and within diet, suggesting that archaeal abundance in ruminal digesta is under host genetic control and can be used to genetically select animals without measuring methane directly. In the metagenomic analysis of rumen contents, we identified 3970 microbial genes of which 20 and 49 genes were significantly associated with methane emissions and feed conversion efficiency respectively. These explained 81% and 86% of the respective variation and were clustered in distinct functional gene networks. Methanogenesis genes (e.g. mcrA and fmdB) were associated with methane emissions, whilst host-microbiome cross talk genes (e.g. TSTA3 and FucI) were associated with feed conversion efficiency. These results strengthen the idea that the host animal controls its own microbiota to a significant extent and open up the implementation of effective breeding strategies using rumen microbial gene abundance as a predictor for difficult-to-measure traits on a large number of hosts. Generally, the results provide a proof of principle to use the relative abundance of microbial genes in the gastrointestinal tract of different species to predict their influence on traits e.g. human metabolism, health and behaviour, as well as to understand the genetic link between host and microbiome. PMID:26891056
Roehe, Rainer; Dewhurst, Richard J; Duthie, Carol-Anne; Rooke, John A; McKain, Nest; Ross, Dave W; Hyslop, Jimmy J; Waterhouse, Anthony; Freeman, Tom C; Watson, Mick; Wallace, R John
2016-02-01
Methane produced by methanogenic archaea in ruminants contributes significantly to anthropogenic greenhouse gas emissions. The host genetic link controlling microbial methane production is unknown and appropriate genetic selection strategies are not developed. We used sire progeny group differences to estimate the host genetic influence on rumen microbial methane production in a factorial experiment consisting of crossbred breed types and diets. Rumen metagenomic profiling was undertaken to investigate links between microbial genes and methane emissions or feed conversion efficiency. Sire progeny groups differed significantly in their methane emissions measured in respiration chambers. Ranking of the sire progeny groups based on methane emissions or relative archaeal abundance was consistent overall and within diet, suggesting that archaeal abundance in ruminal digesta is under host genetic control and can be used to genetically select animals without measuring methane directly. In the metagenomic analysis of rumen contents, we identified 3970 microbial genes of which 20 and 49 genes were significantly associated with methane emissions and feed conversion efficiency respectively. These explained 81% and 86% of the respective variation and were clustered in distinct functional gene networks. Methanogenesis genes (e.g. mcrA and fmdB) were associated with methane emissions, whilst host-microbiome cross talk genes (e.g. TSTA3 and FucI) were associated with feed conversion efficiency. These results strengthen the idea that the host animal controls its own microbiota to a significant extent and open up the implementation of effective breeding strategies using rumen microbial gene abundance as a predictor for difficult-to-measure traits on a large number of hosts. Generally, the results provide a proof of principle to use the relative abundance of microbial genes in the gastrointestinal tract of different species to predict their influence on traits e.g. human metabolism, health and behaviour, as well as to understand the genetic link between host and microbiome.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-07
... economically dynamic regional innovation cluster focused on energy efficient buildings technologies and systems... DEPARTMENT OF ENERGY Energy Efficient Building Systems Regional Innovation Cluster Initiative... February 8, 2010, titled the Energy Efficient Building Systems Regional Innovation Cluster Initiative. A...
Holland, Peter W H
2013-01-01
Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.
Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis
Koh, Esther G. L.; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V.; Brenner, Sydney; Venkatesh, Byrappa
2003-01-01
The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes. PMID:12547909
Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis.
Koh, Esther G L; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V; Brenner, Sydney; Venkatesh, Byrappa
2003-02-04
The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes.
Lee, Sunhee; Reth, Alexander; Meletzus, Dietmar; Sevilla, Myrna; Kennedy, Christina
2000-01-01
A major 30.5-kb cluster of nif and associated genes of Acetobacter diazotrophicus (syn. Gluconacetobacter diazotrophicus), a nitrogen-fixing endophyte of sugarcane, was sequenced and analyzed. This cluster represents the largest assembly of contiguous nif-fix and associated genes so far characterized in any diazotrophic bacterial species. Northern blots and promoter sequence analysis indicated that the genes are organized into eight transcriptional units. The overall arrangement of genes is most like that of the nif-fix cluster in Azospirillum brasilense, while the individual gene products are more similar to those in species of Rhizobiaceae or in Rhodobacter capsulatus. PMID:11092875
Social networks in primates: smart and tolerant species have more efficient networks.
Pasquaretta, Cristian; Levé, Marine; Claidière, Nicolas; van de Waal, Erica; Whiten, Andrew; MacIntosh, Andrew J J; Pelé, Marie; Bergstrom, Mackenzie L; Borgeaud, Christèle; Brosnan, Sarah F; Crofoot, Margaret C; Fedigan, Linda M; Fichtel, Claudia; Hopper, Lydia M; Mareno, Mary Catherine; Petit, Odile; Schnoell, Anna Viktoria; di Sorrentino, Eugenia Polizzi; Thierry, Bernard; Tiddi, Barbara; Sueur, Cédric
2014-12-23
Network optimality has been described in genes, proteins and human communicative networks. In the latter, optimality leads to the efficient transmission of information with a minimum number of connections. Whilst studies show that differences in centrality exist in animal networks with central individuals having higher fitness, network efficiency has never been studied in animal groups. Here we studied 78 groups of primates (24 species). We found that group size and neocortex ratio were correlated with network efficiency. Centralisation (whether several individuals are central in the group) and modularity (how a group is clustered) had opposing effects on network efficiency, showing that tolerant species have more efficient networks. Such network properties affecting individual fitness could be shaped by natural selection. Our results are in accordance with the social brain and cultural intelligence hypotheses, which suggest that the importance of network efficiency and information flow through social learning relates to cognitive abilities.
Social networks in primates: smart and tolerant species have more efficient networks
Pasquaretta, Cristian; Levé, Marine; Claidière, Nicolas; van de Waal, Erica; Whiten, Andrew; MacIntosh, Andrew J. J.; Pelé, Marie; Bergstrom, Mackenzie L.; Borgeaud, Christèle; Brosnan, Sarah F.; Crofoot, Margaret C.; Fedigan, Linda M.; Fichtel, Claudia; Hopper, Lydia M.; Mareno, Mary Catherine; Petit, Odile; Schnoell, Anna Viktoria; di Sorrentino, Eugenia Polizzi; Thierry, Bernard; Tiddi, Barbara; Sueur, Cédric
2014-01-01
Network optimality has been described in genes, proteins and human communicative networks. In the latter, optimality leads to the efficient transmission of information with a minimum number of connections. Whilst studies show that differences in centrality exist in animal networks with central individuals having higher fitness, network efficiency has never been studied in animal groups. Here we studied 78 groups of primates (24 species). We found that group size and neocortex ratio were correlated with network efficiency. Centralisation (whether several individuals are central in the group) and modularity (how a group is clustered) had opposing effects on network efficiency, showing that tolerant species have more efficient networks. Such network properties affecting individual fitness could be shaped by natural selection. Our results are in accordance with the social brain and cultural intelligence hypotheses, which suggest that the importance of network efficiency and information flow through social learning relates to cognitive abilities. PMID:25534964
Athey, Taryn B T; Vaillancourt, Katy; Frenette, Michel; Fittipaldi, Nahuel; Gottschalk, Marcelo; Grenier, Daniel
2016-01-01
Recently, we reported the purification and characterization of three distinct lantibiotics (named suicin 90-1330, suicin 3908, and suicin 65) produced by Streptococcus suis . In this study, we investigated the distribution of the three suicin lantibiotic gene clusters among serotype 2 S. suis strains belonging to sequence type (ST) 25 and ST28, the two dominant STs identified in North America. The genomes of 102 strains were interrogated for the presence of suicin gene clusters encoding suicins 90-1330, 3908, and 65. The gene cluster encoding suicin 65 was the most prevalent and mainly found among ST25 strains. In contrast, none of the genes related to suicin 90-1330 production were identified in 51 ST25 strains nor in 35/51 ST28 strains. However, the complete suicin 90-1330 gene cluster was found in ten ST28 strains, although some genes in the cluster were truncated in three of these isolates. The vast majority (101/102) of S. suis strains did not possess any of the genes encoding suicin 3908. In conclusion, this study indicates heterogeneous distribution of suicin genes in S. suis .
2012-01-01
Background Metallothioneins (MT) are low molecular weight, cysteine rich metal binding proteins, found across genera and species, but their function(s) in abiotic stress tolerance are not well documented. Results We have characterized a rice MT gene, OsMT1e-P, isolated from a subtractive library generated from a stressed salinity tolerant rice genotype, Pokkali. Bioinformatics analysis of the rice genome sequence revealed that this gene belongs to a multigenic family, which consists of 13 genes with 15 protein products. OsMT1e-P is located on chromosome XI, away from the majority of other type I genes that are clustered on chromosome XII. Various members of this MT gene cluster showed a tight co-regulation pattern under several abiotic stresses. Sequence analysis revealed the presence of conserved cysteine residues in OsMT1e-P protein. Salinity stress was found to regulate the transcript abundance of OsMT1e-P in a developmental and organ specific manner. Using transgenic approach, we found a positive correlation between ectopic expression of OsMT1e-P and stress tolerance. Our experiments further suggest ROS scavenging to be the possible mechanism for multiple stress tolerance conferred by OsMT1e-P. Conclusion We present an overview of MTs, describing their gene structure, genome localization and expression patterns under salinity and development in rice. We have found that ectopic expression of OsMT1e-P enhances tolerance towards multiple abiotic stresses in transgenic tobacco and the resultant plants could survive and set viable seeds under saline conditions. Taken together, the experiments presented here have indicated that ectopic expression of OsMT1e-P protects against oxidative stress primarily through efficient scavenging of reactive oxygen species. PMID:22780875
Kumar, Gautam; Kushwaha, Hemant Ritturaj; Panjabi-Sabharwal, Vaishali; Kumari, Sumita; Joshi, Rohit; Karan, Ratna; Mittal, Shweta; Pareek, Sneh L Singla; Pareek, Ashwani
2012-07-10
Metallothioneins (MT) are low molecular weight, cysteine rich metal binding proteins, found across genera and species, but their function(s) in abiotic stress tolerance are not well documented. We have characterized a rice MT gene, OsMT1e-P, isolated from a subtractive library generated from a stressed salinity tolerant rice genotype, Pokkali. Bioinformatics analysis of the rice genome sequence revealed that this gene belongs to a multigenic family, which consists of 13 genes with 15 protein products. OsMT1e-P is located on chromosome XI, away from the majority of other type I genes that are clustered on chromosome XII. Various members of this MT gene cluster showed a tight co-regulation pattern under several abiotic stresses. Sequence analysis revealed the presence of conserved cysteine residues in OsMT1e-P protein. Salinity stress was found to regulate the transcript abundance of OsMT1e-P in a developmental and organ specific manner. Using transgenic approach, we found a positive correlation between ectopic expression of OsMT1e-P and stress tolerance. Our experiments further suggest ROS scavenging to be the possible mechanism for multiple stress tolerance conferred by OsMT1e-P. We present an overview of MTs, describing their gene structure, genome localization and expression patterns under salinity and development in rice. We have found that ectopic expression of OsMT1e-P enhances tolerance towards multiple abiotic stresses in transgenic tobacco and the resultant plants could survive and set viable seeds under saline conditions. Taken together, the experiments presented here have indicated that ectopic expression of OsMT1e-P protects against oxidative stress primarily through efficient scavenging of reactive oxygen species.
NASA Astrophysics Data System (ADS)
Leamy, Michael J.; Springer, Adam C.
In this research we report parallel implementation of a Cellular Automata-based simulation tool for computing elastodynamic response on complex, two-dimensional domains. Elastodynamic simulation using Cellular Automata (CA) has recently been presented as an alternative, inherently object-oriented technique for accurately and efficiently computing linear and nonlinear wave propagation in arbitrarily-shaped geometries. The local, autonomous nature of the method should lead to straight-forward and efficient parallelization. We address this notion on symmetric multiprocessor (SMP) hardware using a Java-based object-oriented CA code implementing triangular state machines (i.e., automata) and the MPI bindings written in Java (MPJ Express). We use MPJ Express to reconfigure our existing CA code to distribute a domain's automata to cores present on a dual quad-core shared-memory system (eight total processors). We note that this message passing parallelization strategy is directly applicable to computer clustered computing, which will be the focus of follow-on research. Results on the shared memory platform indicate nearly-ideal, linear speed-up. We conclude that the CA-based elastodynamic simulator is easily configured to run in parallel, and yields excellent speed-up on SMP hardware.
Mihali, Troco K.; Carmichael, Wayne W.; Neilan, Brett A.
2011-01-01
Saxitoxin and its analogs cause the paralytic shellfish-poisoning syndrome, adversely affecting human health and coastal shellfish industries worldwide. Here we report the isolation, sequencing, annotation, and predicted pathway of the saxitoxin biosynthetic gene cluster in the cyanobacterium Lyngbya wollei. The gene cluster spans 36 kb and encodes enzymes for the biosynthesis and export of the toxins. The Lyngbya wollei saxitoxin gene cluster differs from previously identified saxitoxin clusters as it contains genes that are unique to this cluster, whereby the carbamoyltransferase is truncated and replaced by an acyltransferase, explaining the unique toxin profile presented by Lyngbya wollei. These findings will enable the creation of toxin probes, for water monitoring purposes, as well as proof-of-concept for the combinatorial biosynthesis of these natural occurring alkaloids for the production of novel, biologically active compounds. PMID:21347365
Bhatnagar-Mathur, Pooja; Devi, M Jyostna; Reddy, D Srinivas; Lavanya, M; Vadez, Vincent; Serraj, R; Yamaguchi-Shinozaki, K; Sharma, Kiran K
2007-12-01
Water deficit is the major abiotic constraint affecting crop productivity in peanut (Arachis hypogaea L.). Water use efficiency under drought conditions is thought to be one of the most promising traits to improve and stabilize crop yields under intermittent water deficit. A transcription factor DREB1A from Arabidopsis thaliana, driven by the stress inducible promoter from the rd29A gene, was introduced in a drought-sensitive peanut cultivar JL 24 through Agrobacterium tumefaciens-mediated gene transfer. The stress inducible expression of DREB1A in these transgenic plants did not result in growth retardation or visible phenotypic alterations. T3 progeny of fourteen transgenic events were exposed to progressive soil drying in pot culture. The soil moisture threshold where their transpiration rate begins to decline relative to control well-watered (WW) plants and the number of days needed to deplete the soil water was used to rank the genotypes using the average linkage cluster analysis. Five diverse events were selected from the different clusters and further tested. All the selected transgenic events were able to maintain a transpiration rate equivalent to the WW control in soils dry enough to reduce transpiration rate in wild type JL 24. All transgenic events except one achieved higher transpiration efficiency (TE) under WW conditions and this appeared to be explained by a lower stomatal conductance. Under water limiting conditions, one of the selected transgenic events showed 40% higher TE than the untransformed control.
Wu, Mengmeng; Huang, Haidong; Li, Guoqiang; Ren, Yi; Shi, Zhong; Li, Xiaoyan; Dai, Xiaohui; Gao, Ge; Ren, Mengnan; Ma, Ting
2017-04-21
Although clustering of genes from the same metabolic pathway is a widespread phenomenon, the evolution of the polysaccharide biosynthetic gene cluster remains poorly understood. To determine the evolution of this pathway, we identified a scattered production pathway of the polysaccharide sanxan by Sphingomonas sanxanigenens NX02, and compared the distribution of genes between sphingan-producing and other Sphingomonadaceae strains. This allowed us to determine how the scattered sanxan pathway developed, and how the polysaccharide gene cluster evolved. Our findings suggested that the evolution of microbial polysaccharide biosynthesis gene clusters is a lengthy cyclic process comprising cluster 1 → scatter → cluster 2. The sanxan biosynthetic pathway proved the existence of a dispersive process. We also report the complete genome sequence of NX02, in which we identified many unstable genetic elements and powerful secretion systems. Furthermore, nine enzymes for the formation of activated precursors, four glycosyltransferases, four acyltransferases, and four polymerization and export proteins were identified. These genes were scattered in the NX02 genome, and the positive regulator SpnA of sphingans synthesis could not regulate sanxan production. Finally, we concluded that the evolution of the sanxan pathway was independent. NX02 evolved naturally as a polysaccharide producing strain over a long-time evolution involving gene acquisitions and adaptive mutations.
Roelen, Bernard A J; de Graaff, Wim; Forlani, Sylvie; Deschamps, Jacqueline
2002-11-01
The molecular mechanism underlying the 3' to 5' polarity of induction of mouse Hox genes is still elusive. While relief from a cluster-encompassing repression was shown to lead to all Hoxd genes being expressed like the 3'most of them, Hoxd1 (Kondo and Duboule, 1999), the molecular basis of initial activation of this 3'most gene, is not understood yet. We show that, already before primitive streak formation, prior to initial expression of the first Hox gene, a dramatic transcriptional stimulation of the 3'most genes, Hoxb1 and Hoxb2, is observed upon a short pulse of exogenous retinoic acid (RA), whereas it is not in the case for more 5', cluster-internal, RA-responsive Hoxb genes. In contrast, the RA-responding Hoxb1lacZ transgene that faithfully mimics the endogenous gene (Marshall et al., 1994) did not exhibit the sensitivity of Hoxb1 to precocious activation. We conclude that polarity in initial activation of Hoxb genes reflects a greater availability of 3'Hox genes for transcription, suggesting a pre-existing (susceptibility to) opening of the chromatin structure at the 3' extremity of the cluster. We discuss the data in the context of prevailing models involving differential chromatin opening in the directionality of clustered Hox gene transcription, and regarding the importance of the cluster context for correct timing of initial Hox gene expression.Interestingly, Cdx1 manifested the same early transcriptional availability as Hoxb1. Copyright 2002 Elsevier Science Ireland Ltd.
Shifting forest value orientations in the United States, 1980-2001: A computer content analysis
David N. Bengston; Trevor J. Webb; David P. Fan
2004-01-01
This paper examines three forest value orientations - clusters of interrelated values and basic beliefs about forests - that emerged from an analysis of the public discourse about forest planning, management, and policy in the United States. The value orientations include anthropocentric, biocentric, and moral/spiritual/aesthetic orientations toward forests. Computer...
Coral comparative genomics reveal expanded Hox cluster in the cnidarian-bilaterian ancestor.
DuBuc, Timothy Q; Ryan, Joseph F; Shinzato, Chuya; Satoh, Nori; Martindale, Mark Q
2012-12-01
The key developmental role of the Hox cluster of genes was established prior to the last common ancestor of protostomes and deuterostomes and the subsequent evolution of this cluster has played a major role in the morphological diversity exhibited in extant bilaterians. Despite 20 years of research into cnidarian Hox genes, the nature of the cnidarian-bilaterian ancestral Hox cluster remains unclear. In an attempt to further elucidate this critical phylogenetic node, we have characterized the Hox cluster of the recently sequenced Acropora digitifera genome. The A. digitifera genome contains two anterior Hox genes (PG1 and PG2) linked to an Eve homeobox gene and an Anthox1A gene, which is thought to be either a posterior or posterior/central Hox gene. These data show that the Hox cluster of the cnidarian-bilaterian ancestor was more extensive than previously thought. The results are congruent with the existence of an ancient set of constraints on the Hox cluster and reinforce the importance of incorporating a wide range of animal species to reconstruct critical ancestral nodes.
Broad spectrum antibiotic compounds and use thereof
Koglin, Alexander; Strieker, Matthias
2016-07-05
The discovery of a non-ribosomal peptide synthetase (NRPS) gene cluster in the genome of Clostridium thermocellum that produces a secondary metabolite that is assembled outside of the host membrane is described. Also described is the identification of homologous NRPS gene clusters from several additional microorganisms. The secondary metabolites produced by the NRPS gene clusters exhibit broad spectrum antibiotic activity. Thus, antibiotic compounds produced by the NRPS gene clusters, and analogs thereof, their use for inhibiting bacterial growth, and methods of making the antibiotic compounds are described.
On three-dimensional misorientation spaces.
Krakow, Robert; Bennett, Robbie J; Johnstone, Duncan N; Vukmanovic, Zoja; Solano-Alvarez, Wilberth; Lainé, Steven J; Einsle, Joshua F; Midgley, Paul A; Rae, Catherine M F; Hielscher, Ralf
2017-10-01
Determining the local orientation of crystals in engineering and geological materials has become routine with the advent of modern crystallographic mapping techniques. These techniques enable many thousands of orientation measurements to be made, directing attention towards how such orientation data are best studied. Here, we provide a guide to the visualization of misorientation data in three-dimensional vector spaces, reduced by crystal symmetry, to reveal crystallographic orientation relationships. Domains for all point group symmetries are presented and an analysis methodology is developed and applied to identify crystallographic relationships, indicated by clusters in the misorientation space, in examples from materials science and geology. This analysis aids the determination of active deformation mechanisms and evaluation of cluster centres and spread enables more accurate description of transformation processes supporting arguments regarding provenance.
On three-dimensional misorientation spaces
NASA Astrophysics Data System (ADS)
Krakow, Robert; Bennett, Robbie J.; Johnstone, Duncan N.; Vukmanovic, Zoja; Solano-Alvarez, Wilberth; Lainé, Steven J.; Einsle, Joshua F.; Midgley, Paul A.; Rae, Catherine M. F.; Hielscher, Ralf
2017-10-01
Determining the local orientation of crystals in engineering and geological materials has become routine with the advent of modern crystallographic mapping techniques. These techniques enable many thousands of orientation measurements to be made, directing attention towards how such orientation data are best studied. Here, we provide a guide to the visualization of misorientation data in three-dimensional vector spaces, reduced by crystal symmetry, to reveal crystallographic orientation relationships. Domains for all point group symmetries are presented and an analysis methodology is developed and applied to identify crystallographic relationships, indicated by clusters in the misorientation space, in examples from materials science and geology. This analysis aids the determination of active deformation mechanisms and evaluation of cluster centres and spread enables more accurate description of transformation processes supporting arguments regarding provenance.
Goutaudier, N; Chauchard, E; Melioli, T; Valls, M; van Leeuwen, N; Chabrol, H
2015-09-01
The aim of the study was to explore the typology of adolescents with immigrant background based on the orientations of acculturation and to estimate the psychosocial adaptation of the various subtypes. A sample of 228 French high school students with an immigrant background completed a questionnaire assessing acculturation orientations (Immigrant Acculturation Scale; Barrette et al., 2004), antisocial behaviors, depressive symptoms and self-esteem. Cluster analysis based on acculturation orientations was performed using the k-means method. Cluster analysis produced four distinct acculturation profiles: bicultural (31%), separated (28%), marginalized (21%), and assimilated-individualistic (20%). Adolescents in the separated and marginalized clusters, both characterized by rejection of the host culture, reported higher levels of antisocial behavior. Depressive symptoms and self-esteem did not differ between clusters. Several hypotheses may explain the association between separation and delinquency. First, separation and rejection of the host culture may lead to rebellious behavior such as delinquency. Conversely, delinquent behavior may provoke rejection or discrimination by peers or school, or legal sanctions that induce a reciprocal process of rejection of the host culture and separation. The relationship between separation and antisocial behavior may be bidirectional, each one reinforcing the other, resulting in a negative spiral. This study confirms the interest of the study of the orientations of acculturation in the understanding of the antisocial behavior of adolescents with immigrant background. Copyright © 2014 L’Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data.
Nishiyama, Takeshi; Takahashi, Kunihiko; Tango, Toshiro; Pinto, Dalila; Scherer, Stephen W; Takami, Satoshi; Kishino, Hirohisa
2011-05-26
Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.
Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug
2011-11-01
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
Manchaiah, Vinaya; Zhao, Fei; Oladeji, Susan; Ratinaud, Pierre
2018-01-01
Purpose: The current study was aimed at understanding the patterns in the social representation of loud music reported by young adults in different countries. Materials and Methods: The study included a sample of 534 young adults (18–25 years) from India, Iran, Portugal, United Kingdom, and United States. Participants were recruited using a convince sampling, and data were collected using the free association task. Participants were asked to provide up to five words or phrases that come to mind when thinking about “loud music.” The data were first analyzed using the qualitative content analysis. This was followed by quantitative cluster analysis and chi-square analysis. Results: The content analysis suggested 19 main categories of responses related to loud music. The cluster analysis resulted in for main clusters, namely: (1) emotional oriented perception; (2) problem oriented perception; (3) music and enjoyment oriented perception; and (4) positive emotional and recreation-oriented perception. Country of origin was associated with the likelihood of participants being in each of these clusters. Conclusion: The current study highlights the differences and similarities in young adults’ perception of loud music. These results may have implications to hearing health education to facilitate healthy listening habits. PMID:29457602
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Clustering Algorithms: Their Application to Gene Expression Data
Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel
2016-01-01
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867
Genetic mechanisms underlying the methylation level of anthocyanins in grape (Vitis vinifera L.)
2011-01-01
Background Plant color variation is due not only to the global pigment concentration but also to the proportion of different types of pigment. Variation in the color spectrum may arise from secondary modifications, such as hydroxylation and methylation, affecting the chromatic properties of pigments. In grapes (Vitis vinifera L.), the level of methylation modifies the stability and reactivity of anthocyanin, which directly influence the color of the berry. Anthocyanin methylation, as a complex trait, is controlled by multiple molecular factors likely to involve multiple regulatory steps. Results In a Syrah × Grenache progeny, two QTLs were detected for variation in level of anthocyanin methylation. The first one, explaining up to 27% of variance, colocalized with a cluster of Myb-type transcription factor genes. The second one, explaining up to 20% of variance, colocalized with a cluster of O-methyltransferase coding genes (AOMT). In a collection of 32 unrelated cultivars, MybA and AOMT expression profiles correlated with the level of methylated anthocyanin. In addition, the newly characterized AOMT2 gene presented two SNPs associated with methylation level. These mutations, probably leading to a structural change of the AOMT2 protein significantly affected the enzyme specific catalytic efficiency for the 3'-O-methylation of delphinidin 3-glucoside. Conclusion We demonstrated that variation in methylated anthocyanin accumulation is susceptible to involve both transcriptional regulation and structural variation. We report here the identification of novel AOMT variants likely to cause methylated anthocyanin variation. The integration of QTL mapping and molecular approaches enabled a better understanding of how variation in gene expression and catalytic efficiency of the resulting enzyme may influence the grape anthocyanin profile. PMID:22171701
Genomic analyses of bacterial porin-cytochrome gene clusters
Shi, Liang; Fredrickson, James K.; Zachara, John M.
2014-11-26
In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less
Uhong Lü, Yuhong; Liu, Xiaoli; Wang, Miao; Li, Yuanyuan; Liu, Ning; Bao, Yuxin; Liu, Minghao; Li, Xiaoqian; Wang, Yinyin; Qian, Shenyan; Yue, Changwu; Huang, Ying
2016-09-01
In order to obtain the natural products synthesized by the three putative xiamycin biosynthesis gene clusters which were predicted via antiSMASH during the genome mining of marine Streptomyces sp. FXJ 7.388, Streptomyces sp. FXJ 8.012, and Streptomyces olivaceus FXJ 7.023. Sixteen genes involved in xiamycin assembly, modification, and regulation with higher identity than the newest reported xiamycin biosynthetic gene cluster from marine Streptomyces sp. SCSIO 02999, Streptomyces sp. HKI0576, and Streptomyces sp. FXJ 7.388 were discovered via gene cluster comparative analysis. A ribosome engineering strategy was adopted to activate such cryptic gene clusters with different final concentrations antibiotics that act on the ribosome, and two indolosesquiterpenes were isolated from idlethaldose streptomycin-resistant Streptomyces sp. FXJ 7.388 strains. However, no such product was detected in Streptomyces sp. FXJ 8.012 and Streptomyces olivaceus FXJ 7.023 under the same treatment. This result suggested that these genes might hold the least gene content for xiamycin biosynthesis.
Lundqvist, M L; Middleton, D L; Hazard, S; Warr, G W
2001-12-14
The region of the duck IgH locus extending from upstream of the proximal diversity (D) segment to downstream of the constant gene cluster has been cloned and mapped. A sequence contig of 48,796 base pairs established that the organization of the genes is D-J(H)-mu-alpha-upsilon. No evidence for a functional homologue (or remnant) of a delta gene was found. The alpha gene is in inverted transcriptional orientation; class switch to IgA expression thus requires inversion of the approximately 27-kilobase pair region that includes both mu and alpha genes. The secreted forms of duck alpha and mu are each encoded by 4 constant region exons, and the hydrophobic C-terminal regions of the membrane receptor forms of alpha and mu are encoded by one and two transmembrane exons, respectively. Putative switch (S) regions were identified for duck mu and upsilon by comparison with chicken Smu and Supsilon sequences and for duck alpha by comparison with mouse Salpha. The duck IgH locus is rich in complex variable number tandem repeats, which occupy approximately 60% of the sequenced region, and occur at a much higher frequency in the IgH locus than in other sequenced regions of the duck genome.
Genes encoding cuticular proteins are components of the Nimrod gene cluster in Drosophila.
Cinege, Gyöngyi; Zsámboki, János; Vidal-Quadras, Maite; Uv, Anne; Csordás, Gábor; Honti, Viktor; Gábor, Erika; Hegedűs, Zoltán; Varga, Gergely I B; Kovács, Attila L; Juhász, Gábor; Williams, Michael J; Andó, István; Kurucz, Éva
2017-08-01
The Nimrod gene cluster, located on the second chromosome of Drosophila melanogaster, is the largest synthenic unit of the Drosophila genome. Nimrod genes show blood cell specific expression and code for phagocytosis receptors that play a major role in fruit fly innate immune functions. We previously identified three homologous genes (vajk-1, vajk-2 and vajk-3) located within the Nimrod cluster, which are unrelated to the Nimrod genes, but are homologous to a fourth gene (vajk-4) located outside the cluster. Here we show that, unlike the Nimrod candidates, the Vajk proteins are expressed in cuticular structures of the late embryo and the late pupa, indicating that they contribute to cuticular barrier functions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma.
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D; Ober, Carole; Nicolae, Dan L; Barnes, Kathleen C; London, Stephanie J; Gilliland, Frank; Weiss, Scott T; Raby, Benjamin A; Cohn, Lauren; Chupp, Geoffrey L
2015-05-15
The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10(-6)) and hospitalization (P = 0.01), respectively. There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.
Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren
2015-01-01
Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Morata, Jordi; Puigdomènech, Pere
2017-02-08
Cucurbitaceae species contain a significantly lower number of genes coding for proteins with similarity to plant resistance genes belonging to the NBS-LRR family than other plant species of similar genome size. A large proportion of these genes are organized in clusters that appear to be hotspots of variability. The genomes of the Cucurbitaceae species measured until now are intermediate in size (between 350 and 450 Mb) and they apparently have not undergone any genome duplications beside those at the origin of eudicots. The cluster containing the largest number of NBS-LRR genes has previously been analyzed in melon and related species and showed a high degree of interspecific and intraspecific variability. It was of interest to study whether similar behavior occurred in other cluster of the same family of genes. The cluster of NBS-LRR genes located in melon chromosome 9 was analyzed and compared with the syntenic regions in other cucurbit genomes. This is the second cluster in number within this species and it contains nine sequences with a NBS-LRR annotation including two genes, Fom1 and Prv, providing resistance against Fusarium and Ppapaya ring-spot virus (PRSV). The variability within the melon species appears to consist essentially of single nucleotide polymorphisms. Clusters of similar genes are present in the syntenic regions of the two species of Cucurbitaceae that were sequenced, cucumber and watermelon. Most of the genes in the syntenic clusters can be aligned between species and a hypothesis of generation of the cluster is proposed. The number of genes in the watermelon cluster is similar to that in melon while a higher number of genes (12) is present in cucumber, a species with a smaller genome than melon. After comparing genome resequencing data of 115 cucumber varieties, deletion of a group of genes is observed in a group of varieties of Indian origin. Clusters of genes coding for NBS-LRR proteins in cucurbits appear to have specific variability in different regions of the genome and between different species. This observation is in favour of considering that the adaptation of plant species to changing environments is based upon the variability that may occur at any location in the genome and that has been produced by specific mechanisms of sequence variation acting on plant genomes. This information could be useful both to understand the evolution of species and for plant breeding.
Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita
2016-04-01
Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
High-throughput platform for the discovery of elicitors of silent bacterial gene clusters.
Seyedsayamdost, Mohammad R
2014-05-20
Over the past decade, bacterial genome sequences have revealed an immense reservoir of biosynthetic gene clusters, sets of contiguous genes that have the potential to produce drugs or drug-like molecules. However, the majority of these gene clusters appear to be inactive for unknown reasons prompting terms such as "cryptic" or "silent" to describe them. Because natural products have been a major source of therapeutic molecules, methods that rationally activate these silent clusters would have a profound impact on drug discovery. Herein, a new strategy is outlined for awakening silent gene clusters using small molecule elicitors. In this method, a genetic reporter construct affords a facile read-out for activation of the silent cluster of interest, while high-throughput screening of small molecule libraries provides potential inducers. This approach was applied to two cryptic gene clusters in the pathogenic model Burkholderia thailandensis. The results not only demonstrate a prominent activation of these two clusters, but also reveal that the majority of elicitors are themselves antibiotics, most in common clinical use. Antibiotics, which kill B. thailandensis at high concentrations, act as inducers of secondary metabolism at low concentrations. One of these antibiotics, trimethoprim, served as a global activator of secondary metabolism by inducing at least five biosynthetic pathways. Further application of this strategy promises to uncover the regulatory networks that activate silent gene clusters while at the same time providing access to the vast array of cryptic molecules found in bacteria.
A conserved gene cluster as a putative functional unit in insect innate immunity.
Somogyi, Kálmán; Sipos, Botond; Pénzes, Zsolt; Andó, István
2010-11-05
The Nimrod gene superfamily is an important component of the innate immune response. The majority of its member genes are located in close proximity within the Drosophila melanogaster genome and they lie in a larger conserved cluster ("Nimrod cluster"), made up of non-related groups (families, superfamilies) of genes. This cluster has been a part of the Arthropod genomes for about 300-350 million years. The available data suggest that the Nimrod cluster is a functional module of the insect innate immune response. Copyright © 2010 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Development of an Efficient Genome Editing Tool in Bacillus licheniformis Using CRISPR-Cas9 Nickase.
Li, Kaifeng; Cai, Dongbo; Wang, Zhangqian; He, Zhili; Chen, Shouwen
2018-03-15
Bacillus strains are important industrial bacteria that can produce various biochemical products. However, low transformation efficiencies and a lack of effective genome editing tools have hindered its widespread application. Recently, clustered regularly interspaced short palindromic repeat (CRISPR)-Cas9 techniques have been utilized in many organisms as genome editing tools because of their high efficiency and easy manipulation. In this study, an efficient genome editing method was developed for Bacillus licheniformis using a CRISPR-Cas9 nickase integrated into the genome of B. licheniformis DW2 with overexpression driven by the P43 promoter. The yvmC gene was deleted using the CRISPR-Cas9n technique with homology arms of 1.0 kb as a representative example, and an efficiency of 100% was achieved. In addition, two genes were simultaneously disrupted with an efficiency of 11.6%, and the large DNA fragment bacABC (42.7 kb) was deleted with an efficiency of 79.0%. Furthermore, the heterologous reporter gene aprN , which codes for nattokinase in Bacillus subtilis , was inserted into the chromosome of B. licheniformis with an efficiency of 76.5%. The activity of nattokinase in the DWc9nΔ7/pP43SNT-S sacC strain reached 59.7 fibrinolytic units (FU)/ml, which was 25.7% higher than that of DWc9n/pP43SNT-S sacC Finally, the engineered strain DWc9nΔ7 (Δ epr Δ wprA Δ mpr Δ aprE Δ vpr Δ bprA Δ bacABC ), with multiple disrupted genes, was constructed using the CRISPR-Cas9n technique. Taken together, we have developed an efficient genome editing tool based on CRISPR-Cas9n in B. licheniformis This tool could be applied to strain improvement for future research. IMPORTANCE As important industrial bacteria, Bacillus strains have attracted significant attention due to their production of biological products. However, genetic manipulation of these bacteria is difficult. The CRISPR-Cas9 system has been applied to genome editing in some bacteria, and CRISPR-Cas9n was proven to be an efficient and precise tool in previous reports. The significance of our research is the development of an efficient, more precise, and systematic genome editing method for single-gene deletion, multiple-gene disruption, large DNA fragment deletion, and single-gene integration in Bacillus licheniformis via Cas9 nickase. We also applied this method to the genetic engineering of the host strain for protein expression. Copyright © 2018 American Society for Microbiology.
Zhang, Xiujun; Parry, Ronald J.
2007-01-01
The pyrrolomycins are a family of polyketide antibiotics, some of which contain a nitro group. To gain insight into the nitration mechanism associated with the formation of these antibiotics, the pyrrolomycin biosynthetic gene cluster from Actinosporangium vitaminophilum was cloned. Sequencing of ca. 56 kb of A. vitaminophilum DNA revealed 35 open reading frames (ORFs). Sequence analysis revealed a clear relationship between some of these ORFs and the biosynthetic gene cluster for pyoluteorin, a structurally related antibiotic. Since a gene transfer system could not be devised for A. vitaminophilum, additional proof for the identity of the cloned gene cluster was sought by cloning the pyrrolomycin gene cluster from Streptomyces sp. strain UC 11065, a transformable pyrrolomycin producer. Sequencing of ca. 26 kb of UC 11065 DNA revealed the presence of 17 ORFs, 15 of which exhibit strong similarity to ORFs in the A. vitaminophilum cluster as well as a nearly identical organization. Single-crossover disruption of two genes in the UC 11065 cluster abolished pyrrolomycin production in both cases. These results confirm that the genetic locus cloned from UC 11065 is essential for pyrrolomycin production, and they also confirm that the highly similar locus in A. vitaminophilum encodes pyrrolomycin biosynthetic genes. Sequence analysis revealed that both clusters contain genes encoding the two components of an assimilatory nitrate reductase. This finding suggests that nitrite is required for the formation of the nitrated pyrrolomycins. However, sequence analysis did not provide additional insights into the nitration process, suggesting the operation of a novel nitration mechanism. PMID:17158935
Zhang, Junyong; Chang, Shaoqing; Suryanto, Bryan H R; Gong, Chunhua; Zeng, Xianghua; Zhao, Chuan; Zeng, Qingdao; Xie, Jingli
2016-06-06
Taking advantage of a continuous-flow apparatus, the iridium(III)-containing polytungstate cluster K12Na2H2[Ir2Cl8P2W20O72]·37H2O (1) was obtained in a reasonable yield (13% based on IrCl3·H2O). Compound 1 was characterized by Fourier transform IR, UV-visible, (31)P NMR, electrospray ionization mass spectrometry (ESI-MS), and thermogravimetric analysis measurements. (31)P NMR, ESI-MS, and elemental analysis all indicated 1 was a new polytungstate cluster compared with the reported K14[(IrCl4)KP2W20O72] compound. Intriguingly, the successful isolation of 1 relied on the custom-built flow apparatus, demonstrating the uniqueness of continuous-flow chemistry to achieve crystalline materials. The catalytic properties of 1 were assessed by investigating the activity on catalyzing the electro-oxidation of ruthenium tris-2,2'-bipyridine [Ru(bpy)3](2+/3+). The voltammetric behavior suggested a coupled catalytic behavior between [Ru(bpy)3](3+/2+) and 1. Furthermore, on the highly oriented pyrolytic graphite surface, 1,3,5-tris(10-carboxydecyloxy) benzene (TCDB) was used as the two-dimensional host network to coassemble cluster 1; the surface morphology was observed by scanning tunneling microscope technique. "S"-shape of 1 was observed, indicating that the cluster could be accommodated in the cavity formed by two TCDB host molecules, leading to a TCDB/cluster binary structure.
Fast and robust generation of feature maps for region-based visual attention.
Aziz, Muhammad Zaheer; Mertsching, Bärbel
2008-05-01
Visual attention is one of the important phenomena in biological vision which can be followed to achieve more efficiency, intelligence, and robustness in artificial vision systems. This paper investigates a region-based approach that performs pixel clustering prior to the processes of attention in contrast to late clustering as done by contemporary methods. The foundation steps of feature map construction for the region-based attention model are proposed here. The color contrast map is generated based upon the extended findings from the color theory, the symmetry map is constructed using a novel scanning-based method, and a new algorithm is proposed to compute a size contrast map as a formal feature channel. Eccentricity and orientation are computed using the moments of obtained regions and then saliency is evaluated using the rarity criteria. The efficient design of the proposed algorithms allows incorporating five feature channels while maintaining a processing rate of multiple frames per second. Another salient advantage over the existing techniques is the reusability of the salient regions in the high-level machine vision procedures due to preservation of their shapes and precise locations. The results indicate that the proposed model has the potential to efficiently integrate the phenomenon of attention into the main stream of machine vision and systems with restricted computing resources such as mobile robots can benefit from its advantages.
Targeting Clusters, Achieving Excellence.
ERIC Educational Resources Information Center
Rosenfeld, Stuart; Jacobs, Jim; Liston, Cynthia
2003-01-01
Suggests that groups, or clusters, of industries form partnerships with community colleges in order to positively impact economic development. Asserts that a cluster-oriented community college system requires innovation, specialized resources and expertise, knowledge of trends, and links to industry. Offers suggestions for developing such a…
Cao, Huojun; Amendt, Brad A
2016-11-01
Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis). A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis. pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis. Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects. By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016. Published by Elsevier B.V.
Evolution of Chemical Diversity in Echinocandin Lipopeptide Antifungal Metabolites
Yue, Qun; Chen, Li; Zhang, Xiaoling; Li, Kuan; Sun, Jingzu; Liu, Xingzhong
2015-01-01
The echinocandins are a class of antifungal drugs that includes caspofungin, micafungin, and anidulafungin. Gene clusters encoding most of the structural complexity of the echinocandins provided a framework for hypotheses about the evolutionary history and chemical logic of echinocandin biosynthesis. Gene orthologs among echinocandin-producing fungi were identified. Pathway genes, including the nonribosomal peptide synthetases (NRPSs), were analyzed phylogenetically to address the hypothesis that these pathways represent descent from a common ancestor. The clusters share cooperative gene contents and linkages among the different strains. Individual pathway genes analyzed in the context of similar genes formed unique echinocandin-exclusive phylogenetic lineages. The echinocandin NRPSs, along with the NRPS from the inp gene cluster in Aspergillus nidulans and its orthologs, comprise a novel lineage among fungal NRPSs. NRPS adenylation domains from different species exhibited a one-to-one correspondence between modules and amino acid specificity that is consistent with models of tandem duplication and subfunctionalization. Pathway gene trees and Ascomycota phylogenies are congruent and consistent with the hypothesis that the echinocandin gene clusters have a common origin. The disjunct Eurotiomycete-Leotiomycete distribution appears to be consistent with a scenario of vertical descent accompanied by incomplete lineage sorting and loss of the clusters from most lineages of the Ascomycota. We present evidence for a single evolutionary origin of the echinocandin family of gene clusters and a progression of structural diversification in two fungal classes that diverged approximately 290 to 390 million years ago. Lineage-specific gene cluster evolution driven by selection of new chemotypes contributed to diversification of the molecular functionalities. PMID:26024901
[The application of genome editing in identification of plant gene function and crop breeding].
Zhou, Xiang-chun; Xing, Yong-zhong
2016-03-01
Plant genome can be modified via current biotechnology with high specificity and excellent efficiency. Zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system are the key engineered nucleases used in the genome editing. Genome editing techniques enable gene targeted mutagenesis, gene knock-out, gene insertion or replacement at the target sites during the endogenous DNA repair process, including non-homologous end joining (NHEJ) and homologous recombination (HR), triggered by the induction of DNA double-strand break (DSB). Genome editing has been successfully applied in the genome modification of diverse plant species, such as Arabidopsis thaliana, Oryza sativa, and Nicotiana tabacum. In this review, we summarize the application of genome editing in identification of plant gene function and crop breeding. Moreover, we also discuss the improving points of genome editing in crop precision genetic improvement for further study.
Verslues, Paul E.; Lasky, Jesse R.; Juenger, Thomas E.; Liu, Tzu-Wen; Kumar, M. Nagaraj
2014-01-01
Arabidopsis (Arabidopsis thaliana) exhibits natural genetic variation in drought response, including varying levels of proline (Pro) accumulation under low water potential. As Pro accumulation is potentially important for stress tolerance and cellular redox control, we conducted a genome-wide association (GWAS) study of low water potential-induced Pro accumulation using a panel of natural accessions and publicly available single-nucleotide polymorphism (SNP) data sets. Candidate genomic regions were prioritized for subsequent study using metrics considering both the strength and spatial clustering of the association signal. These analyses found many candidate regions likely containing gene(s) influencing Pro accumulation. Reverse genetic analysis of several candidates identified new Pro effector genes, including thioredoxins and several genes encoding Universal Stress Protein A domain proteins. These new Pro effector genes further link Pro accumulation to cellular redox and energy status. Additional new Pro effector genes found include the mitochondrial protease LON1, ribosomal protein RPL24A, protein phosphatase 2A subunit A3, a MADS box protein, and a nucleoside triphosphate hydrolase. Several of these new Pro effector genes were from regions with multiple SNPs, each having moderate association with Pro accumulation. This pattern supports the use of summary approaches that incorporate clusters of SNP associations in addition to consideration of individual SNP probability values. Further GWAS-guided reverse genetics promises to find additional effectors of Pro accumulation. The combination of GWAS and reverse genetics to efficiently identify new effector genes may be especially applicable for traits difficult to analyze by other genetic screening methods. PMID:24218491
A bacterial aromatic aldehyde dehydrogenase critical for the efficient catabolism of syringaldehyde.
Kamimura, Naofumi; Goto, Takayuki; Takahashi, Kenji; Kasai, Daisuke; Otsuka, Yuichiro; Nakamura, Masaya; Katayama, Yoshihiro; Fukuda, Masao; Masai, Eiji
2017-03-15
Vanillin and syringaldehyde obtained from lignin are essential intermediates for the production of basic chemicals using microbial cell factories. However, in contrast to vanillin, the microbial conversion of syringaldehyde is poorly understood. Here, we identified an aromatic aldehyde dehydrogenase (ALDH) gene responsible for syringaldehyde catabolism from 20 putative ALDH genes of Sphingobium sp. strain SYK-6. All these genes were expressed in Escherichia coli, and nine gene products, including previously characterized BzaA, BzaB, and vanillin dehydrogenase (LigV), exhibited oxidation activities for syringaldehyde to produce syringate. Among these genes, SLG_28320 (desV) and ligV were most highly and constitutively transcribed in the SYK-6 cells. Disruption of desV in SYK-6 resulted in a significant reduction in growth on syringaldehyde and in syringaldehyde oxidation activity. Furthermore, a desV ligV double mutant almost completely lost its ability to grow on syringaldehyde. Purified DesV showed similar k cat /K m values for syringaldehyde (2100 s -1 ·mM -1 ) and vanillin (1700 s -1 ·mM -1 ), whereas LigV substantially preferred vanillin (8800 s -1 ·mM -1 ) over syringaldehyde (1.4 s -1 ·mM -1 ). These results clearly demonstrate that desV plays a major role in syringaldehyde catabolism. Phylogenetic analyses showed that DesV-like ALDHs formed a distinct phylogenetic cluster separated from the vanillin dehydrogenase cluster.
A bacterial aromatic aldehyde dehydrogenase critical for the efficient catabolism of syringaldehyde
Kamimura, Naofumi; Goto, Takayuki; Takahashi, Kenji; Kasai, Daisuke; Otsuka, Yuichiro; Nakamura, Masaya; Katayama, Yoshihiro; Fukuda, Masao; Masai, Eiji
2017-01-01
Vanillin and syringaldehyde obtained from lignin are essential intermediates for the production of basic chemicals using microbial cell factories. However, in contrast to vanillin, the microbial conversion of syringaldehyde is poorly understood. Here, we identified an aromatic aldehyde dehydrogenase (ALDH) gene responsible for syringaldehyde catabolism from 20 putative ALDH genes of Sphingobium sp. strain SYK-6. All these genes were expressed in Escherichia coli, and nine gene products, including previously characterized BzaA, BzaB, and vanillin dehydrogenase (LigV), exhibited oxidation activities for syringaldehyde to produce syringate. Among these genes, SLG_28320 (desV) and ligV were most highly and constitutively transcribed in the SYK-6 cells. Disruption of desV in SYK-6 resulted in a significant reduction in growth on syringaldehyde and in syringaldehyde oxidation activity. Furthermore, a desV ligV double mutant almost completely lost its ability to grow on syringaldehyde. Purified DesV showed similar kcat/Km values for syringaldehyde (2100 s−1·mM−1) and vanillin (1700 s−1·mM−1), whereas LigV substantially preferred vanillin (8800 s−1·mM−1) over syringaldehyde (1.4 s−1·mM−1). These results clearly demonstrate that desV plays a major role in syringaldehyde catabolism. Phylogenetic analyses showed that DesV-like ALDHs formed a distinct phylogenetic cluster separated from the vanillin dehydrogenase cluster. PMID:28294121
Clusters of antibiotic resistance genes enriched together stay together in swine agriculture
Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; ...
2016-04-12
Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. In addition, as governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically.« less
Clusters of antibiotic resistance genes enriched together stay together in swine agriculture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong
Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. In addition, as governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically.« less
Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture.
Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M
2016-04-12
Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. As governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically. Copyright © 2016 Johnson et al.
Biophysical Constraints Arising from Compositional Context in Synthetic Gene Networks.
Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M
2017-07-26
Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.
Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun
2017-12-01
Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
ORGANIZATION OF THE nif GENES OF THE NONHETEROCYSTOUS CYANOBACTERIUM TRICHODESMIUM SP. IMS101.
Dominic, Benny; Zani, Sabino; Chen, Yi-Bu; Mellon, Mark T; Zehr, Jonathan P
2000-08-26
An approximately 16-kb fragment of the Trichodesmium sp. IMS101 (a nonheterocystous filamentous cyanobacterium) "conventional"nif gene cluster was cloned and sequenced. The gene organization of the Trichodesmium and Anabaena variabilis vegetative (nif 2) nitrogenase gene clusters spanning the region from nif B to nif W are similar except for the absence of two open reading frames (ORF3 and ORF1) in Trichodesmium. The Trichodesmium nif EN genes encode a fused Nif EN polypeptide that does not appear to be processed into individual Nif E and Nif N polypeptides. Fused nif EN genes were previously found in the A. variabilis nif 2 genes, but we have found that fused nif EN genes are widespread in the nonheterocystous cyanobacteria. Although the gene organization of the nonheterocystous filamentous Trichodesmium nif gene cluster is very similar to that of the A. variabilis vegetative nif 2 gene cluster, phylogenetic analysis of nif sequences do not support close relatedness of Trichodesmium and A. variabilis vegetative (nif 2) nitrogenase genes.
Neural Differentiation of Embryonic Stem Cells In Vitro: A Road Map to Neurogenesis in the Embryo
Abranches, Elsa; Silva, Margarida; Pradier, Laurent; Schulz, Herbert; Hummel, Oliver; Henrique, Domingos; Bekman, Evguenia
2009-01-01
Background The in vitro generation of neurons from embryonic stem (ES) cells is a promising approach to produce cells suitable for neural tissue repair and cell-based replacement therapies of the nervous system. Available methods to promote ES cell differentiation towards neural lineages attempt to replicate, in different ways, the multistep process of embryonic neural development. However, to achieve this aim in an efficient and reproducible way, a better knowledge of the cellular and molecular events that are involved in the process, from the initial specification of neuroepithelial progenitors to their terminal differentiation into neurons and glial cells, is required. Methodology/Principal Findings In this work, we characterize the main stages and transitions that occur when ES cells are driven into a neural fate, using an adherent monolayer culture system. We established improved conditions to routinely produce highly homogeneous cultures of neuroepithelial progenitors, which organize into neural tube-like rosettes when they acquire competence for neuronal production. Within rosettes, neuroepithelial progenitors display morphological and functional characteristics of their embryonic counterparts, namely, apico-basal polarity, active Notch signalling, and proper timing of production of neurons and glia. In order to characterize the global gene activity correlated with each particular stage of neural development, the full transcriptome of different cell populations that arise during the in vitro differentiation protocol was determined by microarray analysis. By using embryo-oriented criteria to cluster the differentially expressed genes, we define five gene expression signatures that correlate with successive stages in the path from ES cells to neurons. These include a gene signature for a primitive ectoderm-like stage that appears after ES cells enter differentiation, and three gene signatures for subsequent stages of neural progenitor development, from an early stage that follows neural induction to a final stage preceding terminal differentiation. Conclusions/Significance Overall, our work confirms and extends the cellular and molecular parallels between monolayer ES cell neural differentiation and embryonic neural development, revealing in addition novel aspects of the genetic network underlying the multistep process that leads from uncommitted cells to differentiated neurons. PMID:19621087
Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference
Stone, Eric A.; Ayroles, Julien F.
2009-01-01
In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation. PMID:19424432
A mixture model-based approach to the clustering of microarray expression data.
McLachlan, G J; Bean, R W; Peel, D
2002-03-01
This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/
Cho, Bomsoo; Pierre-Louis, Gandhy; Sagner, Andreas; Eaton, Suzanne; Axelrod, Jeffrey D
2015-05-01
The core components of the planar cell polarity (PCP) signaling system, including both transmembrane and peripheral membrane associated proteins, form asymmetric complexes that bridge apical intercellular junctions. While these can assemble in either orientation, coordinated cell polarization requires the enrichment of complexes of a given orientation at specific junctions. This might occur by both positive and negative feedback between oppositely oriented complexes, and requires the peripheral membrane associated PCP components. However, the molecular mechanisms underlying feedback are not understood. We find that the E3 ubiquitin ligase complex Cullin1(Cul1)/SkpA/Supernumerary limbs(Slimb) regulates the stability of one of the peripheral membrane components, Prickle (Pk). Excess Pk disrupts PCP feedback and prevents asymmetry. We show that Pk participates in negative feedback by mediating internalization of PCP complexes containing the transmembrane components Van Gogh (Vang) and Flamingo (Fmi), and that internalization is activated by oppositely oriented complexes within clusters. Pk also participates in positive feedback through an unknown mechanism promoting clustering. Our results therefore identify a molecular mechanism underlying generation of asymmetry in PCP signaling.
Cell Alignment Required in Differentiation of Myxococcus xanthus
NASA Astrophysics Data System (ADS)
Kim, Seung K.; Kaiser, Dale
1990-08-01
During fruiting body morphogenesis of Myxococcus xanthus, cell movement is required for transmission of C-factor, a short range intercellular signaling protein necessary for sporulation and developmental gene expression. Nonmotile cells fail to sporulate and to express C-factor-dependent genes, but both defects were rescued by a simple manipulation of cell position that oriented the cells in aligned, parallel groups. A similar pattern of aligned cells normally results from coordinated recruitment of wild-type cells into multicellular aggregates, which later form mature fruiting bodies. It is proposed that directed cell movement establishes critical contacts between adjacent cells, which are required for efficient intercellular C-factor transmission.
Yamamoto, Ayumu; West, Robert R.; McIntosh, J. Richard; Hiraoka, Yasushi
1999-01-01
Meiotic recombination requires pairing of homologous chromosomes, the mechanisms of which remain largely unknown. When pairing occurs during meiotic prophase in fission yeast, the nucleus oscillates between the cell poles driven by astral microtubules. During these oscillations, the telomeres are clustered at the spindle pole body (SPB), located at the leading edge of the moving nucleus and the rest of each chromosome dangles behind. Here, we show that the oscillatory nuclear movement of meiotic prophase is dependent on cytoplasmic dynein. We have cloned the gene encoding a cytoplasmic dynein heavy chain of fission yeast. Most of the cells disrupted for the gene show no gross defect during mitosis and complete meiosis to form four viable spores, but they lack the nuclear movements of meiotic prophase. Thus, the dynein heavy chain is required for these oscillatory movements. Consistent with its essential role in such nuclear movement, dynein heavy chain tagged with green fluorescent protein (GFP) is localized at astral microtubules and the SPB during the movements. In dynein-disrupted cells, meiotic recombination is significantly reduced, indicating that the dynein function is also required for efficient meiotic recombination. In accordance with the reduced recombination, which leads to reduced crossing over, chromosome missegregation is increased in the mutant. Moreover, both the formation of a single cluster of centromeres and the colocalization of homologous regions on a pair of homologous chromosomes are significantly inhibited in the mutant. These results strongly suggest that the dynein-driven nuclear movements of meiotic prophase are necessary for efficient pairing of homologous chromosomes in fission yeast, which in turn promotes efficient meiotic recombination. PMID:10366596
Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu
2017-01-10
VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Ten Billion Years of Brightest Cluster Galaxy Alignments
NASA Astrophysics Data System (ADS)
West, Michael J.
2017-07-01
Astronomers long assumed that galaxies are randomly oriented in space. However, it's now clear that some have preferred orientations with respect to their surroundings. Chief among these are the giant ellipticals found at the centers of rich galaxy clusters, whose major axes are often aligned with those of their host clusters - a remarkable coherence of structures over millions of light years. A better understanding of these alignments can yield new insights into the processes that have shaped galaxies over the history of the universe. Using Hubble Space Telescope observations of high-redshift galaxy clusters, we show for the first time that such alignments are seen at epochs when the universe was only one-third its current age. These results suggest that the brightest galaxies in clusters are the product of a special formation history, one influenced by development of the cosmic web over billions of years.
Fragmentation of an aflatoxin-like gene cluster in a forest pathogen
USDA-ARS?s Scientific Manuscript database
Secondary metabolic pathway genes are typically clustered in fungi. An exception to this paradigm is seen for genes required for the production of dothistromin, an aflatoxin-like virulence factor produced by the pine needle pathogen Dothistroma septosporum. In contrast to the tight clustering of gen...
Zhou, Zhenxing; Xu, Qingqing; Bu, Qingting; Guo, Yuanyang; Liu, Shuiping; Liu, Yu; Du, Yiling; Li, Yongquan
2015-02-09
Genomic sequencing of actinomycetes has revealed the presence of numerous gene clusters seemingly capable of natural product biosynthesis, yet most clusters are cryptic under laboratory conditions. Bioinformatics analysis of the completely sequenced genome of Streptomyces chattanoogensis L10 (CGMCC 2644) revealed a silent angucycline biosynthetic gene cluster. The overexpression of a pathway-specific activator gene under the constitutive ermE* promoter successfully triggered the expression of the angucycline biosynthetic genes. Two novel members of the angucycline antibiotic family, chattamycins A and B, were further isolated and elucidated. Biological activity assays demonstrated that chattamycin B possesses good antitumor activities against human cancer cell lines and moderate antibacterial activities. The results presented here provide a feasible method to activate silent angucycline biosynthetic gene clusters to discover potential new drug leads. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The ergot alkaloid gene cluster: functional analyses and evolutionary aspects.
Lorenz, Nicole; Haarmann, Thomas; Pazoutová, Sylvie; Jung, Manfred; Tudzynski, Paul
2009-01-01
Ergot alkaloids and their derivatives have been traditionally used as therapeutic agents in migraine, blood pressure regulation and help in childbirth and abortion. Their production in submerse culture is a long established biotechnological process. Ergot alkaloids are produced mainly by members of the genus Claviceps, with Claviceps purpurea as best investigated species concerning the biochemistry of ergot alkaloid synthesis (EAS). Genes encoding enzymes involved in EAS have been shown to be clustered; functional analyses of EAS cluster genes have allowed to assign specific functions to several gene products. Various Claviceps species differ with respect to their host specificity and their alkaloid content; comparison of the ergot alkaloid clusters in these species (and of clavine alkaloid clusters in other genera) yields interesting insights into the evolution of cluster structure. This review focuses on recently published and also yet unpublished data on the structure and evolution of the EAS gene cluster and on the function and regulation of cluster genes. These analyses have also significant biotechnological implications: the characterization of non-ribosomal peptide synthetases (NRPS) involved in the synthesis of the peptide moiety of ergopeptines opened interesting perspectives for the synthesis of ergot alkaloids; on the other hand, defined mutants could be generated producing interesting intermediates or only single peptide alkaloids (instead of the alkaloid mixtures usually produced by industrial strains).
Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S
2016-12-01
Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.
Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine
2016-01-01
Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408
Rapid generation of genetic diversity by multiplex CRISPR/Cas9 genome editing in rice.
Shen, Lan; Hua, Yufeng; Fu, Yaping; Li, Jian; Liu, Qing; Jiao, Xiaozhen; Xin, Gaowei; Wang, Junjie; Wang, Xingchun; Yan, Changjie; Wang, Kejian
2017-05-01
The clustered regularly interspaced short palindromic repeats (CRISPR)-associated endonuclease 9 (CRISPR/Cas9) system has emerged as a promising technology for specific genome editing in many species. Here we constructed one vector targeting eight agronomic genes in rice using the CRISPR/Cas9 multiplex genome editing system. By subsequent genetic transformation and DNA sequencing, we found that the eight target genes have high mutation efficiencies in the T 0 generation. Both heterozygous and homozygous mutations of all editing genes were obtained in T 0 plants. In addition, homozygous sextuple, septuple, and octuple mutants were identified. As the abundant genotypes in T 0 transgenic plants, various phenotypes related to the editing genes were observed. The findings demonstrate the potential of the CRISPR/Cas9 system for rapid introduction of genetic diversity during crop breeding.
Hu, Peinan; Zhao, Xueying; Zhang, Qinghua; Li, Weiming; Zu, Yao
2018-03-02
The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system has been proven to be an efficient and precise genome editing technology in various organisms. However, the gene editing efficiencies of Cas9 proteins with a nuclear localization signal (NLS) fused to different termini and Cas9 mRNA have not been systematically compared. Here, we compared the ability of Cas9 proteins with NLS fused to the N-, C-, or both the N- and C-termini and N-NLS-Cas9-NLS-C mRNA to target two sites in the tyr gene and two sites in the gol gene related to pigmentation in zebrafish. Phenotypic analysis revealed that all types of Cas9 led to hypopigmentation in similar proportions of injected embryos. Genome analysis by T7 Endonuclease I (T7E1) assays demonstrated that all types of Cas9 similarly induced mutagenesis in four target sites. Sequencing results further confirmed that a high frequency of indels occurred in the target sites ( tyr1 > 66%, tyr2 > 73%, gol1 > 50%, and gol2 > 35%), as well as various types (more than six) of indel mutations observed in all four types of Cas9-injected embryos. Furthermore, all types of Cas9 showed efficient targeted mutagenesis on multiplex genome editing, resulting in multiple phenotypes simultaneously. Collectively, we conclude that various NLS-fused Cas9 proteins and Cas9 mRNAs have similar genome editing efficiencies on targeting single or multiple genes, suggesting that the efficiency of CRISPR/Cas9 genome editing is highly dependent on guide RNAs (gRNAs) and gene loci. These findings may help to simplify the selection of Cas9 for gene editing using the CRISPR/Cas9 system. Copyright © 2018 Hu et al.
Danielsson, Rebecca; Dicksved, Johan; Sun, Li; Gonda, Horacio; Müller, Bettina; Schnürer, Anna; Bertilsson, Jan
2017-01-01
Methane (CH 4 ) is produced as an end product from feed fermentation in the rumen. Yield of CH 4 varies between individuals despite identical feeding conditions. To get a better understanding of factors behind the individual variation, 73 dairy cows given the same feed but differing in CH 4 emissions were investigated with focus on fiber digestion, fermentation end products and bacterial and archaeal composition. In total 21 cows (12 Holstein, 9 Swedish Red) identified as persistent low, medium or high CH 4 emitters over a 3 month period were furthermore chosen for analysis of microbial community structure in rumen fluid. This was assessed by sequencing the V4 region of 16S rRNA gene and by quantitative qPCR of targeted Methanobrevibacter groups. The results showed a positive correlation between low CH 4 emitters and higher abundance of Methanobrevibacter ruminantium clade. Principal coordinate analysis (PCoA) on operational taxonomic unit (OTU) level of bacteria showed two distinct clusters ( P < 0.01) that were related to CH 4 production. One cluster was associated with low CH 4 production (referred to as cluster L) whereas the other cluster was associated with high CH 4 production (cluster H) and the medium emitters occurred in both clusters. The differences between clusters were primarily linked to differential abundances of certain OTUs belonging to Prevotella . Moreover, several OTUs belonging to the family Succinivibrionaceae were dominant in samples belonging to cluster L. Fermentation pattern of volatile fatty acids showed that proportion of propionate was higher in cluster L, while proportion of butyrate was higher in cluster H. No difference was found in milk production or organic matter digestibility between cows. Cows in cluster L had lower CH 4 /kg energy corrected milk (ECM) compared to cows in cluster H, 8.3 compared to 9.7 g CH 4 /kg ECM, showing that low CH 4 cows utilized the feed more efficient for milk production which might indicate a more efficient microbial population or host genetic differences that is reflected in bacterial and archaeal (or methanogens) populations.
Danielsson, Rebecca; Dicksved, Johan; Sun, Li; Gonda, Horacio; Müller, Bettina; Schnürer, Anna; Bertilsson, Jan
2017-01-01
Methane (CH4) is produced as an end product from feed fermentation in the rumen. Yield of CH4 varies between individuals despite identical feeding conditions. To get a better understanding of factors behind the individual variation, 73 dairy cows given the same feed but differing in CH4 emissions were investigated with focus on fiber digestion, fermentation end products and bacterial and archaeal composition. In total 21 cows (12 Holstein, 9 Swedish Red) identified as persistent low, medium or high CH4 emitters over a 3 month period were furthermore chosen for analysis of microbial community structure in rumen fluid. This was assessed by sequencing the V4 region of 16S rRNA gene and by quantitative qPCR of targeted Methanobrevibacter groups. The results showed a positive correlation between low CH4 emitters and higher abundance of Methanobrevibacter ruminantium clade. Principal coordinate analysis (PCoA) on operational taxonomic unit (OTU) level of bacteria showed two distinct clusters (P < 0.01) that were related to CH4 production. One cluster was associated with low CH4 production (referred to as cluster L) whereas the other cluster was associated with high CH4 production (cluster H) and the medium emitters occurred in both clusters. The differences between clusters were primarily linked to differential abundances of certain OTUs belonging to Prevotella. Moreover, several OTUs belonging to the family Succinivibrionaceae were dominant in samples belonging to cluster L. Fermentation pattern of volatile fatty acids showed that proportion of propionate was higher in cluster L, while proportion of butyrate was higher in cluster H. No difference was found in milk production or organic matter digestibility between cows. Cows in cluster L had lower CH4/kg energy corrected milk (ECM) compared to cows in cluster H, 8.3 compared to 9.7 g CH4/kg ECM, showing that low CH4 cows utilized the feed more efficient for milk production which might indicate a more efficient microbial population or host genetic differences that is reflected in bacterial and archaeal (or methanogens) populations. PMID:28261182
Scoring clustering solutions by their biological relevance.
Gat-Viks, I; Sharan, R; Shamir, R
2003-12-12
A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.
Wu, Changsheng; Du, Chao; Gubbens, Jacob; Choi, Young Hae; van Wezel, Gilles P
2015-10-23
Actinomycetes are a major source of antimicrobials, anticancer compounds, and other medically important products, and their genomes harbor extensive biosynthetic potential. Major challenges in the screening of these microorganisms are to activate the expression of cryptic biosynthetic gene clusters and the development of technologies for efficient dereplication of known molecules. Here we report the identification of a previously unidentified isatin-type antibiotic produced by Streptomyces sp. MBT28, following a strategy based on NMR-based metabolomics combined with the introduction of streptomycin resistance in the producer strain. NMR-guided isolation by tracking the target proton signal resulted in the characterization of 7-prenylisatin (1) with antimicrobial activity against Bacillus subtilis. The metabolite-guided genome mining of Streptomyces sp. MBT28 combined with proteomics identified a gene cluster with an indole prenyltransferase that catalyzes the conversion of tryptophan into 7-prenylisatin. This study underlines the applicability of NMR-based metabolomics in facilitating the discovery of novel antibiotics.
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
Oxygen Vacancy Linear Clustering in a Perovskite Oxide
Eom, Kitae; Choi, Euiyoung; Choi, Minsu; ...
2017-07-14
Oxygen vacancies have been implicitly assumed isolated ones, and understanding oxide materials possibly containing oxygen vacancies remains elusive within the scheme of the isolated vacancies, although the oxygen vacancies have been playing a decisive role in oxide materials. We report the presence of oxygen vacancy linear clusters and their orientation along a specific crystallographic direction in SrTiO 3, a representative of a perovskite oxide. The presence of the linear clusters and associated electron localization was revealed by an electronic structure represented in the increase in the Ti 2+ valence state or corresponding Ti 3d 2 electronic configuration along with divacancymore » cluster model analysis and transport measurement. The orientation of the linear clusters along the [001] direction in perovskite SrTiO 3 was verified by further X-ray diffuse scattering analysis. And because SrTiO 3 is an archetypical perovskite oxide, the vacancy linear clustering with the specific aligned direction and electron localization can be extended to a wide variety of the perovskite oxides.« less
Oxygen Vacancy Linear Clustering in a Perovskite Oxide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eom, Kitae; Choi, Euiyoung; Choi, Minsu
Oxygen vacancies have been implicitly assumed isolated ones, and understanding oxide materials possibly containing oxygen vacancies remains elusive within the scheme of the isolated vacancies, although the oxygen vacancies have been playing a decisive role in oxide materials. We report the presence of oxygen vacancy linear clusters and their orientation along a specific crystallographic direction in SrTiO 3, a representative of a perovskite oxide. The presence of the linear clusters and associated electron localization was revealed by an electronic structure represented in the increase in the Ti 2+ valence state or corresponding Ti 3d 2 electronic configuration along with divacancymore » cluster model analysis and transport measurement. The orientation of the linear clusters along the [001] direction in perovskite SrTiO 3 was verified by further X-ray diffuse scattering analysis. And because SrTiO 3 is an archetypical perovskite oxide, the vacancy linear clustering with the specific aligned direction and electron localization can be extended to a wide variety of the perovskite oxides.« less
Gil-Serna, Jessica; García-Díaz, Marta; González-Jaén, María Teresa; Vázquez, Covadonga; Patiño, Belén
2018-03-02
Ochratoxin A (OTA) is one of the most important mycotoxins due to its toxic properties and worldwide distribution which is produced by several Aspergillus and Penicillium species. The knowledge of OTA biosynthetic genes and understanding of the mechanisms involved in their regulation are essential. In this work, we obtained a clear picture of biosynthetic genes organization in the main OTA-producing Aspergillus and Penicillium species (A. steynii, A. westerdijkiae, A. niger, A. carbonarius and P. nordicum) using complete genome sequences obtained in this work or previously available on databases. The results revealed a region containing five ORFs which predicted five proteins: halogenase, bZIP transcription factor, cytochrome P450 monooxygenase, non-ribosomal peptide synthetase and polyketide synthase in all the five species. Genetic synteny was conserved in both Penicillium and Aspergillus species although genomic location seemed to be different since the clusters presented different flanking regions (except for A. steynii and A. westerdijkiae); these observations support the hypothesis of the orthology of this genomic region and that it might have been acquired by horizontal transfer. New real-time RT-PCR assays for quantification of the expression of these OTA biosynthetic genes were developed. In all species, the five genes were consistently expressed in OTA-producing strains in permissive conditions. These protocols might favour futures studies on the regulation of biosynthetic genes in order to develop new efficient control methods to avoid OTA entering the food chain. Copyright © 2018 Elsevier B.V. All rights reserved.
Strategic groups, performance, and strategic response in the nursing home industry.
Zinn, J S; Aaronson, W E; Rosko, M D
1994-01-01
OBJECTIVE. This study examines the effect of strategic group membership on nursing home performance and strategic behavior. DATA SOURCES AND STUDY SETTING. Data from the 1987 Medicare and Medicaid Automated Certification Survey were combined with data from the 1987 and 1989 Pennsylvania Long Term Care Facility Questionnaire. The sample consisted of 383 Pennsylvania nursing homes. STUDY DESIGN. Cluster analysis was used to place the 383 nursing homes into strategic groups on the basis of variables measuring scope and resource deployment. Performance was measured by indicators of the quality of nursing home care (rates of pressure ulcers, catheterization, and restraint usage) and efficiency in services provision. Changes in Medicare participation after passage of the 1988 Medicare Catastrophic Coverage Act (MCCA) measured strategic behavior. MANOVA and Turkey HSD post hoc means tests determined if significant differences were associated with strategic group membership. FINDINGS. Cluster analysis produced an optimal seven-group solution. Differences in group means were significant for the clustering, performance, and conduct variables (p < .0001). Strategic groups characterized by facilities providing a continuum of care services had the best patient care outcomes. The most efficient groups were characterized by facilities with high Medicare census. While all strategic groups increased Medicare census following passage of the MCCA, those dominated by for-profits had the greatest increases. CONCLUSIONS. Our analysis demonstrates that strategic orientation influences nursing home response to regulatory initiatives, a factor that should be recognized in policy formation directed at nursing home reform. PMID:8005789
Salim, Shelly; Moh, Sangman; Choi, Dongmin; Chung, Ilyong
2014-08-11
A cognitive radio sensor network (CRSN) is a wireless sensor network whose sensor nodes are equipped with cognitive radio capability. Clustering is one of the most challenging issues in CRSNs, as all sensor nodes, including the cluster head, have to use the same frequency band in order to form a cluster. However, due to the nature of heterogeneous channels in cognitive radio, it is difficult for sensor nodes to find a cluster head. This paper proposes a novel energy-efficient and compact clustering scheme named clustering with temporary support nodes (CENTRE). CENTRE efficiently achieves a compact cluster formation by adopting two-phase cluster formation with fixed duration. By introducing a novel concept of temporary support nodes to improve the cluster formation, the proposed scheme enables sensor nodes in a network to find a cluster head efficiently. The performance study shows that not only is the clustering process efficient and compact but it also results in remarkable energy savings that prolong the overall network lifetime. In addition, the proposed scheme decreases both the clustering overhead and the average distance between cluster heads and their members.
Salim, Shelly; Moh, Sangman; Choi, Dongmin; Chung, Ilyong
2014-01-01
A cognitive radio sensor network (CRSN) is a wireless sensor network whose sensor nodes are equipped with cognitive radio capability. Clustering is one of the most challenging issues in CRSNs, as all sensor nodes, including the cluster head, have to use the same frequency band in order to form a cluster. However, due to the nature of heterogeneous channels in cognitive radio, it is difficult for sensor nodes to find a cluster head. This paper proposes a novel energy-efficient and compact clustering scheme named clustering with temporary support nodes (CENTRE). CENTRE efficiently achieves a compact cluster formation by adopting two-phase cluster formation with fixed duration. By introducing a novel concept of temporary support nodes to improve the cluster formation, the proposed scheme enables sensor nodes in a network to find a cluster head efficiently. The performance study shows that not only is the clustering process efficient and compact but it also results in remarkable energy savings that prolong the overall network lifetime. In addition, the proposed scheme decreases both the clustering overhead and the average distance between cluster heads and their members. PMID:25116905
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebhaber, S.A.; Weiss, I.; Cash, F.E.
Synthesis of normal human hemoglobin A, {alpha}{sub 2}{beta}{sub 2}, is based upon balanced expression of genes in the {alpha}-globin gene cluster on chromosome 15 and the {beta}-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the {beta}-globin cluster depend on sequences located at a considerable distance 5{prime} to the {beta}-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the {alpha}-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with {alpha}-thalassemia in whom structurally normal {alpha}-globin genesmore » have been inactivated in cis by a discrete de novo 35-kilobase deletion located {approximately}30 kilobases 5{prime} from the {alpha}-globin gene cluster. They conclude that this deletion inactivates expression of the {alpha}-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the {alpha}-globin genes.« less
Calculating the Motion and Direction of Flux Transfer Events with Cluster
NASA Technical Reports Server (NTRS)
Collado-Vega, Yaireska M.; Sibeck, David Gary
2011-01-01
We use multi-point timing analysis to determine the orientation and motion of flux transfer events (FTEs) detected by the four Cluster spacecraft on the high-latitude dayside and flank magnetopause during 2002 and 2003. During these years, the distances between the Cluster spacecraft were greater than 1000 km, providing the tetrahedral configuration needed to select events and determine velocities. Each velocity and location will be examined in detail and compared to the velocities and locations determined by the predictions of the component and antiparallel reconnection models for event formation, orientation, motion, and acceleration for a wide range of spacecraft locations and solar wind conditions.
Discovery of a Phosphonoacetic Acid Derived Natural Product by Pathway Refactoring.
Freestone, Todd S; Ju, Kou-San; Wang, Bin; Zhao, Huimin
2017-02-17
The activation of silent natural product gene clusters is a synthetic biology problem of great interest. As the rate at which gene clusters are identified outpaces the discovery rate of new molecules, this unknown chemical space is rapidly growing, as too are the rewards for developing technologies to exploit it. One class of natural products that has been underrepresented is phosphonic acids, which have important medical and agricultural uses. Hundreds of phosphonic acid biosynthetic gene clusters have been identified encoding for unknown molecules. Although methods exist to elicit secondary metabolite gene clusters in native hosts, they require the strain to be amenable to genetic manipulation. One method to circumvent this is pathway refactoring, which we implemented in an effort to discover new phosphonic acids from a gene cluster from Streptomyces sp. strain NRRL F-525. By reengineering this cluster for expression in the production host Streptomyces lividans, utility of refactoring is demonstrated with the isolation of a novel phosphonic acid, O-phosphonoacetic acid serine, and the characterization of its biosynthesis. In addition, a new biosynthetic branch point is identified with a phosphonoacetaldehyde dehydrogenase, which was used to identify additional phosphonic acid gene clusters that share phosphonoacetic acid as an intermediate.
The intact dupA cluster is a more reliable Helicobacter pylori virulence marker than dupA alone.
Jung, Sung Woo; Sugimoto, Mitsushige; Shiota, Seiji; Graham, David Y; Yamaoka, Yoshio
2012-01-01
The duodenal ulcer promoting (dupA) gene, located in the plasticity region of Helicobacter pylori, is associated with duodenal ulcer development. dupA was predicted to form a type IV secretory system (T4SS) with vir genes around dupA (dupA cluster). We investigated the prevalence of dupA and dupA clusters and clarified associations between the dupA cluster status and clinical outcomes in the U.S. population. In all, 245 H. pylori strains were examined using PCR to evaluate the status of dupA and the adjacent vir genes predicted to form T4SS, in addition to the status of cag pathogenicity island (PAI). The associations between dupA cluster status and interleukin-8 (IL-8) and IL-12 production were also examined. The presence of dupA and all adjacent vir genes were defined as a complete dupA cluster. Many variations related to the status of dupA and dupA cluster genes were identified. Concurrent H. pylori infection and the presence of a complete dupA cluster increases duodenal ulcer risk compared to H. pylori infection with incomplete dupA cluster or without the dupA gene independent on the cag PAI status (adjusted odds ratio, 2.13; 95% confidence interval, 1.13 to 4.03). Gastric mucosal IL-8 levels were also significantly higher in the complete dupA cluster group than in other groups (P=0.01). In conclusion, although the causal relationship between the dupA cluster and duodenal ulcer development is not proved, the presence of a complete dupA cluster but not dupA alone, is associated with duodenal ulcer development.
The Intact dupA Cluster Is a More Reliable Helicobacter pylori Virulence Marker than dupA Alone
Jung, Sung Woo; Sugimoto, Mitsushige; Shiota, Seiji; Graham, David Y.
2012-01-01
The duodenal ulcer promoting (dupA) gene, located in the plasticity region of Helicobacter pylori, is associated with duodenal ulcer development. dupA was predicted to form a type IV secretory system (T4SS) with vir genes around dupA (dupA cluster). We investigated the prevalence of dupA and dupA clusters and clarified associations between the dupA cluster status and clinical outcomes in the U.S. population. In all, 245 H. pylori strains were examined using PCR to evaluate the status of dupA and the adjacent vir genes predicted to form T4SS, in addition to the status of cag pathogenicity island (PAI). The associations between dupA cluster status and interleukin-8 (IL-8) and IL-12 production were also examined. The presence of dupA and all adjacent vir genes were defined as a complete dupA cluster. Many variations related to the status of dupA and dupA cluster genes were identified. Concurrent H. pylori infection and the presence of a complete dupA cluster increases duodenal ulcer risk compared to H. pylori infection with incomplete dupA cluster or without the dupA gene independent on the cag PAI status (adjusted odds ratio, 2.13; 95% confidence interval, 1.13 to 4.03). Gastric mucosal IL-8 levels were also significantly higher in the complete dupA cluster group than in other groups (P = 0.01). In conclusion, although the causal relationship between the dupA cluster and duodenal ulcer development is not proved, the presence of a complete dupA cluster but not dupA alone, is associated with duodenal ulcer development. PMID:22038914
Fast gene ontology based clustering for microarray experiments.
Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa
2008-11-21
Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Campbell, Elsie L; Hagen, Kari D; Chen, Rui; Risser, Douglas D; Ferreira, Daniela P; Meeks, John C
2015-02-15
In cyanobacterial Nostoc species, substratum-dependent gliding motility is confined to specialized nongrowing filaments called hormogonia, which differentiate from vegetative filaments as part of a conditional life cycle and function as dispersal units. Here we confirm that Nostoc punctiforme hormogonia are positively phototactic to white light over a wide range of intensities. N. punctiforme contains two gene clusters (clusters 2 and 2i), each of which encodes modular cyanobacteriochrome-methyl-accepting chemotaxis proteins (MCPs) and other proteins that putatively constitute a basic chemotaxis-like signal transduction complex. Transcriptional analysis established that all genes in clusters 2 and 2i, plus two additional clusters (clusters 1 and 3) with genes encoding MCPs lacking cyanobacteriochrome sensory domains, are upregulated during the differentiation of hormogonia. Mutational analysis determined that only genes in cluster 2i are essential for positive phototaxis in N. punctiforme hormogonia; here these genes are designated ptx (for phototaxis) genes. The cluster is unusual in containing complete or partial duplicates of genes encoding proteins homologous to the well-described chemotaxis elements CheY, CheW, MCP, and CheA. The cyanobacteriochrome-MCP gene (ptxD) lacks transmembrane domains and has 7 potential binding sites for bilins. The transcriptional start site of the ptx genes does not resemble a sigma 70 consensus recognition sequence; moreover, it is upstream of two genes encoding gas vesicle proteins (gvpA and gvpC), which also are expressed only in the hormogonium filaments of N. punctiforme. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
van der Geize, R.; de Jong, W.; Hessels, G. I.; Grommen, A. W. F.; Jacobs, A. A. C.; Dijkhuizen, L.
2008-01-01
A novel method to efficiently generate unmarked in-frame gene deletions in Rhodococcus equi was developed, exploiting the cytotoxic effect of 5-fluorocytosine (5-FC) by the action of cytosine deaminase (CD) and uracil phosphoribosyltransferase (UPRT) enzymes. The opportunistic, intracellular pathogen R. equi is resistant to high concentrations of 5-FC. Introduction of Escherichia coli genes encoding CD and UPRT conferred conditional lethality to R. equi cells incubated with 5-FC. To exemplify the use of the codA::upp cassette as counter-selectable marker, an unmarked in-frame gene deletion mutant of R. equi was constructed. The supA and supB genes, part of a putative cholesterol catabolic gene cluster, were efficiently deleted from the R. equi wild-type genome. Phenotypic analysis of the generated ΔsupAB mutant confirmed that supAB are essential for growth of R. equi on cholesterol. Macrophage survival assays revealed that the ΔsupAB mutant is able to survive and proliferate in macrophages comparable to wild type. Thus, cholesterol metabolism does not appear to be essential for macrophage survival of R. equi. The CD-UPRT based 5-FC counter-selection may become a useful asset in the generation of unmarked in-frame gene deletions in other actinobacteria as well, as actinobacteria generally appear to be 5-FC resistant and 5-FU sensitive. PMID:18984616
On three-dimensional misorientation spaces
Bennett, Robbie J.; Vukmanovic, Zoja; Solano-Alvarez, Wilberth; Lainé, Steven J.; Einsle, Joshua F.; Midgley, Paul A.; Rae, Catherine M. F.; Hielscher, Ralf
2017-01-01
Determining the local orientation of crystals in engineering and geological materials has become routine with the advent of modern crystallographic mapping techniques. These techniques enable many thousands of orientation measurements to be made, directing attention towards how such orientation data are best studied. Here, we provide a guide to the visualization of misorientation data in three-dimensional vector spaces, reduced by crystal symmetry, to reveal crystallographic orientation relationships. Domains for all point group symmetries are presented and an analysis methodology is developed and applied to identify crystallographic relationships, indicated by clusters in the misorientation space, in examples from materials science and geology. This analysis aids the determination of active deformation mechanisms and evaluation of cluster centres and spread enables more accurate description of transformation processes supporting arguments regarding provenance. PMID:29118660
Identifying a gene expression signature of cluster headache in blood
Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.
2017-01-01
Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859
Proven and novel strategies for efficient editing of the human genome.
Mussolino, Claudio; Mlambo, Tafadzwa; Cathomen, Toni
2015-10-01
Targeted gene editing with designer nucleases has become increasingly popular. The most commonly used designer nuclease platforms are engineered meganucleases, zinc-finger nucleases, transcription activator-like effector nucleases and the clustered regularly interspaced short palindromic repeat/Cas9 system. These powerful tools have greatly facilitated the generation of plant and animal models for basic research, and harbor an enormous potential for applications in biotechnology and gene therapy. This review recapitulates proven concepts of targeted genome engineering in primary human cells and elaborates on novel concepts that became possible with the dawn of RNA-guided nucleases and RNA-guided transcription factors. Copyright © 2015 Elsevier Ltd. All rights reserved.
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.
Tan, Qihua; Thomassen, Mads; Burton, Mark; Mose, Kristian Fredløv; Andersen, Klaus Ejner; Hjelmborg, Jacob; Kruse, Torben
2017-06-06
Modeling complex time-course patterns is a challenging issue in microarray study due to complex gene expression patterns in response to the time-course experiment. We introduce the generalized correlation coefficient and propose a combinatory approach for detecting, testing and clustering the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health.
Gene essentiality, conservation index and co-evolution of genes in cyanobacteria.
Tiruveedula, Gopi Siva Sai; Wangikar, Pramod P
2017-01-01
Cyanobacteria, a group of photosynthetic prokaryotes, dominate the earth with ~ 1015 g wet biomass. Despite diversity in habitats and an ancient origin, cyanobacterial phylum has retained a significant core genome. Cyanobacteria are being explored for direct conversion of solar energy and carbon dioxide into biofuels. For this, efficient cyanobacterial strains will need to be designed via metabolic engineering. This will require identification of target knockouts to channelize the flow of carbon toward the product of interest while minimizing deletions of essential genes. We propose "Gene Conservation Index" (GCI) as a quick measure to predict gene essentiality in cyanobacteria. GCI is based on phylogenetic profile of a gene constructed with a reduced dataset of cyanobacterial genomes. GCI is the percentage of organism clusters in which the query gene is present in the reduced dataset. Of the 750 genes deemed to be essential in the experimental study on S. elongatus PCC 7942, we found 494 to be conserved across the phylum which largely comprise of the essential metabolic pathways. On the contrary, the conserved but non-essential genes broadly comprise of genes required under stress conditions. Exceptions to this rule include genes such as the glycogen synthesis and degradation enzymes, deoxyribose-phosphate aldolase (DERA), glucose-6-phosphate 1-dehydrogenase (zwf) and fructose-1,6-bisphosphatase class1, which are conserved but non-essential. While the essential genes are to be avoided during gene knockout studies as potentially lethal deletions, the non-essential but conserved set of genes could be interesting targets for metabolic engineering. Further, we identify clusters of co-evolving genes (CCG), which provide insights that may be useful in annotation. Principal component analysis (PCA) plots of the CCGs are demonstrated as data visualization tools that are complementary to the conventional heatmaps. Our dataset consists of phylogenetic profiles for 23,643 non-redundant cyanobacterial genes. We believe that the data and the analysis presented here will be a great resource to the scientific community interested in cyanobacteria.
Neubauer, Lisa; Dopstadt, Julian; Humpf, Hans-Ulrich; Tudzynski, Paul
2016-01-01
Claviceps purpurea is a phytopathogenic fungus infecting a broad range of grasses including economically important cereal crop plants. The infection cycle ends with the formation of the typical purple-black pigmented sclerotia containing the toxic ergot alkaloids. Besides these ergot alkaloids little is known about the secondary metabolism of the fungus. Red anthraquinone derivatives and yellow xanthone dimers (ergochromes) have been isolated from sclerotia and described as ergot pigments, but the corresponding gene cluster has remained unknown. Fungal pigments gain increasing interest for example as environmentally friendly alternatives to existing dyes. Furthermore, several pigments show biological activities and may have some pharmaceutical value. This study identified the gene cluster responsible for the synthesis of the ergot pigments. Overexpression of the cluster-specific transcription factor led to activation of the gene cluster and to the production of several known ergot pigments. Knock out of the cluster key enzyme, a nonreducing polyketide synthase, clearly showed that this cluster is responsible for the production of red anthraquinones as well as yellow ergochromes. Furthermore, a tentative biosynthetic pathway for the ergot pigments is proposed. By changing the culture conditions, pigment production was activated in axenic culture so that high concentration of phosphate and low concentration of sucrose induced pigment syntheses. This is the first functional analysis of a secondary metabolite gene cluster in the ergot fungus besides that for the classical ergot alkaloids. We demonstrated that this gene cluster is responsible for the typical purple-black color of the ergot sclerotia and showed that the red and yellow ergot pigments are products of the same biosynthetic pathway. Activation of the gene cluster in axenic culture opened up new possibilities for biotechnological applications like the dye production or the development of new pharmaceuticals.
Brain imaging registry for neurologic diagnosis and research
NASA Astrophysics Data System (ADS)
Hoo, Kent S., Jr.; Wong, Stephen T. C.; Knowlton, Robert C.; Young, Geoffrey S.; Walker, John; Cao, Xinhua; Dillon, William P.; Hawkins, Randall A.; Laxer, Kenneth D.
2002-05-01
The purpose of this paper is to demonstrate the importance of building a brain imaging registry (BIR) on top of existing medical information systems including Picture Archiving Communication Systems (PACS) environment. We describe the design framework for a cluster of data marts whose purpose is to provide clinicians and researchers efficient access to a large volume of raw and processed patient images and associated data originating from multiple operational systems over time and spread out across different hospital departments and laboratories. The framework is designed using object-oriented analysis and design methodology. The BIR data marts each contain complete image and textual data relating to patients with a particular disease.
Reuse of imputed data in microarray analysis increases imputation efficiency
Kim, Ki-Yeol; Kim, Byoung-Jin; Yi, Gwan-Su
2004-01-01
Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. Results We developed a new cluster-based imputation method called sequential K-nearest neighbor (SKNN) method. This imputes the missing values sequentially from the gene having least missing values, and uses the imputed values for the later imputation. Although it uses the imputed values, the efficiency of this new method is greatly improved in its accuracy and computational complexity over the conventional KNN-based method and other methods based on maximum likelihood estimation. The performance of SKNN was in particular higher than other imputation methods for the data with high missing rates and large number of experiments. Application of Expectation Maximization (EM) to the SKNN method improved the accuracy, but increased computational time proportional to the number of iterations. The Multiple Imputation (MI) method, which is well known but not applied previously to microarray data, showed a similarly high accuracy as the SKNN method, with slightly higher dependency on the types of data sets. Conclusions Sequential reuse of imputed data in KNN-based imputation greatly increases the efficiency of imputation. The SKNN method should be practically useful to save the data of some microarray experiments which have high amounts of missing entries. The SKNN method generates reliable imputed values which can be used for further cluster-based analysis of microarray data. PMID:15504240
2012-01-01
Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154
Neck formation and deformation effects in a preformed cluster model of exotic cluster decays
NASA Astrophysics Data System (ADS)
Kumar, Satish; Gupta, Raj K.
1997-01-01
Using the nuclear proximity approach and the two center nuclear shape parametrization, the interaction potential between two deformed and pole-to-pole oriented nuclei forming a necked configuration in the overlap region is calculated and its role is studied for the cluster decay half-lives. The barrier is found to move to a larger relative separation, with its proximity minimum lying in the neighborhood of the Q value of decay and its height and width reduced considerably. For cluster decay calculations in the preformed cluster model of Malik and Gupta, due to deformations and orientations of nuclei, the (empirical) preformation factor is found to get reduced considerably and agrees nicely with other model calculations known to be successful for their predictions of cluster decay half-lives. Comparison with the earlier case of nuclei treated as spheres suggests that the effects of both deformations and neck formation get compensated by choosing the position of cluster preformation and the inner classical turning point for penetrability calculations at the touching configuration of spherical nuclei.
An Empirical Study on the Effect of Work/Life Commitment to Work-Life Conflict
NASA Astrophysics Data System (ADS)
Ma, Li; Yin, Jie-lin
This study examined the relation between work and life orientation and work interfere with personal life or personal life interfere with work of employees in China. Cluster analysis results showed that there are four profiles of orientation: work orientation, life orientation, integration and disengagement orientation. There are significant differences in work interfere personal life and personal life interfere work between different profiles.
Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji
2016-07-01
To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.
Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.
2013-01-01
The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role of secondary metabolite gene clusters and their metabolites in fungal biology. PMID:23818858
Outcome-Driven Cluster Analysis with Application to Microarray Data.
Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A
2015-01-01
One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.
Orientation domains: A mobile grid clustering algorithm with spherical corrections
NASA Astrophysics Data System (ADS)
Mencos, Joana; Gratacós, Oscar; Farré, Mercè; Escalante, Joan; Arbués, Pau; Muñoz, Josep Anton
2012-12-01
An algorithm has been designed and tested which was devised as a tool assisting the analysis of geological structures solely from orientation data. More specifically, the algorithm was intended for the analysis of geological structures that can be approached as planar and piecewise features, like many folded strata. Input orientation data is expressed as pairs of angles (azimuth and dip). The algorithm starts by considering the data in Cartesian coordinates. This is followed by a search for an initial clustering solution, which is achieved by comparing the results output from the systematic shift of a regular rigid grid over the data. This initial solution is optimal (achieves minimum square error) once the grid size and the shift increment are fixed. Finally, the algorithm corrects for the variable spread that is generally expected from the data type using a reshaped non-rigid grid. The algorithm is size-oriented, which implies the application of conditions over cluster size through all the process in contrast to density-oriented algorithms, also widely used when dealing with spatial data. Results are derived in few seconds and, when tested over synthetic examples, they were found to be consistent and reliable. This makes the algorithm a valuable alternative to the time-consuming traditional approaches available to geologists.
Song, Tian-Yang; Xu, Zi-Fei; Chen, Yong-Hong; Ding, Qiu-Yan; Sun, Yu-Rong; Miao, Yang; Zhang, Ke-Qin; Niu, Xue-Mei
2017-05-24
Types of polyketide synthase-terpenoid synthase (PKS-TPS) hybrid metabolites, including arthrosporols with significant morphological regulatory activity, have been elucidated from nematode-trapping fungus Arthrobotrys oligospora. A previous study suggested that the gene cluster AOL_s00215 in A. oligospora was involved in the production of arthrosporols. Here, we report that disruption of one cytochrome P450 monooxygenase gene AOL_s00215g280 in the cluster resulted in significant phenotypic difference and much aerial hyphae. A further bioassay indicated that the mutant showed a dramatic decrease in the conidial formation but developed numerous traps and killed 85% nematodes within 6 h in contact with prey, in sharp contrast to the wild-type strain with no obvious response. Chemical investigation revealed huge accumulation of three new PKS-TPS epoxycyclohexone derivatives with different oxygenated patterns around the epoxycyclohexone moiety and the absence of arthrosporols in the cultural broth of the mutant ΔAOL_s00215g280. These findings suggested that a study on the biosynthetic pathway for morphological regulatory metabolites in nematode-trapping fungus would provide an efficient way to develop new fungal biocontrol agents.
Cancer Detection in Microarray Data Using a Modified Cat Swarm Optimization Clustering Approach
M, Pandi; R, Balamurugan; N, Sadhasivam
2017-12-29
Objective: A better understanding of functional genomics can be obtained by extracting patterns hidden in gene expression data. This could have paramount implications for cancer diagnosis, gene treatments and other domains. Clustering may reveal natural structures and identify interesting patterns in underlying data. The main objective of this research was to derive a heuristic approach to detection of highly co-expressed genes related to cancer from gene expression data with minimum Mean Squared Error (MSE). Methods: A modified CSO algorithm using Harmony Search (MCSO-HS) for clustering cancer gene expression data was applied. Experiment results are analyzed using two cancer gene expression benchmark datasets, namely for leukaemia and for breast cancer. Result: The results indicated MCSO-HS to be better than HS and CSO, 13% and 9% with the leukaemia dataset. For breast cancer dataset improvement was by 22% and 17%, respectively, in terms of MSE. Conclusion: The results showed MCSO-HS to outperform HS and CSO with both benchmark datasets. To validate the clustering results, this work was tested with internal and external cluster validation indices. Also this work points to biological validation of clusters with gene ontology in terms of function, process and component. Creative Commons Attribution License
Wan, B; Yarbrough, J W; Schultz, T W
2008-01-01
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Fox, Ellen M.; Gardiner, Donald M.; Keller, Nancy P.; Howlett, Barbara J.
2008-01-01
A gene, sirZ, encoding a Zn(II)2Cys6 DNA binding protein is present in a cluster of genes responsible for the biosynthesis of the epipolythiodioxopiperazine (ETP) toxin, sirodesmin PL in the ascomycete plant pathogen, Leptosphaeria maculans. RNA-mediated silencing of sirZ gives rise to transformants that produce only residual amounts of sirodesmin PL and display a decrease in the transcription of several sirodesmin PL biosynthetic genes. This indicates that SirZ is a major regulator of this gene cluster. Proteins similar to SirZ are encoded in the gliotoxin biosynthetic gene cluster of Aspergillus fumigatus (gliZ) and in an ETP-like cluster in Penicillium lilacinoechinulatum (PlgliZ). Despite its high level of sequence similarity to gliZ, PlgliZ is unable to complement the gliotoxin-deficiency of a mutant of gliZ in A. fumigatus. Putative binding sites for these regulatory proteins in the promoters of genes in these clusters were predicted using bioinformatic analysis. These sites are similar to those commonly bound by other proteins with Zn(II)2Cys6 DNA binding domains. PMID:18023597
Evidence against the selfish operon theory.
Pál, Csaba; Hurst, Laurence D
2004-06-01
According to the selfish operon hypothesis, the clustering of genes and their subsequent organization into operons is beneficial for the constituent genes because it enables the horizontal gene transfer of weakly selected, functionally coupled genes. The majority of these are expected to be non-essential genes. From our analysis of the Escherichia coli genome, we conclude that the selfish operon hypothesis is unlikely to provide a general explanation for clustering nor can it account for the gene composition of operons. Contrary to expectations, essential genes with related functions have an especially strong tendency to cluster, even if they are not in operons. Moreover, essential genes are particularly abundant in operons.
Valine/isoleucine variants drive selective pressure in the VP1 sequence of EV-A71 enteroviruses.
Duy, Nghia Ngu; Huong, Le Thi Thanh; Ravel, Patrice; Huong, Le Thi Song; Dwivedi, Ankit; Sessions, October Michael; Hou, Yan'An; Chua, Robert; Kister, Guilhem; Afelt, Aneta; Moulia, Catherine; Gubler, Duane J; Thiem, Vu Dinh; Thanh, Nguyen Thi Hien; Devaux, Christian; Duong, Tran Nhu; Hien, Nguyen Tran; Cornillot, Emmanuel; Gavotte, Laurent; Frutos, Roger
2017-05-08
In 2011-2012, Northern Vietnam experienced its first large scale hand foot and mouth disease (HFMD) epidemic. In 2011, a major HFMD epidemic was also reported in South Vietnam with fatal cases. This 2011-2012 outbreak was the first one to occur in North Vietnam providing grounds to study the etiology, origin and dynamic of the disease. We report here the analysis of the VP1 gene of strains isolated throughout North Vietnam during the 2011-2012 outbreak and before. The VP1 gene of 106 EV-A71 isolates from North Vietnam and 2 from Central Vietnam were sequenced. Sequence alignments were analyzed at the nucleic acid and protein level. Gene polymorphism was also analyzed. A Factorial Correspondence Analysis was performed to correlate amino acid mutations with clinical parameters. The sequences were distributed into four phylogenetic clusters. Three clusters corresponded to the subgenogroup C4 and the last one corresponded to the subgenogroup C5. Each cluster displayed different polymorphism characteristics. Proteins were highly conserved but three sites bearing only Isoleucine (I) or Valine (V) were characterized. The isoleucine/valine variability matched the clusters. Spatiotemporal analysis of the I/V variants showed that all variants which emerged in 2011 and then in 2012 were not the same but were all present in the region prior to the 2011-2012 outbreak. Some correlation was found between certain I/V variants and ethnicity and severity. The 2011-2012 outbreak was not caused by an exogenous strain coming from South Vietnam or elsewhere but by strains already present and circulating at low level in North Vietnam. However, what triggered the outbreak remains unclear. A selective pressure is applied on I/V variants which matches the genetic clusters. I/V variants were shown on other viruses to correlate with pathogenicity. This should be investigated in EV-A71. I/V variants are an easy and efficient way to survey and identify circulating EV-A71 strains.
Ushijima, Masaru; Mashima, Tetsuo; Tomida, Akihiro; Dan, Shingo; Saito, Sakae; Furuno, Aki; Tsukahara, Satomi; Seimiya, Hiroyuki; Yamori, Takao; Matsuura, Masaaki
2013-03-01
Genome-wide transcriptional expression analysis is a powerful strategy for characterizing the biological activity of anticancer compounds. It is often instructive to identify gene sets involved in the activity of a given drug compound for comparison with different compounds. Currently, however, there is no comprehensive gene expression database and related application system that is; (i) specialized in anticancer agents; (ii) easy to use; and (iii) open to the public. To develop a public gene expression database of antitumor agents, we first examined gene expression profiles in human cancer cells after exposure to 35 compounds including 25 clinically used anticancer agents. Gene signatures were extracted that were classified as upregulated or downregulated after exposure to the drug. Hierarchical clustering showed that drugs with similar mechanisms of action, such as genotoxic drugs, were clustered. Connectivity map analysis further revealed that our gene signature data reflected modes of action of the respective agents. Together with the database, we developed analysis programs that calculate scores for ranking changes in gene expression and for searching statistically significant pathways from the Kyoto Encyclopedia of Genes and Genomes database in order to analyze the datasets more easily. Our database and the analysis programs are available online at our website (http://scads.jfcr.or.jp/db/cs/). Using these systems, we successfully showed that proteasome inhibitors are selectively classified as endoplasmic reticulum stress inducers and induce atypical endoplasmic reticulum stress. Thus, our public access database and related analysis programs constitute a set of efficient tools to evaluate the mode of action of novel compounds and identify promising anticancer lead compounds. © 2012 Japanese Cancer Association.
Heat Shock-Enhanced Conjugation Efficiency in Standard Campylobacter jejuni Strains
Zeng, Ximin; Ardeshna, Devarshi
2015-01-01
Campylobacter jejuni, the leading bacterial cause of human gastroenteritis in the United States, displays significant strain diversity due to horizontal gene transfer. Conjugation is an important horizontal gene transfer mechanism contributing to the evolution of bacterial pathogenesis and antimicrobial resistance. It has been observed that heat shock could increase transformation efficiency in some bacteria. In this study, the effect of heat shock on C. jejuni conjugation efficiency and the underlying mechanisms were examined. With a modified Escherichia coli donor strain, different C. jejuni recipient strains displayed significant variation in conjugation efficiency ranging from 6.2 × 10−8 to 6.0 × 10−3 CFU per recipient cell. Despite reduced viability, heat shock of standard C. jejuni NCTC 11168 and 81-176 strains (e.g., 48 to 54°C for 30 to 60 min) could dramatically enhance C. jejuni conjugation efficiency up to 1,000-fold. The phenotype of the heat shock-enhanced conjugation in C. jejuni recipient cells could be sustained for at least 9 h. Filtered supernatant from the heat shock-treated C. jejuni cells could not enhance conjugation efficiency, which suggests that the enhanced conjugation efficiency is independent of secreted substances. Mutagenesis analysis indicated that the clustered regularly interspaced short palindromic repeats system and the selected restriction-modification systems (Cj0030/Cj0031, Cj0139/Cj0140, Cj0690c, and HsdR) were dispensable for heat shock-enhanced conjugation in C. jejuni. Taking all results together, this study demonstrated a heat shock-enhanced conjugation efficiency in standard C. jejuni strains, leading to an optimized conjugation protocol for molecular manipulation of this organism. The findings from this study also represent a significant step toward elucidation of the molecular mechanism of conjugative gene transfer in C. jejuni. PMID:25911489
Heat Shock-Enhanced Conjugation Efficiency in Standard Campylobacter jejuni Strains.
Zeng, Ximin; Ardeshna, Devarshi; Lin, Jun
2015-07-01
Campylobacter jejuni, the leading bacterial cause of human gastroenteritis in the United States, displays significant strain diversity due to horizontal gene transfer. Conjugation is an important horizontal gene transfer mechanism contributing to the evolution of bacterial pathogenesis and antimicrobial resistance. It has been observed that heat shock could increase transformation efficiency in some bacteria. In this study, the effect of heat shock on C. jejuni conjugation efficiency and the underlying mechanisms were examined. With a modified Escherichia coli donor strain, different C. jejuni recipient strains displayed significant variation in conjugation efficiency ranging from 6.2 × 10(-8) to 6.0 × 10(-3) CFU per recipient cell. Despite reduced viability, heat shock of standard C. jejuni NCTC 11168 and 81-176 strains (e.g., 48 to 54°C for 30 to 60 min) could dramatically enhance C. jejuni conjugation efficiency up to 1,000-fold. The phenotype of the heat shock-enhanced conjugation in C. jejuni recipient cells could be sustained for at least 9 h. Filtered supernatant from the heat shock-treated C. jejuni cells could not enhance conjugation efficiency, which suggests that the enhanced conjugation efficiency is independent of secreted substances. Mutagenesis analysis indicated that the clustered regularly interspaced short palindromic repeats system and the selected restriction-modification systems (Cj0030/Cj0031, Cj0139/Cj0140, Cj0690c, and HsdR) were dispensable for heat shock-enhanced conjugation in C. jejuni. Taking all results together, this study demonstrated a heat shock-enhanced conjugation efficiency in standard C. jejuni strains, leading to an optimized conjugation protocol for molecular manipulation of this organism. The findings from this study also represent a significant step toward elucidation of the molecular mechanism of conjugative gene transfer in C. jejuni. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Missing link in the evolution of Hox clusters.
Ogishima, Soichi; Tanaka, Hiroshi
2007-01-31
Hox cluster has key roles in regulating the patterning of the antero-posterior axis in a metazoan embryo. It consists of the anterior, central and posterior genes; the central genes have been identified only in bilaterians, but not in cnidarians, and are responsible for archiving morphological complexity in bilaterian development. However, their evolutionary history has not been revealed, that is, there has been a "missing link". Here we show the evolutionary history of Hox clusters of 18 bilaterians and 2 cnidarians by using a new method, "motif-based reconstruction", examining the gain/loss processes of evolutionarily conserved sequences, "motifs", outside the homeodomain. We successfully identified the missing link in the evolution of Hox clusters between the cnidarian-bilaterian ancestor and the bilaterians as the ancestor of the central genes, which we call the proto-central gene. Exploring the correspondent gene with the proto-central gene, we found that one of the acoela Hox genes has the same motif repertory as that of the proto-central gene. This interesting finding suggests that the acoela Hox cluster corresponds with the missing link in the evolution of the Hox cluster between the cnidarian-bilaterian ancestor and the bilaterians. Our findings suggested that motif gains/diversifications led to the explosive diversity of the bilaterian body plan.
Reyes-Dominguez, Yazmid; Boedi, Stefan; Sulyok, Michael; Wiesenberger, Gerlinde; Stoppacher, Norbert; Krska, Rudolf; Strauss, Joseph
2012-01-01
Chromatin modifications and heterochromatic marks have been shown to be involved in the regulation of secondary metabolism gene clusters in the fungal model system Aspergillus nidulans. We examine here the role of HEP1, the heterochromatin protein homolog of Fusarium graminearum, for the production of secondary metabolites. Deletion of Hep1 in a PH-1 background strongly influences expression of genes required for the production of aurofusarin and the main tricothecene metabolite DON. In the Hep1 deletion strains AUR genes are highly up-regulated and aurofusarin production is greatly enhanced suggesting a repressive role for heterochromatin on gene expression of this cluster. Unexpectedly, gene expression and metabolites are lower for the trichothecene cluster suggesting a positive function of Hep1 for DON biosynthesis. However, analysis of histone modifications in chromatin of AUR and DON gene promoters reveals that in both gene clusters the H3K9me3 heterochromatic mark is strongly reduced in the Hep1 deletion strain. This, and the finding that a DON-cluster flanking gene is up-regulated, suggests that the DON biosynthetic cluster is repressed by HEP1 directly and indirectly. Results from this study point to a conserved mode of secondary metabolite (SM) biosynthesis regulation in fungi by chromatin modifications and the formation of facultative heterochromatin. PMID:22100541
Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata.
Hu, Wei; Zaveri, Amrapali; Qiu, Honglei; Dumontier, Michel
2017-09-18
The ability to efficiently search and filter datasets depends on access to high quality metadata. While most biomedical repositories require data submitters to provide a minimal set of metadata, some such as the Gene Expression Omnibus (GEO) allows users to specify additional metadata in the form of textual key-value pairs (e.g. sex: female). However, since there is no structured vocabulary to guide the submitter regarding the metadata terms to use, consequently, the 44,000,000+ key-value pairs in GEO suffer from numerous quality issues including redundancy, heterogeneity, inconsistency, and incompleteness. Such issues hinder the ability of scientists to hone in on datasets that meet their requirements and point to a need for accurate, structured and complete description of the data. In this study, we propose a clustering-based approach to address data quality issues in biomedical, specifically gene expression, metadata. First, we present three different kinds of similarity measures to compare metadata keys. Second, we design a scalable agglomerative clustering algorithm to cluster similar keys together. Our agglomerative cluster algorithm identified metadata keys that were similar, based on (i) name, (ii) core concept and (iii) value similarities, to each other and grouped them together. We evaluated our method using a manually created gold standard in which 359 keys were grouped into 27 clusters based on six types of characteristics: (i) age, (ii) cell line, (iii) disease, (iv) strain, (v) tissue and (vi) treatment. As a result, the algorithm generated 18 clusters containing 355 keys (four clusters with only one key were excluded). In the 18 clusters, there were keys that were identified correctly to be related to that cluster, but there were 13 keys which were not related to that cluster. We compared our approach with four other published methods. Our approach significantly outperformed them for most metadata keys and achieved the best average F-Score (0.63). Our algorithm identified keys that were similar to each other and grouped them together. Our intuition that underpins cleaning by clustering is that, dividing keys into different clusters resolves the scalability issues for data observation and cleaning, and keys in the same cluster with duplicates and errors can easily be found. Our algorithm can also be applied to other biomedical data types.
2011-01-01
Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755
ERIC Educational Resources Information Center
Hallman, Clemens L.; And Others
This teacher training monograph deals with value orientations of cultures in general and with specific reference to United States Culture. The first two sections discuss the conceptual issues of value orientation and give axiological definitions of the six clusters used to describe cultural orientation: self, the family, society, human nature,…
Investigation of deformation effects on the decay properties of 12 C + α Cluster states in 16O
NASA Astrophysics Data System (ADS)
Soylu, A.; Koyuncu, F.; Coban, A.; Bayrak, O.; Freer, M.
2018-04-01
We have analyzed the elastic scattering angular distributions data of the α +12C reaction over a wide energy range (Elab = 28 . 2 to 35.5 MeV) within the framework of the Optical Model formalism. A double folding (DF) type real potential was used with a phenomenological Woods-Saxon-squared (WS2) type imaginary potential. Good agreement between the calculations and experimental data was obtained. By using the real DF potential we have calculated the properties of the α-cluster states in 16O by using the Gamow code as well as the α-decay widths by using the WKB method. We implemented a 12C + α cluster framework for the calculation of the excitation energies and decay widths of 16O as a function of the orientation of the planar 12C nucleus with respect to the α-particle. These calculations showed strong sensitivity of the widths and excitation energies to the orientation. Branching ratios were also calculated and though less sensitive to the 12C orientation, it was found that 12Cgs + α structure, with the α-particle orbiting the 12C in its ground state, is dominant. This work demonstrates that deformation, and the orientation, of 12C plays a crucial role in the understanding of the nature of the α-cluster states in 16O.
Patel, Vidushi S; Ezaz, Tariq; Deakin, Janine E; Graves, Jennifer A Marshall
2010-12-01
The haemoglobin protein, required for oxygen transportation in the body, is encoded by α- and β-globin genes that are arranged in clusters. The transpositional model for the evolution of distinct α-globin and β-globin clusters in amniotes is much simpler than the previously proposed whole genome duplication model. According to this model, all jawed vertebrates share one ancient region containing α- and β-globin genes and several flanking genes in the order MPG-C16orf35-(α-β)-GBY-LUC7L that has been conserved for more than 410 million years, whereas amniotes evolved a distinct β-globin cluster by insertion of a transposed β-globin gene from this ancient region into a cluster of olfactory receptors flanked by CCKBR and RRM1. It could not be determined whether this organisation is conserved in all amniotes because of the paucity of information from non-avian reptiles. To fill in this gap, we examined globin gene organisation in a squamate reptile, the Australian bearded dragon lizard, Pogona vitticeps (Agamidae). We report here that the α-globin cluster (HBK, HBA) is flanked by C16orf35 and GBY and is located on a pair of microchromosomes, whereas the β-globin cluster is flanked by RRM1 on the 3' end and is located on the long arm of chromosome 3. However, the CCKBR gene that flanks the β-globin cluster on the 5' end in other amniotes is located on the short arm of chromosome 5 in P. vitticeps, indicating that a chromosomal break between the β-globin cluster and CCKBR occurred at least in the agamid lineage. Our data from a reptile species provide further evidence to support the transpositional model for the evolution of β-globin gene cluster in amniotes.
Advancing chimeric antigen receptor T cell therapy with CRISPR/Cas9.
Ren, Jiangtao; Zhao, Yangbing
2017-09-01
The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (CRISPR/Cas9) system, an RNA-guided DNA targeting technology, is triggering a revolution in the field of biology. CRISPR/Cas9 has demonstrated great potential for genetic manipulation. In this review, we discuss the current development of CRISPR/Cas9 technologies for therapeutic applications, especially chimeric antigen receptor (CAR) T cell-based adoptive immunotherapy. Different methods used to facilitate efficient CRISPR delivery and gene editing in T cells are compared. The potential of genetic manipulation using CRISPR/Cas9 system to generate universal CAR T cells and potent T cells that are resistant to exhaustion and inhibition is explored. We also address the safety concerns associated with the use of CRISPR/Cas9 gene editing and provide potential solutions and future directions of CRISPR application in the field of CAR T cell immunotherapy. As an integration-free gene insertion method, CRISPR/Cas9 holds great promise as an efficient gene knock-in platform. Given the tremendous progress that has been made in the past few years, we believe that the CRISPR/Cas9 technology holds immense promise for advancing immunotherapy.
Camilleri, Michael; Carlson, Paula; Valentin, Nelson; Acosta, Andres; O'Neill, Jessica; Eckert, Deborah; Dyer, Roy; Na, Jie; Klee, Eric W; Murray, Joseph A
2016-09-01
Prior studies in with irritable bowel syndrome with diarrhea (IBS-D) patients showed immune activation, secretion, and barrier dysfunction in jejunal or colorectal mucosa. We measured mRNA expression by RT-PCR of 91 genes reflecting tight junction proteins, chemokines, innate immunity, ion channels, transmitters, housekeeping genes, and controls for DNA contamination and PCR efficiency in small intestinal mucosa from 15 IBS-D and 7 controls (biopsies negative for celiac disease). Fold change was calculated using 2((-ΔΔCT)) formula. Nominal P values (P < 0.05) were interpreted with false detection rate (FDR) correction (q value). Cluster analysis with Lens for Enrichment and Network Studies (LENS) explored connectivity of mechanisms. Upregulated genes (uncorrected P < 0.05) were related to ion transport (INADL, MAGI1, and SONS1), barrier (TJP1, 2, and 3 and CLDN) or immune functions (TLR3, IL15, and MAPKAPK5), or histamine metabolism (HNMT); downregulated genes were related to immune function (IL-1β, TGF-β1, and CCL20) or antigen detection (TLR1 and 8). The following genes were significantly upregulated (q < 0.05) in IBS-D: INADL, MAGI1, PPP2R5C, MAPKAPK5, TLR3, and IL-15. Among the 14 nominally upregulated genes, there was clustering of barrier and PDZ domains (TJP1, TJP2, TJP3, CLDN4, INADL, and MAGI1) and clustering of downregulated genes (CCL20, TLR1, IL1B, and TLR8). Protein expression of PPP2R5C in nuclear lysates was greater in patients with IBS-D and controls. There was increase in INADL protein (median 9.4 ng/ml) in patients with IBS-D relative to controls (median 5.8 ng/ml, P > 0.05). In conclusion, altered transcriptome (and to lesser extent protein) expression of ion transport, barrier, immune, and mast cell mechanisms in small bowel may reflect different alterations in function and deserves further study in IBS-D. Copyright © 2016 the American Physiological Society.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae.
Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu
2018-01-01
A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata . It consists of 10 amino acid residues, including five N -methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae . The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR , were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae , gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata . Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae , although there may be unknown factors limiting productivity in this species.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae
Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu
2018-01-01
A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata. It consists of 10 amino acid residues, including five N-methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae. The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR, were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae, gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata. Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae, although there may be unknown factors limiting productivity in this species. PMID:29686660
Darbani, Behrooz; Motawia, Mohammed Saddik; Olsen, Carl Erik; Nour-Eldin, Hussam H.; Møller, Birger Lindberg; Rook, Fred
2016-01-01
Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expressed with the biosynthetic genes. The predicted localisation of SbMATE2 to the vacuolar membrane was demonstrated experimentally by transient expression of a SbMATE2-YFP fusion protein and confocal microscopy. Transport studies in Xenopus laevis oocytes demonstrate that SbMATE2 is able to transport dhurrin. In addition, SbMATE2 was able to transport non-endogenous cyanogenic glucosides, but not the anthocyanin cyanidin 3-O-glucoside or the glucosinolate indol-3-yl-methyl glucosinolate. The genomic co-localisation of a transporter gene with the biosynthetic genes producing the transported compound is discussed in relation to the role self-toxicity of chemical defence compounds may play in the formation of gene clusters. PMID:27841372
ERIC Educational Resources Information Center
Zettergren, Peter
2007-01-01
A modern clustering technique was applied to age-10 and age-13 sociometric data with the purpose of identifying longitudinally stable peer status clusters. The study included 445 girls from a Swedish longitudinal study. The identified temporally stable clusters of rejected, popular, and average girls were essentially larger than corresponding…