Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
Tharakaraman, Kannan; Mariño-Ramírez, Leonardo; Sheetlin, Sergey L; Landsman, David; Spouge, John L
2006-01-01
Background Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. Results We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. Conclusion Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances. PMID:16961919
eShadow: A tool for comparing closely related sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.
2004-01-15
Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
Palzkill, T G; Oliver, S G; Newlon, C S
1986-01-01
Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Insertion sequences enrichment in extreme Red sea brine pool vent.
Elbehery, Ali H A; Aziz, Ramy K; Siam, Rania
2017-03-01
Mobile genetic elements are major agents of genome diversification and evolution. Limited studies addressed their characteristics, including abundance, and role in extreme habitats. One of the rare natural habitats exposed to multiple-extreme conditions, including high temperature, salinity and concentration of heavy metals, are the Red Sea brine pools. We assessed the abundance and distribution of different mobile genetic elements in four Red Sea brine pools including the world's largest known multiple-extreme deep-sea environment, the Red Sea Atlantis II Deep. We report a gradient in the abundance of mobile genetic elements, dramatically increasing in the harshest environment of the pool. Additionally, we identified a strong association between the abundance of insertion sequences and extreme conditions, being highest in the harshest and deepest layer of the Red Sea Atlantis II Deep. Our comparative analyses of mobile genetic elements in secluded, extreme and relatively non-extreme environments, suggest that insertion sequences predominantly contribute to polyextremophiles genome plasticity.
Li, Ruichao; Xie, Miaomiao; Dong, Ning; Lin, Dachuan; Yang, Xuemei; Wong, Marcus Ho Yin; Chan, Edward Wai-Chi; Chen, Sheng
2018-03-01
Multidrug resistance (MDR)-encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements.
2012-01-01
Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Motor programming when sequencing multiple elements of the same duration.
Magnuson, Curt E; Robin, Donald A; Wright, David L
2008-11-01
Motor programming at the self-select paradigm was adopted in 2 experiments to examine the processing demands of independent processes. One process (INT) is responsible for organizing the internal features of the individual elements in a movement (e.g., response duration). The 2nd process (SEQ) is responsible for placing the elements into the proper serial order before execution. Participants in Experiment 1 performed tasks involving 1 key press or sequences of 4 key presses of the same duration. Implementing INT and SEQ was more time consuming for key-pressing sequences than for single key-press tasks. Experiment 2 examined whether the INT costs resulting from the increase in sequence length observed in Experiment 1 resulted from independent planning of each sequence element or via a separate "multiplier" process that handled repetitions of elements of the same duration. Findings from Experiment 2, in which participants performed single key presses or double or triple key sequences of the same duration, suggested that INT is involved with the independent organization of each element contained in the sequence. Researchers offer an elaboration of the 2-process account of motor programming to incorporate the present findings and the findings from other recent sequence-learning research.
Eickbush, D. G.; Eickbush, T. H.
1995-01-01
R1 and R2 are non-long-terminal repeat retrotransposable elements that insert into specific sequences of insect 28S ribosomal RNA genes. These elements have been extensively described in Drosophila melanogaster. To determine whether these elements have been horizontally or vertically transmitted, we characterized R1 and R2 elements from the seven other members of the melanogaster species subgroup by genomic blotting and nucleotide sequencing. Each species was found to have homogeneous families of R1 and R2 elements with the exception of erecta and orena, which have no R2 elements. The DNA sequences of multiple R1 and R2 copies from each species indicated nucleotide divergence within each species averaged only 0.48% for R1 and 0.35% for R2, well below the level of divergence among the species. Most copies of R1 and R2 (40 of 47) sequenced from the seven species were potentially functional, as indicated by the absence of premature termination codons or translational frameshifts that would destroy the open reading frame of the element. The sequence relationships of both the R1 and R2 elements from the various members of the melanogaster subgroup closely followed that of the species phylogeny, suggesting that R1 and R2 have been stably maintained by vertical transmission since the origin of this species subgroup 17-20 million years ago. The remarkable stability of R1 and R2, compared to what has been suggested for transposable elements that insert at multiple locations in these same species, may be due to their unique specificity for sites in the rRNA gene locus. Under low copy number conditions, when it is essential for any mobile element to transpose, the insertion specificities of R1 and R2 ensure uniform developmentally regulated target sites that can be occupied with little or no detrimental effect on the host. PMID:7713424
Guimond, A; Moss, T
1999-02-01
We have used a differential cloning approach to isolate ribosomal/non-ribosomal frontier sequences from Xenopus laevis. A ribosomal intergenic spacer sequence (IGS) was cloned and shown not to be physically linked with the ribosomal locus. This ribosomal orphon contained the IGS sequences found immediately downstream of the 28S gene and included an array of enhancer repetitions and a non-functional spacer promoter. The orphon sequence was flanked by a member of the novel 'Frt' low copy repetitive element family. Three individual Frt repeats were sequenced and all members of this family were shown to lie clustered at two chromosomal sites, one of which contained the ribosomal orphon. One of the Frt elements contained an insertion of 297 bp that showed extensive homology to sequences within at least three other Xenopus genes. Each homology region was flanked by members of the T2 family of short interspersed repetitive elements, (SINEs), and by its target insertion sequence, suggesting multiple translocation events. The data are discussed in terms of the evolution of the ribosomal gene locus.
Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations.
Feusier, Julie; Witherspoon, David J; Scott Watkins, W; Goubert, Clément; Sasani, Thomas A; Jorde, Lynn B
2017-01-01
Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. Alu Yb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. We identified 5,288 putative Alu insertion events, including several hundred novel Alu Yb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare Alu Yb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare Alu Yb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future ME-Scan experiments. In conclusion, we demonstrate that ME-Scan is a good supplement for next-generation sequencing methods and is well-suited for population-level analyses.
Ehrmann, M A; Vogel, R E
2001-11-01
An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.
BIPAD: A web server for modeling bipartite sequence elements
Bi, Chengpeng; Rogan, Peter K
2006-01-01
Background Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. Results We introduce the Bipad Server [1], a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. Conclusion The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins. PMID:16503993
Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F
2012-01-01
Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Low-pass sequencing for microbial comparative genomics
Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy
2004-01-01
Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067
Castresana, C; Garcia-Luque, I; Alonso, E; Malik, V S; Cashmore, A R
1988-01-01
We have analyzed promoter regulatory elements from a photoregulated CAB gene (Cab-E) isolated from Nicotiana plumbaginifolia. These studies have been performed by introducing chimeric gene constructs into tobacco cells via Agrobacterium tumefaciens-mediated transformation. Expression studies on the regenerated transgenic plants have allowed us to characterize three positive and one negative cis-acting elements that influence photoregulated expression of the Cab-E gene. Within the upstream sequences we have identified two positive regulatory elements (PRE1 and PRE2) which confer maximum levels of photoregulated expression. These sequences contain multiple repeated elements related to the sequence-ACCGGCCCACTT-. We have also identified within the upstream region a negative regulatory element (NRE) extremely rich in AT sequences, which reduces the level of gene expression in the light. We have defined a light regulatory element (LRE) within the promoter region extending from -396 to -186 bp which confers photoregulated expression when fused to a constitutive nopaline synthase ('nos') promoter. Within this region there is a 132-bp element, extending from -368 to -234 bp, which on deletion from the Cab-E promoter reduces gene expression from high levels to undetectable levels. Finally, we have demonstrated for a full length Cab-E promoter conferring high levels of photoregulated expression, that sequences proximal to the Cab-E TATA box are not replaceable by corresponding sequences from a 'nos' promoter. This contrasts with the apparent equivalence of these Cab-E and 'nos' TATA box-proximal sequences in truncated promoters conferring low levels of photoregulated expression. Images PMID:2901343
Large diversity of the piggyBac-like elements in the genome of Tribolium castaneum
Wang, Jianjun; Du, Yuzhou; Wang, Suzhi; Brown, Sue; Park, Yoonseong
2011-01-01
The piggyBac transposable element, originally discovered in the cabbage looper, Trichoplusia ni, has been widely used in insect transgenesis including the red flour beetle Tribolium castaneum. We surveyed piggyBac-like (PLE) sequences in the genome of Tribolium castaneum by homology searches using as queries the diverse PLE sequences that have been described previously. The search yielded a total of 32 piggyBac-like elements (TcPLEs) which were classified into 14 distinct groups. Most of the TcPLEs contain defective functional motifs in that they are lacking inverted terminal repeats or have disrupted open reading frames. Only one single copy of TcPLE1 appears to be intact with imperfect 16 bp inverted terminal repeats flanking an open reading frame encoding a transposase of 571 amino acid residues. Many copies of TcPLEs were found to be inserted into or close to other transposon-like sequences. This large diversity of TcPLEs with generally low copy numbers suggests multiple invasions of the TcPLEs over a long evolutionary time without extensive multiplications or occurrence of rapid loss of TcPLEs copies. PMID:18342253
Marenda, Marc; Barbe, Valérie; Gourgues, Géraldine; Mangenot, Sophie; Sagne, Evelyne; Citti, Christine
2006-01-01
An integrative conjugative element, ICEA, was characterized in Mycoplasma agalactiae strain 5632, in which it occurs as multiple chromosomal copies and as a free circular form. The distribution of ICEA sequences in M. agalactiae strains and their occurrence in Mycoplasma bovis suggest the spreading of the element within or between species. PMID:16707706
NASA Astrophysics Data System (ADS)
Furrer, Julien; Kramer, Frank; Marino, John P.; Glaser, Steffen J.; Luy, Burkhard
2004-01-01
Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding ˜10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.
Furrer, Julien; Kramer, Frank; Marino, John P; Glaser, Steffen J; Luy, Burkhard
2004-01-01
Homonuclear Hartmann-Hahn transfer is one of the most important building blocks in modern high-resolution NMR. It constitutes a very efficient transfer element for the assignment of proteins, nucleic acids, and oligosaccharides. Nevertheless, in macromolecules exceeding approximately 10 kDa TOCSY-experiments can show decreasing sensitivity due to fast transverse relaxation processes that are active during the mixing periods. In this article we propose the MOCCA-XY16 multiple pulse sequence, originally developed for efficient TOCSY transfer through residual dipolar couplings, as a homonuclear Hartmann-Hahn sequence with improved relaxation properties. A theoretical analysis of the coherence transfer via scalar couplings and its relaxation behavior as well as experimental transfer curves for MOCCA-XY16 relative to the well-characterized DIPSI-2 multiple pulse sequence are given.
Sequence Segmentation with changeptGUI.
Tasker, Edward; Keith, Jonathan M
2017-01-01
Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.
Mink, S; Härtig, E; Jennewein, P; Doppler, W; Cato, A C
1992-01-01
Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus involved in the neoplastic transformation of mouse mammary gland cells. The expression of this virus is regulated by mammary cell type-specific factors, steroid hormones, and polypeptide growth factors. Sequences for mammary cell-specific expression are located in an enhancer element in the extreme 5' end of the long terminal repeat region of this virus. This enhancer, when cloned in front of the herpes simplex thymidine kinase promoter, endows the promoter with mammary cell-specific response. Using functional and DNA-protein-binding studies with constructs mutated in the MMTV long terminal repeat enhancer, we have identified two main regulatory elements necessary for the mammary cell-specific response. These elements consist of binding sites for a transcription factor in the family of CTF/NFI proteins and the transcription factor mammary cell-activating factor (MAF) that recognizes the sequence G Pu Pu G C/G A A G G/T. Combinations of CTF/NFI- and MAF-binding sites or multiple copies of either one of these binding sites but not solitary binding sites mediate mammary cell-specific expression. The functional activities of these two regulatory elements are enhanced by another factor that binds to the core sequence ACAAAG. Interdigitated binding sites for CTF/NFI, MAF, and/or the ACAAAG factor are also found in the 5' upstream regions of genes encoding whey milk proteins from different species. These findings suggest that mammary cell-specific regulation is achieved by a concerted action of factors binding to multiple regulatory sites. Images PMID:1328867
Berthier, Y; Thierry, D; Lemattre, M; Guesdon, J L
1994-01-01
A new insertion sequence was isolated from Xanthomonas campestris pv. dieffenbachiae. Sequence analysis showed that this element is 1,158 bp long and has 15-bp inverted repeat ends containing two mismatches. Comparison of this sequence with sequences in data bases revealed significant homology with Escherichia coli IS5. IS1051, which detected multiple restriction fragment length polymorphisms, was used as a probe to characterize strains from the pathovar dieffenbachiae. Images PMID:7906933
PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.
Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S
2007-10-11
By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.
Jie Jin, Feng; Hara, Seiichi; Sato, Atsushi; Koyama, Yasuji
2014-01-01
Wild-type Aspergillus oryzae RIB40 contains two copies of the AO090005001597 gene. We previously constructed A. oryzae RIB40 strain, RKuAF8B, with multiple chromosomal deletions, in which the AO090005001597 copy number was found to be increased significantly. Sequence analysis indicated that AO090005001597 is part of a putative 6,000-bp retrotransposable element, flanked by two long terminal repeats (LTRs) of 669 bp, with characteristics of retroviruses and retrotransposons, and thus designated AoLTR (A. oryzae LTR-retrotransposable element). AoLTR comprised putative reverse transcriptase, RNase H, and integrase domains. The deduced amino acid sequence alignment of AoLTR showed 94% overall identity with AFLAV, an A. flavus Tf1/sushi retrotransposon. Quantitative real-time RT-PCR showed that AoLTR gene expression was significantly increased in the RKuAF8B, in accordance with the increased copy number. Inverse PCR indicated that the full-length retrotransposable element was randomly integrated into multiple genomic locations. However, no obvious phenotypic changes were associated with the increased AoLTR gene copy number.
Circular RNA expression in basal cell carcinoma.
Sand, Michael; Bechara, Falk G; Sand, Daniel; Gambichler, Thilo; Hahn, Stephan A; Bromba, Michael; Stockfleth, Eggert; Hessam, Schapoor
2016-05-01
Circular RNAs (circRNAs), are nonprotein coding RNAs consisting of a circular loop with multiple miRNA, binding sites called miRNA response elements (MREs), functioning as miRNA sponges. This study was performed to identify differentially expressed circRNAs and their MREs in basal cell carcinoma (BCC). Microarray circRNA expression profiles were acquired from BCC and control followed by qRT-PCR validation. Bioinformatical target prediction revealed multiple MREs. Sequence analysis was performed concerning MRE interaction potential with the BCC miRNome. We identified 23 upregulated and 48 downregulated circRNAs with 354 miRNA response elements capable of sequestering miRNA target sequences of the BCC miRNome. The present study describes a variety of circRNAs that are potentially involved in the molecular pathogenesis of BCC.
Spuesens, Emiel B M; Oduber, Minoushka; Hoogenboezem, Theo; Sluijter, Marcel; Hartwig, Nico G; van Rossum, Annemarie M C; Vink, Cornelis
2009-07-01
The gene encoding major adhesin protein P1 of Mycoplasma pneumoniae, MPN141, contains two DNA sequence stretches, designated RepMP2/3 and RepMP4, which display variation among strains. This variation allows strains to be differentiated into two major P1 genotypes (1 and 2) and several variants. Interestingly, multiple versions of the RepMP2/3 and RepMP4 elements exist at other sites within the bacterial genome. Because these versions are closely related in sequence, but not identical, it has been hypothesized that they have the capacity to recombine with their counterparts within MPN141, and thereby serve as a source of sequence variation of the P1 protein. In order to determine the variation within the RepMP2/3 and RepMP4 elements, both within the bacterial genome and among strains, we analysed the DNA sequences of all RepMP2/3 and RepMP4 elements within the genomes of 23 M. pneumoniae strains. Our data demonstrate that: (i) recombination is likely to have occurred between two RepMP2/3 elements in four of the strains, and (ii) all previously described P1 genotypes can be explained by inter-RepMP recombination events. Moreover, the difference between the two major P1 genotypes was reflected in all RepMP elements, such that subtype 1 and 2 strains can be differentiated on the basis of sequence variation in each RepMP element. This implies that subtype 1 and subtype 2 strains represent evolutionarily diverged strain lineages. Finally, a classification scheme is proposed in which the P1 genotype of M. pneumoniae isolates can be described in a sequence-based, universal fashion.
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.
Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M
2003-02-28
Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Kuno, Sotaro; Yoshida, Takashi; Kamikawa, Ryoma; Hosoda, Naohiko; Sako, Yoshihiko
2010-01-01
The cyanophage Ma-LMM01, specifically-infecting Microcystis aeruginosa, has an insertion sequence (IS) element that we named IS607-cp showing high nucleotide similarity to a counterpart in the genome of the cyanobacterium Cyanothece sp. We tested 21 strains of M. aeruginosa for the presence of IS607-cp using PCR and detected the element in strains NIES90, NIES112, NIES604, and RM6. Thermal asymmetric interlaced PCR (TAIL-PCR) revealed each of these strains has multiple copies of IS607-cp. Some of the ISs were classified into three types based on their inserted positions; IS607-cp-1 is common in strains NIES90, NIES112 and NIES604, whereas IS607-cp-2 and IS607-cp-3 are specific to strains NIES90 and RM6, respectively. This multiplicity may reflect the replicative transposition of IS607-cp. The sequence of IS607-cp in Ma-LMM01 showed robust affinity to those found in M. aeruginosa and Cyanothece spp. in a phylogenetic tree inferred from counterparts of various bacteria. This suggests the transfer of IS607-cp between the cyanobacterium and its cyanophage. We discuss the potential role of Ma-LMM01-related phages as donors of IS elements that may mediate the transfer of IS607-cp; and thereby partially contribute to the genome plasticity of M. aeruginosa.
Musetti, Rita; Pagliari, Laura; Buxa, Stefanie V; Degola, Francesca; De Marco, Federica; Loschi, Alberto; Kogel, Karl-Heinz; van Bel, Aart J E
2016-01-01
Phytoplasmas are among the most recently discovered plant pathogenic microorganisms so, many traits of the interactions with host plants and insect vectors are still unclear and need to be investigated. At now, it is impossible to determine the precise sequences leading to the onset of the relationship with the plant host cell. It is still unclear how phytoplasmas, located in the phloem sieve elements, exploit host cell to draw nutrition for their metabolism, growth and multiplication. In this work, basing on microscopical observations, we give insight about the structural interactions established by phytoplasmas and the sieve element plasma membrane, cytoskeleton, sieve endoplasmic reticulum, speculating about a possible functional role.
Mango: multiple alignment with N gapped oligos.
Zhang, Zefeng; Lin, Hao; Li, Ming
2008-06-01
Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.
Martoni, Francesco; Eickbush, Danna G.; Scavariello, Claudia; Luchetti, Andrea; Mantovani, Barbara
2015-01-01
R2 is an extensively investigated non-LTR retrotransposon that specifically inserts into the 28S rRNA gene sequences of a wide range of metazoans, disrupting its functionality. During R2 integration, first strand synthesis can be incomplete so that 5’ end deleted copies are occasionally inserted. While active R2 copies repopulate the locus by retrotransposing, the non-functional truncated elements should frequently be eliminated by molecular drive processes leading to the concerted evolution of the rDNA array(s). Although, multiple R2 lineages have been discovered in the genome of many animals, the rDNA of the stick insect Bacillus rossius exhibits a peculiar situation: it harbors both a canonical, functional R2 element (R2Brfun) as well as a full-length but degenerate element (R2Brdeg). An intensive sequencing survey in the present study reveals that all truncated variants in stick insects are present in multiple copies suggesting they were duplicated by unequal recombination. Sequencing results also demonstrate that all R2Brdeg copies are full-length, i. e. they have no associated 5' end deletions, and functional assays indicate they have lost the active ribozyme necessary for R2 RNA maturation. Although it cannot be completely ruled out, it seems unlikely that the degenerate elements replicate via reverse transcription, exploiting the R2Brfun element enzymatic machinery, but rather via genomic amplification of inserted 28S by unequal recombination. That inactive copies (both R2Brdeg or 5'-truncated elements) are not eliminated in a short term in stick insects contrasts with findings for the Drosophila R2, suggesting a widely different management of rDNA loci and a lower efficiency of the molecular drive while achieving the concerted evolution. PMID:25799008
Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald
2013-01-01
Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. PMID:23648487
Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald
2013-08-01
Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome
Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Brown, James B.; Bickel, Peter; Holmes, Ian; Mullikin, James C.; Ureta-Vidal, Abel; Paten, Benedict; Stone, Eric A.; Rosenbloom, Kate R.; Kent, W. James; Bouffard, Gerard G.; Guan, Xiaobin; Hansen, Nancy F.; Idol, Jacquelyn R.; Maduro, Valerie V.B.; Maskeri, Baishali; McDowell, Jennifer C.; Park, Morgan; Thomas, Pamela J.; Young, Alice C.; Blakesley, Robert W.; Muzny, Donna M.; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Jiang, Huaiyang; Weinstock, George M.; Gibbs, Richard A.; Graves, Tina; Fulton, Robert; Mardis, Elaine R.; Wilson, Richard K.; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B.; Chang, Jean L.; Lindblad-Toh, Kerstin; Lander, Eric S.; Hinrichs, Angie; Trumbower, Heather; Clawson, Hiram; Zweig, Ann; Kuhn, Robert M.; Barber, Galt; Harte, Rachel; Karolchik, Donna; Field, Matthew A.; Moore, Richard A.; Matthewson, Carrie A.; Schein, Jacqueline E.; Marra, Marco A.; Antonarakis, Stylianos E.; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross; Haussler, David; Miller, Webb; Pachter, Lior; Green, Eric D.; Sidow, Arend
2007-01-01
A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. PMID:17567995
Brookfield, John F. Y.; Johnson, Louise J.
2006-01-01
Some families of mammalian interspersed repetitive DNA, such as the Alu SINE sequence, appear to have evolved by the serial replacement of one active sequence with another, consistent with there being a single source of transposition: the “master gene.” Alternative models, in which multiple source sequences are simultaneously active, have been called “transposon models.” Transposon models differ in the proportion of elements that are active and in whether inactivation occurs at the moment of transposition or later. Here we examine the predictions of various types of transposon model regarding the patterns of sequence variation expected at an equilibrium between transposition, inactivation, and deletion. Under the master gene model, all bifurcations in the true tree of elements occur in a single lineage. We show that this property will also hold approximately for transposon models in which most elements are inactive and where at least some of the inactivation events occur after transposition. Such tree shapes are therefore not conclusive evidence for a single source of transposition. PMID:16790583
Musetti, Rita; Pagliari, Laura; Buxa, Stefanie V.; Degola, Francesca; De Marco, Federica; Loschi, Alberto; Kogel, Karl-Heinz; van Bel, Aart J. E.
2016-01-01
ABSTRACT Phytoplasmas are among the most recently discovered plant pathogenic microorganisms so, many traits of the interactions with host plants and insect vectors are still unclear and need to be investigated. At now, it is impossible to determine the precise sequences leading to the onset of the relationship with the plant host cell. It is still unclear how phytoplasmas, located in the phloem sieve elements, exploit host cell to draw nutrition for their metabolism, growth and multiplication. In this work, basing on microscopical observations, we give insight about the structural interactions established by phytoplasmas and the sieve element plasma membrane, cytoskeleton, sieve endoplasmic reticulum, speculating about a possible functional role. PMID:26795235
NASA Technical Reports Server (NTRS)
Dietrich, F. J.; Koloboff, G. J.; Martel, R. J.; Johnson, C. C. (Inventor)
1974-01-01
A spin stabilized satellite has an electronically despun antenna array comprising a multiplicity of peripheral antenna elements. A high gain energy beam is established by connecting a suitable fraction or array of the elements in phase. The beam is steered or caused to scan by switching elements in sequence into one end of the array as elements at the other end of the array are switched out. The switching transients normally associated with such steering are avoided by an amplitude control system. Instead of abruptly switching from one element to the next, a fixed value of power is gradually transferred from the element at the trailing edge of the array to the element next to the leading edge.
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M.; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P.; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M.; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org. PMID:21036865
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
Potvin, Eric; Beuret, Laurent; Cadrin-Girard, Jean-François; Carter, Marcelle; Roy, Sophie; Tremblay, Michel; Charron, Jean
2010-11-01
The precise expression of the N-myc proto-oncogene is essential for normal mammalian development, whereas altered N-myc gene regulation is known to be a determinant factor in tumor formation. Using transgenic mouse embryos, we show that N-myc sequences from kb -8.7 to kb +7.2 are sufficient to reproduce the N-myc embryonic expression profile in developing branchial arches and limb buds. These sequences encompass several regulatory elements dispersed throughout the N-myc locus, including an upstream limb bud enhancer, a downstream somite enhancer, a branchial arch enhancer in the second intron, and a negative regulatory element in the first intron. N-myc expression in the limb buds is under the dominant control of the limb bud enhancer. The expression in the branchial arches necessitates the interplay of three regulatory domains. The branchial arch enhancer cooperates with the somite enhancer region to prevent an inhibitory activity contained in the first intron. The characterization of the branchial arch enhancer has revealed a specific role of the transcription factor GATA3 in the regulation of N-myc expression. Together, these data demonstrate that correct N-myc developmental expression is achieved via cooperation of multiple positive and negative regulatory elements.
Dobinson, K F; Harris, R E; Hamer, J E
1993-01-01
The fungal phytopathogen Magnaporthe grisea parasitizes a wide variety of gramineous hosts. In the course of investigating the genetic relationship between pathogen genotype and host specificity we identified a retroelement that is present in some strains of M. grisea that infect finger millet and goosegrass (members of the plant genus Eleusine). The element, designated grasshopper (grh), is present in multiple copies and dispersed throughout the genome. DNA sequence analysis showed that grasshopper contains 198 base pair direct, long terminal repeats (LTRs) with features characteristic of retroviral and retrotransposon LTRs. Within the element we identified an open reading frame with sequences homologous to the reverse transcriptase, RNaseH, and integrase domains of retroelement pol genes. Comparison of the open reading frame with sequences from other retroelements showed that grh is related to the gypsy family of retrotransposons. Comparisons of the distribution of the grasshopper element with other dispersed repeated DNA sequences in M. grisea indicated that grasshopper was present in a broadly dispersed subgroup of Eleusine pathogens, suggesting that the element was acquired subsequent to the evolution of this host-specific form. We present arguments that the amplification of different retroelements within populations of M. grisea is a consequence of the clonal organization of the fungal populations.
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
Edgar, Robert C
2004-01-01
We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-12-01
The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-01-01
Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488
Anish, Ramakrishnan; Hossain, Mohammad B.; Jacobson, Raymond H.; Takada, Shinako
2009-01-01
Background More than 80% of mammalian protein-coding genes are driven by TATA-less promoters which often show multiple transcriptional start sites (TSSs). However, little is known about the core promoter DNA sequences or mechanisms of transcriptional initiation for this class of promoters. Methodology/Principal Findings Here we identify a new core promoter element XCPE2 (X core promoter element 2) (consensus sequence: A/C/G-C-C/T-C-G/A-T-T-G/A-C-C/A+1-C/T) that can direct specific transcription from the second TSS of hepatitis B virus X gene mRNA. XCPE2 sequences can also be found in human promoter regions and typically appear to drive one of the start sites within multiple TSS-containing TATA-less promoters. To gain insight into mechanisms of transcriptional initiation from this class of promoters, we examined requirements of several general transcription factors by in vitro transcription experiments using immunodepleted nuclear extracts and purified factors. Our results show that XCPE2-driven transcription uses at least TFIIB, either TFIID or free TBP, RNA polymerase II (RNA pol II) and the MED26-containing mediator complex but not Gcn5. Therefore, XCPE2-driven transcription can be carried out by a mechanism which differs from previously described TAF-dependent mechanisms for initiator (Inr)- or downstream promoter element (DPE)-containing promoters, the TBP- and SAGA (Spt-Ada-Gcn5-acetyltransferase)-dependent mechanism for yeast TATA-containing promoters, or the TFTC (TBP-free-TAF-containing complex)-dependent mechanism for certain Inr-containing TATA-less promoters. EMSA assays using XCPE2 promoter and purified factors further suggest that XCPE2 promoter recognition requires a set of factors different from those for TATA box, Inr, or DPE promoter recognition. Conclusions/Significance We identified a new core promoter element XCPE2 that are found in multiple TSS-containing TATA-less promoters. Mechanisms of promoter recognition and transcriptional initiation for XCPE2-driven promoters appear different from previously shown mechanisms for classical promoters that show single “focused” TSSs. Our studies provide insight into novel mechanisms of RNA Pol II transcription from multiple TSS-containing TATA-less promoters. PMID:19337366
Identifying Multiple Populations in M71 using CN
NASA Astrophysics Data System (ADS)
Gerber, Jeffrey M.; Friel, Eileen D.; Vesperini, Enrico
2018-01-01
It is now well established that globular clusters (GCs) host multiple stellar populations characterized by differences in several light elements. While these populations have been found in nearly all GCs, we still lack an entirely successful model to explain their formation. A key constraint to these models is the detailed pattern of light element abundances seen among the populations; different techniques for identifying these populations probe different elements and do not always yield the same results. We study a large sample of stars in the GC M71 for light elements C and N, using the CN and CH band strength to identify multiple populations. Our measurements come from low-resolution spectroscopy obtained with the WIYN-3.5m telescope for ~150 stars from the tip of the red-giant branch down to the main-sequence turn-off. The large number of stars and broad spatial coverage of our sample (out to ~3.5 half-light radii) allows us to carry out a comprehensive characterization of the multiple populations in M71. We use a combination of the various spectroscopic and photometric indicators to draw a more complete picture of the properties of the populations and to investigate the consistency of classifications using different techniques.
Hargreaves, Katherine R.; Flores, Cesar O.; Lawley, Trevor D.
2014-01-01
ABSTRACT Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. PMID:25161187
Multi-Objective Optimization of Spacecraft Trajectories for Small-Body Coverage Missions
NASA Technical Reports Server (NTRS)
Hinckley, David, Jr.; Englander, Jacob; Hitt, Darren
2017-01-01
Visual coverage of surface elements of a small-body object requires multiple images to be taken that meet many requirements on their viewing angles, illumination angles, times of day, and combinations thereof. Designing trajectories capable of maximizing total possible coverage may not be useful since the image target sequence and the feasibility of said sequence given the rotation-rate limitations of the spacecraft are not taken into account. This work presents a means of optimizing, in a multi-objective manner, surface target sequences that account for such limitations.
Pu, Jian; Sun, Haina; Wang, Jinda; Wu, Min; Wang, Kangxu; Denholm, Ian; Han, Zhaojun
2016-11-01
As well as arising from single point mutations in binding sites or detoxifying enzymes, it is likely that insecticide resistance mechanisms are frequently controlled by multiple genetic factors, resulting in resistance being inherited as a quantitative trait. However, empirical evidence for this is still rare. Here we analyse the causes of up-regulation of CYP6FU1, a monoxygenase implicated in resistance to deltamethrin in the rice pest Laodelphax striatellus. The 5'-flanking region of this gene was cloned and sequenced from individuals of a susceptible and a resistant strain. A luminescent reporter assay was used to evaluate different 5'-flanking regions and their fragments for promoter activity. Mutations enhancing promoter activity in various fragments were characterized, singly and in combination, by site mutation recovery. Nucleotide diversity in flanking sequences was greatly reduced in deltamethrin-resistant insects compared to susceptible ones. Phylogenetic sequence analysis found that CYP6FU1 had five different types of 5'-flanking region. All five types were present in a susceptible strain but only a single type showing the highest promoter activity was present in a resistant strain. Four cis-acting elements were identified whose influence on up-regulation was much more pronounced in combination than when present singly. Of these, two were new transcription factor (TF) binding sites produced by mutations, another one was also a new TF binding site alternated from an existing one, and the fourth was a unique transcription start site. These results demonstrate that multiple cis-acting elements are involved in up-regulating CYP6FU1 to generate a resistance phenotype. Copyright © 2016 Elsevier Ltd. All rights reserved.
Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V
2003-01-01
In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Splicing predictions reliably classify different types of alternative splicing
Busch, Anke; Hertel, Klemens J.
2015-01-01
Alternative splicing is a key player in the creation of complex mammalian transcriptomes and its misregulation is associated with many human diseases. Multiple mRNA isoforms are generated from most human genes, a process mediated by the interplay of various RNA signature elements and trans-acting factors that guide spliceosomal assembly and intron removal. Here, we introduce a splicing predictor that evaluates hundreds of RNA features simultaneously to successfully differentiate between exons that are constitutively spliced, exons that undergo alternative 5′ or 3′ splice-site selection, and alternative cassette-type exons. Surprisingly, the splicing predictor did not feature strong discriminatory contributions from binding sites for known splicing regulators. Rather, the ability of an exon to be involved in one or multiple types of alternative splicing is dictated by its immediate sequence context, mainly driven by the identity of the exon's splice sites, the conservation around them, and its exon/intron architecture. Thus, the splicing behavior of human exons can be reliably predicted based on basic RNA sequence elements. PMID:25805853
2014-01-01
Background Retroviral elements are pervasively transcribed and dynamically regulated during development. While multiple histone- and DNA-modifying enzymes have broadly been associated with their global silencing, little is known about how the many diverse retroviral families are each selectively recognized. Results Here we show that the zinc finger protein Krüppel-like Factor 3 (KLF3) specifically silences transcription from the ORR1A0 long terminal repeat in murine fetal and adult erythroid cells. In the absence of KLF3, we detect widespread transcription from ORR1A0 elements driven by the master erythroid regulator KLF1. In several instances these aberrant transcripts are spliced to downstream genic exons. One such chimeric transcript produces a novel, dominant negative isoform of PU.1 that can induce erythroid differentiation. Conclusions We propose that KLF3 ensures the integrity of the murine erythroid transcriptome through the selective repression of a particular retroelement and is likely one of multiple sequence-specific factors that cooperate to achieve global silencing. PMID:24946810
Kwarciak, Kamil; Radom, Marcin; Formanowicz, Piotr
2016-04-01
The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful. Two realistic multiplicity information models are taken into consideration in this paper. The first one, called "one and many" assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called "one, two and many", one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times. An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones. Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip. Copyright © 2016 Elsevier Ltd. All rights reserved.
Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A
2018-04-01
Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.
Conrad, Liza J; Brutnell, Thomas P
2005-12-01
We have identified and characterized a novel Activator (Ac) element that is incapable of excision yet contributes to the canonical negative dosage effect of Ac. Cloning and sequence analysis of this immobilized Ac (Ac-im) revealed that it is identical to Ac with the exception of a 10-bp deletion of sequences at the left end of the element. In screens of approximately 6800 seeds, no germinal transpositions of Ac-im were detected. Importantly, Ac-im catalyzes germinal excisions of a Ds element resident at the r1 locus resulting in the recovery of independent transposed Ds insertions in approximately 4.5% of progeny kernels. Many of these transposition events occur during gametophytic development. Furthermore, we demonstrate that Ac-im transactivates multiple Ds insertions in somatic tissues including those in reporter alleles at bronze1, anthocyaninless1, and anthocyaninless2. We propose a model for the generation of Ac-im as an aberrant transposition event that failed to generate an 8-bp target site duplication and resulted in the deletion of Ac end sequences. We also discuss the utility of Ac-im in two-component Ac/Ds gene-tagging programs in maize.
Crustal architecture and tectonic evolution of the Cauvery Suture Zone, southern India
NASA Astrophysics Data System (ADS)
Chetty, T. R. K.; Yellappa, T.; Santosh, M.
2016-11-01
The Cauvery suture zone (CSZ) in southern India has witnessed multiple deformations associated with multiple subduction-collision history, with incorporation of the related accretionary belts sequentially into the southern continental margin of the Archaean Dharwar craton since Neoarchean to Neoproterozoic. The accreted tectonic elements include suprasubduction complexes of arc magmatic sequences, high-grade supracrustals, thrust duplexes, ophiolites, and younger intrusions that are dispersed along the suture. The intra-oceanic Neoarchean-Neoproterozoic arc assemblages are well exposed in the form of tectonic mélanges dominantly towards the eastern sector of the CSZ and are typically subjected to complex and multiple deformation events. Multi-scale analysis of structural elements with detailed geological mapping of the sub-regions and their structural cross sections, geochemical and geochronological data and integrated geophysical observations suggest that the CSZ is an important zone that preserves the imprints of multiple cycles of Precambrian plate tectonic regimes.
Kordes, Sebastian; Kössl, Manfred
2017-01-01
Abstract For the purpose of orientation, echolocating bats emit highly repetitive and spatially directed sonar calls. Echoes arising from call reflections are used to create an acoustic image of the environment. The inferior colliculus (IC) represents an important auditory stage for initial processing of echolocation signals. The present study addresses the following questions: (1) how does the temporal context of an echolocation sequence mimicking an approach flight of an animal affect neuronal processing of distance information to echo delays? (2) how does the IC process complex echolocation sequences containing echo information from multiple objects (multiobject sequence)? Here, we conducted neurophysiological recordings from the IC of ketamine-anaesthetized bats of the species Carollia perspicillata and compared the results from the IC with the ones from the auditory cortex (AC). Neuronal responses to an echolocation sequence was suppressed when compared to the responses to temporally isolated and randomized segments of the sequence. The neuronal suppression was weaker in the IC than in the AC. In contrast to the cortex, the time course of the acoustic events is reflected by IC activity. In the IC, suppression sharpens the neuronal tuning to specific call-echo elements and increases the signal-to-noise ratio in the units’ responses. When presenting multiple-object sequences, despite collicular suppression, the neurons responded to each object-specific echo. The latter allows parallel processing of multiple echolocation streams at the IC level. Altogether, our data suggests that temporally-precise neuronal responses in the IC could allow fast and parallel processing of multiple acoustic streams. PMID:29242823
Beetz, M Jerome; Kordes, Sebastian; García-Rosales, Francisco; Kössl, Manfred; Hechavarría, Julio C
2017-01-01
For the purpose of orientation, echolocating bats emit highly repetitive and spatially directed sonar calls. Echoes arising from call reflections are used to create an acoustic image of the environment. The inferior colliculus (IC) represents an important auditory stage for initial processing of echolocation signals. The present study addresses the following questions: (1) how does the temporal context of an echolocation sequence mimicking an approach flight of an animal affect neuronal processing of distance information to echo delays? (2) how does the IC process complex echolocation sequences containing echo information from multiple objects (multiobject sequence)? Here, we conducted neurophysiological recordings from the IC of ketamine-anaesthetized bats of the species Carollia perspicillata and compared the results from the IC with the ones from the auditory cortex (AC). Neuronal responses to an echolocation sequence was suppressed when compared to the responses to temporally isolated and randomized segments of the sequence. The neuronal suppression was weaker in the IC than in the AC. In contrast to the cortex, the time course of the acoustic events is reflected by IC activity. In the IC, suppression sharpens the neuronal tuning to specific call-echo elements and increases the signal-to-noise ratio in the units' responses. When presenting multiple-object sequences, despite collicular suppression, the neurons responded to each object-specific echo. The latter allows parallel processing of multiple echolocation streams at the IC level. Altogether, our data suggests that temporally-precise neuronal responses in the IC could allow fast and parallel processing of multiple acoustic streams.
Wymbs, Nicholas F.; Bassett, Danielle S.; Mucha, Peter J.; Porter, Mason A.; Grafton, Scott T.
2012-01-01
Motor chunking facilitates movement production by combining motor elements into integrated units of behavior. Previous research suggests that chunking involves two processes: concatenation, aimed at the formation of motor-motor associations between elements or sets of elements; and segmentation, aimed at the parsing of multiple contiguous elements into shorter action sets. We used fMRI to measure the trial-wise recruitment of brain regions associated with these chunking processes as healthy subjects performed a cued sequence production task. A novel dynamic network analysis identified chunking structure for a set of motor sequences acquired during fMRI and collected on three days of training. Activity in the bilateral sensorimotor putamen positively correlated with chunk concatenation, whereas a left hemisphere frontoparietal network was correlated with chunk segmentation. Across subjects, there was an aggregate increase in chunk strength (concatenation) with training, suggesting that subcortical circuits play a direct role in the creation of fluid transitions across chunks. PMID:22681696
Wymbs, Nicholas F; Bassett, Danielle S; Mucha, Peter J; Porter, Mason A; Grafton, Scott T
2012-06-07
Motor chunking facilitates movement production by combining motor elements into integrated units of behavior. Previous research suggests that chunking involves two processes: concatenation, aimed at the formation of motor-motor associations between elements or sets of elements, and segmentation, aimed at the parsing of multiple contiguous elements into shorter action sets. We used fMRI to measure the trial-wise recruitment of brain regions associated with these chunking processes as healthy subjects performed a cued-sequence production task. A dynamic network analysis identified chunking structure for a set of motor sequences acquired during fMRI and collected over 3 days of training. Activity in the bilateral sensorimotor putamen positively correlated with chunk concatenation, whereas a left-hemisphere frontoparietal network was correlated with chunk segmentation. Across subjects, there was an aggregate increase in chunk strength (concatenation) with training, suggesting that subcortical circuits play a direct role in the creation of fluid transitions across chunks. Copyright © 2012 Elsevier Inc. All rights reserved.
RNA motif search with data-driven element ordering.
Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa
2016-05-18
In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .
Graw, J; Liebstein, A; Pietrowski, D; Schmitt-John, T; Werner, T
1993-12-22
The murine genes, gamma B-cry and gamma C-cry, encoding the gamma B- and gamma C-crystallins, were isolated from a genomic DNA library. The complete nucleotide (nt) sequences of both genes were determined from 661 and 711 bp, respectively, upstream from the first exon to the corresponding polyadenylation sites, comprising more than 2650 and 2890 bp, respectively. The new sequences were compared to the partial cDNA sequences available for the murine gamma B-cry and gamma C-cry, as well as to the corresponding genomic sequences from rat and man, at both the nt and predicted amino acid (aa) sequence levels. In the gamma B-cry promoter region, a canonical CCAAT-box, a TATA-box, putative NF-I and C/EBP sites were detected. An R-repeat is inserted 366 bp upstream from the transcription start point. In contrast, the gamma C-cry promoter does not contain a CCAAT-box, but some other putative binding sites for transcription factors (AP-2, UBP-1, LBP-1) were located by computer analysis. The promoter regions of all six gamma-cry from mouse, rat and human, except human psi gamma F-cry, were analyzed for common sequence elements. A complex sequence element of about 70-80 bp was found in the proximal promoter, which contains a gamma-cry-specific and almost invariant sequence (crygpel) of 14 nt, and ends with the also invariant TATA-box. Within the complex sequence element, a minimum of three further features specific for the gamma A-, gamma B- and gamma D/E/F-cry genes can be defined, at least two of which were recently shown to be functional. In addition to these four sequence elements, a subtype-specific structure of inverted repeats with different-sized spacers can be deduced from the multiple sequence alignment. A phylogenetic analysis based on the promoter region, as well as the complete exon 3 of all gamma-cry from mouse, rat and man, suggests separation of only five gamma-cry subtypes (gamma A-, gamma B-, gamma C-, gamma D- and gamma E/F-cry) prior to species separation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buchman, A.R.; Kimmerly, W.J.; Rine, J.
1988-01-01
Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less
The nonamer UUAUUUAUU is the key AU-rich sequence motif that mediates mRNA degradation.
Zubiaga, A M; Belasco, J G; Greenberg, M E
1995-01-01
Labile mRNAs that encode cytokine and immediate-early gene products often contain AU-rich sequences within their 3' untranslated region (UTR). These AU-rich sequences appear to be key determinants of the short half-lives of these mRNAs, although the sequence features of these elements and the mechanism by which they target mRNAs for rapid decay have not been fully defined. We have examined the features of AU-rich elements (AREs) that are crucial for their function as determinants of mRNA instability in mammalian cells by testing the ability of various mutant c-fos AREs and synthetic AREs to direct rapid mRNA deadenylation and decay when inserted within the 3' UTR of the normally stable beta-globin mRNA. Evidence is presented that the pentamer AUUUA, which previously was suggested to be the minimal determinant of instability present in mammalian AREs, cannot direct rapid mRNA deadenylation and decay. Instead, the nonomer UUAUUUAUU is the elemental AU-rich sequence motif that destabilizes mRNA. Removal of one uridine residue from either end of the nonamer (UUAUUUAU or UAUUUAUU) results in a decrease of potency of the element, while removal of a uridine residue from both ends of the nonamer (UAUUUAU) eliminates detectable destabilizing activity. The inclusion of an additional uridine residue at both ends of the nonamer (UUUAUUUAUUU) does not further increase the efficacy of the element. Taken together, these findings suggest that the nonamer UUAUUUAUU is the minimal AU-rich motif that effectively destabilizes mRNA. Additional ARE potency is achieved by combining multiple copies of this nonamer in a single mRNA 3' UTR. Furthermore, analysis of poly(A) shortening rates for ARE-containing mRNAs reveals that the UUAUUUAUU sequence also accelerates mRNA deadenylation and suggests that the UUAUUUAUU motif targets mRNA for rapid deadenylation as an early step in the mRNA decay process. PMID:7891716
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.
1988-08-01
The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
Capillarics: pre-programmed, self-powered microfluidic circuits built from capillary elements.
Safavieh, Roozbeh; Juncker, David
2013-11-07
Microfluidic capillary systems employ surface tension effects to manipulate liquids, and are thus self-powered and self-regulated as liquid handling is structurally and chemically encoded in microscale conduits. However, capillary systems have been limited to perform simple fluidic operations. Here, we introduce complex capillary flow circuits that encode sequential flow of multiple liquids with distinct flow rates and flow reversal. We first introduce two novel microfluidic capillary elements including (i) retention burst valves and (ii) robust low aspect ratio trigger valves. These elements are combined with flow resistors, capillary retention valves, capillary pumps, and open and closed reservoirs to build a capillary circuit that, following sample addition, autonomously delivers a defined sequence of multiple chemicals according to a preprogrammed and predetermined flow rate and time. Such a circuit was used to measure the concentration of C-reactive protein. This work illustrates that as in electronics, complex capillary circuits may be built by combining simple capillary elements. We define such circuits as "capillarics", and introduce symbolic representations. We believe that more complex circuits will become possible by expanding the library of building elements and formulating abstract design rules.
Heideman, Simone G; van Ede, Freek; Nobre, Anna C
2018-05-24
In daily life, temporal expectations may derive from incidental learning of recurring patterns of intervals. We investigated the incidental acquisition and utilisation of combined temporal-ordinal (spatial/effector) structure in complex visual-motor sequences using a modified version of a serial reaction time (SRT) task. In this task, not only the series of targets/responses, but also the series of intervals between subsequent targets was repeated across multiple presentations of the same sequence. Each participant completed three sessions. In the first session, only the repeating sequence was presented. During the second and third session, occasional probe blocks were presented, where a new (unlearned) spatial-temporal sequence was introduced. We first confirm that participants not only got faster over time, but that they were slower and less accurate during probe blocks, indicating that they incidentally learned the sequence structure. Having established a robust behavioural benefit induced by the repeating spatial-temporal sequence, we next addressed our central hypothesis that implicit temporal orienting (evoked by the learned temporal structure) would have the largest influence on performance for targets following short (as opposed to longer) intervals between temporally structured sequence elements, paralleling classical observations in tasks using explicit temporal cues. We found that indeed, reaction time differences between new and repeated sequences were largest for the short interval, compared to the medium and long intervals, and that this was the case, even when comparing late blocks (where the repeated sequence had been incidentally learned), to early blocks (where this sequence was still unfamiliar). We conclude that incidentally acquired temporal expectations that follow a sequential structure can have a robust facilitatory influence on visually-guided behavioural responses and that, like more explicit forms of temporal orienting, this effect is most pronounced for sequence elements that are expected at short inter-element intervals. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Aptamer-conjugated nanoparticles for cancer cell detection.
Medley, Colin D; Bamrungsap, Suwussa; Tan, Weihong; Smith, Joshua E
2011-02-01
Aptamer-conjugated nanoparticles (ACNPs) have been used for a variety of applications, particularly dual nanoparticles for magnetic extraction and fluorescent labeling. In this type of assay, silica-coated magnetic and fluorophore-doped silica nanoparticles are conjugated to highly selective aptamers to detect and extract targeted cells in a variety of matrixes. However, considerable improvements are required in order to increase the selectivity and sensitivity of this two-particle assay to be useful in a clinical setting. To accomplish this, several parameters were investigated, including nanoparticle size, conjugation chemistry, use of multiple aptamer sequences on the nanoparticles, and use of multiple nanoparticles with different aptamer sequences. After identifying the best-performing elements, the improvements made to this assay's conditional parameters were combined to illustrate the overall enhanced sensitivity and selectivity of the two-particle assay using an innovative multiple aptamer approach, signifying a critical feature in the advancement of this technique.
Multiple splicing defects in an intronic false exon.
Sun, H; Chasin, L A
2000-09-01
Splice site consensus sequences alone are insufficient to dictate the recognition of real constitutive splice sites within the typically large transcripts of higher eukaryotes, and large numbers of pseudoexons flanked by pseudosplice sites with good matches to the consensus sequences can be easily designated. In an attempt to identify elements that prevent pseudoexon splicing, we have systematically altered known splicing signals, as well as immediately adjacent flanking sequences, of an arbitrarily chosen pseudoexon from intron 1 of the human hprt gene. The substitution of a 5' splice site that perfectly matches the 5' consensus combined with mutation to match the CAG/G sequence of the 3' consensus failed to get this model pseudoexon included as the central exon in a dhfr minigene context. Provision of a real 3' splice site and a consensus 5' splice site and removal of an upstream inhibitory sequence were necessary and sufficient to confer splicing on the pseudoexon. This activated context also supported the splicing of a second pseudoexon sequence containing no apparent enhancer. Thus, both the 5' splice site sequence and the polypyrimidine tract of the pseudoexon are defective despite their good agreement with the consensus. On the other hand, the pseudoexon body did not exert a negative influence on splicing. The introduction into the pseudoexon of a sequence selected for binding to ASF/SF2 or its replacement with beta-globin exon 2 only partially reversed the effect of the upstream negative element and the defective polypyrimidine tract. These results support the idea that exon-bridging enhancers are not a prerequisite for constitutive exon definition and suggest that intrinsically defective splice sites and negative elements play important roles in distinguishing the real splicing signal from the vast number of false splicing signals.
Nandi, Tannistha; Holden, Matthew T.G.; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A.; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J.; Titball, Richard; Chen, Swaine L.; Parkhill, Julian
2015-01-01
Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617
The repetitive landscape of the chicken genome.
Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert
2005-01-01
Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome
Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert
2005-01-01
Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
The ENCODE Project at UC Santa Cruz.
Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James
2007-01-01
The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.
Functional and mechanistic diversity of distal transcription enhancers
Bulger, Michael; Groudine, Mark
2013-01-01
Biological differences among metazoans, and between cell types in a given organism, arise in large part due to differences in gene expression patterns. The sequencing of multiple metazoan genomes, coupled with recent advances in genome-wide analysis of histone modifications and transcription factor binding, has revealed that among regulatory DNA sequences, gene-distal enhancers appear to exhibit the greatest diversity and cell-type specificity. Moreover, such elements are emerging as important targets for mutations that can give rise to disease and to genetic variability that underlies evolutionary change. Studies of long-range interactions between distal genomic sequences in the nucleus indicate that enhancers are often important determinants of nuclear organization, contributing to a general model for enhancer function that involves direct enhancer-promoter contact. In a number of systems, however, mechanisms for enhancer function are emerging that do not fit solely within such a model, suggesting that enhancers as a class of DNA regulatory element may be functionally and mechanistically diverse. PMID:21295696
Transposition of the maize transposable element Ac in barley (Hordeum vulgare L.).
Scholz, S; Lörz, H; Lütticke, S
2001-01-01
Transposition of the maize autonomous element Ac (Activator) was investigated in barley (Hordeum vulgare L.) with the aim of developing a transposon tagging system for the latter. The Ac element was introduced into meristematic tissue of barley by microprojectile bombardment. Transposon activity was then examined in the resulting transgenic plants. Multiple excision events were detected in leaf tissue of all plant lines. The mobile elements generated empty donor sites with small DNA sequence alterations, similar to those found in maize. Reintegration of Ac at independent genomic loci in somatic tissue was demonstrated by isolation of new element-flanking regions by AIMS-PCR (amplification of insertion-mutagenized sites). In addition, transmission of transposed Ac elements to progeny plants was confirmed. The results indicate that the introduced Ac element is able to transpose in barley. This is a first step towards the establishment of a transposon tagging system in this economically important crop.
Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C
2017-09-01
Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).
Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT
Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong
2006-01-01
Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417
Shakoor, Nadia; Ziegler, Greg; Dilkes, Brian P; Brenton, Zachary; Boyles, Richard; Connolly, Erin L; Kresovich, Stephen; Baxter, Ivan
2016-04-01
Seedling establishment and seed nutritional quality require the sequestration of sufficient element nutrients. The identification of genes and alleles that modify element content in the grains of cereals, including sorghum (Sorghum bicolor), is fundamental to developing breeding and selection methods aimed at increasing bioavailable element content and improving crop growth. We have developed a high-throughput work flow for the simultaneous measurement of multiple elements in sorghum seeds. We measured seed element levels in the genotyped Sorghum Association Panel, representing all major cultivated sorghum races from diverse geographic and climatic regions, and mapped alleles contributing to seed element variation across three environments by genome-wide association. We observed significant phenotypic and genetic correlation between several elements across multiple years and diverse environments. The power of combining high-precision measurements with genome-wide association was demonstrated by implementing rank transformation and a multilocus mixed model to map alleles controlling 20 element traits, identifying 255 loci affecting the sorghum seed ionome. Sequence similarity to genes characterized in previous studies identified likely causative genes for the accumulation of zinc, manganese, nickel, calcium, and cadmium in sorghum seeds. In addition to strong candidates for these five elements, we provide a list of candidate loci for several other elements. Our approach enabled the identification of single-nucleotide polymorphisms in strong linkage disequilibrium with causative polymorphisms that can be evaluated in targeted selection strategies for plant breeding and improvement. © 2016 American Society of Plant Biologists. All Rights Reserved.
Namouchi, Amine; Mardassi, Helmi
2006-11-01
Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band.
NASA Astrophysics Data System (ADS)
Chen, LeuJen; Kim, Seong Heon; Lee, Alfred K. H.; de Lozanne, Alex
2012-01-01
We describe a new type of circuit designed for driving piezoelectric positioners that rely on the stick-slip phenomenon. The circuit can be used for inertial positioners that have only one piezoelectric element (or multiple elements that are moved simultaneously) or for designs using a sequential movement of independent piezoelectric elements. A relay switches the piezoelectric elements between a high voltage source and ground, thus creating a fast voltage step followed by a slow ramp produced by the exponential discharging of the piezoelectric elements through a series resistor. A timing cascade is generated by having each relay power the next relay in the sequence. This design is simple and inexpensive. While it was developed for scanning probe microscopes, it may be useful for any piezoelectric motor based on a fast jump followed by a slow relaxation.
A different approach to multiplicity-edited heteronuclear single quantum correlation spectroscopy
NASA Astrophysics Data System (ADS)
Sakhaii, Peyman; Bermel, Wolfgang
2015-10-01
A new experiment for recording multiplicity-edited HSQC spectra is presented. In standard multiplicity-edited HSQC experiments, the amplitude of CH2 signals is negative compared to those of CH and CH3 groups. We propose to reverse the sign of 13C frequencies of CH2 groups in t1 as criteria for editing. Basically, a modified [BIRD]r,x element (Bilinear Rotation Pulses and Delays) is inserted in a standard HSQC pulse sequence with States-TPPI frequency detection in t1 for this purpose. The modified BIRD element was designed in such a way as to pass or stop the evolution of the heteronuclear 1JHC coupling. This is achieved by adding a 180° proton RF pulse in each of the 1/2J periods. Depending on their position the evolution is switched on or off. Usually, the BIRD- element is applied on real and imaginary increments of a HSQC experiment to achieve the editing between multiplicities. Here, we restrict the application of the modified BIRD element to either real or imaginary increments of the HSQC. With this new scheme for editing, changing the frequency and/or amplitude of the CH2 signals becomes available. Reversing the chemical shift axis for CH2 signals simplifies overcrowded frequency regions and thus avoids accidental signal cancellation in conventional edited HSQC experiments. The practical implementation is demonstrated on the protein Lysozyme. Advantages and limitations of the idea are discussed.
Aubrey, Wayne; Riley, Michael C; Young, Michael; King, Ross D; Oliver, Stephen G; Clare, Amanda
2015-01-01
Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.
A fitness cost associated with the antibiotic resistance enzyme SME-1 beta-lactamase.
Marciano, David C; Karkouti, Omid Y; Palzkill, Timothy
2007-08-01
The bla(TEM-1) beta-lactamase gene has become widespread due to the selective pressure of beta-lactam use and its stable maintenance on transferable DNA elements. In contrast, bla(SME-1) is rarely isolated and is confined to the chromosome of carbapenem-resistant Serratia marcescens strains. Dissemination of bla(SME-1) via transfer to a mobile DNA element could hinder the use of carbapenems. In this study, bla(SME-1) was determined to impart a fitness cost upon Escherichia coli in multiple genetic contexts and assays. Genetic screens and designed SME-1 mutants were utilized to identify the source of this fitness cost. These experiments established that the SME-1 protein was required for the fitness cost but also that the enzyme activity of SME-1 was not associated with the fitness cost. The genetic screens suggested that the SME-1 signal sequence was involved in the fitness cost. Consistent with these findings, exchange of the SME-1 signal sequence for the TEM-1 signal sequence alleviated the fitness cost while replacing the TEM-1 signal sequence with the SME-1 signal sequence imparted a fitness cost to TEM-1 beta-lactamase. Taken together, these results suggest that fitness costs associated with some beta-lactamases may limit their dissemination.
A Fitness Cost Associated With the Antibiotic Resistance Enzyme SME-1 β-Lactamase
Marciano, David C.; Karkouti, Omid Y.; Palzkill, Timothy
2007-01-01
The blaTEM-1 β-lactamase gene has become widespread due to the selective pressure of β-lactam use and its stable maintenance on transferable DNA elements. In contrast, blaSME-1 is rarely isolated and is confined to the chromosome of carbapenem-resistant Serratia marcescens strains. Dissemination of blaSME-1 via transfer to a mobile DNA element could hinder the use of carbapenems. In this study, blaSME-1 was determined to impart a fitness cost upon Escherichia coli in multiple genetic contexts and assays. Genetic screens and designed SME-1 mutants were utilized to identify the source of this fitness cost. These experiments established that the SME-1 protein was required for the fitness cost but also that the enzyme activity of SME-1 was not associated with the fitness cost. The genetic screens suggested that the SME-1 signal sequence was involved in the fitness cost. Consistent with these findings, exchange of the SME-1 signal sequence for the TEM-1 signal sequence alleviated the fitness cost while replacing the TEM-1 signal sequence with the SME-1 signal sequence imparted a fitness cost to TEM-1 β-lactamase. Taken together, these results suggest that fitness costs associated with some β-lactamases may limit their dissemination. PMID:17565956
Aubrey, Wayne; Riley, Michael C.; Young, Michael; King, Ross D.; Oliver, Stephen G.; Clare, Amanda
2015-01-01
Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method’s primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome. PMID:26630677
Nishinaka, Toru; Ichijo, Yusuke; Ito, Maki; Kimura, Masayoshi; Katsuyama, Masato; Iwata, Kazumi; Miura, Takeshi; Terada, Tomoyuki; Yabe-Nishimura, Chihiro
2007-05-15
Curcumin is a plant-derived diferuloylmethane compound extracted from Curcuma longa, possessing antioxidative and anticarcinogenic properties. Antioxidants and oxidative stress are known to induce the expression of certain classes of detoxification enzymes. Since the upregulation of detoxifying enzymes affects the drug metabolism and cell defense system, it is important to understand the gene regulation by such agents. In this study, we demonstrated that curcumin could induce the expression of human glutathione S-transferase P1 (GSTP1). In HepG2 cells treated with 20muM curcumin, the level of GSTP1 mRNA was significantly increased. In luciferase reporter assays, curcumin augmented the promoter activity of a reporter construct carrying 336bp upstream of the 5'-flanking region of the GSTP1 gene. Mutation analyses revealed that the region including antioxidant response element (ARE), which overlaps AP1 in sequence, was essential to the response to curcumin. While the introduction of a wild-type Nrf2 expression construct augmented the promoter activity of the GSTP1 gene, co-expression of a dominant-negative Nrf2 abolished the responsiveness to curcumin. In addition, curcumin activated the expression of the luciferase gene from a reporter construct carrying multiple ARE consensus sequences but not one with multiple AP1 sites. In a gel mobility shift assay with an oligonucleotide with GSTP1 ARE, an increase in the amount of the binding complex was observed in the nuclear extracts of curcumin-treated HepG2 cells. These results suggested that ARE is the primary sequence for the curcumin-induced transactivation of the GSTP1 gene. The induction of GSTP1 may be one of the mechanisms underlying the multiple actions of curcumin.
Kikhno, Irina
2014-01-01
Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153
Gerencsér, Ákos; Barta, Endre; Boa, Simon; Kastanis, Petros; Bösze, Zsuzsanna; Whitelaw, C Bruce A
2002-01-01
κ-casein plays an essential role in the formation, stabilisation and aggregation of milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. We determined the 5'-flanking sequences for the murine, rabbit and human κ-casein genes and compared them to the published ruminant sequences. The most conserved region was not the proximal promoter region but an approximately 400 bp long region centred 800 bp upstream of the TATA box. This region contained two highly conserved MGF/STAT5 sites with common spacing relative to each other. In this region, six conserved short stretches of similarity were also found which did not correspond to known transcription factor consensus sites. On the contrary to ruminant and human 5' regulatory sequences, the rabbit and murine 5'-flanking regions did not harbour any kind of repetitive elements. We generated a phylogenetic tree of the six species based on multiple alignment of the κ-casein sequences. This study identified conserved candidate transcriptional regulatory elements within the κ-casein gene promoter. PMID:11929628
Hargreaves, Katherine R; Flores, Cesar O; Lawley, Trevor D; Clokie, Martha R J
2014-08-26
Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. Clostridium difficile is a significant bacterial human pathogen which undergoes continual genome evolution, resulting in the emergence of new virulent strains. Phages are major facilitators of genome evolution in other bacterial species, and we use sequence analysis-based approaches in order to examine whether the CRISPR/Cas system could control these interactions across divergent C. difficile strains. The presence of spacer sequences in prophages that are homologous to phage genomes raises an extra level of complexity in this predator-prey microbial system. Our results demonstrate that the impact of phage infection in this system is widespread and that the CRISPR/Cas system is likely to be an important aspect of the evolutionary dynamics in C. difficile. Copyright © 2014 Hargreaves et al.
Gillespie, J J; Johnston, J S; Cannone, J J; Gutell, R R
2006-01-01
As an accompanying manuscript to the release of the honey bee genome, we report the entire sequence of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) ribosomal RNA (rRNA)-encoding gene sequences (rDNA) and related internally and externally transcribed spacer regions of Apis mellifera (Insecta: Hymenoptera: Apocrita). Additionally, we predict secondary structures for the mature rRNA molecules based on comparative sequence analyses with other arthropod taxa and reference to recently published crystal structures of the ribosome. In general, the structures of honey bee rRNAs are in agreement with previously predicted rRNA models from other arthropods in core regions of the rRNA, with little additional expansion in non-conserved regions. Our multiple sequence alignments are made available on several public databases and provide a preliminary establishment of a global structural model of all rRNAs from the insects. Additionally, we provide conserved stretches of sequences flanking the rDNA cistrons that comprise the externally transcribed spacer regions (ETS) and part of the intergenic spacer region (IGS), including several repetitive motifs. Finally, we report the occurrence of retrotransposition in the nuclear large subunit rDNA, as R2 elements are present in the usual insertion points found in other arthropods. Interestingly, functional R1 elements usually present in the genomes of insects were not detected in the honey bee rRNA genes. The reverse transcriptase products of the R2 elements are deduced from their putative open reading frames and structurally aligned with those from another hymenopteran insect, the jewel wasp Nasonia (Pteromalidae). Stretches of conserved amino acids shared between Apis and Nasonia are illustrated and serve as potential sites for primer design, as target amplicons within these R2 elements may serve as novel phylogenetic markers for Hymenoptera. Given the impending completion of the sequencing of the Nasonia genome, we expect our report eventually to shed light on the evolution of the hymenopteran genome within higher insects, particularly regarding the relative maintenance of conserved rDNA genes, related variable spacer regions and retrotransposable elements. PMID:17069639
Pavia, Paula X; Thomas, M Carmen; López, Manuel C; Puerta, Concepción J
2012-10-01
Repetitive sequences constitute an important proportion of the Trypanosoma cruzi genome; hence, they have been used as molecular markers and as amplification targets to identify the parasite presence via PCR. In this study, a molecular characterization of the SIRE repetitive element was performed in the six discrete typing units (DTUs) of T. cruzi. The results evidenced that this element, located in multiple chromosomes, was interspersed in the genome of all DTUs of the parasite. The presence of several motifs implicated in element insertion, duplication, and functionality suggests that SIRE could be an active element in the parasite genome. Of interest, there were SIRE specific Alu I fragments that allowed to discriminate DTU I from the others DTUs. Moreover, an UPGMA phenetic tree constructed from fragment sharing Southern blot data showed that T. cruzi I isolates conform a cluster separated from the T. cruzi II-VI isolates. When the relative number of SIRE copies was determined, a variation from 105 to 2,000 copies per haploid genome was observed among the different isolates without kept a DTU-relationship. In all, these findings suggest that SIRE sequence is a good target for parasite DNA amplification. Copyright © 2012 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, Kazuo; Yasunami, Michio; Matsuda, Yoichi
1996-09-01
Embryonic TEA domain-containing factor (ETF) belongs to the family of proteins structurally related to transcriptional enhancer factor-1 (TEF-1) and is implicated in neural development. Isolation and characterization of the cosmid clones encoding the mouse ETF gene (Etdf) revealed that Etdf spans approximately 17.9 kb and consists of 12 exons. The exon-intron structure of Etdf closely resembles that of the Drosophila scalloped gene, indicating that these genes may have evolved from a common ancestor. Then multiple transcription initiation sites revealed by S1 protection and primer extension analyses are consistent with the absence of the canonical TATA and CAAT boxes in themore » 5{prime}-flanking region, which contains many potential regulatory sequences, such as the E-box, N-box, Sp1 element, GATA-1 element, TAATGARAT element, and B2 short interspersed element (SINE) as well as several direct and inverted repeat sequences. The Etdf locus was assigned to the proximal region of mouse chromosome 7 using fluorescence in situ hybridization and linkage mapping analyses. These results provide the molecular basis for studying the regulation, in vivo function, and evolution of Etdf. 29 refs., 5 figs., 1 tab.« less
Suzuki, K; Yasunami, M; Matsuda, Y; Maeda, T; Kobayashi, H; Terasaki, H; Ohkubo, H
1996-09-01
Embryonic TEA domain-containing factor (ETF) belongs to the family of proteins structurally related to transcriptional enhancer factor-1 (TEF-1) and is implicated in neural development. Isolation and characterization of the cosmid clones encoding the mouse ETF gene (Etdf) revealed that Etdf spans approximately 17.9 kb and consists of 12 exons. The exon-intron structure of Etdf closely resembles that of the Drosophila scalloped gene, indicating that these genes may have evolved from a common ancestor. The multiple transcription initiation sites revealed by S1 protection and primer extension analyses are consistent with the absence of the canonical TATA and CAAT boxes in the 5'-flanking region, which contains many potential regulatory sequences, such as the E-box, N-box, Sp1 element, GATA-1 element, TAATGARAT element, and B2 short interspersed element (SINE) as well as several direct and inverted repeat sequences. The Etdf locus was assigned to the proximal region of mouse chromosome 7 using fluorescence in situ hybridization and linkage mapping analyses. These results provide the molecular basis for studying the regulation, in vivo function, and evolution of Etdf.
Late Holocene volcanic activity and environmental change in Highland Guatemala
NASA Astrophysics Data System (ADS)
Lohse, Jon C.; Hamilton, W. Derek; Brenner, Mark; Curtis, Jason; Inomata, Takeshi; Morgan, Molly; Cardona, Karla; Aoyama, Kazuo; Yonenobu, Hitoshi
2018-07-01
We present a record of late Holocene volcanic eruptions with elemental data for a sequence of sampled tephras from Lake Amatitlan in Highland Guatemala. Our tephrochronology is anchored by a Bayesian P_Sequence age-depth model based on multiple AMS radiocarbon dates. We compare our record against a previously published study from the same area to understand the record of volcanism and environmental changes. This work has implications for understanding the effects of climate and other environmental changes that may be related to the emission of volcanic aerosols at local, regional and global scales.
Canver, Matthew C; Lessard, Samuel; Pinello, Luca; Wu, Yuxuan; Ilboudo, Yann; Stern, Emily N; Needleman, Austen J; Galactéros, Frédéric; Brugnara, Carlo; Kutlar, Abdullah; McKenzie, Colin; Reid, Marvin; Chen, Diane D; Das, Partha Pratim; A Cole, Mitchel; Zeng, Jing; Kurita, Ryo; Nakamura, Yukio; Yuan, Guo-Cheng; Lettre, Guillaume; Bauer, Daniel E; Orkin, Stuart H
2017-04-01
Cas9-mediated, high-throughput, saturating in situ mutagenesis permits fine-mapping of function across genomic segments. Disease- and trait-associated variants identified in genome-wide association studies largely cluster at regulatory loci. Here we demonstrate the use of multiple designer nucleases and variant-aware library design to interrogate trait-associated regulatory DNA at high resolution. We developed a computational tool for the creation of saturating-mutagenesis libraries with single or multiple nucleases with incorporation of variants. We applied this methodology to the HBS1L-MYB intergenic region, which is associated with red-blood-cell traits, including fetal hemoglobin levels. This approach identified putative regulatory elements that control MYB expression. Analysis of genomic copy number highlighted potential false-positive regions, thus emphasizing the importance of off-target analysis in the design of saturating-mutagenesis experiments. Together, these data establish a widely applicable high-throughput and high-resolution methodology to identify minimal functional sequences within large disease- and trait-associated regions.
Gordon, Christopher T.; Attanasio, Catia; Bhatia, Shipra; Benko, Sabina; Ansari, Morad; Tan, Tiong Y.; Munnich, Arnold; Pennacchio, Len A.; Abadie, Véronique; Temple, I. Karen; Goldenberg, Alice; van Heyningen, Veronica; Amiel, Jeanne; FitzPatrick, David; Kleinjan, Dirk A.; Visel, Axel; Lyonnet, Stanislas
2015-01-01
Mutations in the coding sequence of SOX9 cause campomelic dysplasia (CD), a disorder of skeletal development associated with 46,XY disorders of sex development (DSDs). Translocations, deletions and duplications within a ~2 Mb region upstream of SOX9 can recapitulate the CD-DSD phenotype fully or partially, suggesting the existence of an unusually large cis-regulatory control region. Pierre Robin sequence (PRS) is a craniofacial disorder that is frequently an endophenotype of CD and a locus for isolated PRS at ~1.2-1.5 Mb upstream of SOX9 has been previously reported. The craniofacial regulatory potential within this locus, and within the greater genomic domain surrounding SOX9, remains poorly defined. We report two novel deletions upstream of SOX9 in families with PRS, allowing refinement of the regions harbouring candidate craniofacial regulatory elements. In parallel, ChIP-Seq for p300 binding sites in mouse craniofacial tissue led to the identification of several novel craniofacial enhancers at the SOX9 locus, which were validated in transgenic reporter mice and zebrafish. Notably, some of the functionally validated elements fall within the PRS deletions. These studies suggest that multiple non-coding elements contribute to the craniofacial regulation of SOX9 expression, and that their disruption results in PRS. PMID:24934569
Sequence information signal processor
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1999-01-01
An electronic circuit is used to compare two sequences, such as genetic sequences, to determine which alignment of the sequences produces the greatest similarity. The circuit includes a linear array of series-connected processors, each of which stores a single element from one of the sequences and compares that element with each successive element in the other sequence. For each comparison, the processor generates a scoring parameter that indicates which segment ending at those two elements produces the greatest degree of similarity between the sequences. The processor uses the scoring parameter to generate a similar scoring parameter for a comparison between the stored element and the next successive element from the other sequence. The processor also delivers the scoring parameter to the next processor in the array for use in generating a similar scoring parameter for another pair of elements. The electronic circuit determines which processor and alignment of the sequences produce the scoring parameter with the highest value.
Repetitive element transcripts are elevated in the brain of C9orf72 ALS/FTLD patients.
Prudencio, Mercedes; Gonzales, Patrick K; Cook, Casey N; Gendron, Tania F; Daughrity, Lillian M; Song, Yuping; Ebbert, Mark T W; van Blitterswijk, Marka; Zhang, Yong-Jie; Jansen-West, Karen; Baker, Matthew C; DeTure, Michael; Rademakers, Rosa; Boylan, Kevin B; Dickson, Dennis W; Petrucelli, Leonard; Link, Christopher D
2017-09-01
Significant transcriptome alterations are detected in the brain of patients with amyotrophic lateral sclerosis (ALS), including carriers of the C9orf72 repeat expansion and C9orf72-negative sporadic cases. Recently, the expression of repetitive element transcripts has been associated with toxicity and, while increased repetitive element expression has been observed in several neurodegenerative diseases, little is known about their contribution to ALS. To assess whether aberrant expression of repetitive element sequences are observed in ALS, we analysed RNA sequencing data from C9orf72-positive and sporadic ALS cases, as well as healthy controls. Transcripts from multiple classes and subclasses of repetitive elements (LINEs, endogenous retroviruses, DNA transposons, simple repeats, etc.) were significantly increased in the frontal cortex of C9orf72 ALS patients. A large collection of patient samples, representing both C9orf72 positive and negative ALS, ALS/FTLD, and FTLD cases, was used to validate the levels of several repetitive element transcripts. These analyses confirmed that repetitive element expression was significantly increased in C9orf72-positive compared to C9orf72-negative or control cases. While previous studies suggest an important link between TDP-43 and repetitive element biology, our data indicate that TDP-43 pathology alone is insufficient to account for the observed changes in repetitive elements in ALS/FTLD. Instead, we found that repetitive element expression positively correlated with RNA polymerase II activity in postmortem brain, and pharmacologic modulation of RNA polymerase II activity altered repetitive element expression in vitro. We conclude that increased RNA polymerase II activity in ALS/FTLD may lead to increased repetitive element transcript expression, a novel pathological feature of ALS/FTLD. © The Author 2017. Published by Oxford University Press.
Imbert, J; Zafarullah, M; Culotta, V C; Gedamu, L; Hamer, D
1989-01-01
Metallothionein (MT) gene promoters in higher eucaryotes contain multiple metal regulatory elements (MREs) that are responsible for the metal induction of MT gene transcription. We identified and purified to near homogeneity a 74-kilodalton mouse nuclear protein that specifically binds to certain MRE sequences. This protein, MBF-I, was purified employing as an affinity reagent a trout MRE that is shown to be functional in mouse cells but which lacks the G+C-rich and SP1-like sequences found in many mammalian MT gene promoters. Using point-mutated MREs, we showed that there is a strong correlation between DNA binding in vitro and MT gene regulation in vivo, suggesting a direct role of MBF-I in MT gene transcription. We also showed that MBF-I can induce MT gene transcription in vitro in a mouse extract and that this stimulation requires zinc. Images PMID:2586522
Jaffrey, S R; Haile, D J; Klausner, R D; Harford, J B
1993-09-25
To assess the influence of RNA sequence/structure on the interaction RNAs with the iron-responsive element binding protein (IRE-BP), twenty eight altered RNAs were tested as competitors for an RNA corresponding to the ferritin H chain IRE. All changes in the loop of the predicted IRE hairpin and in the unpaired cytosine residue characteristically found in IRE stems significantly decreased the apparent affinity of the RNA for the IRE-BP. Similarly, alteration in the spacing and/or orientation of the loop and the unpaired cytosine of the stem by either increasing or decreasing the number of base pairs separating them significantly reduced efficacy as a competitor. It is inferred that the IRE-BP forms multiple contacts with its cognate RNA, and that these contacts, acting in concert, provide the basis for the high affinity of this interaction.
Spike-Based Bayesian-Hebbian Learning of Temporal Sequences
Lindén, Henrik; Lansner, Anders
2016-01-01
Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli, but it remains an open issue how neocortical microcircuits can reliably encode and replay such sequences of information. To better understand this, a modular attractor memory network is proposed in which meta-stable sequential attractor transitions are learned through changes to synaptic weights and intrinsic excitabilities via the spike-based Bayesian Confidence Propagation Neural Network (BCPNN) learning rule. We find that the formation of distributed memories, embodied by increased periods of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model’s feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx). We show that the learning and speed of sequence replay depends on a confluence of biophysically relevant parameters including stimulus duration, level of background noise, ratio of synaptic currents, and strengths of short-term depression and adaptation. Moreover, sequence elements are shown to flexibly participate multiple times in the sequence, suggesting that spiking attractor networks of this type can support an efficient combinatorial code. The model provides a principled approach towards understanding how multiple interacting plasticity mechanisms can coordinate hetero-associative learning in unison. PMID:27213810
The Thiamine-Pyrophosphate-Motif
NASA Technical Reports Server (NTRS)
Ciszak, Ewa; Dominiak, Paulina
2004-01-01
Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.
Ishii, Satoshi; Sadowsky, Michael J
2009-04-01
A large number of repetitive DNA sequences are found in multiple sites in the genomes of numerous bacteria, archaea and eukarya. While the functions of many of these repetitive sequence elements are unknown, they have proven to be useful as the basis of several powerful tools for use in molecular diagnostics, medical microbiology, epidemiological analyses and environmental microbiology. The repetitive sequence-based PCR or rep-PCR DNA fingerprint technique uses primers targeting several of these repetitive elements and PCR to generate unique DNA profiles or 'fingerprints' of individual microbial strains. Although this technique has been extensively used to examine diversity among variety of prokaryotic microorganisms, rep-PCR DNA fingerprinting can also be applied to microbial ecology and microbial evolution studies since it has the power to distinguish microbes at the strain or isolate level. Recent advancement in rep-PCR methodology has resulted in increased accuracy, reproducibility and throughput. In this minireview, we summarize recent improvements in rep-PCR DNA fingerprinting methodology, and discuss its applications to address fundamentally important questions in microbial ecology and evolution.
Jay, Z. J.; Beam, J. P.; Dohnalkova, A.; Lohmayer, R.; Bodle, B.; Planer-Friedrich, B.; Romine, M.
2015-01-01
Thermoproteales (phylum Crenarchaeota) populations are abundant in high-temperature (>70°C) environments of Yellowstone National Park (YNP) and are important in mediating the biogeochemical cycles of sulfur, arsenic, and carbon. The objectives of this study were to determine the specific physiological attributes of the isolate Pyrobaculum yellowstonensis strain WP30, which was obtained from an elemental sulfur sediment (Joseph's Coat Hot Spring [JCHS], 80°C, pH 6.1, 135 μM As) and relate this organism to geochemical processes occurring in situ. Strain WP30 is a chemoorganoheterotroph and requires elemental sulfur and/or arsenate as an electron acceptor. Growth in the presence of elemental sulfur and arsenate resulted in the formation of thioarsenates and polysulfides. The complete genome of this organism was sequenced (1.99 Mb, 58% G+C content), revealing numerous metabolic pathways for the degradation of carbohydrates, amino acids, and lipids. Multiple dimethyl sulfoxide-molybdopterin (DMSO-MPT) oxidoreductase genes, which are implicated in the reduction of sulfur and arsenic, were identified. Pathways for the de novo synthesis of nearly all required cofactors and metabolites were identified. The comparative genomics of P. yellowstonensis and the assembled metagenome sequence from JCHS showed that this organism is highly related (∼95% average nucleotide sequence identity) to in situ populations. The physiological attributes and metabolic capabilities of P. yellowstonensis provide an important foundation for developing an understanding of the distribution and function of these populations in YNP. PMID:26092468
Gaji, Rajshekhar Y; Howe, Daniel K
2009-07-01
The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
von Both, Ulrich; Berk, Maurice; Agapow, Paul-Michael; Wright, Joseph D; Git, Anna; Hamilton, Melissa Shea; Goldgof, Greg; Siddiqui, Nazneen; Bellos, Evangelos; Wright, Victoria J; Coin, Lachlan J; Newton, Sandra M; Levin, Michael
2018-01-12
Mycobacterium tuberculosis (M. tuberculosis) survives and multiplies inside human macrophages by subversion of immune mechanisms. Although these immune evasion strategies are well characterised functionally, the underlying molecular mechanisms are poorly understood. Here we show that during infection of human whole blood with M. tuberculosis, host gene transcriptional suppression, rather than activation, is the predominant response. Spatial, temporal and functional characterisation of repressed genes revealed their involvement in pathogen sensing and phagocytosis, degradation within the phagolysosome and antigen processing and presentation. To identify mechanisms underlying suppression of multiple immune genes we undertook epigenetic analyses. We identified significantly differentially expressed microRNAs with known targets in suppressed genes. In addition, after searching regions upstream of the start of transcription of suppressed genes for common sequence motifs, we discovered novel enriched composite sequence patterns, which corresponded to Alu repeat elements, transposable elements known to have wide ranging influences on gene expression. Our findings suggest that to survive within infected cells, mycobacteria exploit a complex immune "molecular off switch" controlled by both microRNAs and Alu regulatory elements.
Ribosomal DNA Organization Before and After Magnification in Drosophila melanogaster
Bianciardi, Alessio; Boschi, Manuela; Swanson, Ellen E.; Belloni, Massimo; Robbins, Leonard G.
2012-01-01
In all eukaryotes, the ribosomal RNA genes are stably inherited redundant elements. In Drosophila melanogaster, the presence of a Ybb− chromosome in males, or the maternal presence of the Ribosomal exchange (Rex) element, induces magnification: a heritable increase of rDNA copy number. To date, several alternative classes of mechanisms have been proposed for magnification: in situ replication or extra-chromosomal replication, either of which might act on short or extended strings of rDNA units, or unequal sister chromatid exchange. To eliminate some of these hypotheses, none of which has been clearly proven, we examined molecular-variant composition and compared genetic maps of the rDNA in the bb2 mutant and in some magnified bb+ alleles. The genetic markers used are molecular-length variants of IGS sequences and of R1 and R2 mobile elements present in many 28S sequences. Direct comparison of PCR products does not reveal any particularly intensified electrophoretic bands in magnified alleles compared to the nonmagnified bb2 allele. Hence, the increase of rDNA copy number is diluted among multiple variants. We can therefore reject mechanisms of magnification based on multiple rounds of replication of short strings. Moreover, we find no changes of marker order when pre- and postmagnification maps are compared. Thus, we can further restrict the possible mechanisms to two: replication in situ of an extended string of rDNA units or unequal exchange between sister chromatids. PMID:22505623
Pecora, Nicole D; Li, Ning; Allard, Marc; Li, Cong; Albano, Esperanza; Delaney, Mary; Dubois, Andrea; Onderdonk, Andrew B; Bry, Lynn
2015-07-28
Carbapenem-resistant Enterobacteriaceae (CRE) are an urgent public health concern. Rapid identification of the resistance genes, their mobilization capacity, and strains carrying them is essential to direct hospital resources to prevent spread and improve patient outcomes. Whole-genome sequencing allows refined tracking of both chromosomal traits and associated mobile genetic elements that harbor resistance genes. To enhance surveillance of CREs, clinical isolates with phenotypic resistance to carbapenem antibiotics underwent whole-genome sequencing. Analysis of 41 isolates of Klebsiella pneumoniae and Enterobacter cloacae, collected over a 3-year period, identified K. pneumoniae carbapenemase (KPC) genes encoding KPC-2, -3, and -4 and OXA-48 carbapenemases. All occurred within transposons, including multiple Tn4401 transposon isoforms, embedded within more than 10 distinct plasmids representing incompatibility (Inc) groups IncR, -N, -A/C, -H, and -X. Using short-read sequencing, draft maps were generated of new KPC-carrying vectors, several of which were derivatives of the IncN plasmid pBK31551. Two strains also had Tn4401 chromosomal insertions. Integrated analyses of plasmid profiles and chromosomal single-nucleotide polymorphism (SNP) profiles refined the strain patterns and provided a baseline hospital mobilome to facilitate analysis of new isolates. When incorporated with patient epidemiological data, the findings identified limited outbreaks against a broader 3-year period of sporadic external entry of many different strains and resistance vectors into the hospital. These findings highlight the utility of genomic analyses in internal and external surveillance efforts to stem the transmission of drug-resistant strains within and across health care institutions. We demonstrate how detection of resistance genes within mobile elements and resistance-carrying strains furthers active surveillance efforts for drug resistance. Whole-genome sequencing is increasingly available in hospital laboratories and provides a powerful and nuanced means to define the local landscape of drug resistance. In this study, isolates of Klebsiella pneumoniae and Enterobacter cloacae with resistance to carbapenem antibiotics were sequenced. Multiple carbapenemase genes were identified that resided in distinct transposons and plasmids. This mobilome, or population of mobile elements capable of mobilizing drug resistance, further highlighted the degree of strain heterogeneity while providing a detailed timeline of carbapenemase entry into the hospital over a 3-year period. These surveillance efforts support effective targeting of infection control resources and the development of institution-specific repositories of resistance genes and the mobile elements that carry them. Copyright © 2015 Pecora et al.
IBS: an illustrator for the presentation and visualization of biological sequences.
Liu, Wenzhong; Xie, Yubin; Ma, Jiyong; Luo, Xiaotong; Nie, Peng; Zuo, Zhixiang; Lahrmann, Urs; Zhao, Qi; Zheng, Yueyuan; Zhao, Yong; Xue, Yu; Ren, Jian
2015-10-15
Biological sequence diagrams are fundamental for visualizing various functional elements in protein or nucleotide sequences that enable a summarization and presentation of existing information as well as means of intuitive new discoveries. Here, we present a software package called illustrator of biological sequences (IBS) that can be used for representing the organization of either protein or nucleotide sequences in a convenient, efficient and precise manner. Multiple options are provided in IBS, and biological sequences can be manipulated, recolored or rescaled in a user-defined mode. Also, the final representational artwork can be directly exported into a publication-quality figure. The standalone package of IBS was implemented in JAVA, while the online service was implemented in HTML5 and JavaScript. Both the standalone package and online service are freely available at http://ibs.biocuckoo.org. renjian.sysu@gmail.com or xueyu@hust.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants.
Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun
2017-10-24
Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation.
IBS: an illustrator for the presentation and visualization of biological sequences
Liu, Wenzhong; Xie, Yubin; Ma, Jiyong; Luo, Xiaotong; Nie, Peng; Zuo, Zhixiang; Lahrmann, Urs; Zhao, Qi; Zheng, Yueyuan; Zhao, Yong; Xue, Yu; Ren, Jian
2015-01-01
Summary: Biological sequence diagrams are fundamental for visualizing various functional elements in protein or nucleotide sequences that enable a summarization and presentation of existing information as well as means of intuitive new discoveries. Here, we present a software package called illustrator of biological sequences (IBS) that can be used for representing the organization of either protein or nucleotide sequences in a convenient, efficient and precise manner. Multiple options are provided in IBS, and biological sequences can be manipulated, recolored or rescaled in a user-defined mode. Also, the final representational artwork can be directly exported into a publication-quality figure. Availability and implementation: The standalone package of IBS was implemented in JAVA, while the online service was implemented in HTML5 and JavaScript. Both the standalone package and online service are freely available at http://ibs.biocuckoo.org. Contact: renjian.sysu@gmail.com or xueyu@hust.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26069263
High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome
2013-01-01
Background Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate. Results Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny. Conclusions We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts. PMID:23368932
Zhao, Yi; Cao, Xiangyu; Gao, Jun; Liu, Xiao; Li, Sijia
2016-05-16
We demonstrate a simple reconfigurable metasurface with multiple functions. Anisotropic tiles are investigated and manufactured as fundamental elements. Then, the tiles are combined in a certain sequence to construct a metasurface. Each of the tiles can be adjusted independently which is like a jigsaw puzzle and the whole metasurface can achieve diverse functions by different layouts. For demonstration purposes, we realize polarization conversion, anomalous reflection and diffusion by a jigsaw puzzle metasurface with 6 × 6 pieces of anisotropic tile. Simulated and measured results prove that our method offers a simple and effective strategy for metasurface design.
Eastman, Alexander W.; Yuan, Ze-Chun
2015-01-01
Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Sand, Olivier; Thomas-Chollier, Morgane; Vervisch, Eric; van Helden, Jacques
2008-01-01
This protocol shows how to access the Regulatory Sequence Analysis Tools (RSAT) via a programmatic interface in order to automate the analysis of multiple data sets. We describe the steps for writing a Perl client that connects to the RSAT Web services and implements a workflow to discover putative cis-acting elements in promoters of gene clusters. In the presented example, we apply this workflow to lists of transcription factor target genes resulting from ChIP-chip experiments. For each factor, the protocol predicts the binding motifs by detecting significantly overrepresented hexanucleotides in the target promoters and generates a feature map that displays the positions of putative binding sites along the promoter sequences. This protocol is addressed to bioinformaticians and biologists with programming skills (notions of Perl). Running time is approximately 6 min on the example data set.
Longo, Mark S; Carone, Dawn M; Green, Eric D; O'Neill, Michael J; O'Neill, Rachel J
2009-01-01
Background Large-scale genome rearrangements brought about by chromosome breaks underlie numerous inherited diseases, initiate or promote many cancers and are also associated with karyotype diversification during species evolution. Recent research has shown that these breakpoints are nonrandomly distributed throughout the mammalian genome and many, termed "evolutionary breakpoints" (EB), are specific genomic locations that are "reused" during karyotypic evolution. When the phylogenetic trajectory of orthologous chromosome segments is considered, many of these EB are coincident with ancient centromere activity as well as new centromere formation. While EB have been characterized as repeat-rich regions, it has not been determined whether specific sequences have been retained during evolution that would indicate previous centromere activity or a propensity for new centromere formation. Likewise, the conservation of specific sequence motifs or classes at EBs among divergent mammalian taxa has not been determined. Results To define conserved sequence features of EBs associated with centromere evolution, we performed comparative sequence analysis of more than 4.8 Mb within the tammar wallaby, Macropus eugenii, derived from centromeric regions (CEN), euchromatic regions (EU), and an evolutionary breakpoint (EB) that has undergone convergent breakpoint reuse and past centromere activity in marsupials. We found a dramatic enrichment for long interspersed nucleotide elements (LINE1s) and endogenous retroviruses (ERVs) and a depletion of short interspersed nucleotide elements (SINEs) shared between CEN and EBs. We analyzed the orthologous human EB (14q32.33), known to be associated with translocations in many cancers including multiple myelomas and plasma cell leukemias, and found a conserved distribution of similar repetitive elements. Conclusion Our data indicate that EBs tracked within the class Mammalia harbor sequence features retained since the divergence of marsupials and eutherians that may have predisposed these genomic regions to large-scale chromosomal instability. PMID:19630942
Fukami, Maki; Dateki, Sumito; Kato, Fumiko; Hasegawa, Yukihiro; Mochizuki, Hiroshi; Horikawa, Reiko; Ogata, Tsutomu
2008-01-01
Although short-stature homeobox-containing gene (SHOX ) haploinsufficiency is responsible for Léri-Weill dyschondrosteosis (LWD), the molecular defect has not been identified in approximately 20% of Japanese LWD patients. Furthermore, although high prevalence of microdeletions affecting SHOX is primarily ascribed to the presence of repeat sequences such as Alu elements around SHOX, it remains to be determined whether microdeletions are actually mediated by repeat sequences. We performed multiple ligation probe amplification (MLPA) assay in six Japanese LWD patients with apparently normal SHOX, followed by fluorescent in situ hybridization (FISH) analysis and sequencing for polymerase chain reaction (PCR) products encompassing the deletion junctions in patients with abnormal MLPA patterns. Consequently, heterozygous intragenic deletions were identified in three cases, i.e., a 5,906-bp deletion involving exons 4-5 in case 1, a 5,594-bp deletion involving exons 4-6a in case 2, and a 50,199-bp deletion involving exons 4-6b in case 3. The deletion breakpoints of cases 1 and 2 were present in nonrepeat sequences, whereas those of case 3 resided within Alu elements. The results suggest that cryptic SHOX intragenic deletions account for a small fraction of LWD and that microdeletions affecting SHOX can be generated by repeat-sequence-mediated aberrant recombinations and by nonhomologous end joining.
Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes
Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong
2014-01-01
Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution. PMID:25523484
Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong
2014-12-19
Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.
Compositions and methods for the expression of selenoproteins in eukaryotic cells
Gladyshev, Vadim [Lincoln, NE; Novoselov, Sergey [Puschino, RU
2012-09-25
Recombinant nucleic acid constructs for the efficient expression of eukaryotic selenoproteins and related methods for production of recombinant selenoproteins are provided. The nucleic acid constructs comprise novel selenocysteine insertion sequence (SECIS) elements. Certain novel SECIS elements of the invention contain non-canonical quartet sequences. Other novel SECIS elements provided by the invention are chimeric SECIS elements comprising a canonical SECIS element that contains a non-canonical quartet sequence and chimeric SECIS elements comprising a non-canonical SECIS element that contains a canonical quartet sequence. The novel SECIS elements of the invention facilitate the insertion of selenocysteine residues into recombinant polypeptides.
Identification, variation and transcription of pneumococcal repeat sequences
2011-01-01
Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
Kohmoto, Tomohiro; Naruto, Takuya; Watanabe, Miki; Fujita, Yuji; Ujiro, Sae; Okamoto, Nana; Horikawa, Hideaki; Masuda, Kiyoshi; Imoto, Issei
2017-04-01
Mesomelia-synostoses syndrome (MSS) is a rare, autosomal-dominant, syndromal osteochondrodysplasia characterized by mesomelic limb shortening, acral synostoses, and multiple congenital malformations due to a non-recurrent deletion at 8q13 that always encompasses two coding-genes, SULF1 and SLCO5A1. To date, five unrelated patients have been reported worldwide, and MMS was previously proposed to not be a genomic disorder associated with deletions recurring from non-allelic homologous recombination (NAHR) in at least two analyzed cases. We conducted targeted gene panel sequencing and subsequent array-based copy number analysis in an 11-year-old undiagnosed Japanese female patient with multiple congenital anomalies that included mesomelic limb shortening and detected a novel 590 Kb deletion at 8q13 encompassing the same gene set as reported previously, resulting in the diagnosis of MSS. Breakpoint sequences of the deleted region in our case demonstrated the first LINE-1s (L1s)-mediated unequal NAHR event utilizing two distant L1 elements as homology substrates in this disease, which may represent a novel causative mechanism of the 8q13 deletion, expanding the range of mechanisms involved in the chromosomal rearrangements responsible for MSS. © 2017 Wiley Periodicals, Inc.
Barbosa, Patrícia; de Oliveira, Luiz Antonio; Pucci, Marcela Baer; Santos, Mateus Henrique; Moreira-Filho, Orlando; Vicari, Marcelo Ricardo; Nogaroto, Viviane; de Almeida, Mara Cristina; Artoni, Roberto Ferreira
2015-02-01
Most part of the eukaryotic genome is composed of repeated sequences or multiple copies of DNA, which were considered as "junk DNA", and may be associated to the heterochromatin. In this study, three populations of Astyanax aff. scabripinnis from Brazilian rivers of Guaratinguetá and Pindamonhangaba (São Paulo) and a population from Maringá (Paraná) were analyzed concerning the localization of the nucleolar organizer regions (Ag-NORs), the As51 satellite DNA, the 18S ribosomal DNA (rDNA), and the 5S rDNA. Repeated sequences were also isolated and identified by the Cot - 1 method, which indicated similarity (90%) with the LINE UnaL2 retrotransposon. The fluorescence in situ hybridization (FISH) showed the retrotransposon dispersed and more concentrated markers in centromeric and telomeric chromosomal regions. These sequences were co-localized and interspaced with 18S and 5S rDNA and As51, confirmed by fiber-FISH essay. The B chromosome found in these populations pointed to a conspicuous hybridization with LINE probe, which is also co-located in As51 sequences. The NORs were active at unique sites of a homologous pair in the three populations. There were no evidences that transposable elements and repetitive DNA had influence in the transcriptional regulation of ribosomal genes in our analyses.
Naville, Magali; Gautheret, Daniel
2010-01-01
Bacterial transcription attenuation occurs through a variety of cis-regulatory elements that control gene expression in response to a wide range of signals. The signal-sensing structures in attenuators are so diverse and rapidly evolving that only a small fraction have been properly annotated and characterized to date. Here we apply a broad-spectrum detection tool in order to achieve a more complete view of the transcriptional attenuation complement of key bacterial species. Our protocol seeks gene families with an unusual frequency of 5' terminators found across multiple species. Many of the detected attenuators are part of annotated elements, such as riboswitches or T-boxes, which often operate through transcriptional attenuation. However, a significant fraction of candidates were not previously characterized in spite of their unmistakable footprint. We further characterized some of these new elements using sequence and secondary structure analysis. We also present elements that may control the expression of several non-homologous genes, suggesting co-transcription and response to common signals. An important class of such elements, which we called mobile attenuators, is provided by 3' terminators of insertion sequences or prophages that may be exapted as 5' regulators when inserted directly upstream of a cellular gene. We show here that attenuators involve a complex landscape of signal-detection structures spanning the entire bacterial domain. We discuss possible scenarios through which these diverse 5' regulatory structures may arise or evolve.
Fomukong, N G; Tang, T H; al-Maamary, S; Ibrahim, W A; Ramayah, S; Yates, M; Zainuddin, Z F; Dale, J W
1994-12-01
DNA fingerprinting with the insertion sequence IS6110 (also known as IS986) has become established as a major tool for investigating the spread of tuberculosis. Most strains of Mycobacterium tuberculosis have multiple copies of IS6110, but a small minority carry a single copy only. We have examined selected strains from Malaysia, Tanzania and Oman, in comparison with M. bovis isolates and BCG strains carrying one or two copies of IS6110. The insertion sequence appears to be present in the same position in all these strains, which suggests that in these organisms the element is defective in transposition and that the loss of transposability may have occurred at an early stage in the evolution of the M. tuberculosis complex.
Adamczuk, Marcin; Dziewit, Lukasz
2017-01-01
The draft genome of multidrug-resistant Aeromonas sp. ARM81 isolated from a wastewater treatment plant in Warsaw (Poland) was obtained. Sequence analysis revealed multiple genes conferring resistance to aminoglycosides, β-lactams or tetracycline. Three different β-lactamase genes were identified, including an extended-spectrum β-lactamase gene bla PER-1 . The antibiotic susceptibility was experimentally tested. Genome sequencing also allowed us to investigate the plasmidome and transposable mobilome of ARM81. Four plasmids, of which two carry phenotypic modules (i.e., genes encoding a zinc transporter ZitB and a putative glucosyltransferase), and 28 putative transposase genes were identified. The mobility of three insertion sequences (isoforms of previously identified elements ISAs12, ISKpn9 and ISAs26) was confirmed using trap plasmids.
Genomic epidemiology of global Klebsiella pneumoniae carbapenemase (KPC)-producing Escherichia coli.
Stoesser, N; Sheppard, A E; Peirano, G; Anson, L W; Pankhurst, L; Sebra, R; Phan, H T T; Kasarskis, A; Mathers, A J; Peto, T E A; Bradford, P; Motyl, M R; Walker, A S; Crook, D W; Pitout, J D
2017-07-19
The dissemination of carbapenem resistance in Escherichia coli has major implications for the management of common infections. bla KPC , encoding a transmissible carbapenemase (KPC), has historically largely been associated with Klebsiella pneumoniae, a predominant plasmid (pKpQIL), and a specific transposable element (Tn4401, ~10 kb). Here we characterize the genetic features of bla KPC emergence in global E. coli, 2008-2013, using both long- and short-read whole-genome sequencing. Amongst 43/45 successfully sequenced bla KPC -E. coli strains, we identified substantial strain diversity (n = 21 sequence types, 18% of annotated genes in the core genome); substantial plasmid diversity (≥9 replicon types); and substantial bla KPC -associated, mobile genetic element (MGE) diversity (50% not within complete Tn4401 elements). We also found evidence of inter-species, regional and international plasmid spread. In several cases bla KPC was found on high copy number, small Col-like plasmids, previously associated with horizontal transmission of resistance genes in the absence of antimicrobial selection pressures. E. coli is a common human pathogen, but also a commensal in multiple environmental and animal reservoirs, and easily transmissible. The association of bla KPC with a range of MGEs previously linked to the successful spread of widely endemic resistance mechanisms (e.g. bla TEM , bla CTX-M ) suggests that it may become similarly prevalent.
Sequence Composition and Gene Content of the Short Arm of Rye (Secale cereale) Chromosome 1
Fluch, Silvia; Kopecky, Dieter; Burg, Kornel; Šimková, Hana; Taudien, Stefan; Petzold, Andreas; Kubaláková, Marie; Platzer, Matthias; Berenyi, Maria; Krainer, Siegfried; Doležel, Jaroslav; Lelley, Tamas
2012-01-01
Background The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. Methodology/Principal Findings Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. Conclusions The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye. PMID:22328922
Muramoto, Hiroki; Yagi, Shintaro; Hirabayashi, Keiji; Sato, Shinya; Ohgane, Jun; Tanaka, Satoshi; Shiota, Kunio
2010-08-01
Embryonic stem cells (ESCs) have a distinctive epigenome, which includes their genome-wide DNA methylation modification status, as represented by the ESC-specific hypomethylation of tissue-dependent and differentially methylated regions (T-DMRs) of Pou5f1 and Nanog. Here, we conducted a genome-wide investigation of sequence characteristics associated with T-DMRs that were differentially methylated between ESCs and somatic cells, by focusing on transposable elements including short interspersed elements (SINEs), long interspersed elements (LINEs) and long terminal repeats (LTRs). We found that hypomethylated T-DMRs were predominantly present in SINE-rich/LINE-poor genomic loci. The enrichment for SINEs spread over 300 kb in cis and there existed SINE-rich genomic domains spreading continuously over 1 Mb, which contained multiple hypomethylated T-DMRs. The characterization of sequence information showed that the enriched SINEs were relatively CpG rich and belonged to specific subfamilies. A subset of the enriched SINEs were hypomethylated T-DMRs in ESCs at Dppa3 gene locus, although SINEs are overall methylated in both ESCs and the liver. In conclusion, we propose that SINE enrichment is the genomic property of regions harboring hypomethylated T-DMRs in ESCs, which is a novel aspect of the ESC-specific epigenomic information.
Liu, Yun-Hua; Zhang, Meiping; Wu, Chengcang; Huang, James J; Zhang, Hong-Bin
2014-01-01
Knowledge of how a genome is structured and organized from its constituent elements is crucial to understanding its biology and evolution. Here, we report the genome structuring and organization pattern as revealed by systems analysis of the sequences of three model species, Arabidopsis, rice and yeast, at the whole-genome and chromosome levels. We found that all fundamental function elements (FFE) constituting the genomes, including genes (GEN), DNA transposable elements (DTE), retrotransposable elements (RTE), simple sequence repeats (SSR), and (or) low complexity repeats (LCR), are structured in a nonrandom and correlative manner, thus leading to a hypothesis that the DNA of the species is structured as a linear "jigsaw puzzle". Furthermore, we showed that different FFE differ in their importance in the formation and evolution of the DNA jigsaw puzzle structure between species. DTE and RTE play more important roles than GEN, LCR, and SSR in Arabidopsis, whereas GEN and RTE play more important roles than LCR, SSR, and DTE in rice. The genes having multiple recognized functions play more important roles than those having single functions. These results provide useful knowledge necessary for better understanding genome biology and evolution of the species and for effective molecular breeding of rice.
vonHoldt, Bridgett M; Ji, Sarah S; Aardema, Matthew L; Stahler, Daniel; Udell, Monique A R; Sinsheimer, Janet S
2018-06-01
In canines, transposon dynamics have been associated with a hyper-social behavioral syndrome, although the functional mechanism has yet to be described. We investigate the epigenetic and transcriptional consequences of these behavior-associated mobile element insertions in dogs and Yellowstone wolves. We posit that the transposons themselves may not be the causative feature; rather, their transcriptional regulation may exert the functional impact. We survey four outlier transposons associated with hyper-sociability, with the expectation that they are targeted for epigenetic silencing. We predict hyper-methylation of mobile element insertions (MEIs), suggestive that the epigenetic silencing of and not the MEIs themselves may be driving dysregulation of nearby genes. We found that transposon-derived sequences are significantly hyper-methylated, regardless of their copy number or species. Further, we have assessed transcriptome sequence data and found evidence that mobile element insertions impact the expression levels of six genes (WBSCR17, LIMK1, GTF2I, WBSCR27, BAZ1B, and BCL7B), all of which have known roles in human Williams-Beuren syndrome due to changes in copy number, typically hemizygosity. Although further evidence is needed, our results suggest that a few insertions alter local expression at multiple genes, likely through a cis-regulatory mechanism that excludes proximal methylation.
Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis
2005-06-13
Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A totalmore » of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.« less
Gowda, Malali
2016-01-01
Blast disease caused by the Magnaporthe species is a major factor affecting the productivity of rice, wheat and millets. This study was aimed at generating genomic information for rice and non-rice Magnaporthe isolates to understand the extent of genetic variation. We have sequenced the whole genome of the Magnaporthe isolates, infecting rice (leaf and neck), finger millet (leaf and neck), foxtail millet (leaf) and buffel grass (leaf). Rice and finger millet isolates infecting both leaf and neck tissues were sequenced, since the damage and yield loss caused due to neck blast is much higher as compared to leaf blast. The genome-wide comparison was carried out to study the variability in gene content, candidate effectors, repeat element distribution, genes involved in carbohydrate metabolism and SNPs. The analysis of repeat element footprints revealed some genes such as naringenin, 2-oxoglutarate 3-dioxygenase being targeted by Pot2 and Occan, in isolates from different host species. Some repeat insertions were host-specific while other insertions were randomly shared between isolates. The distributions of repeat elements, secretory proteins, CAZymes and SNPs showed significant variation across host-specific lineages of Magnaporthe indicating an independent genome evolution orchestrated by multiple genomic factors. PMID:27658241
Yeakley, J M; Hedjran, F; Morfin, J P; Merillat, N; Rosenfeld, M G; Emeson, R B
1993-01-01
The calcitonin/calcitonin gene-related peptide (CGRP) primary transcript is alternatively spliced in thyroid C cells and neurons, resulting in the tissue-specific production of calcitonin and CGRP mRNAs. Analyses of mutated calcitonin/CGRP transcription units in permanently transfected cell lines have indicated that alternative splicing is regulated by a differential capacity to utilize the calcitonin-specific splice acceptor. The analysis of an extensive series of mutations suggests that tissue-specific regulation of calcitonin mRNA production does not depend on the presence of a single, unique cis-active element but instead appears to be a consequence of suboptimal constitutive splicing signals. While only those mutations that altered constitutive splicing signals affected splice choices, the action of multiple regulatory sequences cannot be formally excluded. Further, we have identified a 13-nucleotide purine-rich element from a constitutive exon that, when placed in exon 4, entirely switches splice site usage in CGRP-producing cells. These data suggest that specific exon recruitment sequences, in combination with other constitutive elements, serve an important function in exon recognition. These results are consistent with the hypothesis that tissue-specific alternative splicing of the calcitonin/CGRP primary transcript is mediated by cell-specific differences in components of the constitutive splicing machinery. Images PMID:8413203
Dillon, Laura; Collins, Meaghan; Conway, Maura; Cunningham, Kate
2013-01-01
Three experiments examined the implicit learning of sequences under conditions in which the elements comprising a sequence were equated in terms of reinforcement probability. In Experiment 1 cotton-top tamarins (Saguinus oedipus) experienced a five-element sequence displayed serially on a touch screen in which reinforcement probability was equated across elements at .16 per element. Tamarins demonstrated learning of this sequence with higher latencies during a random test as compared to baseline sequence training. In Experiments 2 and 3, manipulations of the procedure used in the first experiment were undertaken to rule out a confound owing to the fact that the elements in Experiment 1 bore different temporal relations to the intertrial interval (ITI), an inhibitory period. The results of Experiments 2 and 3 indicated that the implicit learning observed in Experiment 1 was not due to temporal proximity between some elements and the inhibitory ITI. The results taken together support two conclusion: First that tamarins engaged in sequence learning whether or not there was contingent reinforcement for learning the sequence, and second that this learning was not due to subtle differences in associative strength between the elements of the sequence. PMID:23344718
Jay, Z J; Beam, J P; Dohnalkova, A; Lohmayer, R; Bodle, B; Planer-Friedrich, B; Romine, M; Inskeep, W P
2015-09-01
Thermoproteales (phylum Crenarchaeota) populations are abundant in high-temperature (>70°C) environments of Yellowstone National Park (YNP) and are important in mediating the biogeochemical cycles of sulfur, arsenic, and carbon. The objectives of this study were to determine the specific physiological attributes of the isolate Pyrobaculum yellowstonensis strain WP30, which was obtained from an elemental sulfur sediment (Joseph's Coat Hot Spring [JCHS], 80°C, pH 6.1, 135 μM As) and relate this organism to geochemical processes occurring in situ. Strain WP30 is a chemoorganoheterotroph and requires elemental sulfur and/or arsenate as an electron acceptor. Growth in the presence of elemental sulfur and arsenate resulted in the formation of thioarsenates and polysulfides. The complete genome of this organism was sequenced (1.99 Mb, 58% G+C content), revealing numerous metabolic pathways for the degradation of carbohydrates, amino acids, and lipids. Multiple dimethyl sulfoxide-molybdopterin (DMSO-MPT) oxidoreductase genes, which are implicated in the reduction of sulfur and arsenic, were identified. Pathways for the de novo synthesis of nearly all required cofactors and metabolites were identified. The comparative genomics of P. yellowstonensis and the assembled metagenome sequence from JCHS showed that this organism is highly related (∼95% average nucleotide sequence identity) to in situ populations. The physiological attributes and metabolic capabilities of P. yellowstonensis provide an important foundation for developing an understanding of the distribution and function of these populations in YNP. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Brütting, Christine; Emmer, Alexander; Kornhuber, Malte; Staege, Martin S
2016-08-01
Although multiple sclerosis (MS) is one of the most common central nervous system diseases in young adults, little is known about its etiology. Several human endogenous retroviruses (ERVs) are considered to play a role in MS. We are interested in which ERVs can be identified in the vicinity of MS associated genetic marker to find potential initiators of MS. We analysed the chromosomal regions surrounding 58 single nucleotide polymorphisms (SNPs) that are associated with MS identified in one of the last major genome wide association studies. We scanned these regions for putative endogenous retrovirus sequences with large open reading frames (ORFs). We observed that more retrovirus-related putative ORFs exist in the relatively close vicinity of SNP marker indices in multiple sclerosis compared to control SNPs. We found very high homologies to HERV-K, HCML-ARV, XMRV, Galidia ERV, HERV-H/env62 and XMRV-like mouse endogenous retrovirus mERV-XL. The associated genes (CYP27B1, CD6, CD58, MPV17L2, IL12RB1, CXCR5, PTGER4, TAGAP, TYK2, ICAM3, CD86, GALC, GPR65 as well as the HLA DRB1*1501) are mainly involved in the immune system, but also in vitamin D regulation. The most frequently detected ERV sequences are related to the multiple sclerosis-associated retrovirus, the human immunodeficiency virus 1, HERV-K, and the Simian foamy virus. Our data shows that there is a relation between MS associated SNPs and the number of retroviral elements compared to control. Our data identifies new ERV sequences that have not been associated with MS, so far.
Concerted formation of macromolecular Suppressor–mutator transposition complexes
Raina, Ramesh; Schläppi, Michael; Karunanandaa, Balasulojini; Elhofy, Adam; Fedoroff, Nina
1998-01-01
Transposition of the maize Suppressor–mutator (Spm) transposon requires two element-encoded proteins, TnpA and TnpD. Although there are multiple TnpA binding sites near each element end, binding of TnpA to DNA is not cooperative, and the binding affinity is not markedly affected by the number of binding sites per DNA fragment. However, intermolecular complexes form cooperatively between DNA fragments with three or more TnpA binding sites. TnpD, itself not a sequence-specific DNA-binding protein, binds to TnpA and stabilizes the TnpA–DNA complex. The high redundancy of TnpA binding sites at both element ends and the protein–protein interactions between DNA-bound TnpA complexes and between these and TnpD imply a concerted transition of the element from a linear to a protein crosslinked transposition complex within a very narrow protein concentration range. PMID:9671711
Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti
2016-08-01
The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.
Wallau, Gabriel Luz; Capy, Pierre; Loreto, Elgion; Le Rouzic, Arnaud; Hua-Van, Aurélie
2016-04-01
Transposable elements (TEs) are genomic repeated sequences that display complex evolutionary patterns. They are usually inherited vertically, but can occasionally be transmitted between sexually independent species, through so-called horizontal transposon transfers (HTTs). Recurrent HTTs are supposed to be essential in life cycle of TEs, which are otherwise destined for eventual decay. HTTs also impact the host genome evolution. However, the extent of HTTs in eukaryotes is largely unknown, due to the lack of efficient, statistically supported methods that can be applied to multiple species sequence data sets. Here, we developed a new automated method available as a R package "vhica" that discriminates whether a given TE family was vertically or horizontally transferred, and potentially infers donor and receptor species. The method is well suited for TE sequences extracted from complete genomes, and applicable to multiple TEs and species at the same time. We first validated our method using Drosophila TE families with well-known evolutionary histories, displaying both HTTs and vertical transmission. We then tested 26 different lineages of mariner elements recently characterized in 20 Drosophila genomes, and found HTTs in 24 of them. Furthermore, several independent HTT events could often be detected within the same mariner lineage. The VHICA (Vertical and Horizontal Inheritance Consistence Analysis) method thus appears as a valuable tool to analyze the evolutionary history of TEs across a large range of species. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Roychoudhury, Aryadeep; Paul, Saikat; Basu, Supratim
2013-07-01
Salinity, drought and low temperature are the common forms of abiotic stress encountered by land plants. To cope with these adverse environmental factors, plants execute several physiological and metabolic responses. Both osmotic stress (elicited by water deficit or high salt) and cold stress increase the endogenous level of the phytohormone abscisic acid (ABA). ABA-dependent stomatal closure to reduce water loss is associated with small signaling molecules like nitric oxide, reactive oxygen species and cytosolic free calcium, and mediated by rapidly altering ion fluxes in guard cells. ABA also triggers the expression of osmotic stress-responsive (OR) genes, which usually contain single/multiple copies of cis-acting sequence called abscisic acid-responsive element (ABRE) in their upstream regions, mostly recognized by the basic leucine zipper-transcription factors (TFs), namely, ABA-responsive element-binding protein/ABA-binding factor. Another conserved sequence called the dehydration-responsive element (DRE)/C-repeat, responding to cold or osmotic stress, but not to ABA, occurs in some OR promoters, to which the DRE-binding protein/C-repeat-binding factor binds. In contrast, there are genes or TFs containing both DRE/CRT and ABRE, which can integrate input stimuli from salinity, drought, cold and ABA signaling pathways, thereby enabling cross-tolerance to multiple stresses. A strong candidate that mediates such cross-talk is calcium, which serves as a common second messenger for abiotic stress conditions and ABA. The present review highlights the involvement of both ABA-dependent and ABA-independent signaling components and their interaction or convergence in activating the stress genes. We restrict our discussion to salinity, drought and cold stress.
2010-01-01
Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
The contribution of alu elements to mutagenic DNA double-strand break repair.
Morales, Maria E; White, Travis B; Streva, Vincent A; DeFreece, Cecily B; Hedges, Dale J; Deininger, Prescott L
2015-03-01
Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both the rate and nature of DNA repair events.
Denz, Christopher R; Zhang, Chi; Jia, Pingping; Du, Jianfeng; Huang, Xupei; Dube, Syamalima; Thomas, Anish; Poiesz, Bernard J; Dube, Dipak K
2011-09-01
Tropomyosins are a family of actin-binding proteins that show cell-specific diversity by a combination of multiple genes and alternative RNA splicing. Of the 4 different tropomyosin genes, TPM4 plays a pivotal role in myofibrillogenesis as well as cardiac contractility in amphibians. In this study, we amplified and sequenced the upstream regulatory region of the TPM4 gene from both normal and mutant axolotl hearts. To identify the cis-elements that are essential for the expression of the TPM4, we created various deletion mutants of the TPM4 promoter DNA, inserted the deleted segments into PGL3 vector, and performed promoter-reporter assay using luciferase as the reporter gene. Comparison of sequences of the promoter region of the TPM4 gene from normal and mutant axolotl revealed no mutations in the promoter sequence of the mutant TPM4 gene. CArG box elements that are generally involved in controlling the expression of several other muscle-specific gene promoters were not found in the upstream regulatory region of the TPM4 gene. In deletion experiments, loss of activity of the reporter gene was noted upon deletion which was then restored upon further deletion suggesting the presence of both positive and negative cis-elements in the upstream regulatory region of the TPM4 gene. We believe that this is the first axolotl promoter that has ever been cloned and studied with clear evidence that it functions in mammalian cell lines. Although striated muscle-specific cis-acting elements are absent from the promoter region of TPM4 gene, our results suggest the presence of positive and negative cis-elements in the promoter region, which in conjunction with positive and negative trans-elements may be involved in regulating the expression of TPM4 gene in a tissue-specific manner.
Canale, Aneth S; Venev, Sergey V; Whitfield, Troy W; Caffrey, Daniel R; Marasco, Wayne A; Schiffer, Celia A; Kowalik, Timothy F; Jensen, Jeffrey D; Finberg, Robert W; Zeldovich, Konstantin B; Wang, Jennifer P; Bolon, Daniel N A
2018-04-13
The fitness effects of synonymous mutations can provide insights into biological and evolutionary mechanisms. We analyzed the experimental fitness effects of all single-nucleotide mutations, including synonymous substitutions, at the beginning of the influenza A virus hemagglutinin (HA) gene. Many synonymous substitutions were deleterious both in bulk competition and for individually isolated clones. Investigating protein and RNA levels of a subset of individually expressed HA variants revealed that multiple biochemical properties contribute to the observed experimental fitness effects. Our results indicate that a structural element in the HA segment viral RNA may influence fitness. Examination of naturally evolved sequences in human hosts indicates a preference for the unfolded state of this structural element compared to that found in swine hosts. Our overall results reveal that synonymous mutations may have greater fitness consequences than indicated by simple models of sequence conservation, and we discuss the implications of this finding for commonly used evolutionary tests and analyses. Copyright © 2018. Published by Elsevier Ltd.
Farias, Pedro; Espírito Santo, Christophe; Branco, Rita; Francisco, Romeu; Santos, Susana; Hansen, Lars; Sorensen, Soren
2015-01-01
Microorganisms are responsible for multiple antibiotic resistances that have been associated with resistance/tolerance to heavy metals, with consequences to public health. Many genes conferring these resistances are located on mobile genetic elements, easily exchanged among phylogenetically distant bacteria. The objective of the present work was to isolate arsenic-, antimonite-, and antibiotic-resistant strains and to determine the existence of plasmids harboring antibiotic/arsenic/antimonite resistance traits in phenotypically resistant strains, in a nonanthropogenically impacted environment. The hydrothermal Lucky Strike field in the Azores archipelago (North Atlantic, between 11°N and 38°N), at the Mid-Atlantic Ridge, protected under the OSPAR Convention, was sampled as a metal-rich pristine environment. A total of 35 strains from 8 different species were isolated in the presence of arsenate, arsenite, and antimonite. ACR3 and arsB genes were amplified from the sediment's total DNA, and 4 isolates also carried ACR3 genes. Phenotypic multiple resistances were found in all strains, and 7 strains had recoverable plasmids. Purified plasmids were sequenced by Illumina and assembled by EDENA V3, and contig annotation was performed using the “Rapid Annotation using the Subsystems Technology” server. Determinants of resistance to copper, zinc, cadmium, cobalt, and chromium as well as to the antibiotics β-lactams and fluoroquinolones were found in the 3 sequenced plasmids. Genes coding for heavy metal resistance and antibiotic resistance in the same mobile element were found, suggesting the possibility of horizontal gene transfer and distribution of theses resistances in the bacterial population. PMID:25636836
FARME DB: a functional antibiotic resistance element database
Wallace, James C.; Port, Jesse A.; Smith, Marissa N.; Faustman, Elaine M.
2017-01-01
Antibiotic resistance (AR) is a major global public health threat but few resources exist that catalog AR genes outside of a clinical context. Current AR sequence databases are assembled almost exclusively from genomic sequences derived from clinical bacterial isolates and thus do not include many microbial sequences derived from environmental samples that confer resistance in functional metagenomic studies. These environmental metagenomic sequences often show little or no similarity to AR sequences from clinical isolates using standard classification criteria. In addition, existing AR databases provide no information about flanking sequences containing regulatory or mobile genetic elements. To help address this issue, we created an annotated database of DNA and protein sequences derived exclusively from environmental metagenomic sequences showing AR in laboratory experiments. Our Functional Antibiotic Resistant Metagenomic Element (FARME) database is a compilation of publically available DNA sequences and predicted protein sequences conferring AR as well as regulatory elements, mobile genetic elements and predicted proteins flanking antibiotic resistant genes. FARME is the first database to focus on functional metagenomic AR gene elements and provides a resource to better understand AR in the 99% of bacteria which cannot be cultured and the relationship between environmental AR sequences and antibiotic resistant genes derived from cultured isolates. Database URL: http://staff.washington.edu/jwallace/farme PMID:28077567
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants
Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun
2017-01-01
Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation. PMID:29064432
2012-01-01
Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331
Batty, Elizabeth M; Chaemchuen, Suwittra; Blacksell, Stuart; Richards, Allen L; Paris, Daniel; Bowden, Rory; Chan, Caroline; Lachumanan, Ramkumar; Day, Nicholas; Donnelly, Peter; Chen, Swaine; Salje, Jeanne
2018-06-01
Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species. We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia. Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.
NASA Astrophysics Data System (ADS)
Tene, Yair; Tene, Noam; Tene, G.
1993-08-01
An interactive data fusion methodology of video, audio, and nonlinear structural dynamic analysis for potential application in forensic engineering is presented. The methodology was developed and successfully demonstrated in the analysis of heavy transportable bridge collapse during preparation for testing. Multiple bridge elements failures were identified after the collapse, including fracture, cracks and rupture of high performance structural materials. Videotape recording by hand held camcorder was the only source of information about the collapse sequence. The interactive data fusion methodology resulted in extracting relevant information form the videotape and from dynamic nonlinear structural analysis, leading to full account of the sequence of events during the bridge collapse.
A modularized pulse programmer for NMR spectroscopy
NASA Astrophysics Data System (ADS)
Mao, Wenping; Bao, Qingjia; Yang, Liang; Chen, Yiqun; Liu, Chaoyang; Qiu, Jianqing; Ye, Chaohui
2011-02-01
A modularized pulse programmer for a NMR spectrometer is described. It consists of a networked PCI-104 single-board computer and a field programmable gate array (FPGA). The PCI-104 is dedicated to translate the pulse sequence elements from the host computer into 48-bit binary words and download these words to the FPGA, while the FPGA functions as a sequencer to execute these binary words. High-resolution NMR spectra obtained on a home-built spectrometer with four pulse programmers working concurrently demonstrate the effectiveness of the pulse programmer. Advantages of the module include (1) once designed it can be duplicated and used to construct a scalable NMR/MRI system with multiple transmitter and receiver channels, (2) it is a totally programmable system in which all specific applications are determined by software, and (3) it provides enough reserve for possible new pulse sequences.
A User's Guide to the Encyclopedia of DNA Elements (ENCODE)
2011-01-01
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome. PMID:21526222
Bentley, Stephen D.; Corton, Craig; Brown, Susan E.; Barron, Andrew; Clark, Louise; Doggett, Jon; Harris, Barbara; Ormond, Doug; Quail, Michael A.; May, Georgiana; Francis, David; Knudson, Dennis; Parkhill, Julian; Ishimaru, Carol A.
2008-01-01
Clavibacter michiganensis subsp. sepedonicus is a plant-pathogenic bacterium and the causative agent of bacterial ring rot, a devastating agricultural disease under strict quarantine control and zero tolerance in the seed potato industry. This organism appears to be largely restricted to an endophytic lifestyle, proliferating within plant tissues and unable to persist in the absence of plant material. Analysis of the genome sequence of C. michiganensis subsp. sepedonicus and comparison with the genome sequences of related plant pathogens revealed a dramatic recent evolutionary history. The genome contains 106 insertion sequence elements, which appear to have been active in extensive rearrangement of the chromosome compared to that of Clavibacter michiganensis subsp. michiganensis. There are 110 pseudogenes with overrepresentation in functions associated with carbohydrate metabolism, transcriptional regulation, and pathogenicity. Genome comparisons also indicated that there is substantial gene content diversity within the species, probably due to differential gene acquisition and loss. These genomic features and evolutionary dating suggest that there was recent adaptation for life in a restricted niche where nutrient diversity and perhaps competition are low, correlated with a reduced ability to exploit previously occupied complex niches outside the plant. Toleration of factors such as multiplication and integration of insertion sequence elements, genome rearrangements, and functional disruption of many genes and operons seems to indicate that there has been general relaxation of selective pressure on a large proportion of the genome. PMID:18192393
NASA Astrophysics Data System (ADS)
Gopinath, T.; Veglia, Gianluigi
2016-06-01
Conventional multidimensional magic angle spinning (MAS) solid-state NMR (ssNMR) experiments detect the signal arising from the decay of a single coherence transfer pathway (FID), resulting in one spectrum per acquisition time. Recently, we introduced two new strategies, namely DUMAS (DUal acquisition Magic Angle Spinning) and MEIOSIS (Multiple ExperIments via Orphan SpIn operatorS), that enable the simultaneous acquisitions of multidimensional ssNMR experiments using multiple coherence transfer pathways. Here, we combined the main elements of DUMAS and MEIOSIS to harness both orphan spin operators and residual polarization and increase the number of simultaneous acquisitions. We show that it is possible to acquire up to eight two-dimensional experiments using four acquisition periods per each scan. This new suite of pulse sequences, called MAeSTOSO for Multiple Acquisitions via Sequential Transfer of Orphan Spin pOlarization, relies on residual polarization of both 13C and 15N pathways and combines low- and high-sensitivity experiments into a single pulse sequence using one receiver and commercial ssNMR probes. The acquisition of multiple experiments does not affect the sensitivity of the main experiment; rather it recovers the lost coherences that are discarded, resulting in a significant gain in experimental time. Both merits and limitations of this approach are discussed.
Sveinsson, Saemundur; Gill, Navdeep; Kane, Nolan C; Cronk, Quentin
2013-07-24
Transposable elements (TEs) and other repetitive elements are a large and dynamically evolving part of eukaryotic genomes, especially in plants where they can account for a significant proportion of genome size. Their dynamic nature gives them the potential for use in identifying and characterizing crop germplasm. However, their repetitive nature makes them challenging to study using conventional methods of molecular biology. Next generation sequencing and new computational tools have greatly facilitated the investigation of TE variation within species and among closely related species. (i) We generated low-coverage Illumina whole genome shotgun sequencing reads for multiple individuals of cacao (Theobroma cacao) and related species. These reads were analysed using both an alignment/mapping approach and a de novo (graph based clustering) approach. (ii) A standard set of ultra-conserved orthologous sequences (UCOS) standardized TE data between samples and provided phylogenetic information on the relatedness of samples. (iii) The mapping approach proved highly effective within the reference species but underestimated TE abundance in interspecific comparisons relative to the de novo methods. (iv) Individual T. cacao accessions have unique patterns of TE abundance indicating that the TE composition of the genome is evolving actively within this species. (v) LTR/Gypsy elements are the most abundant, comprising c.10% of the genome. (vi) Within T. cacao the retroelement families show an order of magnitude greater sequence variability than the DNA transposon families. (vii) Theobroma grandiflorum has a similar TE composition to T. cacao, but the related genus Herrania is rather different, with LTRs making up a lower proportion of the genome, perhaps because of a massive presence (c. 20%) of distinctive low complexity satellite-like repeats in this genome. (i) Short read alignment/mapping to reference TE contigs provides a simple and effective method of investigating intraspecific differences in TE composition. It is not appropriate for comparing repetitive elements across the species boundaries, for which de novo methods are more appropriate. (ii) Individual T. cacao accessions have unique spectra of TE composition indicating active evolution of TE abundance within this species. TE patterns could potentially be used as a "fingerprint" to identify and characterize cacao accessions.
Sroubek, Jakub; Krishnan, Yamini; McDonald, Thomas V.
2013-01-01
Human ether-á-gogo-related gene (HERG) encodes a potassium channel that is highly susceptible to deleterious mutations resulting in susceptibility to fatal cardiac arrhythmias. Most mutations adversely affect HERG channel assembly and trafficking. Why the channel is so vulnerable to missense mutations is not well understood. Since nothing is known of how mRNA structural elements factor in channel processing, we synthesized a codon-modified HERG cDNA (HERG-CM) where the codons were synonymously changed to reduce GC content, secondary structure, and rare codon usage. HERG-CM produced typical IKr-like currents; however, channel synthesis and processing were markedly different. Translation efficiency was reduced for HERG-CM, as determined by heterologous expression, in vitro translation, and polysomal profiling. Trafficking efficiency to the cell surface was greatly enhanced, as assayed by immunofluorescence, subcellular fractionation, and surface labeling. Chimeras of HERG-NT/CM indicated that trafficking efficiency was largely dependent on 5′ sequences, while translation efficiency involved multiple areas. These results suggest that HERG translation and trafficking rates are independently governed by noncoding information in various regions of the mRNA molecule. Noncoding information embedded within the mRNA may play a role in the pathogenesis of hereditary arrhythmia syndromes and could provide an avenue for targeted therapeutics.—Sroubek, J., Krishnan, Y., McDonald, T V. Sequence- and structure-specific elements of HERG mRNA determine channel synthesis and trafficking efficiency. PMID:23608144
Staufen1 senses overall transcript secondary structure to regulate translation
Ricci, Emiliano P; Kucukural, Alper; Cenik, Can; Mercier, Blandine C; Singh, Guramrit; Heyer, Erin E; Ashar-Patel, Ami; Peng, Lingtao; Moore, Melissa J
2015-01-01
Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3′ untranslated regions (UTRs) or in ‘strongly distal’ 3′ UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3′ UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins. PMID:24336223
DynaMIT: the dynamic motif integration toolkit
Dassi, Erik; Quattrone, Alessandro
2016-01-01
De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org. PMID:26253738
Liu, Xiaochuan; Freitas, Jaime; Zheng, Dinghai; Oliveira, Marta S; Hoque, Mainul; Martins, Torcato; Henriques, Telmo; Tian, Bin; Moreira, Alexandra
2017-12-01
Alternative polyadenylation (APA) is a mechanism that generates multiple mRNA isoforms with different 3'UTRs and/or coding sequences from a single gene. Here, using 3' region extraction and deep sequencing (3'READS), we have systematically mapped cleavage and polyadenylation sites (PASs) in Drosophila melanogaster , expanding the total repertoire of PASs previously identified for the species, especially those located in A-rich genomic sequences. Cis -element analysis revealed distinct sequence motifs around fly PASs when compared to mammalian ones, including the greater enrichment of upstream UAUA elements and the less prominent presence of downstream UGUG elements. We found that over 75% of mRNA genes in Drosophila melanogaster undergo APA. The head tissue tends to use distal PASs when compared to the body, leading to preferential expression of APA isoforms with long 3'UTRs as well as with distal terminal exons. The distance between the APA sites and intron location of PAS are important parameters for APA difference between body and head, suggesting distinct PAS selection contexts. APA analysis of the RpII215 C4 mutant strain, which harbors a mutant RNA polymerase II (RNAPII) with a slower elongation rate, revealed that a 50% decrease in transcriptional elongation rate leads to a mild trend of more usage of proximal, weaker PASs, both in 3'UTRs and in introns, consistent with the "first come, first served" model of APA regulation. However, this trend was not observed in the head, suggesting a different regulatory context in neuronal cells. Together, our data expand the PAS collection for Drosophila melanogaster and reveal a tissue-specific effect of APA regulation by RNAPII elongation rate. © 2017 Liu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Chishima, Takafumi; Iwakiri, Junichi
2018-01-01
It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher’s exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE–tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression. PMID:29315213
Sadofsky, M; Connelly, S; Manley, J L; Alwine, J C
1985-01-01
Our previous studies of the 3'-end processing of simian virus 40 late mRNAs indicated the existence of an essential element (or elements) downstream of the AAUAAA signal. We report here the use of transient expression analysis to study a functional element which we located within the sequence AGGUUUUUU, beginning 59 nucleotides downstream of the recognized signal AAUAAA. Deletion of this element resulted in (i) at least a 75% drop in 3'-end processing at the normal site and (ii) appearance of readthrough transcripts with alternate 3' ends. Some flexibility in the downstream position of this element relative to the AAUAAA was noted by deletion analysis. Using computer sequence comparison, we located homologous regions within downstream sequences of other genes, suggesting a generalized sequence element. In addition, specific complementarity is noted between the downstream element and U4 RNA. The possibility that this complementarity could participate in 3'-end site selection is discussed. Images PMID:3016512
Multiple-mouse MRI with multiple arrays of receive coils.
Ramirez, Marc S; Esparza-Coss, Emilio; Bankson, James A
2010-03-01
Compared to traditional single-animal imaging methods, multiple-mouse MRI has been shown to dramatically improve imaging throughput and reduce the potentially prohibitive cost for instrument access. To date, up to a single radiofrequency coil has been dedicated to each animal being simultaneously scanned, thus limiting the sensitivity, flexibility, and ultimate throughput. The purpose of this study was to investigate the feasibility of multiple-mouse MRI with a phased-array coil dedicated to each animal. A dual-mouse imaging system, consisting of a pair of two-element phased-array coils, was developed and used to achieve acceleration factors greater than the number of animals scanned at once. By simultaneously scanning two mice with a retrospectively gated cardiac cine MRI sequence, a 3-fold acceleration was achieved with signal-to-noise ratio in the heart that is equivalent to that achieved with an unaccelerated scan using a commercial mouse birdcage coil. (c) 2010 Wiley-Liss, Inc.
Badr, Eman; ElHefnawi, Mahmoud; Heath, Lenwood S
2016-01-01
Alternative splicing is a vital process for regulating gene expression and promoting proteomic diversity. It plays a key role in tissue-specific expressed genes. This specificity is mainly regulated by splicing factors that bind to specific sequences called splicing regulatory elements (SREs). Here, we report a genome-wide analysis to study alternative splicing on multiple tissues, including brain, heart, liver, and muscle. We propose a pipeline to identify differential exons across tissues and hence tissue-specific SREs. In our pipeline, we utilize the DEXSeq package along with our previously reported algorithms. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, we identified 28,100 differentially used exons across the four tissues. We identified tissue-specific exonic splicing enhancers that overlap with various previously published experimental and computational databases. A complicated exonic enhancer regulatory network was revealed, where multiple exonic enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the exonic enhancers are found to be co-occurring with multiple exonic silencers and vice versa, which demonstrates a complicated relationship between tissue-specific exonic enhancers and silencers.
A multiple multicomponent approach to chimeric peptide-peptoid podands.
Rivera, Daniel G; León, Fredy; Concepción, Odette; Morales, Fidel E; Wessjohann, Ludger A
2013-05-10
The success of multi-armed, peptide-based receptors in supramolecular chemistry traditionally is not only based on the sequence but equally on an appropriate positioning of various peptidic chains to create a multivalent array of binding elements. As a faster, more versatile and alternative access toward (pseudo)peptidic receptors, a new approach based on multiple Ugi four-component reactions (Ugi-4CR) is proposed as a means of simultaneously incorporating several binding and catalytic elements into organizing scaffolds. By employing α-amino acids either as the amino or acid components of the Ugi-4CRs, this multiple multicomponent process allows for the one-pot assembly of podands bearing chimeric peptide-peptoid chains as appended arms. Tripodal, bowl-shaped, and concave polyfunctional skeletons are employed as topologically varied platforms for positioning the multiple peptidic chains formed by Ugi-4CRs. In a similar approach, steroidal building blocks with several axially-oriented isocyano groups are synthesized and utilized to align the chimeric chains with conformational constrains, thus providing an alternative to the classical peptido-steroidal receptors. The branched and hybrid peptide-peptoid appendages allow new possibilities for both rational design and combinatorial production of synthetic receptors. The concept is also expandable to other multicomponent reactions. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Multiple conformations are a conserved and regulatory feature of the RB1 5′ UTR
Kutchko, Katrina M.; Sanders, Wes; Ziehr, Ben; Phillips, Gabriela; Solem, Amanda; Halvorsen, Matthew; Weeks, Kevin M.; Moorman, Nathaniel
2015-01-01
Folding to a well-defined conformation is essential for the function of structured ribonucleic acids (RNAs) like the ribosome and tRNA. Structured elements in the untranslated regions (UTRs) of specific messenger RNAs (mRNAs) are known to control expression. The importance of unstructured regions adopting multiple conformations, however, is still poorly understood. High-resolution SHAPE-directed Boltzmann suboptimal sampling of the Homo sapiens Retinoblastoma 1 (RB1) 5′ UTR yields three distinct conformations compatible with the experimental data. Private single nucleotide variants (SNVs) identified in two patients with retinoblastoma each collapse the structural ensemble to a single but distinct well-defined conformation. The RB1 5′ UTRs from Bos taurus (cow) and Trichechus manatus latirostris (manatee) are divergent in sequence from H. sapiens (human) yet maintain structural compatibility with high-probability base pairs. SHAPE chemical probing of the cow and manatee RB1 5′ UTRs reveals that they also adopt multiple conformations. Luciferase reporter assays reveal that 5′ UTR mutations alter RB1 expression. In a traditional model of disease, causative SNVs disrupt a key structural element in the RNA. For the subset of patients with heritable retinoblastoma-associated SNVs in the RB1 5′ UTR, the absence of multiple structures is likely causative of the cancer. Our data therefore suggest that selective pressure will favor multiple conformations in eukaryotic UTRs to regulate expression. PMID:25999316
Requena, Jose M; Folgueira, Cristina; López, Manuel C; Thomas, M Carmen
2008-06-02
Protozoan parasites of the genus Leishmania are causative agents of a diverse spectrum of human diseases collectively known as leishmaniasis. These eukaryotic pathogens that diverged early from the main eukaryotic lineage possess a number of unusual genomic, molecular and biochemical features. The completion of the genome projects for three Leishmania species has generated invaluable information enabling a direct analysis of genome structure and organization. By using DNA macroarrays, made with Leishmania infantum genomic clones and hybridized with total DNA from the parasite, we identified a clone containing a repeated sequence. An analysis of the recently completed genome sequence of L. infantum, using this repeated sequence as bait, led to the identification of a new class of repeated elements that are interspersed along the different L. infantum chromosomes. These elements turned out to be homologues of SIDER2 sequences, which were recently identified in the Leishmania major genome; thus, we adopted this nomenclature for the Leishmania elements described herein. Since SIDER2 elements are very heterogeneous in sequence, their precise identification is rather laborious. We have characterized 54 LiSIDER2 elements in chromosome 32 and 27 ones in chromosome 20. The mean size for these elements is 550 bp and their sequence is G+C rich (mean value of 66.5%). On the basis of sequence similarity, these elements can be grouped in subfamilies that show a remarkable relationship of proximity, i.e. SIDER2s of a given subfamily locate close in a chromosomal region without intercalating elements. For comparative purposes, we have identified the SIDER2 elements existing in L. major and Leishmania braziliensis chromosomes 32. While SIDER2 elements are highly conserved both in number and location between L. infantum and L. major, no such conservation exists when comparing with SIDER2s in L. braziliensis chromosome 32. SIDER2 elements constitute a relevant piece in the Leishmania genome organization. Sequence characteristics, genomic distribution and evolutionarily conservation of SIDER2s are suggestive of relevant functions for these elements in Leishmania. Apart from a proved involvement in post-transcriptional mechanisms of gene regulation, SIDER2 elements could be involved in DNA amplification processes and, perhaps, in chromosome segregation as centromeric sequences.
Ho, Pak Leung; Lo, Wai U.; Yeung, Man Kiu; Lin, Chi Ho; Chow, Kin Hung; Ang, Irene; Tong, Amy Hin Yan; Bao, Jessie Yun-Juan; Lok, Si; Lo, Janice Yee Chi
2011-01-01
Background The emergence of plasmid-mediated carbapenemases, such as NDM-1 in Enterobacteriaceae is a major public health issue. Since they mediate resistance to virtually all β-lactam antibiotics and there is often co-resistance to other antibiotic classes, the therapeutic options for infections caused by these organisms are very limited. Methodology We characterized the first NDM-1 producing E. coli isolate recovered in Hong Kong. The plasmid encoding the metallo-β-lactamase gene was sequenced. Principal Findings The plasmid, pNDM-HK readily transferred to E. coli J53 at high frequencies. It belongs to the broad host range IncL/M incompatibility group and is 88803 bp in size. Sequence alignment showed that pNDM-HK has a 55 kb backbone which shared 97% homology with pEL60 originating from the plant pathogen, Erwina amylovora in Lebanon and a 28.9 kb variable region. The plasmid backbone includes the mucAB genes mediating ultraviolet light resistance. The 28.9 kb region has a composite transposon-like structure which includes intact or truncated genes associated with resistance to β-lactams (bla TEM-1, bla NDM-1, Δbla DHA-1), aminoglycosides (aacC2, armA), sulphonamides (sul1) and macrolides (mel, mph2). It also harbors the following mobile elements: IS26, ISCR1, tnpU, tnpAcp2, tnpD, ΔtnpATn1 and insL. Certain blocks within the 28.9 kb variable region had homology with the corresponding sequences in the widely disseminated plasmids, pCTX-M3, pMUR050 and pKP048 originating from bacteria in Poland in 1996, in Spain in 2002 and in China in 2006, respectively. Significance The genetic support of NDM-1 gene suggests that it has evolved through complex pathways. The association with broad host range plasmid and multiple mobile genetic elements explain its observed horizontal mobility in multiple bacterial taxa. PMID:21445317
Surface apposition and multiple cell contacts promote myoblast fusion in Drosophila flight muscles
Dhanyasi, Nagaraju; Segal, Dagan; Shimoni, Eyal; Shinder, Vera
2015-01-01
Fusion of individual myoblasts to form multinucleated myofibers constitutes a widely conserved program for growth of the somatic musculature. We have used electron microscopy methods to study this key form of cell–cell fusion during development of the indirect flight muscles (IFMs) of Drosophila melanogaster. We find that IFM myoblast–myotube fusion proceeds in a stepwise fashion and is governed by apparent cross talk between transmembrane and cytoskeletal elements. Our analysis suggests that cell adhesion is necessary for bringing myoblasts to within a minimal distance from the myotubes. The branched actin polymerization machinery acts subsequently to promote tight apposition between the surfaces of the two cell types and formation of multiple sites of cell–cell contact, giving rise to nascent fusion pores whose expansion establishes full cytoplasmic continuity. Given the conserved features of IFM myogenesis, this sequence of cell interactions and membrane events and the mechanistic significance of cell adhesion elements and the actin-based cytoskeleton are likely to represent general principles of the myoblast fusion process. PMID:26459604
Chandrashekar, Darshan Shimoga; Dey, Poulami; Acharya, Kshitish K.
2015-01-01
Background Genome-wide repeat sequences, such as LINEs, SINEs and LTRs share a considerable part of the mammalian nuclear genomes. These repeat elements seem to be important for multiple functions including the regulation of transcription initiation, alternative splicing and DNA methylation. But it is not possible to study all repeats and, hence, it would help to short-list before exploring their potential functional significance via experimental studies and/or detailed in silico analyses. Result We developed the ‘Genomic Repeat Element Analyzer for Mammals’ (GREAM) for analysis, screening and selection of potentially important mammalian genomic repeats. This web-server offers many novel utilities. For example, this is the only tool that can reveal a categorized list of specific types of transposons, retro-transposons and other genome-wide repetitive elements that are statistically over-/under-represented in regions around a set of genes, such as those expressed differentially in a disease condition. The output displays the position and frequency of identified elements within the specified regions. In addition, GREAM offers two other types of analyses of genomic repeat sequences: a) enrichment within chromosomal region(s) of interest, and b) comparative distribution across the neighborhood of orthologous genes. GREAM successfully short-listed a repeat element (MER20) known to contain functional motifs. In other case studies, we could use GREAM to short-list repetitive elements in the azoospermia factor a (AZFa) region of the human Y chromosome and those around the genes associated with rat liver injury. GREAM could also identify five over-represented repeats around some of the human and mouse transcription factor coding genes that had conserved expression patterns across the two species. Conclusion GREAM has been developed to provide an impetus to research on the role of repetitive sequences in mammalian genomes by offering easy selection of more interesting repeats in various contexts/regions. GREAM is freely available at http://resource.ibab.ac.in/GREAM/. PMID:26208093
Palindromic repetitive DNA elements with coding potential in Methanocaldococcus jannaschii.
Suyama, Mikita; Lathe, Warren C; Bork, Peer
2005-10-10
We have identified 141 novel palindromic repetitive elements in the genome of euryarchaeon Methanocaldococcus jannaschii. The total length of these elements is 14.3kb, which corresponds to 0.9% of the total genomic sequence and 6.3% of all extragenic regions. The elements can be divided into three groups (MJRE1-3) based on the sequence similarity. The low sequence identity within each of the groups suggests rather old origin of these elements in M. jannaschii. Three MJRE2 elements were located within the protein coding regions without disrupting the coding potential of the host genes, indicating that insertion of repeats might be a widespread mechanism to enhance sequence diversity in coding regions.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.
Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario
2011-01-01
Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Global Organization of a Positive-strand RNA Virus Genome
Wu, Baodong; Grigull, Jörg; Ore, Moriam O.; Morin, Sylvie; White, K. Andrew
2013-01-01
The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV) contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2′-hydroxyl acylation analysed by primer extension (i.e. SHAPE), which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context. PMID:23717202
A bipolar population counter using wave pipelining to achieve 2.5 x normal clock frequency
NASA Technical Reports Server (NTRS)
Wong, Derek C.; De Micheli, Giovanni; Flynn, Michael J.; Huston, Robert E.
1992-01-01
Wave pipelining is a technique for pipelining digital systems that can increase clock frequency in practical circuits without increasing the number of storage elements. In wave pipelining, multiple coherent waves of data are sent through a block of combinational logic by applying new inputs faster than the delay through the logic. The throughput of a 63-b CML population counter was increased from 97 to 250 MHz using wave pipelining. The internal circuit is flowthrough combinational logic. Novel CAD methods have balanced all input-to-output paths to about the same delay. This allows multiple data waves to propagate in sequence when the circuit is clocked faster than its propagation delay.
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Kanhayuwa, Lakkhana; Coutts, Robert H. A.
2016-01-01
Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4–14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140–493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3’-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50–65% and 60–75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259–343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity. PMID:27736869
Kanhayuwa, Lakkhana; Coutts, Robert H A
2016-01-01
Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4-14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140-493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3'-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50-65% and 60-75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259-343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity.
Self-adaptive calibration for staring infrared sensors
NASA Astrophysics Data System (ADS)
Kendall, William B.; Stocker, Alan D.
1993-10-01
This paper presents a new, self-adaptive technique for the correlation of non-uniformities (fixed-pattern noise) in high-density infrared focal-plane detector arrays. We have developed a new approach to non-uniformity correction in which we use multiple image frames of the scene itself, and take advantage of the aim-point wander caused by jitter, residual tracking errors, or deliberately induced motion. Such wander causes each detector in the array to view multiple scene elements, and each scene element to be viewed by multiple detectors. It is therefore possible to formulate (and solve) a set of simultaneous equations from which correction parameters can be computed for the detectors. We have tested our approach with actual images collected by the ARPA-sponsored MUSIC infrared sensor. For these tests we employed a 60-frame (0.75-second) sequence of terrain images for which an out-of-date calibration was deliberately used. The sensor was aimed at a point on the ground via an operator-assisted tracking system having a maximum aim point wander on the order of ten pixels. With these data, we were able to improve the calibration accuracy by a factor of approximately 100.
Alternative DNA structure formation in the mutagenic human c-MYC promoter
del Mundo, Imee Marie A.; Zewail-Foote, Maha; Kerwin, Sean M.
2017-01-01
Abstract Mutation ‘hotspot’ regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. PMID:28334873
NASA Technical Reports Server (NTRS)
Smith, T. B., Jr.; Lala, J. H.
1983-01-01
The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.
Evolutionary interaction between W/Y chromosome and transposable elements.
Śliwińska, Ewa B; Martyka, Rafał; Tryjanowski, Piotr
2016-06-01
The W/Y chromosome is unique among chromosomes as it does not recombine in its mature form. The main side effect of cessation of recombination is evolutionary instability and degeneration of the W/Y chromosome, or frequent W/Y chromosome turnovers. Another important feature of W/Y chromosome degeneration is transposable element (TEs) accumulation. Transposon accumulation has been confirmed for all W/Y chromosomes that have been sequenced so far. Models of W/Y chromosome instability include the assemblage of deleterious mutations in protein coding genes, but do not include the influence of transposable elements that are accumulated gradually in the non-recombining genome. The multiple roles of genomic TEs, and the interactions between retrotransposons and genome defense proteins are currently being studied intensively. Small RNAs originating from retrotransposon transcripts appear to be, in some cases, the only mediators of W/Y chromosome function. Based on the review of the most recent publications, we present knowledge on W/Y evolution in relation to retrotransposable element accumulation.
Yin, Hao; Du, Jianchang; Li, Leiting; Jin, Cong; Fan, Lian; Li, Meng; Wu, Jun; Zhang, Shaoling
2014-01-01
Cassandra transposable elements belong to a specific group of terminal-repeat retrotransposons in miniature (TRIM). Although Cassandra TRIM elements have been found in almost all vascular plants, detailed investigations on the nature, abundance, amplification timeframe, and evolution have not been performed in an individual genome. We therefore conducted a comprehensive analysis of Cassandra retrotransposons using the newly sequenced pear genome along with four other Rosaceae species, including apple, peach, mei, and woodland strawberry. Our data reveal several interesting findings for this particular retrotransposon family: 1) A large number of the intact copies contain three, four, or five long terminal repeats (LTRs) (∼20% in pear); 2) intact copies and solo LTRs with or without target site duplications are both common (∼80% vs. 20%) in each genome; 3) the elements exhibit an overall unbiased distribution among the chromosomes; 4) the elements are most successfully amplified in pear (5,032 copies); and 5) the evolutionary relationships of these elements vary among different lineages, species, and evolutionary time. These results indicate that Cassandra retrotransposons contain more complex structures (elements with multiple LTRs) than what we have known previously, and that frequent interelement unequal recombination followed by transposition may play a critical role in shaping and reshaping host genomes. Thus this study provides insights into the property, propensity, and molecular mechanisms governing the formation and amplification of Cassandra retrotransposons, and enhances our understanding of the structural variation, evolutionary history, and transposition process of LTR retrotransposons in plants. PMID:24899073
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk
Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B.; Huson, Daniel H.; Frick, Julia-Stefanie
2016-01-01
Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. PMID:27071651
Huang, Zhihong; Pan, Mengjia; Zhu, Silei; Zhang, Hao; Wu, Wenbi; Yuan, Meijin; Yang, Kai
2017-03-01
Baculoviridae is a family of insect-specific viruses that have a circular double-stranded DNA genome packaged within a rod-shaped capsid. The mechanism of baculovirus nucleocapsid assembly remains unclear. Previous studies have shown that deletion of the ac83 gene of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) blocks viral nucleocapsid assembly. Interestingly, the ac83 -encoded protein Ac83 is not a component of the nucleocapsid, implying a particular role for ac83 in nucleocapsid assembly that may be independent of its protein product. To examine this possibility, Ac83 synthesis was disrupted by insertion of a chloramphenicol resistance gene into its coding sequence or by deleting its promoter and translation start codon. Both mutants produced progeny viruses normally, indicating that the Ac83 protein is not required for nucleocapsid assembly. Subsequently, complementation assays showed that the production of progeny viruses required the presence of ac83 in the AcMNPV genome instead of its presence in trans Therefore, we reasoned that ac83 is involved in nucleocapsid assembly via an internal cis -acting element, which we named the nucleocapsid assembly-essential element (NAE). The NAE was identified to lie within nucleotides 1651 to 1850 of ac83 and had 8 conserved A/T-rich regions. Sequences homologous to the NAE were found only in alphabaculoviruses and have a conserved positional relationship with another essential cis -acting element that was recently identified. The identification of the NAE may help to connect the data of viral cis -acting elements and related proteins in the baculovirus nucleocapsid assembly, which is important for elucidating DNA-protein interaction events during this process. IMPORTANCE Virus nucleocapsid assembly usually requires specific cis -acting elements in the viral genome for various processes, such as the selection of the viral genome from the cellular nucleic acids, the cleavage of concatemeric viral genome replication intermediates, and the encapsidation of the viral genome into procapsids. In linear DNA viruses, such elements generally locate at the ends of the viral genome; however, most of these elements remain unidentified in circular DNA viruses (including baculovirus) due to their circular genomic conformation. Here, we identified a nucleocapsid assembly-essential element in the AcMNPV (the archetype of baculovirus) genome. This finding provides an important reference for studies of nucleocapsid assembly-related elements in baculoviruses and other circular DNA viruses. Moreover, as most of the previous studies of baculovirus nucleocapsid assembly have been focused on viral proteins, our study provides a novel entry point to investigate this mechanism via cis -acting elements in the viral genome. Copyright © 2017 American Society for Microbiology.
Gao, Feng; Simon, Anne E.
2016-01-01
Programmed -1 ribosomal frameshifting (-1 PRF) is used by many positive-strand RNA viruses for translation of required products. Despite extensive studies, it remains unresolved how cis-elements just downstream of the recoding site promote a precise level of frameshifting. The Umbravirus Pea enation mosaic virus RNA2 expresses its RNA polymerase by -1 PRF of the 5′-proximal ORF (p33). Three hairpins located in the vicinity of the recoding site are phylogenetically conserved among Umbraviruses. The central Recoding Stimulatory Element (RSE), located downstream of the p33 termination codon, is a large hairpin with two asymmetric internal loops. Mutational analyses revealed that sequences throughout the RSE and the RSE lower stem (LS) structure are important for frameshifting. SHAPE probing of mutants indicated the presence of higher order structure, and sequences in the LS may also adapt an alternative conformation. Long-distance pairing between the RSE and a 3′ terminal hairpin was less critical when the LS structure was stabilized. A basal level of frameshifting occurring in the absence of the RSE increases to 72% of wild-type when a hairpin upstream of the slippery site is also deleted. These results suggest that suppression of frameshifting may be needed in the absence of an active RSE conformation. PMID:26578603
Simon, J R; Treger, J M; McEntee, K
1999-02-01
Transcription of the polyubiquitin gene UBI4 of Saccharomyces cerevisiae is strongly induced by a variety of environmental stresses, such as heat shock, nutrient depletion and exposure to DNA-damaging agents. This transcriptional response of UBI4 is likely to be the primary mechanism for increasing the pool of ubiquitin for degradation of stress-damaged proteins. Deletion and promoter fusion studies of the 5' regulatory sequences indicated that two different elements, heat shock elements (HSEs) and stress response element (STREs), contributed independently to heat shock regulation of the UBI4 gene. In the absence of HSEs, STRE sequences localized to the intervals -264 to -238 and -215 to -183 were needed for stress control of transcription after heat shock. Site-directed mutagenesis of the STRE (AG4) at -252 to -248 abolished heat shock induction of UBI4 transcription. Northern analysis demonstrated that cells containing either a temperature-sensitive HSF or non-functional Msn2p/Msn4p transcription factors induced high levels of UBI4 transcripts after heat shock. In cells deficient in both heat stress pathways, heat-induced UBI4 transcript levels were considerably lower but not abolished, suggesting a role for another factor(s) in stress control of its expression.
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
Lammers, P J; McLaughlin, S; Papin, S; Trujillo-Provencio, C; Ryncarz, A J
1990-01-01
An 11-kbp DNA element of unknown function interrupts the nifD gene in vegetative cells of Anabaena sp. strain PCC 7120. In developing heterocysts the nifD element excises from the chromosome via site-specific recombination between short repeat sequences that flank the element. The nucleotide sequence of the nifH-proximal half of the element was determined to elucidate the genetic potential of the element. Four open reading frames with the same relative orientation as the nifD element-encoded xisA gene were identified in the sequenced region. Each of the open reading frames was preceded by a reasonable ribosome-binding site and had biased codon utilization preferences consistent with low levels of expression. Open reading frame 3 was highly homologous with three cytochrome P-450 omega-hydroxylase proteins and showed regional homology to functionally significant domains common to the cytochrome P-450 superfamily. The sequence encoding open reading frame 2 was the most highly conserved portion of the sequenced region based on heterologous hybridization experiments with three genera of heterocystous cyanobacteria. Images PMID:2123860
ERIC Educational Resources Information Center
Noell, George H.; Gresham, Frank M.
2001-01-01
Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…
Continuous Influx of Genetic Material from Host to Virus Populations
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane
2016-01-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors. PMID:26829124
Continuous Influx of Genetic Material from Host to Virus Populations.
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane; Cordaux, Richard; Herniou, Elisabeth A
2016-02-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors.
Yin, Huaqun; Zhang, Xian; Li, Xiaoqi; He, Zhili; Liang, Yili; Guo, Xue; Hu, Qi; Xiao, Yunhua; Cong, Jing; Ma, Liyuan; Niu, Jiaojiao; Liu, Xueduan
2014-07-04
Acidithiobacillus thiooxidans (A. thiooxidans), a chemolithoautotrophic extremophile, is widely used in the industrial recovery of copper (bioleaching or biomining). The organism grows and survives by autotrophically utilizing energy derived from the oxidation of elemental sulfur and reduced inorganic sulfur compounds (RISCs). However, the lack of genetic manipulation systems has restricted our exploration of its physiology. With the development of high-throughput sequencing technology, the whole genome sequence analysis of A. thiooxidans has allowed preliminary models to be built for genes/enzymes involved in key energy pathways like sulfur oxidation. The genome of A. thiooxidans A01 was sequenced and annotated. It contains key sulfur oxidation enzymes involved in the oxidation of elemental sulfur and RISCs, such as sulfur dioxygenase (SDO), sulfide quinone reductase (SQR), thiosulfate:quinone oxidoreductase (TQO), tetrathionate hydrolase (TetH), sulfur oxidizing protein (Sox) system and their associated electron transport components. Also, the sulfur oxygenase reductase (SOR) gene was detected in the draft genome sequence of A. thiooxidans A01, and multiple sequence alignment was performed to explore the function of groups of related protein sequences. In addition, another putative pathway was found in the cytoplasm of A. thiooxidans, which catalyzes sulfite to sulfate as the final product by phosphoadenosine phosphosulfate (PAPS) reductase and adenylylsulfate (APS) kinase. This differs from its closest relative Acidithiobacillus caldus, which is performed by sulfate adenylyltransferase (SAT). Furthermore, real-time quantitative PCR analysis showed that most of sulfur oxidation genes were more strongly expressed in the S0 medium than that in the Na2S2O3 medium at the mid-log phase. Sulfur oxidation model of A. thiooxidans A01 has been constructed based on previous studies from other sulfur oxidizing strains and its genome sequence analyses, providing insights into our understanding of its physiology and further analysis of potential functions of key sulfur oxidation genes.
2014-01-01
Background Acidithiobacillus thiooxidans (A. thiooxidans), a chemolithoautotrophic extremophile, is widely used in the industrial recovery of copper (bioleaching or biomining). The organism grows and survives by autotrophically utilizing energy derived from the oxidation of elemental sulfur and reduced inorganic sulfur compounds (RISCs). However, the lack of genetic manipulation systems has restricted our exploration of its physiology. With the development of high-throughput sequencing technology, the whole genome sequence analysis of A. thiooxidans has allowed preliminary models to be built for genes/enzymes involved in key energy pathways like sulfur oxidation. Results The genome of A. thiooxidans A01 was sequenced and annotated. It contains key sulfur oxidation enzymes involved in the oxidation of elemental sulfur and RISCs, such as sulfur dioxygenase (SDO), sulfide quinone reductase (SQR), thiosulfate:quinone oxidoreductase (TQO), tetrathionate hydrolase (TetH), sulfur oxidizing protein (Sox) system and their associated electron transport components. Also, the sulfur oxygenase reductase (SOR) gene was detected in the draft genome sequence of A. thiooxidans A01, and multiple sequence alignment was performed to explore the function of groups of related protein sequences. In addition, another putative pathway was found in the cytoplasm of A. thiooxidans, which catalyzes sulfite to sulfate as the final product by phosphoadenosine phosphosulfate (PAPS) reductase and adenylylsulfate (APS) kinase. This differs from its closest relative Acidithiobacillus caldus, which is performed by sulfate adenylyltransferase (SAT). Furthermore, real-time quantitative PCR analysis showed that most of sulfur oxidation genes were more strongly expressed in the S0 medium than that in the Na2S2O3 medium at the mid-log phase. Conclusion Sulfur oxidation model of A. thiooxidans A01 has been constructed based on previous studies from other sulfur oxidizing strains and its genome sequence analyses, providing insights into our understanding of its physiology and further analysis of potential functions of key sulfur oxidation genes. PMID:24993543
“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files
2014-01-01
Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Unusually long-lived pause required for regulation of a Rho-dependent transcription terminator.
Hollands, Kerry; Sevostiyanova, Anastasia; Groisman, Eduardo A
2014-05-13
Up to half of all transcription termination events in bacteria rely on the RNA-dependent helicase Rho. However, the nucleic acid sequences that promote Rho-dependent termination remain poorly characterized. Defining the molecular determinants that confer Rho-dependent termination is especially important for understanding how such terminators can be regulated in response to specific signals. Here, we identify an extraordinarily long-lived pause at the site where Rho terminates transcription in the 5'-leader region of the Mg(2+) transporter gene mgtA in Salmonella enterica. We dissect the sequence elements required for prolonged pausing in the mgtA leader and establish that the remarkable longevity of this pause is required for a riboswitch to stimulate Rho-dependent termination in the mgtA leader region in response to Mg(2+) availability. Unlike Rho-dependent terminators described previously, where termination occurs at multiple pause sites, there is a single site of transcription termination directed by Rho in the mgtA leader. Our data suggest that Rho-dependent termination events that are subject to regulation may require elements distinct from those operating at constitutive Rho-dependent terminators.
Self-expressive Dictionary Learning for Dynamic 3D Reconstruction.
Zheng, Enliang; Ji, Dinghuang; Dunn, Enrique; Frahm, Jan-Michael
2017-08-22
We target the problem of sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning, where the dictionary is defined as an aggregation of the temporally varying 3D structures. Given the smooth motion of dynamic objects, we observe any element in the dictionary can be well approximated by a sparse linear combination of other elements in the same dictionary (i.e. self-expression). Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. We further analyze the reconstructability of our approach under different capture scenarios, and its comparison and relation to existing methods. Experimental results on large amounts of synthetic data as well as real imagery demonstrate the effectiveness of our approach.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Short interspersed elements (SINEs) are a major source of canine genomic diversity.
Wang, Wei; Kirkness, Ewen F
2005-12-01
SINEs are retrotransposons that have enjoyed remarkable reproductive success during the course of mammalian evolution, and have played a major role in shaping mammalian genomes. Previously, an analysis of survey-sequence data from an individual dog (a poodle) indicated that canine genomes harbor a high frequency of alleles that differ only by the absence or presence of a SINEC_Cf repeat. Comparison of this survey-sequence data with a draft genome sequence of a distinct dog (a boxer) has confirmed this prediction, and revealed the chromosomal coordinates for >10,000 loci that are bimorphic for SINEC_Cf insertions. Analysis of SINE insertion sites from the genomes of nine additional dogs indicates that 3%-5% are absent from either the poodle or boxer genome sequences--suggesting that an additional 10,000 bimorphic loci could be readily identified in the general dog population. We describe a methodology that can be used to identify these loci, and could be adapted to exploit these bimorphic loci for genotyping purposes. Approximately half of all annotated canine genes contain SINEC_Cf repeats, and these elements are occasionally transcribed. When transcribed in the antisense orientation, they provide splice acceptor sites that can result in incorporation of novel exons. The high frequency of bimorphic SINE insertions in the dog population is predicted to provide numerous examples of allele-specific transcription patterns that will be valuable for the study of differential gene expression among multiple dog breeds.
Alu expression in human cell lines and their retrotranspositional potential.
Oler, Andrew J; Traina-Dorge, Stephen; Derbes, Rebecca S; Canella, Donatella; Cairns, Brad R; Roy-Engel, Astrid M
2012-06-20
The vast majority of the 1.1 million Alu elements are retrotranspositionally inactive, where only a few loci referred to as 'source elements' can generate new Alu insertions. The first step in identifying the active Alu sources is to determine the loci transcribed by RNA polymerase III (pol III). Previous genome-wide analyses from normal and transformed cell lines identified multiple Alu loci occupied by pol III factors, making them candidate source elements. Analysis of the data from these genome-wide studies determined that the majority of pol III-bound Alus belonged to the older subfamilies Alu S and Alu J, which varied between cell lines from 62.5% to 98.7% of the identified loci. The pol III-bound Alus were further scored for estimated retrotransposition potential (ERP) based on the absence or presence of selected sequence features associated with Alu retrotransposition capability. Our analyses indicate that most of the pol III-bound Alu loci candidates identified lack the sequence characteristics important for retrotransposition. These data suggest that Alu expression likely varies by cell type, growth conditions and transformation state. This variation could extend to where the same cell lines in different laboratories present different Alu expression patterns. The vast majority of Alu loci potentially transcribed by RNA pol III lack important sequence features for retrotransposition and the majority of potentially active Alu loci in the genome (scored high ERP) belong to young Alu subfamilies. Our observations suggest that in an in vivo scenario, the contribution of Alu activity on somatic genetic damage may significantly vary between individuals and tissues.
DeFranco, D; Yamamoto, K R
1986-01-01
The expression of genes fused downstream of the Moloney murine sarcoma virus (MoMSV) long terminal repeat is stimulated by glucocorticoids. We mapped the glucocorticoid response element that conferred this hormonal regulation and found that it is a hormone-dependent transcriptional enhancer, designated Sg; it resides within DNA fragments that also carry a previously described enhancer element (B. Levinson, G. Khoury, G. Vande Woude, and P. Gruss, Nature [London] 295:568-572, 1982), here termed Sa, whose activity is independent of the hormone. Nuclease footprinting revealed that purified glucocorticoid receptor bound at multiple discrete sites within and at the borders of the tandemly repeated sequence motif that defines Sa. The Sa and Sg activities stimulated the apparent efficiency of cognate or heterologous promoter utilization, individually providing modest enhancement and in concert yielding higher levels of activity. A deletion mutant lacking most of the tandem repeat but retaining a single receptor footprint sequence lost Sa activity but still conferred Sg activity. The two enhancer components could also be distinguished physiologically: both were operative within cultured rat fibroblasts, but only Sg activity was detectable in rat exocrine pancreas cells. Therefore, the sequence determinants of Sa and Sg activity may be interdigitated, and when both components are active, the receptor and a putative Sa factor can apparently bind and act simultaneously. We concluded that MoMSV enhancer activity is effected by at least two distinct binding factors, suggesting that combinatorial regulation of promoter function can be mediated even from a single genetic element. Images PMID:3023887
Majoros, William H; Ohler, Uwe
2010-12-16
The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
Satellite phage TLCφ enables toxigenic conversion by CTX phage through dif site alteration.
Hassan, Faizule; Kamruzzaman, M; Mekalanos, John J; Faruque, Shah M
2010-10-21
Bacterial chromosomes often carry integrated genetic elements (for example plasmids, transposons, prophages and islands) whose precise function and contribution to the evolutionary fitness of the host bacterium are unknown. The CTXφ prophage, which encodes cholera toxin in Vibrio cholerae, is known to be adjacent to a chromosomally integrated element of unknown function termed the toxin-linked cryptic (TLC). Here we report the characterization of a TLC-related element that corresponds to the genome of a satellite filamentous phage (TLC-Knφ1), which uses the morphogenesis genes of another filamentous phage (fs2φ) to form infectious TLC-Knφ1 phage particles. The TLC-Knφ1 phage genome carries a sequence similar to the dif recombination sequence, which functions in chromosome dimer resolution using XerC and XerD recombinases. The dif sequence is also exploited by lysogenic filamentous phages (for example CTXφ) for chromosomal integration of their genomes. Bacterial cells defective in the dimer resolution often show an aberrant filamentous cell morphology. We found that acquisition and chromosomal integration of the TLC-Knφ1 genome restored a perfect dif site and normal morphology to V. cholerae wild-type and mutant strains with dif(-) filamentation phenotypes. Furthermore, lysogeny of a dif(-) non-toxigenic V. cholerae with TLC-Knφ1 promoted its subsequent toxigenic conversion through integration of CTXφ into the restored dif site. These results reveal a remarkable level of cooperative interactions between multiple filamentous phages in the emergence of the bacterial pathogen that causes cholera.
Optically intraconnected computer employing dynamically reconfigurable holographic optical element
NASA Technical Reports Server (NTRS)
Bergman, Larry A. (Inventor)
1992-01-01
An optically intraconnected computer and a reconfigurable holographic optical element employed therein. The basic computer comprises a memory for holding a sequence of instructions to be executed; logic for accessing the instructions in sequence; logic for determining for each the instruction the function to be performed and the effective address thereof; a plurality of individual elements on a common support substrate optimized to perform certain logical sequences employed in executing the instructions; and, element selection logic connected to the logic determining the function to be performed for each the instruction for determining the class of each function and for causing the instruction to be executed by those the elements which perform those associated the logical sequences affecting the instruction execution in an optimum manner. In the optically intraconnected version, the element selection logic is adapted for transmitting and switching signals to the elements optically.
Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.
Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J
2018-05-10
Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.
Yan, Fan; Di, Shaokang; Takahashi, Ryoji
2015-08-01
The R gene of soybean, presumably encoding a MYB transcription factor, controls seed coat color. The gene consists of multiple alleles, R (black), r-m (black spots and (or) concentric streaks on brown seed), and r (brown seed). This study was conducted to determine the structure of the MYB transcription factor gene in a near-isogenic line (NIL) having r-m allele. PCR amplification of a fragment of the candidate gene Glyma.09G235100 generated a fragment of about 1 kb in the soybean cultivar Clark, whereas a fragment of about 14 kb in addition to fragments of 1 and 1.4 kb were produced in L72-2040, a Clark 63 NIL with the r-m allele. Clark 63 is a NIL of Clark with the rxp and Rps1 alleles. A DNA fragment of 13 060 bp was inserted in the intron of Glyma.09G235100 in L72-2040. The fragment had the CACTA motif at both ends, imperfect terminal inverted repeats (TIR), inverse repetition of short sequence motifs close to the 5' and 3' ends, and a duplication of three nucleotides at the site of integration, indicating that it belongs to a CACTA-superfamily transposable element. We designated the element as Tgm11. Overall nucleotide sequence, motifs of TIR, and subterminal repeats were similar to those of Tgm1 and Tgs1, suggesting that these elements comprise a family.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jay, Z.; Beam, Jake; Dohnalkova, Alice
Thermoproteales populations (phylum Crenarchaeota) are abundant in high-25 temperature (>70° C) environments of Yellowstone National Park (YNP) and are important in mediating biogeochemical cycles of sulfur, arsenic and carbon. The objectives of this study were to determine specific physiological attributes of the isolate Pyrobaculum yellowstonensis strain WP30, which was obtained from an elemental sulfur sediment (Joseph’s Coat Hot Spring [JCHS]; 80 °C; pH 6.1), and relate this organism to geochemical processes occurring in situ. Strain WP30 is a chemoheterotroph that utilizes organic carbon as a source of carbon and electrons and requires elemental sulfur and/or arsenic as electron acceptors. Growthmore » in the presence of elemental sulfur and arsenate resulted in the production of thioarsenates and polysulfides relative to sterile controls. The complete genome of this organism was sequenced (1.99 Mb, 58 % G+C) and revealed numerous metabolic pathways for the degradation of carbohydrates, amino acids and lipids, multiple dimethylsulfoxide molybdopterin (DMSO-MPT) oxidoreductase genes, which are implicated in the reduction of sulfur and arsenic, and pathways for the de novo synthesis of nearly all required cofactors and metabolites. Comparative genomics of P. yellowstonensis versus assembled metagenome sequence from JCHS showed that this organisms is highly-related (~95% average nucleotide identity) to in situ populations. The physiological attributes and metabolic capabilities of P. yellowstonensis provide importanat information towards understanding the distribution and function of these populations in YNP.« less
Mahelka, Václav; Krak, Karol; Kopecký, David; Fehrer, Judith; Šafář, Jan; Bartoš, Jan; Hobza, Roman; Blavet, Nicolas; Blattner, Frank R
2017-02-14
The movement of nuclear DNA from one vascular plant species to another in the absence of fertilization is thought to be rare. Here, nonnative rRNA gene [ribosomal DNA (rDNA)] copies were identified in a set of 16 diploid barley ( Hordeum ) species; their origin was traceable via their internal transcribed spacer (ITS) sequence to five distinct Panicoideae genera, a lineage that split from the Pooideae about 60 Mya. Phylogenetic, cytogenetic, and genomic analyses implied that the nonnative sequences were acquired between 1 and 5 Mya after a series of multiple events, with the result that some current Hordeum sp. individuals harbor up to five different panicoid rDNA units in addition to the native Hordeum rDNA copies. There was no evidence that any of the nonnative rDNA units were transcribed; some showed indications of having been silenced via pseudogenization. A single copy of a Panicum sp. rDNA unit present in H. bogdanii had been interrupted by a native transposable element and was surrounded by about 70 kbp of mostly noncoding sequence of panicoid origin. The data suggest that horizontal gene transfer between vascular plants is not a rare event, that it is not necessarily restricted to one or a few genes only, and that it can be selectively neutral.
Marques, André; Ribeiro, Tiago; Neumann, Pavel; Macas, Jiří; Novák, Petr; Schubert, Veit; Pellino, Marco; Fuchs, Jörg; Ma, Wei; Kuhlmann, Markus; Brandt, Ronny; Vanzela, André L L; Beseda, Tomáš; Šimková, Hana; Pedrosa-Harand, Andrea; Houben, Andreas
2015-11-03
Holocentric chromosomes lack a primary constriction, in contrast to monocentrics. They form kinetochores distributed along almost the entire poleward surface of the chromatids, to which spindle fibers attach. No centromere-specific DNA sequence has been found for any holocentric organism studied so far. It was proposed that centromeric repeats, typical for many monocentric species, could not occur in holocentrics, most likely because of differences in the centromere organization. Here we show that the holokinetic centromeres of the Cyperaceae Rhynchospora pubera are highly enriched by a centromeric histone H3 variant-interacting centromere-specific satellite family designated "Tyba" and by centromeric retrotransposons (i.e., CRRh) occurring as genome-wide interspersed arrays. Centromeric arrays vary in length from 3 to 16 kb and are intermingled with gene-coding sequences and transposable elements. We show that holocentromeres of metaphase chromosomes are composed of multiple centromeric units rather than possessing a diffuse organization, thus favoring the polycentric model. A cell-cycle-dependent shuffling of multiple centromeric units results in the formation of functional (poly)centromeres during mitosis. The genome-wide distribution of centromeric repeat arrays interspersing the euchromatin provides a previously unidentified type of centromeric chromatin organization among eukaryotes. Thus, different types of holocentromeres exist in different species, namely with and without centromeric repetitive sequences.
Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop
Hatakeyama, Masaomi; Aluri, Sirisha; Balachadran, Mathi Thumilan; Sivarajan, Sajeevan Radha; Patrignani, Andrea; Grüter, Simon; Poveda, Lucy; Shimizu-Inatsugi, Rie; Baeten, John; Francoijs, Kees-Jan; Nataraja, Karaba N; Reddy, Yellodu A Nanja; Phadnis, Shamprasad; Ravikumar, Ramapura L; Schlapbach, Ralph; Sreeman, Sheshshayee M; Shimizu, Kentaro K
2018-01-01
Abstract Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length >2.6 Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes. PMID:28985356
Zaba: a novel miniature transposable element present in genomes of legume plants.
Macas, J; Neumann, P; Pozárková, D
2003-08-01
A novel family of miniature transposable elements, named Zaba, was identified in pea (Pisum sativum) and subsequently also in other legume species using computer analysis of their DNA sequences. Zaba elements are 141-190 bp long, generate 10-bp target site duplications, and their terminal inverted repeats make up most of the sequence. Zaba elements thus resemble class 3 foldback transposons. The elements are only moderately repetitive in pea (tens to hundreds copies per haploid genome), but they are present in up to thousands of copies in the genomes of several Medicago and Vicia species. More detailed analysis of the elements from pea, including isolation of new sequences from a genomic library, revealed that a fraction of these elements are truncated, and that their last transposition probably did not occur recently. A search for Zaba sequences in EST databases showed that at least some elements are transcribed, most probably due to their association with genic regions.
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.
Šatović, Eva; Plohl, Miroslav
2017-10-01
Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Sela, Dotan; Chen, Lu; Martin-Brown, Skylar; Washburn, Michael P; Florens, Laurence; Conaway, Joan Weliky; Conaway, Ronald C
2012-06-29
The basic leucine zipper transcription factor ATF6α functions as a master regulator of endoplasmic reticulum (ER) stress response genes. Previous studies have established that, in response to ER stress, ATF6α translocates to the nucleus and activates transcription of ER stress response genes upon binding sequence specifically to ER stress response enhancer elements in their promoters. In this study, we investigate the biochemical mechanism by which ATF6α activates transcription. By exploiting a combination of biochemical and multidimensional protein identification technology-based mass spectrometry approaches, we have obtained evidence that ATF6α functions at least in part by recruiting to the ER stress response enhancer elements of ER stress response genes a collection of RNA polymerase II coregulatory complexes, including the Mediator and multiple histone acetyltransferase complexes, among which are the Spt-Ada-Gcn5 acetyltransferase (SAGA) and Ada-Two-A-containing (ATAC) complexes. Our findings shed new light on the mechanism of action of ATF6α, and they outline a straightforward strategy for applying multidimensional protein identification technology mass spectrometry to determine which RNA polymerase II transcription factors and coregulators are recruited to promoters and other regulatory elements to control transcription.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
2013-01-01
Background Transposable elements (TEs) and other repetitive elements are a large and dynamically evolving part of eukaryotic genomes, especially in plants where they can account for a significant proportion of genome size. Their dynamic nature gives them the potential for use in identifying and characterizing crop germplasm. However, their repetitive nature makes them challenging to study using conventional methods of molecular biology. Next generation sequencing and new computational tools have greatly facilitated the investigation of TE variation within species and among closely related species. Results (i) We generated low-coverage Illumina whole genome shotgun sequencing reads for multiple individuals of cacao (Theobroma cacao) and related species. These reads were analysed using both an alignment/mapping approach and a de novo (graph based clustering) approach. (ii) A standard set of ultra-conserved orthologous sequences (UCOS) standardized TE data between samples and provided phylogenetic information on the relatedness of samples. (iii) The mapping approach proved highly effective within the reference species but underestimated TE abundance in interspecific comparisons relative to the de novo methods. (iv) Individual T. cacao accessions have unique patterns of TE abundance indicating that the TE composition of the genome is evolving actively within this species. (v) LTR/Gypsy elements are the most abundant, comprising c.10% of the genome. (vi) Within T. cacao the retroelement families show an order of magnitude greater sequence variability than the DNA transposon families. (vii) Theobroma grandiflorum has a similar TE composition to T. cacao, but the related genus Herrania is rather different, with LTRs making up a lower proportion of the genome, perhaps because of a massive presence (c. 20%) of distinctive low complexity satellite-like repeats in this genome. Conclusions (i) Short read alignment/mapping to reference TE contigs provides a simple and effective method of investigating intraspecific differences in TE composition. It is not appropriate for comparing repetitive elements across the species boundaries, for which de novo methods are more appropriate. (ii) Individual T. cacao accessions have unique spectra of TE composition indicating active evolution of TE abundance within this species. TE patterns could potentially be used as a “fingerprint” to identify and characterize cacao accessions. PMID:23883295
Crammed signaling motifs in the T-cell receptor.
Borroto, Aldo; Abia, David; Alarcón, Balbino
2014-09-01
Although the T cell antigen receptor (TCR) is long known to contain multiple signaling subunits (CD3γ, CD3δ, CD3ɛ and CD3ζ), their role in signal transduction is still not well understood. The presence of at least one immunoreceptor tyrosine-based activation motif (ITAM) in each CD3 subunit has led to the idea that the multiplication of such elements essentially serves to amplify signals. However, the evolutionary conservation of non-ITAM sequences suggests that each CD3 subunit is likely to have specific non-redundant roles at some stage of development or in mature T cell function. The CD3ɛ subunit is paradigmatic because in a relatively short cytoplasmic sequence (∼55 amino acids) it contains several docking sites for proteins involved in intracellular trafficking and signaling, proteins whose relevance in T cell activation is slowly starting to be revealed. In this review we will summarize our current knowledge on the signaling effectors that bind directly to the TCR and we will propose a hierarchy in their response to TCR triggering. Copyright © 2014 Elsevier B.V. All rights reserved.
The genetic structure of the A mating-type locus of Lentinula edodes.
Au, Chun Hang; Wong, Man Chun; Bao, Dapeng; Zhang, Meiyan; Song, Chunyan; Song, Wenhua; Law, Patrick Tik Wan; Kües, Ursula; Kwan, Hoi Shan
2014-02-10
The Shiitake mushroom, Lentinula edodes (Berk.) Pegler is a tetrapolar basidiomycete with two unlinked mating-type loci, commonly called the A and B loci. Identifying the mating-types in shiitake is important for enhancing the breeding and cultivation of this economically-important edible mushroom. Here, we identified the A mating-type locus from the first draft genome sequence of L. edodes and characterized multiple alleles from different monokaryotic strains. Two intron-length polymorphism markers were developed to facilitate rapid molecular determination of A mating-type. L. edodes sequences were compared with those of known tetrapolar and bipolar basidiomycete species. The A mating-type genes are conserved at the homeodomain region across the order Agaricales. However, we observed unique genomic organization of the locus in L. edodes which exhibits atypical gene order and multiple repetitive elements around its A locus. To our knowledge, this is the first known exception among Homobasidiomycetes, in which the mitochondrial intermediate peptidase (mip) gene is not closely linked to A locus. Copyright © 2013 Elsevier B.V. All rights reserved.
Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M
2018-05-01
Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Alternative DNA structure formation in the mutagenic human c-MYC promoter.
Del Mundo, Imee Marie A; Zewail-Foote, Maha; Kerwin, Sean M; Vasquez, Karen M
2017-05-05
Mutation 'hotspot' regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization of the Fb-Nof Transposable Element of Drosophila Melanogaster
Harden, N.; Ashburner, M.
1990-01-01
FB-NOF is a composite transposable element of Drosophila melanogaster. It is composed of foldback sequences, of variable length, which flank a 4-kb NOF sequence with 308-bp inverted repeat termini. The NOF sequence could potentially code for a 120-kD polypeptide. The FB-NOF element is responsible for unstable mutations of the white gene (w(c) and w(DZL)) and is associated with the large TEs of G. Ising. Although most strains of D. melanogaster have 20-30 sites of FB insertion, FB-NOF elements are usually rare, many strains lack this composite element or have only one copy of it. A few strains, including w(DZL) and Basc have many (8-21) copies of FB-NOF, and these show a tendency to insert at ``hot-spots.'' These strains also have an increased number of FB elements. The DNA sequence of the NOF region associated with TE146(Z) has been determined. PMID:2174013
The twilight zone of cis element alignments.
Sebastian, Alvaro; Contreras-Moreira, Bruno
2013-02-01
Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
The twilight zone of cis element alignments
Sebastian, Alvaro; Contreras-Moreira, Bruno
2013-01-01
Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451
Liang, Chanjuan; van Dijk, Jeroen P; Scholtens, Ingrid M J; Staats, Martijn; Prins, Theo W; Voorhuijzen, Marleen M; da Silva, Andrea M; Arisi, Ana Carolina Maisonnave; den Dunnen, Johan T; Kok, Esther J
2014-04-01
The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.
Mulcahy, Nicholas J; Call, Josep; Dunbar, Robin I M
2005-02-01
Two important elements in problem solving are the abilities to encode relevant task features and to combine multiple actions to achieve the goal. The authors investigated these 2 elements in a task in which gorillas (Gorilla gorilla) and orangutans (Pongo pygmaeus) had to use a tool to retrieve an out-of-reach reward. Subjects were able to select tools of an appropriate length to reach the reward even when the position of the reward and tools were not simultaneously visible. When presented with tools that were too short to retrieve the reward, subjects were more likely to refuse to use them than when tools were the appropriate length. Subjects were proficient at using tools in sequence to retrieve the reward.
Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice.
Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun
2015-12-11
As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice.
Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torella, JP; Boehm, CR; Lienert, F
2013-12-28
In vitro recombination methods have enabled one-step construction of large DNA sequences from multiple parts. Although synthetic biological circuits can in principle be assembled in the same fashion, they typically contain repeated sequence elements such as standard promoters and terminators that interfere with homologous recombination. Here we use a computational approach to design synthetic, biologically inactive unique nucleotide sequences (UNSes) that facilitate accurate ordered assembly. Importantly, our designed UNSes make it possible to assemble parts with repeated terminator and insulator sequences, and thereby create insulated functional genetic circuits in bacteria and mammalian cells. Using UNS-guided assembly to construct repeating promoter-gene-terminatormore » parts, we systematically varied gene expression to optimize production of a deoxychromoviridans biosynthetic pathway in Escherichia coli. We then used this system to construct complex eukaryotic AND-logic gates for genomic integration into embryonic stem cells. Construction was performed by using a standardized series of UNS-bearing BioBrick-compatible vectors, which enable modular assembly and facilitate reuse of individual parts. UNS-guided isothermal assembly is broadly applicable to the construction and optimization of genetic circuits and particularly those requiring tight insulation, such as complex biosynthetic pathways, sensors, counters and logic gates.« less
CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs.
Gilbert, N; Labuda, D
1999-03-16
A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.
CORE-SINEs: Eukaryotic short interspersed retroposing elements with common sequence motifs
Gilbert, Nicolas; Labuda, Damian
1999-01-01
A 65-bp “core” sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3′ ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome. PMID:10077603
ElemeNT: a computational tool for detecting core promoter elements.
Sloutskin, Anna; Danino, Yehuda M; Orenstein, Yaron; Zehavi, Yonathan; Doniger, Tirza; Shamir, Ron; Juven-Gershon, Tamar
2015-01-01
Core promoter elements play a pivotal role in the transcriptional output, yet they are often detected manually within sequences of interest. Here, we present 2 contributions to the detection and curation of core promoter elements within given sequences. First, the Elements Navigation Tool (ElemeNT) is a user-friendly web-based, interactive tool for prediction and display of putative core promoter elements and their biologically-relevant combinations. Second, the CORE database summarizes ElemeNT-predicted core promoter elements near CAGE and RNA-seq-defined Drosophila melanogaster transcription start sites (TSSs). ElemeNT's predictions are based on biologically-functional core promoter elements, and can be used to infer core promoter compositions. ElemeNT does not assume prior knowledge of the actual TSS position, and can therefore assist in annotation of any given sequence. These resources, freely accessible at http://lifefaculty.biu.ac.il/gershon-tamar/index.php/resources, facilitate the identification of core promoter elements as active contributors to gene expression.
Bobrova, E V; Liakhovetskiĭ, V A; Borshchevskaia, E R
2011-01-01
The dependence of errors during reproduction of a sequence of hand movements without visual feedback on the previous right- and left-hand performance ("prehistory") and on positions in space of sequence elements (random or ordered by the explicit rule) was analyzed. It was shown that the preceding information about the ordered positions of the sequence elements was used during right-hand movements, whereas left-hand movements were performed with involvement of the information about the random sequence. The data testify to a central mechanism of the analysis of spatial structure of sequence elements. This mechanism activates movement coding specific for the left hemisphere (vector coding) in case of an ordered sequence structure and positional coding specific for the right hemisphere in case of a random sequence structure.
Prediction and phylogenetic analysis of mammalian short interspersed elements (SINEs).
Rogozin, I B; Mayorov, V I; Lavrentieva, M V; Milanesi, L; Adkison, L R
2000-09-01
The presence of repetitive elements can create serious problems for sequence analysis, especially in the case of homology searches in nucleotide sequence databases. Repetitive elements should be treated carefully by using special programs and databases. In this paper, various aspects of SINE (short interspersed repetitive element) identification, analysis and evolution are discussed.
Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.
2017-01-01
Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.
Schnitzler, P; Darai, G
1989-09-01
The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Kovacs, A; Kandala, J C; Weber, K T; Guntaka, R V
1996-01-19
Type I and III fibrillar collagens are the major structural proteins of the extracellular matrix found in various organs including the myocardium. Abnormal and progressive accumulation of fibrillar type I collagen in the interstitial spaces compromises organ function and therefore, the study of transcriptional regulation of this gene and specific targeting of its expression is of major interest. Transient transfection of adult cardiac fibroblasts indicate that the polypurine-polypyrimidine sequence of alpha 1(I) collagen promoter between nucleotides - 200 and -140 represents an overall positive regulatory element. DNase I footprinting and electrophoretic mobility shift assays suggest that multiple factors bind to different elements of this promoter region. We further demonstrate that the unique polypyrimidine sequence between -172 and -138 of the promoter represents a suitable target for a single-stranded polypurine oligonucleotide (TFO) to form a triple helix DNA structure. Modified electrophoretic mobility shift assays show that this TFO specifically inhibits the protein-DNA interaction within the target region. In vitro transcription assays and transient transfection experiments demonstrate that the transcriptional activity of the promoter is inhibited by this oligonucleotide. We propose that TFOs represent a therapeutic potential to specifically influence the expression of alpha 1(I) collagen gene in various disease states where abnormal type I collagen accumulation is known to occur.
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk.
Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B; Huson, Daniel H; Frick, Julia-Stefanie
2016-04-25
Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Recessive mutations in the INS gene result in neonatal diabetes through reduced insulin biosynthesis
Garin, Intza; Edghill, Emma L.; Akerman, Ildem; Rubio-Cabezas, Oscar; Rica, Itxaso; Locke, Jonathan M.; Maestro, Miguel Angel; Alshaikh, Adnan; Bundak, Ruveyde; del Castillo, Gabriel; Deeb, Asma; Deiss, Dorothee; Fernandez, Juan M.; Godbole, Koumudi; Hussain, Khalid; O’Connell, Michele; Klupa, Thomasz; Kolouskova, Stanislava; Mohsin, Fauzia; Perlman, Kusiel; Sumnik, Zdenek; Rial, Jose M.; Ugarte, Estibaliz; Vasanthi, Thiruvengadam; Johnstone, Karen; Flanagan, Sarah E.; Martínez, Rosa; Castaño, Carlos; Patch, Ann-Marie; Fernández-Rebollo, Eduardo; Raile, Klemens; Morgan, Noel; Harries, Lorna W.; Castaño, Luis; Ellard, Sian; Ferrer, Jorge; de Nanclares, Guiomar Perez; Hattersley, Andrew T.
2010-01-01
Heterozygous coding mutations in the INS gene that encodes preproinsulin were recently shown to be an important cause of permanent neonatal diabetes. These dominantly acting mutations prevent normal folding of proinsulin, which leads to beta-cell death through endoplasmic reticulum stress and apoptosis. We now report 10 different recessive INS mutations in 15 probands with neonatal diabetes. Functional studies showed that recessive mutations resulted in diabetes because of decreased insulin biosynthesis through distinct mechanisms, including gene deletion, lack of the translation initiation signal, and altered mRNA stability because of the disruption of a polyadenylation signal. A subset of recessive mutations caused abnormal INS transcription, including the deletion of the C1 and E1 cis regulatory elements, or three different single base-pair substitutions in a CC dinucleotide sequence located between E1 and A1 elements. In keeping with an earlier and more severe beta-cell defect, patients with recessive INS mutations had a lower birth weight (−3.2 SD score vs. −2.0 SD score) and were diagnosed earlier (median 1 week vs. 10 weeks) compared to those with dominant INS mutations. Mutations in the insulin gene can therefore result in neonatal diabetes as a result of two contrasting pathogenic mechanisms. Moreover, the recessively inherited mutations provide a genetic demonstration of the essential role of multiple sequence elements that regulate the biosynthesis of insulin in man. PMID:20133622
Zhang, Rong-Xiang; Qin, Li-Jun; Zhao, De-Gang
2017-07-20
Inositol is a cyclic polyol that is involved in various physiological processes, including signal transduction and stress adaptation in plants. l- myo -inositol monophosphatase (IMPase) is one of the metal-dependent phosphatase family members and catalyzes the last reaction step of biosynthesis of inositol. Although increased IMPase activity induced by abiotic stress has been reported in chickpea plants, the role and regulation of the IMP gene in rice ( Oryza sativa L.) remains poorly understood. In the present work, we obtained a full-length cDNA sequence coding IMPase in the cold tolerant rice landraces in Gaogonggui, which is named as OsIMP . Multiple alignment results have displayed that this sequence has characteristic signature motifs and conserved enzyme active sites of the phosphatase super family. Phylogenetic analysis showed that IMPase is most closely related to that of the wild rice Oryza brachyantha , while transcript analysis revealed that the expression of the OsIMP is significantly induced by cold stress and exogenous abscisic acid (ABA) treatment. Meanwhile, we cloned the 5' flanking promoter sequence of the OsIMP gene and identified several important cis -acting elements, such as LTR (low-temperature responsiveness), TCA-element (salicylic acid responsiveness), ABRE-element (abscisic acid responsiveness), GARE-motif (gibberellin responsive), MBS (MYB Binding Site) and other cis -acting elements related to defense and stress responsiveness. To further investigate the potential function of the OsIMP gene, we generated transgenic tobacco plants overexpressing the OsIMP gene and the cold tolerance test indicated that these transgenic tobacco plants exhibit improved cold tolerance. Furthermore, transgenic tobacco plants have a lower level of hydrogen peroxide (H₂O₂) and malondialdehyde (MDA), and a higher content of total chlorophyll as well as increased antioxidant enzyme activities of superoxide dismutase (SOD), catalase (CAT) and peroxidase (POD), when compared to wild type (WT) tobacco plants under normal and cold stress conditions.
White, Eleanor; Kamieniarz-Gdula, Kinga; Dye, Michael J.; Proudfoot, Nick J.
2013-01-01
RNA Polymerase II (Pol II) termination is dependent on RNA processing signals as well as specific terminator elements located downstream of the poly(A) site. One of the two major terminator classes described so far is the Co-Transcriptional Cleavage (CoTC) element. We show that homopolymer A/T tracts within the human β-globin CoTC-mediated terminator element play a critical role in Pol II termination. These short A/T tracts, dispersed within seemingly random sequences, are strong terminator elements, and bioinformatics analysis confirms the presence of such sequences in 70% of the putative terminator regions (PTRs) genome-wide. PMID:23258704
Li, Shu-Fen; Zhang, Guo-Jun; Yuan, Jin-Hong; Deng, Chuan-Liang; Gao, Wu-Jun
2016-05-01
The present review discusses the roles of repetitive sequences played in plant sex chromosome evolution, and highlights epigenetic modification as potential mechanism of repetitive sequences involved in sex chromosome evolution. Sex determination in plants is mostly based on sex chromosomes. Classic theory proposes that sex chromosomes evolve from a specific pair of autosomes with emergence of a sex-determining gene(s). Subsequently, the newly formed sex chromosomes stop recombination in a small region around the sex-determining locus, and over time, the non-recombining region expands to almost all parts of the sex chromosomes. Accumulation of repetitive sequences, mostly transposable elements and tandem repeats, is a conspicuous feature of the non-recombining region of the Y chromosome, even in primitive one. Repetitive sequences may play multiple roles in sex chromosome evolution, such as triggering heterochromatization and causing recombination suppression, leading to structural and morphological differentiation of sex chromosomes, and promoting Y chromosome degeneration and X chromosome dosage compensation. In this article, we review the current status of this field, and based on preliminary evidence, we posit that repetitive sequences are involved in sex chromosome evolution probably via epigenetic modification, such as DNA and histone methylation, with small interfering RNAs as the mediator.
Sequenced drive for rotary valves
Mittell, Larry C.
1981-01-01
A sequenced drive for rotary valves which provides the benefits of applying rotary and linear motions to the movable sealing element of the valve. The sequenced drive provides a close approximation of linear motion while engaging or disengaging the movable element with the seat minimizing wear and damage due to scrubbing action. The rotary motion of the drive swings the movable element out of the flowpath thus eliminating obstruction to flow through the valve.
Genetic exchange between endogenous and exogenous LINE-1 repetitive elements in mouse cells.
Belmaaza, A; Wallenburg, J C; Brouillette, S; Gusew, N; Chartrand, P
1990-01-01
The repetitive LINE (L1) elements of the mouse, which are present at about 10(5) copies per genome and share over 80% of sequence homology, were examined for their ability to undergo genetic exchange with exogenous L1 sequences. The exogenous L1 sequences, carried by a shuttle vector, consisted of an internal fragment from L1Md-A2, a previously described member of the L1 family of the mouse. Using an assay that does not require the reconstitution of a selectable marker we found that this vector, in either circular or linear form, acquired DNA sequences from endogenous L1 elements at a frequency of 10(-3) to 10(-4) per rescued vector. Physical analysis of the acquired L1 sequences revealed that distinct endogenous L1 elements acted as donors and that different subfamilies participated. These results demonstrate that L1 elements are readily capable of genetic exchange. Apart from gene conversion events, the acquisition of L1 sequences outside the region of homology suggested that a second mechanism was also involved in the genetic exchange. A model which accounts for this mechanism is presented and its potential implication on the rearrangement of L1 elements is discussed. Images PMID:1978749
Dcode.org anthology of comparative genomic tools.
Loots, Gabriela G; Ovcharenko, Ivan
2005-07-01
Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.
Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.
Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima
2017-10-16
Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A proposed model for the flowering signaling pathway of sugarcane under photoperiodic control.
Coelho, C P; Costa Netto, A P; Colasanti, J; Chalfun-Júnior, A
2013-04-25
Molecular analysis of floral induction in Arabidopsis has identified several flowering time genes related to 4 response networks defined by the autonomous, gibberellin, photoperiod, and vernalization pathways. Although grass flowering processes include ancestral functions shared by both mono- and dicots, they have developed their own mechanisms to transmit floral induction signals. Despite its high production capacity and its important role in biofuel production, almost no information is available about the flowering process in sugarcane. We searched the Sugarcane Expressed Sequence Tags database to look for elements of the flowering signaling pathway under photoperiodic control. Sequences showing significant similarity to flowering time genes of other species were clustered, annotated, and analyzed for conserved domains. Multiple alignments comparing the sequences found in the sugarcane database and those from other species were performed and their phylogenetic relationship assessed using the MEGA 4.0 software. Electronic Northerns were run with Cluster and TreeView programs, allowing us to identify putative members of the photoperiod-controlled flowering pathway of sugarcane.
The Evolution of SINEs and LINEs in the genus Chironomus (Diptera).
Papusheva, Ekaterina; Gruhl, Mary C; Berezikov, Eugene; Groudieva, Tatiana; Scherbik, Svetlana V; Martin, Jon; Blinov, Alexander; Bergtrom, Gerald
2004-03-01
Genomic DNA amplification from 51 species of the family Chironomidae shows that most contain relatives of NLRCth1 LINE and CTRT1 SINE retrotransposons first found in Chironomus thummi. More than 300 cloned PCR products were sequenced. The amplified region of the reverse transcriptase gene in the LINEs is intact and highly conserved, suggesting active elements. The SINEs are less conserved, consistent with minimal/no selection after transposition. A mitochondrial gene phylogeny resolves the Chironomus genus into six lineages (Guryev et al. 2001). LINE and SINE phylogenies resolve five of these lineages, indicating their monophyletic origin and vertical inheritance. However, both the LINE and the SINE tree topologies differ from the species phylogeny, resolving the elements into "clusters I-IV" and "cluster V" families. The data suggest a descent of all LINE and SINE subfamilies from two major families. Based on the species phylogeny, a few LINEs and a larger number of SINEs are cladisitically misplaced. Most misbranch with LINEs or SINEs from species with the same families of elements. From sequence comparisons, cladistically misplaced LINEs and several misplaced SINEs arose by convergent base substitutions. More diverged SINEs result from early transposition and some are derived from multiple source SINEs in the same species. SINEs from two species (C. dorsalis, C. pallidivittatus), expected to belong to the clusters I-IV family, branch instead with cluster V family SINEs; apparently both families predate separation of cluster V from clusters I-IV species. Correlation of the distribution of active SINEs and LINEs, as well as similar 3' sequence motifs in CTRT1 and NLRCth1, suggests coevolving retrotransposon pairs in which CTRT1 transposition depends on enzymes active during NLRCth1 LINE mobility.
Characterization of the human gene (TBXAS1) encoding thromboxane synthase.
Miyata, A; Yokoyama, C; Ihara, H; Bandoh, S; Takeda, O; Takahashi, E; Tanabe, T
1994-09-01
The gene encoding human thromboxane synthase (TBXAS1) was isolated from a human EMBL3 genomic library using human platelet thromboxane synthase cDNA as a probe. Nucleotide sequencing revealed that the human thromboxane synthase gene spans more than 75 kb and consists of 13 exons and 12 introns, of which the splice donor and acceptor sites conform to the GT/AG rule. The exon-intron boundaries of the thromboxane synthase gene were similar to those of the human cytochrome P450 nifedipine oxidase gene (CYP3A4) except for introns 9 and 10, although the primary sequences of these enzymes exhibited 35.8% identity each other. The 1.2-kb of the 5'-flanking region sequence contained potential binding sites for several transcription factors (AP-1, AP-2, GATA-1, CCAAT box, xenobiotic-response element, PEA-3, LF-A1, myb, basic transcription element and cAMP-response element). Primer-extension analysis indicated the multiple transcription-start sites, and the major start site was identified as an adenine residue located 142 bases upstream of the translation-initiation site. However, neither a typical TATA box nor a typical CAAT box is found within the 100-b upstream of the translation-initiation site. Southern-blot analysis revealed the presence of one copy of the thromboxane synthase gene per haploid genome. Furthermore, a fluorescence in situ hybridization study revealed that the human gene for thromboxane synthase is localized to band q33-q34 of the long arm of chromosome 7. A tissue-distribution study demonstrated that thromboxane synthase mRNA is widely expressed in human tissues and is particularly abundant in peripheral blood leukocyte, spleen, lung and liver. The low but significant levels of mRNA were observed in kidney, placenta and thymus.
Weterings, Veronica; Bosch, Thijs; Witteveen, Sandra; Landman, Fabian; Schouls, Leo; Kluytmans, Jan
2017-09-01
Resistance to methicillin in Staphylococcus aureus is caused primarily by the mecA gene, which is carried on a mobile genetic element, the staphylococcal cassette chromosome mec (SCC mec ). Horizontal transfer of this element is supposed to be an important factor in the emergence of new clones of methicillin-resistant Staphylococcus aureus (MRSA) but has been rarely observed in real time. In 2012, an outbreak occurred involving a health care worker (HCW) and three patients, all carrying a fusidic acid-resistant MRSA strain. The husband of the HCW was screened for MRSA carriage, but only a methicillin-susceptible S. aureus (MSSA) strain, which was also resistant to fusidic acid, was detected. Multiple-locus variable-number tandem-repeat analysis (MLVA) typing showed that both the MSSA and MRSA isolates were MT4053-MC0005. This finding led to the hypothesis that the MSSA strain acquired the SCC mec and subsequently caused an outbreak. To support this hypothesis, next-generation sequencing of the MSSA and MRSA isolates was performed. This study showed that the MSSA isolate clustered closely with the outbreak isolates based on whole-genome multilocus sequence typing and single-nucleotide polymorphism (SNP) analysis, with a genetic distance of 17 genes and 44 SNPs, respectively. Remarkably, there were relatively large differences in the mobile genetic elements in strains within and between individuals. The limited genetic distance between the MSSA and MRSA isolates in combination with a clear epidemiologic link supports the hypothesis that the MSSA isolate acquired a SCC mec and that the resulting MRSA strain caused an outbreak. Copyright © 2017 American Society for Microbiology.
Xu, Li; Ji, Jin-Jun; Le, Wangping; Xu, Yan S; Dou, Dandan; Pan, Jieli; Jiao, Yifeng; Zhong, Tianfei; Wu, Dehong; Wang, Yumei; Wen, Chengping; Xie, Guan-Qun; Yao, Feng; Zhao, Heng; Fan, Yong-Sheng; Chin, Y Eugene
2015-10-15
Cytokine or growth factor activated STAT3 undergoes multiple post-translational modifications, dimerization and translocation into nuclei, where it binds to serum-inducible element (SIE, 'TTC(N3)GAA')-bearing promoters to activate transcription. The STAT3 DNA binding domain (DBD, 320-494) mutation in hyper immunoglobulin E syndrome (HIES), called the HIES mutation (R382Q, R382W or V463Δ), which elevates IgE synthesis, inhibits SIE binding activity and sensitizes genes such as TNF-α for expression. However, the mechanism by which the HIES mutation sensitizes STAT3 in gene induction remains elusive. Here, we report that STAT3 binds directly to the AGG-element with the consensus sequence 'AGG(N3)AGG'. Surprisingly, the helical N-terminal region (1-355), rather than the canonical STAT3 DBD, is responsible for AGG-element binding. The HIES mutation markedly enhances STAT3 AGG-element binding and AGG-promoter activation activity. Thus, STAT3 is a dual specificity transcription factor that promotes gene expression not only via SIE- but also AGG-promoter activity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M
2017-03-27
Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W
1996-02-15
Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U).
Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W
1996-01-01
Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U). PMID:8604302
Identification of G-quadruplex forming sequences in three manatee papillomaviruses
Zahin, Maryam; Dean, William L.; Ghim, Shin-je; Joh, Joongho; Gray, Robert D.; Khanal, Sujita; Bossart, Gregory D.; Mignucci-Giannoni, Antonio A.; Rouchka, Eric C.; Jenson, Alfred B.; Trent, John O.; Chaires, Jonathan B.
2018-01-01
The Florida manatee (Trichechus manatus latirotris) is a threatened aquatic mammal in United States coastal waters. Over the past decade, the appearance of papillomavirus-induced lesions and viral papillomatosis in manatees has been a concern for those involved in the management and rehabilitation of this species. To date, three manatee papillomaviruses (TmPVs) have been identified in Florida manatees, one forming cutaneous lesions (TmPV1) and two forming genital lesions (TmPV3 and TmPV4). We identified DNA sequences with the potential to form G-quadruplex structures (G4) across the three genomes. G4 were located on both DNA strands and across coding and non-coding regions on all TmPVs, offering multiple targets for viral control. Although G4 have been identified in several viral genomes, including human PVs, most research has focused on canonical structures comprised of three G-tetrads. In contrast, the vast majority of sequences we identified would allow the formation of non-canonical structures with only two G-tetrads. Our biophysical analysis confirmed the formation of G4 with parallel topology in three such sequences from the E2 region. Two of the structures appear comprised of multiple stacked two G-tetrad structures, perhaps serving to increase structural stability. Computational analysis demonstrated enrichment of G4 sequences on all TmPVs on the reverse strand in the E2/E4 region and on both strands in the L2 region. Several G4 sequences occurred at similar regional locations on all PVs, most notably on the reverse strand in the E2 region. In other cases, G4 were identified at similar regional locations only on PVs forming genital lesions. On all TmPVs, G4 sequences were located in the non-coding region near putative E2 binding sites. Together, these findings suggest that G4 are possible regulatory elements in TmPVs. PMID:29630682
Burke, W D; Calalang, C C; Eickbush, T H
1987-01-01
Two classes of DNA elements interrupt a fraction of the rRNA repeats of Bombyx mori. We have analyzed by genomic blotting and sequence analysis one class of these elements which we have named R2. These elements occupy approximately 9% of the rDNA units of B. mori and appear to be homologous to the type II rDNA insertions detected in Drosophila melanogaster. Approximately 25 copies of R2 exist within the B. mori genome, of which at least 20 are located at a precise location within otherwise typical rDNA units. Nucleotide sequence analysis has revealed that the 4.2-kilobase-pair R2 element has a single large open reading frame, occupying over 82% of the total length of the element. The central region of this 1,151-amino-acid open reading frame shows homology to the reverse transcriptase enzymes found in retroviruses and certain transposable elements. Amino acid homology of this region is highest to the mobile line 1 elements of mammals, followed by the mitochondrial type II introns of fungi, and the pol gene of retroviruses. Less homology exists with transposable elements of D. melanogaster and Saccharomyces cerevisiae. Two additional regions of sequence homology between L1 and R2 elements were also found outside the reverse transcriptase region. We suggest that the R2 elements are retrotransposons that are site specific in their insertion into the genome. Such mobility would enable these elements to occupy a small fraction of the rDNA units of B. mori despite their continual elimination from the rDNA locus by sequence turnover. Images PMID:2439905
Structure, replication efficiency and fragility of yeast ARS elements.
Dhar, Manoj K; Sehgal, Shelly; Kaul, Sanjana
2012-05-01
DNA replication in eukaryotes initiates at specific sites known as origins of replication, or replicators. These replication origins occur throughout the genome, though the propensity of their occurrence depends on the type of organism. In eukaryotes, zones of initiation of replication spanning from about 100 to 50,000 base pairs have been reported. The characteristics of eukaryotic replication origins are best understood in the budding yeast Saccharomyces cerevisiae, where some autonomously replicating sequences, or ARS elements, confer origin activity. ARS elements are short DNA sequences of a few hundred base pairs, identified by their efficiency at initiating a replication event when cloned in a plasmid. ARS elements, although structurally diverse, maintain a basic structure composed of three domains, A, B and C. Domain A is comprised of a consensus sequence designated ACS (ARS consensus sequence), while the B domain has the DNA unwinding element and the C domain is important for DNA-protein interactions. Although there are ∼400 ARS elements in the yeast genome, not all of them are active origins of replication. Different groups within the genus Saccharomyces have ARS elements as components of replication origin. The present paper provides a comprehensive review of various aspects of ARSs, starting from their structural conservation to sequence thermodynamics. All significant and conserved functional sequence motifs within different types of ARS elements have been extensively described. Issues like silencing at ARSs, their inherent fragility and factors governing their replication efficiency have also been addressed. Progress in understanding crucial components associated with the replication machinery and timing at these ARS elements is discussed in the section entitled "The replicon revisited". Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Soccorso, Giampiero; Puls, Florian; Richards, Cathy; Pringle, Howard; Nour, Shawqui
2009-01-01
We present a case of intestinal ganglioneuroma (GN) of the sigmoid colon in a 5-year-old girl, which caused intermittent colocolic intussusception. Ganglioneuromas are rare benign tumors of the autonomic nervous system composed of mature ganglion cells and satellite cells. Colonic GNs are uncommon. The unusual intramural proliferation of neural elements in this case resembled the diffuse intestinal ganglioneuromatosis, which is known to be associated with multiple endocrine neoplasia type 2B. However, the specific mutations of multiple endocrine neoplasia type 2B were not found by genetic sequencing. This is the first pediatric case described in the literature of a solitary polypoid GN presenting as a colocolic intussusception. We present a brief overview of intestinal ganglioneuromatous lesions and associated conditions.
Bryant, Kendall A.; Van Schooneveld, Trevor C.; Thapa, Ishwor; Bastola, Dhundy; Williams, Laurina O.; Safranek, Thomas J.; Hinrichs, Steven H.; Rupp, Mark E.
2013-01-01
We describe the transfer of blaKPC-4 from Enterobacter cloacae to Serratia marcescens in a single patient. DNA sequencing revealed that KPC-4 was encoded on an IncL/M plasmid, pNE1280, closely related to pCTX-M360. Further analysis found that KPC-4 was encoded within a novel Tn4401 element (Tn4401f) containing a truncated tnpA and lacking tnpR, ISKpn7 left, and Tn4401 IRL-1, which are conserved in other Tn4401 transposons. This study highlights the continued evolution of Tn4401 transposons and movement to multiple plasmid backbones that results in acquisition by multiple species of Gram-negative bacilli. PMID:23070154
Miras, Manuel; Rodríguez-Hernández, Ana M; Romero-López, Cristina; Berzal-Herranz, Alfredo; Colchero, Jaime; Aranda, Miguel A; Truniger, Verónica
2018-01-01
In eukaryotes, the formation of a 5'-cap and 3'-poly(A) dependent protein-protein bridge is required for translation of its mRNAs. In contrast, several plant virus RNA genomes lack both of these mRNA features, but instead have a 3'-CITE (for cap-independent translation enhancer), a RNA element present in their 3'-untranslated region that recruits translation initiation factors and is able to control its cap-independent translation. For several 3'-CITEs, direct RNA-RNA long-distance interactions based on sequence complementarity between the 5'- and 3'-ends are required for efficient translation, as they bring the translation initiation factors bound to the 3'-CITE to the 5'-end. For the carmovirus melon necrotic spot virus (MNSV), a 3'-CITE has been identified, and the presence of its 5'-end in cis has been shown to be required for its activity. Here, we analyze the secondary structure of the 5'-end of the MNSV RNA genome and identify two highly conserved nucleotide sequence stretches that are complementary to the apical loop of its 3'-CITE. In in vivo cap-independent translation assays with mutant constructs, by disrupting and restoring sequence complementarity, we show that the interaction between the 3'-CITE and at least one complementary sequence in the 5'-end is essential for virus RNA translation, although efficient virus translation and multiplication requires both connections. The complementary sequence stretches are invariant in all MNSV isolates, suggesting that the dual 5'-3' RNA:RNA interactions are required for optimal MNSV cap-independent translation and multiplication.
Durand-Dubief, Mickaël; Absalon, Sabrina; Menzer, Linda; Ngwabyt, Sandra; Ersfeld, Klaus; Bastin, Philippe
2007-12-01
The protist Trypanosoma brucei possesses a single Argonaute gene called TbAGO1 that is necessary for RNAi silencing. We previously showed that in strain 427, TbAGO1 knock-out leads to a slow growth phenotype and to chromosome segregation defects. Here we report that the slow growth phenotype is linked to defects in segregation of both large and mini-chromosome populations, with large chromosomes being the most affected. These phenotypes are completely reversed upon inducible re-expression of TbAGO1 fused to GFP, demonstrating their link with TbAGO1. Trypanosomes that do not express TbAGO1 show a general increase in the abundance of transcripts derived from the short retroposon RIME (Ribosomal Interspersed Mobile Element). Supplementary large RIME transcripts emerge in the absence of RNAi, a phenomenon coupled to the disappearance of short transcripts. These fluctuations are reversed by inducible expression of GFP::TbAGO1. Furthermore, we use a combination of Northern blots, RT-PCR and sequencing to reveal that RNAi controls expression of transcripts derived from RHS (Retrotransposon Hot Spot) pseudogenes (RHS genes with retro-element(s) integrated within their coding sequence). Absence of RNAi also leads to an increase of steady-state transcripts from regular RHS genes (those without retro-element), indicating a role for pseudogene in control of gene expression. However, analysis of retroposon abundance and arrangement in the genome of multiple clonal cell lines of TbAGO1-/- failed to reveal movement of mobile elements despite the increased amounts of retroposon transcripts.
An experimental and analytical investigation on the response of GR/EP composite I-frames
NASA Technical Reports Server (NTRS)
Moas, E., Jr.; Boitnott, R. L.; Griffin, O. H., Jr.
1991-01-01
Six-foot diameter, semicircular graphite/epoxy specimens representative of generic aircraft frames were loaded quasi-statically to determine their load response and failure mechanisms for large deflections that occur in an airplane crash. These frame-skin specimens consisted of a cylindrical skin section cocured with a semicircular I-frame. Various frame laminate stacking sequences and geometries were evaluated by statically loading the specimen until multiple failures occurred. Two analytical methods were compared for modeling the frame-skin specimens: a two-dimensional branched-shell finite element analysis and a one-dimensional, closed-form, curved beam solution derived using an energy method. Excellent correlation was obtained between experimental results and the finite element predictions of the linear response of the frames prior to the initial failure. The beam solution was used for rapid parameter and design studies, and was found to be stiff in comparison with the finite element analysis. The specimens were found to be useful for evaluating composite frame designs.
Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures
Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Park, Seung-Won; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J.; Robson, Bryanne E.; Aerts, Stein; van Helden, Jacques; Hassan, Bassem; Gilbert, Donald G.; Eastman, Deborah A.; Rice, Michael; Weir, Michael; Hahn, Matthew W.; Park, Yongkyu; Dewey, Colin N.; Pachter, Lior; Kent, W. James; Haussler, David; Lai, Eric C.; Bartel, David P.; Hannon, Gregory J.; Kaufman, Thomas C.; Eisen, Michael B.; Clark, Andrew G.; Smith, Douglas; Celniker, Susan E.; Gelbart, William M.; Kellis, Manolis
2008-01-01
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies. PMID:17994088
Myers, Katie N; Barone, Giancarlo; Ganesh, Anil; Staples, Christopher J; Howard, Anna E; Beveridge, Ryan D; Maslen, Sarah; Skehel, J Mark; Collis, Spencer J
2016-10-14
It was recently discovered that vertebrate genomes contain multiple endogenised nucleotide sequences derived from the non-retroviral RNA bornavirus. Strikingly, some of these elements have been evolutionary maintained as open reading frames in host genomes for over 40 million years, suggesting that some endogenised bornavirus-derived elements (EBL) might encode functional proteins. EBLN1 is one such element established through endogenisation of the bornavirus N gene (BDV N). Here, we functionally characterise human EBLN1 as a novel regulator of genome stability. Cells depleted of human EBLN1 accumulate DNA damage both under non-stressed conditions and following exogenously induced DNA damage. EBLN1-depleted cells also exhibit cell cycle abnormalities and defects in microtubule organisation as well as premature centrosome splitting, which we attribute in part, to improper localisation of the nuclear envelope protein TPR. Our data therefore reveal that human EBLN1 possesses important cellular functions within human cells, and suggest that other EBLs present within vertebrate genomes may also possess important cellular functions.
MANGO: a new approach to multiple sequence alignment.
Zhang, Zefeng; Lin, Hao; Li, Ming
2007-01-01
Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.
Horta-Valerdi, Guillermo; Sanchez-Alonso, Maria Patricia; Perez-Marquez, Victor M; Negrete-Abascal, Erasmo; Vaca-Pacheco, Sergio; Hernandez-Gonzalez, Ismael; Gomez-Lunar, Zulema; Olmedo-Álvarez, Gabriela; Vázquez-Cruz, Candelario
2017-04-13
The draft genome sequence of Avibacterium paragallinarum strain CL serovar C is reported here. The genome comprises 154 contigs corresponding to 2.4 Mb with 41% G+C content and many insertion sequence (IS) elements, a characteristic not previously reported in A. paragallinarum . Copyright © 2017 Horta-Valerdi et al.
A retrotransposable element from the mosquito Anopheles gambiae .
Besansky, N J
1990-01-01
A family of middle repetitive elements from the African malaria vector Anopheles gambiae is described. Approximately 100 copies of the element, designated T1Ag, are dispersed in the genome. Full-length elements are 4.6 kilobase pairs in length, but truncation of the 5' end is common. Nucleotide sequences of one full-length, two 5'-truncated, and two 5' ends of T1Ag elements were determined and aligned to define a consensus sequence. Sequence analysis revealed two long, overlapping open reading frames followed by a polyadenylation signal, AATAAA, and a tail consisting of tandem repetitions of the motif TGAAA. No direct or inverted long terminal repeats (LTRs) were detected. The first open reading frame, 442 amino acids in length, includes a domain resembling that of nucleic acid-binding proteins. The second open reading frame, 975 amino acids long, resembles the reverse transcriptases of a category of retrotransposable elements without LTRs, variously termed class II retrotransposons, class III elements or non-LTR retrotransposons. Similarity at the sequence and structural levels places T1Ag in this category. Images PMID:1689457
Unusually long-lived pause required for regulation of a Rho-dependent transcription terminator
Hollands, Kerry; Sevostiyanova, Anastasia; Groisman, Eduardo A.
2014-01-01
Up to half of all transcription termination events in bacteria rely on the RNA-dependent helicase Rho. However, the nucleic acid sequences that promote Rho-dependent termination remain poorly characterized. Defining the molecular determinants that confer Rho-dependent termination is especially important for understanding how such terminators can be regulated in response to specific signals. Here, we identify an extraordinarily long-lived pause at the site where Rho terminates transcription in the 5′-leader region of the Mg2+ transporter gene mgtA in Salmonella enterica. We dissect the sequence elements required for prolonged pausing in the mgtA leader and establish that the remarkable longevity of this pause is required for a riboswitch to stimulate Rho-dependent termination in the mgtA leader region in response to Mg2+ availability. Unlike Rho-dependent terminators described previously, where termination occurs at multiple pause sites, there is a single site of transcription termination directed by Rho in the mgtA leader. Our data suggest that Rho-dependent termination events that are subject to regulation may require elements distinct from those operating at constitutive Rho-dependent terminators. PMID:24778260
Gbadegesin, M A; Beeching, J R
2011-06-07
Cassava can be cultivated on impoverished soils with minimum inputs, and its storage roots are a staple food for millions in Africa. However, these roots are low in bioavailable nutrients and in protein content, contain cyanogenic glycosides, and suffer from a very short post-harvest shelf-life, and the plant is susceptible to viral and bacterial diseases prevalent in Africa. The demand for improvement of cassava with respect to these traits comes from both farmers and national agricultural institutions. Genetic improvement of cassava cultivars by molecular biology techniques requires the availability of appropriate genes, a system to introduce these genes into cassava, and the use of suitable gene promoters. Cassava root-specific promoter for auxin-repressed protein was isolated using the gene walking approach, starting with a cDNA sequence. In silico analysis of promoter sequences revealed putative cis-acting regulatory elements, including root-specific elements, which may be required for gene expression in vascular tissues. Research on the activities of this promoter is continuing, with the development of plant expression cassettes for transformation into major African elite lines and farmers' preferred cassava cultivars to enable testing of tissue-specific expression patterns in the field.
Gene organization and alternative splicing of human prohormone convertase PC8.
Goodge, K A; Thomas, R J; Martin, T J; Gillespie, M T
1998-01-01
The mammalian Ca2+-dependent serine protease prohormone convertase PC8 is expressed ubiquitously, being transcribed as 3.5, 4.3 and 6.0 kb mRNA isoforms in various tissues. To determine the origin of these various mRNA isoforms we report the characterization of the human PC8 gene, which has been previously localized to chromosome 11q23-24. Consisting of 16 exons, the human PC8 gene spans approx. 27 kb. A comparison of the position of intron-exon junctions of the human PC8 gene with the gene structures of previously reported prohormone convertase genes demonstrated a divergence of the human PC8 from the highly conserved nature of the gene organization of this enzyme family. The nucleotide sequence of the 5'-flanking region of the human PC8 is reported and possesses putative promoter elements characteristic of a GC-rich promoter. Further supporting the potential role of a GC-rich promoter element, multiple transcriptional initiation sites within a 200 bp region were demonstrated. We propose that the various mRNA isoforms of PC8 result from the inclusion of intronic sequences within transcripts. PMID:9820811
Description of the PMAD DC test bed architecture and integration sequence
NASA Technical Reports Server (NTRS)
Beach, R. F.; Trash, L.; Fong, D.; Bolerjack, B.
1991-01-01
NASA-Lewis is responsible for the development, fabrication, and assembly of the electric power system (EPS) for the Space Station Freedom (SSF). The SSF power system is radically different from previous spacecraft power systems in both the size and complexity of the system. Unlike past spacecraft power system the SSF EPS will grow and be maintained on orbit and must be flexible to meet changing user power needs. The SSF power system is also unique in comparison with terrestrial power systems because it is dominated by power electronic converters which regulate and control the power. Although spacecraft historically have used power converters for regulation they typically involved only a single series regulating element. The SSF EPS involves multiple regulating elements, two or more in series, prior to the load. These unique system features required the construction of a testbed which would allow the development of spacecraft power system technology. A description is provided of the Power Management and Distribution (PMAD) DC Testbed which was assembled to support the design and early evaluation of the SSF EPS. A description of the integration process used in the assembly sequence is also given along with a description of the support facility.
Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M
1996-08-01
DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
Context-dependent control of alternative splicing by RNA-binding proteins
Fu, Xiang-Dong; Ares, Manuel
2015-01-01
Sequence-specific RNA-binding proteins (RBPs) bind to pre-mRNA to control alternative splicing, but it is not yet possible to read the ‘splicing code’ that dictates splicing regulation on the basis of genome sequence. Each alternative splicing event is controlled by multiple RBPs, the combined action of which creates a distribution of alternatively spliced products in a given cell type. As each cell type expresses a distinct array of RBPs, the interpretation of regulatory information on a given RNA target is exceedingly dependent on the cell type. RBPs also control each other’s functions at many levels, including by mutual modulation of their binding activities on specific regulatory RNA elements. In this Review, we describe some of the emerging rules that govern the highly context-dependent and combinatorial nature of alternative splicing regulation. PMID:25112293
Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop.
Hatakeyama, Masaomi; Aluri, Sirisha; Balachadran, Mathi Thumilan; Sivarajan, Sajeevan Radha; Patrignani, Andrea; Grüter, Simon; Poveda, Lucy; Shimizu-Inatsugi, Rie; Baeten, John; Francoijs, Kees-Jan; Nataraja, Karaba N; Reddy, Yellodu A Nanja; Phadnis, Shamprasad; Ravikumar, Ramapura L; Schlapbach, Ralph; Sreeman, Sheshshayee M; Shimizu, Kentaro K
2017-09-05
Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length >2.6 Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C
2016-01-01
Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.
Two cis elements collaborate to spatially repress transcription from a sea urchin promoter
NASA Technical Reports Server (NTRS)
Frudakis, T. N.; Wilt, F.
1995-01-01
The expression pattern of many territory-specific genes in metazoan embryos is maintained by an active process of negative spatial regulation. However, the mechanism of this strategy of gene regulation is not well understood in any system. Here we show that reporter constructs containing regulatory sequence for the SM30-alpha gene of Stronglyocentrotus purpuratus are expressed in a pattern congruent with that of the endogenous SM30 gene(s), largely as a result of active transcriptional repression in cell lineages in which the gene is not normally expressed. Chloramphenicol acetyl transferase assays of deletion constructs from the 2600-bp upstream region showed that repressive elements were present in the region from -1628 to -300. In situ hybridization analysis showed that the spatial fidelity of expression was severely compromised when the region from -1628 to -300 was deleted. Two highly repetitive sequence motifs, (G/A/C)CCCCT and (T/C)(T/A/C)CTTTT(T/A/C), are present in the -1628 to -300 region. Representatives of these elements were analyzed by gel mobility shift experiments and were found to interact specifically with protein in crude nuclear extracts. When oligonucleotides containing either sequence element were co-injected with a correctly regulated reporter as potential competitors, the reporter was expressed in inappropriate cells. When composite oligonucleotides, containing both sequence elements, were fused to a misregulated reporter, the expression of the reporter in inappropriate cells was suppressed. Comparison of composite oligonucleotides with oligonucleotides containing single constituent elements show that both sequence elements are required for effective spatial regulation. Thus, both individual elements are required, but only a composite element containing both elements is sufficient to function as a tissue-specific repressive element.
FASMA: a service to format and analyze sequences in multiple alignments.
Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M
2007-12-01
Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou
2016-01-01
Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights the high suitability of contemporary sequencing methods in future analyses of human biology in relation to evolutionary acquired retroviruses in the human genome. © 2016 APMIS. Published by John Wiley & Sons Ltd.
Tanaka, Mizuki; Sakai, Yoshifumi; Yamada, Osamu; Shintani, Takahiro; Gomi, Katsuya
2011-01-01
To investigate 3′-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3′-untranslated region (3′ UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3′ UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3′ UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15–30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3′-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3′-end-processing signals are similar to those in yeast and plants, some notable differences exist between them. PMID:21586533
2013-01-01
Background Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. Results In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Conclusions Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome. PMID:23374229
Marzo, Mar; Bello, Xabier; Puig, Marta; Maside, Xulio; Ruiz, Alfredo
2013-02-04
Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome.
Influence of gag and RRE Sequences on HIV-1 RNA Packaging Signal Structure and Function.
Kharytonchyk, Siarhei; Brown, Joshua D; Stilger, Krista; Yasin, Saif; Iyer, Aishwarya S; Collins, John; Summers, Michael F; Telesnitsky, Alice
2018-07-06
The packaging signal (Ψ) and Rev-responsive element (RRE) enable unspliced HIV-1 RNAs' export from the nucleus and packaging into virions. For some retroviruses, engrafting Ψ onto a heterologous RNA is sufficient to direct encapsidation. In contrast, HIV-1 RNA packaging requires 5' leader Ψ elements plus poorly defined additional features. We previously defined minimal 5' leader sequences competitive with intact Ψ for HIV-1 packaging, and here examined the potential roles of additional downstream elements. The findings confirmed that together, HIV-1 5' leader Ψ sequences plus a nuclear export element are sufficient to specify packaging. However, RNAs trafficked using a heterologous export element did not compete well with RNAs using HIV-1's RRE. Furthermore, some RNA additions to well-packaged minimal vectors rendered them packaging-defective. These defects were rescued by extending gag sequences in their native context. To understand these packaging defects' causes, in vitro dimerization properties of RNAs containing minimal packaging elements were compared to RNAs with sequence extensions that were or were not compatible with packaging. In vitro dimerization was found to correlate with packaging phenotypes, suggesting that HIV-1 evolved to prevent 5' leader residues' base pairing with downstream residues and misfolding of the packaging signal. Our findings explain why gag sequences have been implicated in packaging and show that RRE's packaging contributions appear more specific than nuclear export alone. Paired with recent work showing that sequences upstream of Ψ can dictate RNA folds, the current work explains how genetic context of minimal packaging elements contributes to HIV-1 RNA fate determination. Copyright © 2018 Elsevier Ltd. All rights reserved.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome
2013-01-01
Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
Delimiting regulatory sequences of the Drosophila melanogaster Ddc gene.
Hirsh, J; Morgan, B A; Scholnick, S B
1986-01-01
We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity. Images PMID:3099170
Mandl, C W; Holzmann, H; Kunz, C; Heinz, F X
1993-05-01
The complete nucleotide sequence of the positive-stranded RNA genome of the tick-borne flavivirus Powassan (10,839 nucleotides) was elucidated and the amino acid sequence of all viral proteins was derived. Based on this sequence as well as serological data, Powassan virus represents the most divergent member of the tick-borne serocomplex within the genus flaviviruses, family Flaviviridae. The primary nucleotide sequence and potential RNA secondary structures of the Powassan virus genome as well as the protein sequences and the reactivities of the virion with a panel of monoclonal antibodies were compared to other tick-borne and mosquito-borne flaviviruses. These analyses corroborated significant differences between tick-borne and mosquito-borne flaviviruses, but also emphasized structural elements that are conserved among both vector groups. The comparisons among tick-borne flaviviruses revealed conserved sequence elements that might represent important determinants of the tick-borne flavivirus phenotype.
Identification and characterization of cell-specific enhancer elements for the mouse ETF/Tead2 gene.
Tanoue, Y; Yasunami, M; Suzuki, K; Ohkubo, H
2001-12-21
We have identified and characterized by transient transfection assays the cell-specific 117-bp enhancer sequence in the first intron of the mouse ETF (Embryonic TEA domain-containing factor)/Tead2 gene required for transcriptional activation in ETF/Tead2 gene-expressing cells, such as P19 cells. The 117-bp enhancer contains one GC-rich sequence (5'-GGGGCGGGG-3'), termed the GC box, and two tandemly repeated GA-rich sequences (5'-GGGGGAGGGG-3'), termed the proximal and distal GA elements. Further analyses, including transfection studies and electrophoretic mobility shift assays using a series of deletion and mutation constructs, indicated that Sp1, a putative activator, may be required to predominate over its competition with another unknown putative repressor, termed the GA element-binding factor, for binding to both the GC box, which overlapped with the proximal GA element, and the distal GA element in the 117-bp sequence in order to achieve a full enhancer activity. We also discuss a possible mechanism underlying the cell-specific enhancer activity of the 117-bp sequence.
SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments
Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric
2014-01-01
This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831
Identifying micro-inversions using high-throughput sequencing reads.
He, Feifei; Li, Yang; Tang, Yu-Hang; Ma, Jian; Zhu, Huaiqiu
2016-01-11
The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID .
Li, Ruichao; Xie, Miaomiao; Lv, Jingzhang; Wai-Chi Chan, Edward; Chen, Sheng
2017-03-01
To investigate the genetic features of three plasmids recovered from an MCR-1 and ESBL-producing Escherichia coli strain, HYEC7, and characterize the transmission mechanism of mcr-1 . The genetic profiles of three plasmids were determined by PCR, S1-PFGE, Southern hybridization and WGS analysis. The ability of the mcr-1 -bearing plasmid to undergo conjugation was also assessed. The mcr-1 -bearing transposon Tn 6330 was characterized by PCR and DNA sequencing. Complete sequences of three plasmids were obtained. A non-conjugative phage P7-like plasmid, pHYEC7- mcr1 , was found to harbour the mcr-1 -bearing transposon Tn 6330 , which could be excised from the plasmid by generating a circular intermediate harbouring mcr-1 and the IS Apl1 element. The insertion of the circular intermediate into another plasmid, pHYEC7-IncHI2, could form pHNSHP45-2, the original IncHI2-type mcr-1 -carrying plasmid that was reported. The third plasmid, pHYEC7-110, harboured two replicons, IncX1 and IncFIB, and comprised multiple antimicrobial resistance mobile elements, some of which were shared by pHYEC7-IncHI2. The Tn 6330 element located in the phage-like plasmid pHYEC7- mcr1 could be excised from the plasmid and formed a circular intermediate that could be integrated into plasmids containing the IS Apl1 element. This phenomenon indicated that Tn 6330 is a key element responsible for widespread dissemination of mcr-1 among various types of plasmids and bacterial chromosomes. The dissemination rate of such an element may be further enhanced upon translocation into phage-like vectors, which may also be transmitted via transduction events. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Palazzo, Antonio; Lovero, Domenica; D'Addabbo, Pietro; Caizzi, Ruggiero; Marsano, René Massimiliano
2016-01-01
Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.
Surveying DNA Elements within Functional Genes of Heterocyst-Forming Cyanobacteria
Hilton, Jason A.; Meeks, John C.; Zehr, Jonathan P.
2016-01-01
Some cyanobacteria are capable of differentiating a variety of cell types in response to environmental factors. For instance, in low nitrogen conditions, some cyanobacteria form heterocysts, which are specialized for N2 fixation. Many heterocyst-forming cyanobacteria have DNA elements interrupting key N2 fixation genes, elements that are excised during heterocyst differentiation. While the mechanism for the excision of the element has been well-studied, many questions remain regarding the introduction of the elements into the cyanobacterial lineage and whether they have been retained ever since or have been lost and reintroduced. To examine the evolutionary relationships and possible function of DNA sequences that interrupt genes of heterocyst-forming cyanobacteria, we identified and compared 101 interruption element sequences within genes from 38 heterocyst-forming cyanobacterial genomes. The interruption element lengths ranged from about 1 kb (the minimum able to encode the recombinase responsible for element excision), up to nearly 1 Mb. The recombinase gene sequences served as genetic markers that were common across the interruption elements and were used to track element evolution. Elements were found that interrupted 22 different orthologs, only five of which had been previously observed to be interrupted by an element. Most of the newly identified interrupted orthologs encode proteins that have been shown to have heterocyst-specific activity. However, the presence of interruption elements within genes with no known role in N2 fixation, as well as in three non-heterocyst-forming cyanobacteria, indicates that the processes that trigger the excision of elements may not be limited to heterocyst development or that the elements move randomly within genomes. This comprehensive analysis provides the framework to study the history and behavior of these unique sequences, and offers new insight regarding the frequency and persistence of interruption elements in heterocyst-forming cyanobacteria. PMID:27206019
Surveying DNA Elements within Functional Genes of Heterocyst-Forming Cyanobacteria.
Hilton, Jason A; Meeks, John C; Zehr, Jonathan P
2016-01-01
Some cyanobacteria are capable of differentiating a variety of cell types in response to environmental factors. For instance, in low nitrogen conditions, some cyanobacteria form heterocysts, which are specialized for N2 fixation. Many heterocyst-forming cyanobacteria have DNA elements interrupting key N2 fixation genes, elements that are excised during heterocyst differentiation. While the mechanism for the excision of the element has been well-studied, many questions remain regarding the introduction of the elements into the cyanobacterial lineage and whether they have been retained ever since or have been lost and reintroduced. To examine the evolutionary relationships and possible function of DNA sequences that interrupt genes of heterocyst-forming cyanobacteria, we identified and compared 101 interruption element sequences within genes from 38 heterocyst-forming cyanobacterial genomes. The interruption element lengths ranged from about 1 kb (the minimum able to encode the recombinase responsible for element excision), up to nearly 1 Mb. The recombinase gene sequences served as genetic markers that were common across the interruption elements and were used to track element evolution. Elements were found that interrupted 22 different orthologs, only five of which had been previously observed to be interrupted by an element. Most of the newly identified interrupted orthologs encode proteins that have been shown to have heterocyst-specific activity. However, the presence of interruption elements within genes with no known role in N2 fixation, as well as in three non-heterocyst-forming cyanobacteria, indicates that the processes that trigger the excision of elements may not be limited to heterocyst development or that the elements move randomly within genomes. This comprehensive analysis provides the framework to study the history and behavior of these unique sequences, and offers new insight regarding the frequency and persistence of interruption elements in heterocyst-forming cyanobacteria.
Genomic Organization of the Drosophila Telomere RetrotransposableElements
DOE Office of Scientific and Technical Information (OSTI.GOV)
George, J.A.; DeBaryshe, P.G.; Traverse, K.L.
2006-10-16
The emerging sequence of the heterochromatic portion of the Drosophila melanogaster genome, with the most recent update of euchromatic sequence, gives the first genome-wide view of the chromosomal distribution of the telomeric retrotransposons, HeT-A, TART, and Tahre. As expected, these elements are entirely excluded from euchromatin, although sequence fragments of HeT-A and TART 3 untranslated regions are found in nontelomeric heterochromatin on the Y chromosome. The proximal ends of HeT-A/TART arrays appear to be a transition zone because only here do other transposable elements mix in the array. The sharp distinction between the distribution of telomeric elements and that ofmore » other transposable elements suggests that chromatin structure is important in telomere element localization. Measurements reported here show (1) D. melanogaster telomeres are very long, in the size range reported for inbred mouse strains (averaging 46 kb per chromosome end in Drosophila stock 2057). As in organisms with telomerase, their length varies depending on genotype. There is also slight under-replication in polytene nuclei. (2) Surprisingly, the relationship between the number of HeT-A and TART elements is not stochastic but is strongly correlated across stocks, supporting the idea that the two elements are interdependent. Although currently assembled portions of the HeT-A/TART arrays are from the most-proximal part of long arrays, {approx}61% of the total HeT-A sequence in these regions consists of intact, potentially active elements with little evidence of sequence decay, making it likely that the content of the telomere arrays turns over more extensively than has been thought.« less
Multiplexed fragaria chloroplast genome sequencing
W. Njuguna; A. Liston; R. Cronn; N.V. Bassil
2010-01-01
A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...
Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar
2002-02-01
The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.
Zhang, Wensheng; Edwards, Andrea; Fan, Wei; Fang, Zhide; Deininger, Prescott; Zhang, Kun
2013-08-28
The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons' expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3' (5') untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3'UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.
Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou
2008-04-01
Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.
Embedding strategies for effective use of information from multiple sequence alignments.
Henikoff, S.; Henikoff, J. G.
1997-01-01
We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Löpprich, Martin; Krauss, Felix; Ganzinger, Matthias; Senghas, Karsten; Riezler, Stefan; Knaup, Petra
2016-08-05
In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports. The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used. The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data element. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly. The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.
Huang, Xi; Duan, Min; Liao, Jiakai; Yuan, Xi; Chen, Hui; Feng, Jiejie; Huang, Ji; Zhang, Hong-Sheng
2014-01-01
Homeodomain-leucine zipper type I (HD-Zip I) proteins are involved in the regulation of plant development and response to environmental stresses. In this study, OsSLI1 (Oryza sativa stress largely induced 1), encoding a member of the HD-Zip I subfamily, was isolated from rice. The expression of OsSLI1 was dramatically induced by multiple abiotic stresses and exogenous abscisic acid (ABA). In silico sequence analysis discovered several cis-acting elements including multiple ABREs (ABA-responsive element binding factors) in the upstream promoter region of OsSLI1. The OsSLI1-GFP fusion protein was localized in the nucleus of rice protoplast cells and the transcriptional activity of OsSLI1 was confirmed by the yeast hybrid system. Further, it was found that OsSLI1 expression was enhanced in an ABI5-Like1 (ABL1) deficiency rice mutant abl1 under stress conditions, suggesting that ABL1 probably negatively regulates OsSLI1 gene expression. Moreover, it was found that OsSLI1 was regulated in panicle development. Taken together, OsSLI1 may be a transcriptional activator regulating stress-responsive gene expression and panicle development in rice.
Stochastic nature of Landsat MSS data
NASA Technical Reports Server (NTRS)
Labovitz, M. L.; Masuoka, E. J.
1987-01-01
A multiple series generalization of the ARIMA models is used to model Landsat MSS scan lines as sequences of vectors, each vector having four elements (bands). The purpose of this work is to investigate if Landsat scan lines can be described by a general multiple series linear stochastic model and if the coefficients of such a model vary as a function of satellite system and target attributes. To accomplish this objective, an exploratory experimental design was set up incorporating six factors, four representing target attributes - location, cloud cover, row (within location), and column (within location) - and two factors representing system attributes - satellite number and detector bank. Each factor was included in the design at two levels and, with two replicates per treatment, 128 scan lines were analyzed. The results of the analysis suggests that a multiple AR(4) model is an adequate representation across all scan lines. Furthermore, the coefficients of the AR(4) model vary with location, particularly changes in physiography (slope regimes), and with percent cloud cover, but are insensitive to changes in system attributes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hiraiwa, Akikazu; Yamanaka, Katsuo; Kwok, W.W.
Although HLA genes have been shown to be associated with certain diseases, the basis for this association is unknown. Recent studies, however, have documented patterns of nucleotide sequence variation among some HLA genes associated with a particular disease. For rheumatoid arthritis, HLA genes in most patients have a shared nucleotide sequence encoding a key structural element of an HLA class II polypeptide; this sequence element is critical for the interaction of the HLA molecule with antigenic peptides and with responding T cells, suggestive of a direct role for this sequence element in disease susceptibility. The authors describe the serological andmore » cellular immunologic characteristics encoded by this rheumatoid arthritis-associated sequence element. Site-directed mutagenesis of the DRB1 gene was used to define amino acids critical for antibody and T-cell recognition of this structural element, focusing on residues that distinguish the rheumatoid arthritis-associated alleles Dw4 and Dw14 from a closely related allele, Dw10, not associated with disease. Both the gain and loss of rheumatoid arthritis-associated epitopes were highly dependent on three residues within a discrete domain of the HLA-DR molecule. Recognition was most strongly influenced by the following amino acids (in order): 70 > 71 > 67. Some alloreactive T-cell clones were also influenced by amino acid variation in portions of the DR molecule lying outside the shared sequence element.« less
Liu, Dong; Zhu, Guoli; Tang, Wenqiao; Yang, Jinquan; Guo, Hongyi
2012-01-01
Short interspersed nucleotide elements (SINEs), a type of retrotransposon, are widely distributed in various genomes with multiple copies arranged in different orientations, and cause changes to genes and genomes during evolutionary history. This can provide the basis for determining genome diversity, genetic variation and molecular phylogeny, etc. SINE DNA is transcribed into RNA by polymerase III from an internal promoter, which is composed of two conserved boxes, box A and box B. Here we present an approach to isolate novel SINEs based on these promoter elements. Box A of a SINE is obtained via PCR with only one primer identical to box B (B-PCR). Box B and its downstream sequence are acquired by PCR with one primer corresponding to box A (A-PCR). The SINE clone produced by A-PCR is selected as a template to label a probe with biotin. The full-length SINEs are isolated from the genomic pool through complex capture using the biotinylated probe bound to magnetic particles. Using this approach, a novel SINE family, Cn-SINE, from the genomes of Coilia nasus, was isolated. The members are 180-360 bp long. Sequence homology suggests that Cn-SINEs evolved from a leucine tRNA gene. This is the first report of a tRNA(Leu)-related SINE obtained without the use of a genomic library or inverse PCR. These results provide new insights into the origin of SINEs.
Liu, Dong; Zhu, Guoli; Tang, Wenqiao; Yang, Jinquan; Guo, Hongyi
2012-01-01
Short interspersed nucleotide elements (SINEs), a type of retrotransposon, are widely distributed in various genomes with multiple copies arranged in different orientations, and cause changes to genes and genomes during evolutionary history. This can provide the basis for determining genome diversity, genetic variation and molecular phylogeny, etc. SINE DNA is transcribed into RNA by polymerase III from an internal promoter, which is composed of two conserved boxes, box A and box B. Here we present an approach to isolate novel SINEs based on these promoter elements. Box A of a SINE is obtained via PCR with only one primer identical to box B (B-PCR). Box B and its downstream sequence are acquired by PCR with one primer corresponding to box A (A-PCR). The SINE clone produced by A-PCR is selected as a template to label a probe with biotin. The full-length SINEs are isolated from the genomic pool through complex capture using the biotinylated probe bound to magnetic particles. Using this approach, a novel SINE family, Cn-SINE, from the genomes of Coilia nasus, was isolated. The members are 180–360 bp long. Sequence homology suggests that Cn-SINEs evolved from a leucine tRNA gene. This is the first report of a tRNALeu-related SINE obtained without the use of a genomic library or inverse PCR. These results provide new insights into the origin of SINEs. PMID:22408437
Chen, Huan; Je, Jihyun; Song, Chieun; Hwang, Jung Eun; Lim, Chae Oh
2012-09-01
The dehydration-responsive element-binding factor 2C (DREB2C) is a member of the CBF/DREB subfamily of proteins, which contains a single APETALA2/Ethylene responsive element-binding factor (AP2/ERF) domain. To identify the expression pattern of the DREB2C gene, which contains multiple transcription cis-regulatory elements in its promoter, an approximately 1.4 kb upstream DREB2C sequence was fused to the β-glucuronidase reporter gene (GUS) and the recombinant p1244 construct was transformed into Arabidopsis thaliana (L.) Heynh. The promoter of the gene directed prominent GUS activity in the vasculature in diverse young dividing tissues. Upon applying heat stress (HS), GUS staining was also enhanced in the vasculature of the growing tissues. Analysis of a series of 5'-deletions of the DREB2C promoter revealed that a proximal upstream sequence sufficient for the tissue-specific spatial and temporal induction of GUS expression by HS is localized in the promoter region between -204 and -34 bps relative to the transcriptional start site. Furthermore, electrophoretic mobility shift assay (EMSA) demonstrated that nuclear protein binding activities specific to a -120 to -32 bp promoter fragment increased after HS. These results indicate that the TATA-proximal region and some latent trans-acting factors may cooperate in HS-induced activation of the Arabidopsis DREB2C promoter. © 2012 Institute of Botany, Chinese Academy of Sciences.
ICEPmu1, an integrative conjugative element (ICE) of Pasteurella multocida: structure and transfer.
Michael, Geovana Brenner; Kadlec, Kristina; Sweeney, Michael T; Brzuszkiewicz, Elzbieta; Liesegang, Heiko; Daniel, Rolf; Murray, Robert W; Watts, Jeffrey L; Schwarz, Stefan
2012-01-01
Integrative and conjugative elements (ICEs) have not been detected in Pasteurella multocida. In this study the multiresistance ICEPmu1 from bovine P. multocida was analysed for its core genes and its ability to conjugatively transfer into strains of the same and different genera. ICEPmu1 was identified during whole genome sequencing. Coding sequences were predicted by bioinformatic tools and manually curated using the annotation software ERGO. Conjugation into P. multocida, Mannheimia haemolytica and Escherichia coli recipients was performed by mating assays. The presence of ICEPmu1 and its circular intermediate in the recipient strains was confirmed by PCR and sequence analysis. Integration sites were sequenced. Susceptibility testing of the ICEPmu1-carrying recipients was conducted by broth microdilution. The 82 214 bp ICEPmu1 harbours 88 genes. The core genes of ICEPmu1, which are involved in excision/integration and conjugative transfer, resemble those found in a 66 641 bp ICE from Histophilus somni. ICEPmu1 integrates into a tRNA(Leu) and is flanked by 13 bp direct repeats. It is able to conjugatively transfer to P. multocida, M. haemolytica and E. coli, where it also uses a tRNA(Leu) for integration and produces closely related 13 bp direct repeats. PCR assays and susceptibility testing confirmed the presence and the functional activity of the ICEPmu1-associated resistance genes in the recipient strains. The observation that the multiresistance ICEPmu1 is present in a bovine P. multocida and can easily spread across strain and genus boundaries underlines the risk of a rapid dissemination of multiple resistance genes, which will distinctly decrease the therapeutic options.
Simultaneous phylogeny reconstruction and multiple sequence alignment
Yue, Feng; Shi, Jian; Tang, Jijun
2009-01-01
Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110
Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R
2005-09-01
We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B
1995-01-01
Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Bellerophon: A program to detect chimeric sequences in multiple sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2003-12-23
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Development of Multiple-Element Flame Emission Spectrometer Using CCD Detection
ERIC Educational Resources Information Center
Seney, Caryn S.; Sinclair, Karen V.; Bright, Robin M.; Momoh, Paul O.; Bozeman, Amelia D.
2005-01-01
The full wavelength coverage of charge coupled device (CCD) detector when coupled with an echelle spectrography, the system allows for simultaneously multiple element spectroscopy to be performed. The multiple-element flame spectrometer was built and characterized through the analysis of environmentally significant elements such as Ca, K, Na, Cu,…
“Guest list” or “Black list”? Heritable Small RNAs as Immunogenic Memories
Rechavi, Oded
2016-01-01
Small RNA-mediated gene silencing plays a pivotal role in genome immunity by recognizing and eliminating viruses and transposons which otherwise may colonize the genome. However, this can be challenging since individual genomic parasites are highly diverse, and employ multiple immune evasion techniques. In this review, I discuss a new theory proposing that the integrity of the germline is maintained by transgenerationally-transmitted RNA “memories” that record ancestral gene expression patterns, and delineate “Self” from “Foreign” sequences. To maintain such recollection two tactics are employed in parallel: “black listing” of invading nucleic acids, and “guest listing” of endogenous genes. Studies in a number of organisms have shown that this memorization is used by the next generation small RNAs to act as “Inherited Vaccines” that ambush invading elements, or as “Inherited Licenses” that grant the transcription of autogenous sequences. PMID:24231398
NASA Astrophysics Data System (ADS)
Dhakshnamoorthy, Balasundaresan; Rohaim, Ahmed; Rui, Huan; Blachowicz, Lydia; Roux, Benoît
2016-09-01
The selectivity filter is an essential functional element of K+ channels that is highly conserved both in terms of its primary sequence and its three-dimensional structure. Here, we investigate the properties of an ion channel from the Gram-positive bacterium Tsukamurella paurometabola with a selectivity filter formed by an uncommon proline-rich sequence. Electrophysiological recordings show that it is a non-selective cation channel and that its activity depends on Ca2+ concentration. In the crystal structure, the selectivity filter adopts a novel conformation with Ca2+ ions bound within the filter near the pore helix where they are coordinated by backbone oxygen atoms, a recurrent motif found in multiple proteins. The binding of Ca2+ ion in the selectivity filter controls the widening of the pore as shown in crystal structures and in molecular dynamics simulations. The structural, functional and computational data provide a characterization of this calcium-gated cationic channel.
Bouallaga, I; Massicard, S; Yaniv, M; Thierry, F
2000-11-01
Recent studies have reported new mechanisms that mediate the transcriptional synergy of strong tissue-specific enhancers, involving the cooperative assembly of higher-order nucleoprotein complexes called enhanceosomes. Here we show that the HPV18 enhancer, which controls the epithelial-specific transcription of the E6 and E7 transforming genes, exhibits characteristic features of these structures. We used deletion experiments to show that a core enhancer element cooperates, in a specific helical phasing, with distant essential factors binding to the ends of the enhancer. This core sequence, binding a Jun B/Fra-2 heterodimer, cooperatively recruits the architectural protein HMG-I(Y) in a nucleoprotein complex, where they interact with each other. Therefore, in HeLa cells, HPV18 transcription seems to depend upon the assembly of an enhanceosome containing multiple cellular factors recruited by a core sequence interacting with AP1 and HMG-I(Y).
Transposable elements in cancer.
Burns, Kathleen H
2017-07-01
Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.
Pike, William A; Riensche, Roderick M; Best, Daniel M; Roberts, Ian E; Whyatt, Marie V; Hart, Michelle L; Carr, Norman J; Thomas, James J
2012-09-18
Systems and computer-implemented processes for storage and management of information artifacts collected by information analysts using a computing device. The processes and systems can capture a sequence of interactive operation elements that are performed by the information analyst, who is collecting an information artifact from at least one of the plurality of software applications. The information artifact can then be stored together with the interactive operation elements as a snippet on a memory device, which is operably connected to the processor. The snippet comprises a view from an analysis application, data contained in the view, and the sequence of interactive operation elements stored as a provenance representation comprising operation element class, timestamp, and data object attributes for each interactive operation element in the sequence.
Lambertini, Elisabetta; Tavanti, Elisa; Torreggiani, Elena; Penolazzi, Letizia; Gambari, Roberto; Piva, Roberta
2008-07-01
Estrogen-responsive genes often have an estrogen response element (ERE) positioned next to activator protein-1 (AP-1) binding sites. Considering that the interaction between ERE and AP-1 elements has been described for the modulation of bone-specific genes, we investigated the 17-beta-estradiol responsiveness and the role of these cis-elements present in the F promoter of the human estrogen receptor alpha (ERalpha) gene. The F promoter, containing the sequence analyzed here, is one of the multiple promoters of the human ERalpha gene and is the only active promoter in bone tissue. Through electrophoretic mobility shift (EMSA), chromatin immunoprecipitation (ChIP), and re-ChIP assays, we investigated the binding of ERalpha and four members of the AP-1 family (c-Jun, c-fos, Fra-2, and ATF2) to a region located approximately 800 bp upstream of the transcriptional start site of exon F of the human ERalpha gene in SaOS-2 osteoblast-like cells. Reporter gene assay experiments in combination with DNA binding assays demonstrated that F promoter activity is under the control of upstream cis-acting elements which are recognized by specific combinations of ERalpha, c-Jun, c-fos, and ATF2 homo- and heterodimers. Moreover, ChIP and re-ChIP experiments showed that these nuclear factors bind the F promoter in vivo with a simultaneous occupancy stimulated by 17-beta-estradiol. Taken together, our findings support a model in which ERalpha/AP-1 complexes modulate F promoter activity under conditions of 17-beta-estradiol stimulation. (c) 2008 Wiley-Liss, Inc.
Identification of a Recently Active Mammalian SINE Derived from Ribosomal RNA
Longo, Mark S.; Brown, Judy D.; Zhang, Chu; O’Neill, Michael J.; O’Neill, Rachel J.
2015-01-01
Complex eukaryotic genomes are riddled with repeated sequences whose derivation does not coincide with phylogenetic history and thus is often unknown. Among such sequences, the capacity for transcriptional activity coupled with the adaptive use of reverse transcription can lead to a diverse group of genomic elements across taxa, otherwise known as selfish elements or mobile elements. Short interspersed nuclear elements (SINEs) are nonautonomous mobile elements found in eukaryotic genomes, typically derived from cellular RNAs such as tRNAs, 7SL or 5S rRNA. Here, we identify and characterize a previously unknown SINE derived from the 3′-end of the large ribosomal subunit (LSU or 28S rDNA) and transcribed via RNA polymerase III. This new element, SINE28, is represented in low-copy numbers in the human reference genome assembly, wherein we have identified 27 discrete loci. Phylogenetic analysis indicates these elements have been transpositionally active within primate lineages as recently as 6 MYA while modern humans still carry transcriptionally active copies. Moreover, we have identified SINE28s in all currently available assembled mammalian genome sequences. Phylogenetic comparisons indicate that these elements are frequently rederived from the highly conserved LSU rRNA sequences in a lineage-specific manner. We propose that this element has not been previously recognized as a SINE given its high identity to the canonical LSU, and that SINE28 likely represents one of possibly many unidentified, active transposable elements within mammalian genomes. PMID:25637222
Means, A L; Farnham, P J
1990-02-01
We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Sp1-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Sp1-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).
Lyu, Haomin; He, Ziwen; Wu, Chung-I; Shi, Suhua
2018-01-01
Several clades of mangrove trees independently invade the interface between land and sea at the margin of woody plant distribution. As phenotypic convergence among mangroves is common, the possibility of convergent adaptation in their genomes is quite intriguing. To study this molecular convergence, we sequenced multiple mangrove genomes. In this study, we focused on the evolution of transposable elements (TEs) in relation to the genome size evolution. TEs, generally considered genomic parasites, are the most common components of woody plant genomes. Analyzing the long terminal repeat-retrotransposon (LTR-RT) type of TE, we estimated their death rates by counting solo-LTRs and truncated elements. We found that all lineages of mangroves massively and convergently reduce TE loads in comparison to their nonmangrove relatives; as a consequence, genome size reduction happens independently in all six mangrove lineages; TE load reduction in mangroves can be attributed to the paucity of young elements; the rarity of young LTR-RTs is a consequence of fewer births rather than access death. In conclusion, mangrove genomes employ a convergent strategy of TE load reduction by suppressing element origination in their independent adaptation to a new environment. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Localized Overheating Phenomena and Optimization of Spark-Plasma Sintering Tooling Design
Giuntini, Diletta; Olevsky, Eugene A.; Garcia-Cardona, Cristina; Maximenko, Andrey L.; Yurlova, Maria S.; Haines, Christopher D.; Martin, Darold G.; Kapoor, Deepak
2013-01-01
The present paper shows the application of a three-dimensional coupled electrical, thermal, mechanical finite element macro-scale modeling framework of Spark Plasma Sintering (SPS) to an actual problem of SPS tooling overheating, encountered during SPS experimentation. The overheating phenomenon is analyzed by varying the geometry of the tooling that exhibits the problem, namely by modeling various tooling configurations involving sequences of disk-shape spacers with step-wise increasing radii. The analysis is conducted by means of finite element simulations, intended to obtain temperature spatial distributions in the graphite press-forms, including punches, dies, and spacers; to identify the temperature peaks and their respective timing, and to propose a more suitable SPS tooling configuration with the avoidance of the overheating as a final aim. Electric currents-based Joule heating, heat transfer, mechanical conditions, and densification are imbedded in the model, utilizing the finite-element software COMSOL™, which possesses a distinguishing ability of coupling multiple physics. Thereby the implementation of a finite element method applicable to a broad range of SPS procedures is carried out, together with the more specific optimization of the SPS tooling design when dealing with excessive heating phenomena. PMID:28811398
Chelkha, Nisrine; Colson, Philippe; Levasseur, Anthony; La Scola, Bernard
2018-06-02
Giant viruses infect protozoa, especially amoebae of the genus Acanthamoeba. These viruses possess genetic elements named Mobilome. So far, this mobilome comprises provirophages which are integrated into the genome of their hosts, transpovirons, and Maverick/Polintons. Virophages replicate inside virus factories within Acanthamoeba and can decrease the infectivity of giant viruses. The virophage infecting CroV was found to be integrated in the host of CroV, Cafeteria roenbergensis, thus protecting C. roenbergensis by reduction of CroV multiplication. Because of this unique property, assessment of the mechanisms of replication of virophages and their relationship with giant viruses is a key element of this investigation. This work aimed at evaluating the presence and the dynamic of these mobile elements in sixteen Acanthamoeba genomes. No significant traces of the integration of genomes or sequences from known virophages were identified in all the available Acanthamoeba genomes. These results brought us to hypothesize that the interactions between mimiviruses and their virophages might occur through different mechanisms, or at low frequency. An additional explanation could be that our knowledge of the diversity of virophages is still very limited. Copyright © 2018 Elsevier B.V. All rights reserved.
The application of the high throughput sequencing technology in the transposable elements.
Liu, Zhen; Xu, Jian-hong
2015-09-01
High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
D’Addabbo, Pietro; Caizzi, Ruggiero
2016-01-01
Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon’s co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon’s evolutionary dynamics and increases our understanding on the Tc1-mariner elements’ biology. PMID:27213270
Meyer, C; Pouteau, S; Rouzé, P; Caboche, M
1994-01-01
By Northern blot analysis of nitrate reductase-deficient mutants of Nicotiana plumbaginifolia, we identified a mutant (mutant D65), obtained after gamma-ray irradiation of protoplasts, which contained an insertion sequence in the nitrate reductase (NR) mRNA. This insertion sequence was localized by polymerase chain reaction (PCR) in the first exon of NR and was also shown to be present in the NR gene. The mutant gene contained a 565 bp insertion sequence that exhibits the sequence characteristics of a transposable element, which was thus named dTnp1. The dTnp1 element has 14 bp terminal inverted repeats and is flanked by an 8-bp target site duplication generated upon transposition. These inverted repeats have significant sequence homology with those of other transposable elements. Judging by its size and the absence of a long open reading frame, dTnp1 appears to represent a defective, although mobile, transposable element. The octamer motif TTTAGGCC was found several times in direct orientation near the 5' and 3' ends of dTnp1 together with a perfect palindrome located after the 5' inverted repeat. Southern blot analysis using an internal probe of dTnp1 suggested that this element occurs as a single copy in the genome of N. plumbaginifolia. It is also present in N. tabacum, but absent in tomato or petunia. The dTnp1 element is therefore of potential use for gene tagging in Nicotiana species.
A novel approach to multiple sequence alignment using hadoop data grids.
Sudha Sadasivam, G; Baktavatchalam, G
2010-01-01
Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A
2013-11-05
Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
Hyder, S M; Stancel, G M; Nawaz, Z; McDonnell, D P; Loose-Mitchell, D S
1992-09-05
We have used transient transfection assays with reporter plasmids expressing chloramphenicol acetyltransferase, linked to regions of mouse c-fos, to identify a specific estrogen response element (ERE) in this protooncogene. This element is located in the untranslated 3'-flanking region of the c-fos gene, 5 kilobases (kb) downstream from the c-fos promoter and 1.5 kb downstream of the poly(A) signal. This element confers estrogen responsiveness to chloramphenicol acetyltransferase reporters linked to both the herpes simplex virus thymidine kinase promoter and the homologous c-fos promoter. Deletion analysis localized the response element to a 200-base pair fragment which contains the element GGTCACCACAGCC that resembles the consensus ERE sequence GGTCACAGTGACC originally identified in Xenopus vitellogenin A2 gene. A synthetic 36-base pair oligodeoxynucleotide containing this c-fos sequence conferred estrogen inducibility to the thymidine kinase promoter. The corresponding sequence also induced reporter activity when present in the c-fos gene fragment 3 kb from the thymidine kinase promoter. Gel-shift experiments demonstrated that synthetic oligonucleotides containing either the consensus ERE or the c-fos element bind human estrogen receptor obtained from a yeast expression system. However, the mobility of the shifted band is faster for the fos-ERE-complex than the consensus ERE complex suggesting that the three-dimensional structure of the protein-DNA complexes is different or that other factors are differentially involved in the two reactions. When the 5'-GGTCA sequence present in the c-fos ERE is mutated to 5'-TTTCA, transcriptional activation and receptor binding activities are both lost. Mutation of the CAGCC-3' element corresponding to the second half-site of the c-fos sequence also led to the loss of receptor binding activity, suggesting that both half-sites of this element are involved in this function. The estrogen induction mediated by either the c-fos or the consensus ERE was blunted by the antiestrogen tamoxifen. Based on these studies, we believe the 3'-fos ERE sequence we have identified may be a major cis-acting element involved in the physiological regulation of the gene by estrogens in vivo.
A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast.
Jaschke, Paul R; Lieberman, Erica K; Rodriguez, Jon; Sierra, Adrian; Endy, Drew
2012-12-20
The 5386 nucleotide bacteriophage øX174 genome has a complicated architecture that encodes 11 gene products via overlapping protein coding sequences spanning multiple reading frames. We designed a 6302 nucleotide synthetic surrogate, øX174.1, that fully separates all primary phage protein coding sequences along with cognate translation control elements. To specify øX174.1f, a decompressed genome the same length as wild type, we truncated the gene F coding sequence. We synthesized DNA encoding fragments of øX174.1f and used a combination of in vitro- and yeast-based assembly to produce yeast vectors encoding natural or designer bacteriophage genomes. We isolated clonal preparations of yeast plasmid DNA and transfected E. coli C strains. We recovered viable øX174 particles containing the øX174.1f genome from E. coli C strains that independently express full-length gene F. We expect that yeast can serve as a genomic 'drydock' within which to maintain and manipulate clonal lineages of other obligate lytic phage. Copyright © 2012 Elsevier Inc. All rights reserved.
Fink, J S; Verhave, M; Kasper, S; Tsukada, T; Mandel, G; Goodman, R H
1988-01-01
cAMP-regulated transcription of the human vasoactive intestinal peptide gene is dependent upon a 17-base-pair DNA element located 70 base pairs upstream from the transcriptional initiation site. This element is similar to sequences in other genes known to be regulated by cAMP and to sequences in several viral enhancers. We have demonstrated that the vasoactive intestinal peptide regulatory element is an enhancer that depends upon the integrity of two CGTCA sequence motifs for biological activity. Mutations in either of the CGTCA motifs diminish the ability of the element to respond to cAMP. Enhancers containing the CGTCA motif from the somatostatin and adenovirus genes compete for binding of nuclear proteins from C6 glioma and PC12 cells to the vasoactive intestinal peptide enhancer, suggesting that CGTCA-containing enhancers interact with similar transacting factors. Images PMID:2842787
Kumar, Rajesh; Grover, Sunita; Kaushik, Jai K; Batish, Virender Kumar
2014-01-01
Lactobacillus plantarum is a flexible and versatile microorganism that inhabits a variety of niches, and its genome may express up to four bsh genes to maximize its survival in the mammalian gut. However, the ecological significance of multiple bsh genes in L. plantarum is still not clearly understood. Hence, this study demonstrated the disruption of bile salt hydrolase (bsh1) gene due to the insertion of a transposable element in L. plantarum Lp20 - a wild strain of human fecal origin. Surprisingly, L. plantarum strain Lp20 produced a ∼2.0 kb bsh1 amplicon against the normal size (∼1.0 kb) bsh1 amplicon of Bsh(+)L. plantarum Lp21. Strain Lp20 exhibited minimal Bsh activity in spite of having intact bsh2, bsh3 and bsh4 genes in its genome and hence had a Bsh(-) phenotype. Cloning and sequence characterization of Lp20 bsh1 gene predicted four individual open reading frames (ORFs) within this region. BLAST analysis of ORF1 and ORF2 revealed significant sequence similarity to the L. plantarum bsh1 gene while ORF3 and ORF4 showed high sequence homology to IS30-family transposases. Since, IS30-related transposon element was inserted within Lp20 bsh1 gene in reverse orientation (3'-5'), it introduced several stop codons and disrupted the protein reading frames of both Bsh1 and transposase. Inverted terminal repeats (GGCAGATTG) of transposon, mediated its insertion at 255-263 nt and 1301-1309 nt positions of Lp20 bsh1 gene. In conclusion, insertion of IS30 related-transposon within the bsh1 gene sequence of L. plantarum strain Lp20 demolished the integrity and functionality of Bsh1 enzyme. Additionally, this transposon DNA sequence remains active among various Lactobacillus spp. and hence harbors the potential to be explored in the development of efficient insertion mutagenesis system. Copyright © 2013 Elsevier GmbH. All rights reserved.
Sleep-dependent learning and motor-skill complexity
Kuriyama, Kenichi; Stickgold, Robert; Walker, Matthew P.
2004-01-01
Learning of a procedural motor-skill task is known to progress through a series of unique memory stages. Performance initially improves during training, and continues to improve, without further rehearsal, across subsequent periods of sleep. Here, we investigate how this delayed sleep-dependent learning is affected when the task characteristics are varied across several degrees of difficulty, and whether this improvement differentially enhances individual transitions of the motor-sequence pattern being learned. We report that subjects show similar overnight improvements in speed whether learning a five-element unimanual sequence (17.7% improvement), a nine-element unimanual sequence (20.2%), or a five-element bimanual sequence (17.5%), but show markedly increased overnight improvement (28.9%) with a nine-element bimanual sequence. In addition, individual transitions within the motor-sequence pattern that appeared most difficult at the end of training showed a significant 17.8% increase in speed overnight, whereas those transitions that were performed most rapidly at the end of training showed only a non-significant 1.4% improvement. Together, these findings suggest that the sleep-dependent learning process selectively provides maximum benefit to motor-skill procedures that proved to be most difficult prior to sleep. PMID:15576888
DeVry, C G; Tsai, W; Clarke, S
1996-11-15
The protein L-isoaspartyl/D-aspartyl O-methyltransferase (EC 2.1.1.77) catalyzes the first step in the repair of proteins damaged in the aging process by isomerization or racemization reactions at aspartyl and asparaginyl residues. A single gene has been localized to human chromosome 6 and multiple transcripts arising through alternative splicing have been identified. Restriction enzyme mapping, subcloning, and DNA sequence analysis of three overlapping clones from a human genomic library in bacteriophage P1 indicate that the gene spans approximately 60 kb and is composed of 8 exons interrupted by 7 introns. Analysis of intron/exon splice junctions reveals that all of the donor and acceptor splice sites are in agreement with the mammalian consensus splicing sequence. Determination of transcription initiation sites by primer extension analysis of poly(A)+ mRNA from human brain identifies multiple start sites, with a major site 159 nucleotides upstream from the ATG start codon. Sequence analysis of the 5'-untranslated region demonstrates several potential cis-acting DNA elements including SP1, ETF, AP1, AP2, ARE, XRE, CREB, MED-1, and half-palindromic ERE motifs. The promoter of this methyltransferase gene lacks an identifiable TATA box but is characterized by a CpG island which begins approximately 723 nucleotides upstream of the major transcriptional start site and extends through exon 1 and into the first intron. These features are characteristic of housekeeping genes and are consistent with the wide tissue distribution observed for this methyltransferase activity.
Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kass, D.H.; Batzer, M.A.; Deininger, P.L.
The Alu repetitive family of short interspersed elements (SINEs) in primates can be subdivided into distinct subfamilies by specific diagnostic nucleotide changes. The older subfamilies are generally very abundant, while the younger subfamilies have fewer copies. Some of the youngest Alu elements are absent in the orthologous loci of nonhuman primates, indicative of recent retroposition events, the primary mode of SINE evolutions. PCR analysis of one young Alu subfamily (Sb2) member found in the low-density lipoprotein receptor gene apparently revealed the presence of this element in the green monkey, orangutan, gorilla, and chimpanzee genomes, as well as the human genome.more » However, sequence analysis of these genomes revealed a highly mutated, older, primate-specific Alu element was present at this position in the nonhuman primates. Comparison of the flanking DNA sequences upstream of this Alu insertion corresponded to evolution expected for standard primate phylogeny, but comparison of the Alu repeat sequences revealed that the human element departed from this phylogeny. The change in the human sequence apparently occurred by a gene conversion event only within the Alu element itself, converting it from one of the oldest to one of the youngest Alu subfamilies. Although gene conversions of Alu elements are clearly very rare, this finding shows that such events can occur and contribute to specific cases of SINE subfamily evolution.« less
NASA Astrophysics Data System (ADS)
Niaz, Mansoor
The main objectives of this study are:(1) to elaborate a framework based on a rational reconstruction of developments that led to the formulation of the laws of definite and multiple proportions; (2) to ascertain students' views of the two laws; (3) to formulate criteria based on the framework for evaluating chemistry textbooks' treatment of the two laws; and (4) to provide a rationale for chemistry teachers to respond to the question: Can we teach chemistry without the laws of definite and multiple proportions? Results obtained show that most of the textbooks present the laws of definite and multiple proportions within an inductivist perspective, characterized by the following sequence: experimental findings showed that chemical elements combined in fixed/multiple proportions, followed by the formulation of the laws of definite and multiple proportions, and finally Dalton's atomic theory was postulated to explain the laws. Students were found to be reluctant to question the laws that they learnt as the building blocks of chemistry. It is concluded that by emphasizing the laws of definite and multiple proportions, textbooks inevitably endorse the dichotomy between theories and laws, which is questioned by philosophers of science (Lakatos 1970; Giere 1995a, b). An alternative approach is presented which shows that we can teach chemistry without the laws of definite and multiple proportions.
Laimins, L; Holmgren-König, M; Khoury, G
1986-01-01
The enhancer elements from either simian virus 40 or murine sarcoma virus activate the expression of a transfected rat insulin 1 (rI1) gene when placed within 2.0 kilobases or less of the rI1 gene cap site. Inclusion of 4.0 kilobases of upstream rI1 sequence, however, results in a substantial reduction in the enhancer-dependent insulin gene expression. These observations suggested that a negative transcriptional regulatory element was present between 2.0 and 4.0 kilobases of the rI1 sequence. To test this notion, we employed a heterologous enhancer-dependent transcription assay in which the simian virus 40 72-base-pair repeat is linked to a human beta-globin gene. Addition of the upstream rI1 element to this system decreased the level of enhancer-dependent beta-globin transcription by a factor of 5 to 15. This rI1 "silencer" element functions in a manner relatively independent of position and orientation and requires a cis-dependent relationship to the transcription unit on which it acts. Thus, the silencer sequence seems to have a number of the characteristics of enhancer elements, and we suggest that it may function by the converse of the enhancer mechanism. The rI1 silencer sequence was identified as a member of a long interspersed rat repetitive family. Thus, a potential role for certain repetitive sequences interspersed throughout the eukaryotic genome may be to regulate gene expression by retaining transcriptional activity within defined domains. Images PMID:3010279
Matsumura, Ritsuko; Akashi, Makoto
2017-09-29
Cell-autonomous oscillation in clock gene expression drives circadian rhythms. The development of comprehensive analytical techniques, such as bioinformatics and ChIP-sequencing, has enabled the genome-wide identification of potential circadian transcriptional elements that regulate the transcriptional oscillation of clock genes. However, detailed analyses using traditional biochemical and molecular-biological approaches, such as binding and reporter assays, are still necessary to determine whether these potential circadian transcriptional elements are actually functional and how significantly they contribute to driving transcriptional oscillation. Here, we focused on the molecular mechanism of transcriptional oscillations in the mammalian clock gene Period3 ( Per3 ). The PER3 protein is essential for robust peripheral clocks and is a key component in circadian output processes. We found three E box-like elements located upstream of human Per3 transcription start sites that additively contributed to cell-autonomous transcriptional oscillation. However, we also found that Per3 is still expressed in a circadian manner when all three E box-like elements are functionally impaired. We noted that Per3 transcription was activated by the synergistic actions of two D box-like elements and the three E box-like elements, leading to a drastic increase in circadian amplitude. Interestingly, circadian expression of Per3 was completely disrupted only when all five transcriptional elements were functionally impaired. These results indicate that three E box-like and two D box-like elements cooperatively and redundantly regulate cell-autonomous transcriptional oscillation of Per3 . © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Cis-acting elements in the promoter region of the human aldolase C gene.
Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F
1993-08-16
We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.
Gent, Jonathan I; Wang, Na; Dawe, R Kelly
2017-06-21
Paradoxically, centromeres are known both for their characteristic repeat sequences (satellite DNA) and for being epigenetically defined. Maize (Zea mays mays) is an attractive model for studying centromere positioning because many of its large (~2 Mb) centromeres are not dominated by satellite DNA. These centromeres, which we call complex centromeres, allow for both assembly into reference genomes and for mapping short reads from ChIP-seq with antibodies to centromeric histone H3 (cenH3). We found frequent complex centromeres in maize and its wild relatives Z. mays parviglumis, Z. mays mexicana, and particularly Z. mays huehuetenangensis. Analysis of individual plants reveals minor variation in the positions of complex centromeres among siblings. However, such positional shifts are stochastic and not heritable, consistent with prior findings that centromere positioning is stable at the population level. Centromeres are also stable in multiple F1 hybrid contexts. Analysis of repeats in Z. mays and other species (Zea diploperennis, Zea luxurians, and Tripsacum dactyloides) reveals tenfold differences in abundance of the major satellite CentC, but similar high levels of sequence polymorphism in individual CentC copies. Deviation from the CentC consensus has little or no effect on binding of cenH3. These data indicate that complex centromeres are neither a peculiarity of cultivation nor inbreeding in Z. mays. While extensive arrays of CentC may be the norm for other Zea and Tripsacum species, these data also reveal that a wide diversity of DNA sequences and multiple types of genetic elements in and near centromeres support centromere function and constrain centromere positions.
NASA Technical Reports Server (NTRS)
Ji, C.; Chen, Y.; McCarthy, T. L.; Centrella, M.
1999-01-01
Transforming growth factor-beta binds to three high affinity cell surface molecules that directly or indirectly regulate its biological effects. The type III receptor (TRIII) is a proteoglycan that lacks significant intracellular signaling or enzymatic motifs but may facilitate transforming growth factor-beta binding to other receptors, stabilize multimeric receptor complexes, or segregate growth factor from activating receptors. Because various agents or events that regulate osteoblast function rapidly modulate TRIII expression, we cloned the 5' region of the rat TRIII gene to assess possible control elements. DNA fragments from this region directed high reporter gene expression in osteoblasts. Sequencing showed no consensus TATA or CCAAT boxes, whereas several nuclear factors binding sequences within the 3' region of the promoter co-mapped with multiple transcription initiation sites, DNase I footprints, gel mobility shift analysis, or loss of activity by deletion or mutation. An upstream enhancer was evident 5' proximal to nucleotide -979, and a silencer region occurred between nucleotides -2014 and -2194. Glucocorticoid sensitivity mapped between nucleotides -687 and -253, whereas bone morphogenetic protein 2 sensitivity co-mapped within the silencer region. Thus, the TRIII promoter contains cooperative basal elements and dispersed growth factor- and hormone-sensitive regulatory regions that can control TRIII expression by osteoblasts.
Kutar, Braj M. R. N. S.; Rajpara, Neha; Upadhyay, Hardik; Ramamurthy, Thandavarayan; Bhardwaj, Ashima K.
2013-01-01
Background Increase in the number of multidrug resistant pathogens and the accompanied rise in case fatality rates has hampered the treatment of many infectious diseases including cholera. Unraveling the mechanisms responsible for multidrug resistance in the clinical isolates of Vibrio cholerae would help in understanding evolution of these pathogenic bacteria and their epidemic potential. This study was carried out to identify genetic factors responsible for multiple drug resistance in clinical isolates of Vibrio cholerae O1, serotype Ogawa, biotype El Tor isolated from the patients admitted to the Infectious Diseases Hospital, Kolkata, India, in 2009. Methodology/Principal Findings One hundred and nineteen clinical isolates of V. cholerae were analysed for their antibiotic resistance phenotypes. Antibiogram analysis revealed that majority of the isolates showed resistance to co-trimoxazole, nalidixic acid, polymixin B and streptomycin. In PCR, SXT integrase was detected in 117 isolates and its sequence showed 99% identity notably to ICEVchInd5 from Sevagram, India, ICEVchBan5 from Bangladesh and VC1786ICE sequence from Haiti outbreak among others. Antibiotic resistance traits corresponding to SXT element were transferred from the parent Vibrio isolate to the recipient E. coli XL-1 Blue cells during conjugation. Double-mismatch-amplification mutation assay (DMAMA) revealed the presence of Haitian type ctxB allele of genotype 7 in 55 isolates and the classical ctxB allele of genotype 1 in 59 isolates. Analysis of topoisomerase sequences revealed the presence of mutation Ser83 → Ile in gyrA and Ser85→ Leu in parC. This clearly showed the circulation of SXT-containing V. cholerae as causative agent for cholera in Kolkata. Conclusions There was predominance of SXT element in these clinical isolates from Kolkata region which also accounted for their antibiotic resistance phenotype typical of this element. DMAMA PCR showed them to be a mixture of isolates with different ctxB alleles like classical, El Tor and Haitian variants. PMID:23431378
Novikova, Olga; Śliwińska, Ewa; Fet, Victor; Settele, Josef; Blinov, Alexander; Woyciechowski, Michal
2007-01-01
Background Non-long terminal repeat (non-LTR) retrotransposons are mobile genetic elements that propagate themselves by reverse transcription of an RNA intermediate. Non-LTR retrotransposons are known to evolve mainly via vertical transmission and random loss. Horizontal transmission is believed to be a very rare event in non-LTR retrotransposons. Our knowledge of distribution and diversity of insect non-LTR retrotransposons is limited to a few species – mainly model organisms such as dipteran genera Drosophila, Anopheles, and Aedes. However, diversity of non-LTR retroelements in arthropods seems to be much richer. The present study extends the analysis of non-LTR retroelements to CR1 clade from four butterfly species of genus Maculinea (Lepidoptera: Lycaenidae). The lycaenid genus Maculinea, the object of interest for evolutionary biologists and also a model group for European biodiversity studies, possesses a unique, specialized myrmecophilous lifestyle at larval stage. Their caterpillars, after three weeks of phytophagous life on specific food plants drop to the ground where they are adopted to the ant nest by Myrmica foraging workers. Results We found that the genome of Maculinea butterflies contains multiple CR1 lineages of non-LTR retrotransposons, including those from MacCR1A, MacCR1B and T1Q families. A comparative analysis of RT nucleotide sequences demonstrated an extremely high similarity among elements both in interspecific and intraspecific comparisons. CR1A-like elements were found only in family Lycaenidae. In contrast, MacCR1B lineage clones were extremely similar to CR1B non-LTR retrotransposons from Bombycidae moths: silkworm Bombyx mori and Oberthueria caeca. Conclusion The degree of coding sequence similarity of the studied elements, their discontinuous distribution, and results of divergence-versus-age analysis make it highly unlikely that these sequences diverged at the same time as their host taxa. The only reasonable alternative explanation is horizontal transfer. In addition, phylogenetic markers for population analysis of Maculinea could be developed based on the described non-LTR retrotransposons. PMID:17588269
Yao, Peng; Potdar, Alka A.; Arif, Abul; Ray, Partho Sarothi; Mukhopadhyay, Rupak; Willard, Belinda; Xu, Yichi; Yan, Jun; Saidel, Gerald M.; Fox, Paul L.
2012-01-01
SUMMARY Post-transcriptional regulatory mechanisms superimpose “fine-tuning” control upon “on-off” switches characteristic of gene transcription. We have exploited computational modeling with experimental validation to resolve an anomalous relationship between mRNA expression and protein synthesis. Differential GAIT (Gamma-interferon Activated Inhibitor of Translation) complex activation repressed VEGF-A synthesis to a low, constant rate despite high, variable VEGFA mRNA expression. Dynamic model simulations indicated the presence of an unidentified, inhibitory GAIT element-interacting factor. We discovered a truncated form of glutamyl-prolyl tRNA synthetase (EPRS), the GAIT constituent that binds the 3’-UTR GAIT element in target transcripts. The truncated protein, EPRSN1, prevents binding of functional GAIT complex. EPRSN1 mRNA is generated by a remarkable polyadenylation-directed conversion of a Tyr codon in the EPRS coding sequence to a stop codon (PAY*). By low-level protection of GAIT element-bearing transcripts, EPRSN1 imposes a robust “translational trickle” of target protein expression. Genome-wide analysis shows PAY* generates multiple truncated transcripts thereby contributing to transcriptome expansion. PMID:22386318
The Dfam database of repetitive DNA families.
Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J
2016-01-04
Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Myers, Katie N.; Barone, Giancarlo; Ganesh, Anil; Staples, Christopher J.; Howard, Anna E.; Beveridge, Ryan D.; Maslen, Sarah; Skehel, J. Mark; Collis, Spencer J.
2016-01-01
It was recently discovered that vertebrate genomes contain multiple endogenised nucleotide sequences derived from the non-retroviral RNA bornavirus. Strikingly, some of these elements have been evolutionary maintained as open reading frames in host genomes for over 40 million years, suggesting that some endogenised bornavirus-derived elements (EBL) might encode functional proteins. EBLN1 is one such element established through endogenisation of the bornavirus N gene (BDV N). Here, we functionally characterise human EBLN1 as a novel regulator of genome stability. Cells depleted of human EBLN1 accumulate DNA damage both under non-stressed conditions and following exogenously induced DNA damage. EBLN1-depleted cells also exhibit cell cycle abnormalities and defects in microtubule organisation as well as premature centrosome splitting, which we attribute in part, to improper localisation of the nuclear envelope protein TPR. Our data therefore reveal that human EBLN1 possesses important cellular functions within human cells, and suggest that other EBLs present within vertebrate genomes may also possess important cellular functions. PMID:27739501
McMurchy, Alicia N; Stempor, Przemyslaw; Gaarenstroom, Tessa; Wysolmerski, Brian; Dong, Yan; Aussianikava, Darya; Appert, Alex; Huang, Ni; Kolasinska-Zwierz, Paulina; Sapetschnig, Alexandra; Miska, Eric A; Ahringer, Julie
2017-01-01
Repetitive sequences derived from transposons make up a large fraction of eukaryotic genomes and must be silenced to protect genome integrity. Repetitive elements are often found in heterochromatin; however, the roles and interactions of heterochromatin proteins in repeat regulation are poorly understood. Here we show that a diverse set of C. elegans heterochromatin proteins act together with the piRNA and nuclear RNAi pathways to silence repetitive elements and prevent genotoxic stress in the germ line. Mutants in genes encoding HPL-2/HP1, LIN-13, LIN-61, LET-418/Mi-2, and H3K9me2 histone methyltransferase MET-2/SETDB1 also show functionally redundant sterility, increased germline apoptosis, DNA repair defects, and interactions with small RNA pathways. Remarkably, fertility of heterochromatin mutants could be partially restored by inhibiting cep-1/p53, endogenous meiotic double strand breaks, or the expression of MIRAGE1 DNA transposons. Functional redundancy among factors and pathways underlies the importance of safeguarding the genome through multiple means. DOI: http://dx.doi.org/10.7554/eLife.21666.001 PMID:28294943
Al-Jarbou, Ahmed Nasser
2012-01-01
Bacterial pathogenesis presents an astounding arsenal of virulence factors that allow them to conquer many different niches throughout the course of infection. Principally fascinating is the fact that some bacterial species are able to induce different diseases by expression of different combinations of virulence factors. Nevertheless, studies aiming at screening for the presence of bacteriophages in humans have been limited. Such screening procedures would eventually lead to identification of phage-encoded properties that impart increased bacterial fitness and/or virulence in a particular niche, and hence, would potentially be used to reverse the course of bacterial infections. As the human oral cavity represents a rich and dynamic ecosystem for several upper respiratory tract pathogens. However, little is known about virus diversity in human dental plaque which is an important reservoir. We applied the culture-independent approach to characterize virus diversity in human dental plaque making a library from a virus DNA fraction amplified using a multiple displacement method and sequenced 80 clones. The resulting sequence showed 44% significant identities to GenBank databases by TBLASTX analysis. TBLAST homology comparisons showed that 66% was viral; 18% eukarya; 10% bacterial; 6% mobile elements. These sequences were sorted into 6 contigs and 45 single sequences in which 4 contigs and a single sequence showed significant identity to a small region of a putative prophage in the Corynebacterium diphtheria genome. These findings interestingly highlight the uniqueness of over half of the sequences, whilst the dominance of a pathogen-specific prophage sequences imply their role in virulence.
NASA Astrophysics Data System (ADS)
Miyatake, Teruhiko; Chiba, Kazuki; Hamamura, Masanori; Tachikawa, Shin'ichi
We propose a novel asynchronous direct-sequence codedivision multiple access (DS-CDMA) using feedback-controlled spreading sequences (FCSSs) (FCSS/DS-CDMA). At the receiver of FCSS/DS-CDMA, the code-orthogonalizing filter (COF) produces a spreading sequence, and the receiver returns the spreading sequence to the transmitter. Then the transmitter uses the spreading sequence as its updated version. The performance of FCSS/DS-CDMA is evaluated over time-dispersive channels. The results indicate that FCSS/DS-CDMA greatly suppresses both the intersymbol interference (ISI) and multiple access interference (MAI) over time-invariant channels. FCSS/DS-CDMA is applicable to the decentralized multiple access.
Isotopic and trace element characteristics of an unusual refractory inclusion from Essebi
NASA Technical Reports Server (NTRS)
Deloule, E.; Kennedy, A. K.; Hutcheon, I. D.; Elgoresy, A.
1993-01-01
The isotopic and chemical properties of Ca-Al-rich inclusions (CAI) provide important clues to the early solar nebula environment. While the abundances of refractory major and trace elements are similar to those expected for high temperature condensates, the variety of textural, chemical, and isotopic signatures indicate most CAI experienced complex, multi-stage histories involving repeated episodes of condensation, evaporation, and metamorphism. Evidence of multiple processes is especially apparent in an unusual refractory inclusion from Essebi (URIE) described by El Goresy et al. The melilite (mel)-rich core of URIE contains polygonal framboids of spinel (sp) and hibonite (hb) or sp and fassaite (fas) and is surrounded by a rim sequence consisting of five layers. In contrast to rims on Allende, the mineralogy of the URIE rim layers becomes increasingly refractory from the core outwards, ending in a layer of spinel-Al2O3 solid solution + Sc-rich fassaite. The chemical and mineralogical features of URIE are inconsistent with crystallization from a homogeneous melt, and El Goresy et al. proposed a multi-step history involving condensation of sp + hb and aggregation into framboids, capture of framboids by a refractory silicate melt droplet, condensation of rim layers, and alteration of mel to calcite and feldspathoid. The PANURGE ion probe was used to investigate the isotopic and trace element characteristics of URIE to develop a more complete picture of the multiple processes leading to formation and metamorphism.
Cis-acting elements in its 3′ UTR mediate post-transcriptional regulation of KRAS
Kim, Minlee; Kogan, Nicole; Slack, Frank J.
2016-01-01
Multiple RNA-binding proteins and non-coding RNAs, such as microRNAs (miRNAs), are involved in post-transcriptional gene regulation through recognition motifs in the 3′ untranslated region (UTR) of their target genes. The KRAS gene encodes a key signaling protein, and its messenger RNA (mRNA) contains an exceptionally long 3′ UTR; this suggests that it may be subject to a highly complex set of regulatory processes. However, 3′ UTR-dependent regulation of KRAS expression has not been explored in detail. Using extensive deletion and mutational analyses combined with luciferase reporter assays, we have identified inhibitory and stabilizing cis-acting regions within the KRAS 3′ UTR that may interact with miRNAs and RNA-binding proteins, such as HuR. Particularly, we have identified an AU-rich 49-nt fragment in the KRAS 3′ UTR that is required for KRAS 3′ UTR reporter repression. This element contains a miR-185 complementary element, and we show that overexpression of miR-185 represses endogenous KRAS mRNA and protein in vitro. In addition, we have identified another 49-nt fragment that is required to promote KRAS 3′ UTR reporter expression. These findings indicate that multiple cis-regulatory motifs in the 3′ UTR of KRAS finely modulate its expression, and sequence alterations within a binding motif may disrupt the precise functions of trans-regulatory factors, potentially leading to aberrant KRAS expression. PMID:26930719
Genome-Wide Mutagenesis in Borrelia burgdorferi.
Lin, Tao; Gao, Lihui
2018-01-01
Signature-tagged mutagenesis (STM) is a functional genomics approach to identify bacterial virulence determinants and virulence factors by simultaneously screening multiple mutants in a single host animal, and has been utilized extensively for the study of bacterial pathogenesis, host-pathogen interactions, and spirochete and tick biology. The signature-tagged transposon mutagenesis has been developed to investigate virulence determinants and pathogenesis of Borrelia burgdorferi. Mutants in genes important in virulence are identified by negative selection in which the mutants fail to colonize or disseminate in the animal host and tick vector. STM procedure combined with Luminex Flex ® Map™ technology and next-generation sequencing (e.g., Tn-seq) are the powerful high-throughput tools for the determination of Borrelia burgdorferi virulence determinants. The assessment of multiple tissue sites and two DNA resources at two different time points using Luminex Flex ® Map™ technology provides a robust data set. B. burgdorferi transposon mutant screening indicates that a high proportion of genes are the novel virulence determinants that are required for mouse and tick infection. In this protocol, an effective signature-tagged Himar1-based transposon suicide vector was developed and used to generate a sequence-defined library of nearly 4800 mutants in the infectious B. burgdorferi B31 clone. In STM, signature-tagged suicide vectors are constructed by inserting unique DNA sequences (tags) into the transposable elements. The signature-tagged transposon mutants are generated when transposon suicide vectors are transformed into an infectious B. burgdorferi clone, and the transposable element is transposed into the 5'-TA-3' sequence in the B. burgdorferi genome with the signature tag. The transposon library is created and consists of many sub-libraries, each sub-library has several hundreds of mutants with same tags. A group of mice or ticks are infected with a mixed population of mutants with different tags, after recovered from different tissues of infected mice and ticks, mutants from output pool and input pool are detected using high-throughput, semi-quantitative Luminex ® FLEXMAP™ or next-generation sequencing (Tn-seq) technologies. Thus far, we have created a high-density, sequence-defined transposon library of over 6600 STM mutants for the efficient genome-wide investigation of genes and gene products required for wild-type pathogenesis, host-pathogen interactions, in vitro growth, in vivo survival, physiology, morphology, chemotaxis, motility, structure, metabolism, gene regulation, plasmid maintenance and replication, etc. The insertion sites of 4480 transposon mutants have been determined. About 800 predicted protein-encoding genes in the genome were disrupted in the STM transposon library. The infectivity and some functions of 800 mutants in 500 genes have been determined. Analysis of these transposon mutants has yielded valuable information regarding the genes and gene products important in the pathogenesis and biology of B. burgdorferi and its tick vectors.
Mobile element biology – new possibilities with high-throughput sequencing
Xing, Jinchuan; Witherspoon, David J.; Jorde, Lynn B.
2014-01-01
Mobile elements compose more than half of the human genome, but until recently their large-scale detection was time-consuming and challenging. With the development of new high-throughput sequencing technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. We review several high-throughput methods, with an emphasis on techniques that specifically target mobile element insertions in humans, and we highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human cancers. PMID:23312846
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
2003-12-31
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Dubrana, K; Le Mouël, A; Amar, L
1997-01-01
Ciliated protozoa undergo thousands of site-specific DNA deletion events during the programmed development of micronuclear genomes to macronuclear genomes. Two deletion elements, W1 and W2, were identified in the Paramecium primaurelia wild-type 156 strain. Here, we report the characterization of both elements in wild-type strain 168 and show that they display variant deletion patterns when compared with those of strain 156. The W1 ( 168 ) element is defective for deletion. The W2 ( 168 ) element is excised utilizing two alternative boundaries on one side, both are different from the boundary utilized to excise the W2156 element. By crossing the 156 and 168 strains, we demonstrate that the definition of all deletion endpoints are each controlled by cis -acting determinant(s) rather than by strain-specific trans-acting factor(s). Sequence comparison of all deleted DNA segments indicates that the 5'-TA-3'terminal sequence is strictly required at their ends. Furthermore the identity of the first eight base pairs of these ends to a previously established consensus sequence correlates with the frequency of the corresponding deletion events. Our data implies the existence of an adaptive convergent evolution of these Paramecium deleted DNA segment end sequences. PMID:9171098
Whisson, Stephen C; Avrova, Anna O; Lavrova, Olga; Pritchard, Leighton
2005-04-01
The first known families of tRNA-related short interspersed elements (SINEs) in the oomycetes were identified by exploiting the genomic DNA sequence resources for the potato late blight pathogen, Phytophthora infestans. Fifteen families of tRNA-related SINEs, as well as predicted tRNAs, and other possible RNA polymerase III-transcribed sequences were identified. The size of individual elements ranges from 101 to 392 bp, representing sequences present from low (1) to highly abundant (over 2000) copy number in the P. infestans genome, based on quantitative PCR analysis. Putative short direct repeat sequences (6-14 bp) flanking the elements were also identified for eight of the SINEs. Predicted SINEs were named in a series prefixed infSINE (for infestans-SINE). Two SINEs were apparently present as multimers of tRNA-related units; four copies of a related unit for infSINEr, and two unrelated units for infSINEz. Two SINEs, infSINEh and infSINEi, were typically located within 400 bp of each other. These were also the only two elements identified as being actively transcribed in the mycelial stage of P. infestans by RT-PCR. It is possible that infSINEh and infSINEi represent active retrotransposons in P. infestans. Based on the quantitative PCR estimates of copy number for all of the elements identified, tRNA-related SINEs were estimated to comprise 0.3% of the 250 Mb P. infestans genome. InfSINE-related sequences were found to occur in species throughout the genus Phytophthora. However, seven elements were shown to be exclusive to P. infestans.
A variant Tc4 transposable element in the nematode C. elegans could encode a novel protein.
Li, W; Shaw, J E
1993-01-01
A variant C. elegans Tc4 transposable element, Tc4-rh1030, has been sequenced and is 3483 bp long. The Tc4 element that had been analyzed previously is 1605 bp long, consists of two 774-bp nearly perfect inverted terminal repeats connected by a 57-bp loop, and lacks significant open reading frames. In Tc4-rh1030, by comparison, a 2343-bp novel sequence is present in place of a 477-bp segment in one of the inverted repeats. The novel sequence of Tc4-rh1030 is present about five times per haploid genome and is invariably associated with Tc4 elements; we have used the designation Tc4v to denote this variant subfamily of Tc4 elements. Sequence analysis of three cDNA clones suggests that a Tc4v element contains at least five exons that could encode a novel basic protein of 537 amino acid residues. On northern blots, a 1.6-kb Tc4v-specific transcript was detected in the mutator strain TR679 but not in the wild-type strain N2; Tc4 elements are known to transpose in TR679 but appear to be quiescent in N2. We have analyzed transcripts produced by an unc-33 gene that has the Tc4-rh1030 insertional mutation in its transcribed region; all or almost all of the Tc4v sequence is frequently spliced out of the mutant unc-33 transcripts, sometimes by means of non-consensus splice acceptor sites. Images PMID:8382791
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
A multiplexed system for quantitative comparisons of chromatin landscapes
van Galen, Peter; Viny, Aaron D.; Ram, Oren; Ryan, Russell J.H.; Cotton, Matthew J.; Donohue, Laura; Sievers, Cem; Drier, Yotam; Liau, Brian B.; Gillespie, Shawn M.; Carroll, Kaitlin M.; Cross, Michael B.; Levine, Ross L.; Bernstein, Bradley E.
2015-01-01
Genome-wide profiling of histone modifications can provide systematic insight into the regulatory elements and programs engaged in a given cell type. However, conventional chromatin immunoprecipitation and sequencing (ChIP-seq) does not capture quantitative information on histone modification levels, requires large amounts of starting material, and involves tedious processing of each individual sample. Here we address these limitations with a technology that leverages DNA barcoding to profile chromatin quantitatively and in multiplexed format. We concurrently map relative levels of multiple histone modifications across multiple samples, each comprising as few as a thousand cells. We demonstrate the technology by monitoring dynamic changes following inhibition of P300, EZH2 or KDM5, by linking altered epigenetic landscapes to chromatin regulator mutations, and by mapping active and repressive marks in purified human hematopoietic stem cells. Hence, this technology enables quantitative studies of chromatin state dynamics across rare cell types, genotypes, environmental conditions and drug treatments. PMID:26687680
Colistin-Resistant Acinetobacter baumannii Clinical Strains with Deficient Biofilm Formation
Dafopoulou, Konstantina; Xavier, Basil Britto; Hotterbeekx, An; Janssens, Lore; Lammens, Christine; Dé, Emmanuelle; Goossens, Herman; Tsakris, Athanasios; Malhotra-Kumar, Surbhi
2015-01-01
In two pairs of clinical colistin-susceptible/colistin-resistant (Csts/Cstr) Acinetobacter baumannii strains, the Cstr strains showed significantly decreased biofilm formation in static and dynamic assays (P < 0.001) and lower relative fitness (P < 0.05) compared with those of the Csts counterparts. The whole-genome sequencing comparison of strain pairs identified a mutation converting a stop codon to lysine (*241K) in LpsB (involved in lipopolysaccharide [LPS] synthesis) in one Cstr strain and a frameshift mutation in CarO and the loss of a 47,969-bp element containing multiple genes associated with biofilm production in the other. PMID:26666921
Rosales, Alirio
2017-04-01
Theories are composed of multiple interacting components. I argue that some theories have narratives as essential components, and that narratives function as integrative devices of the mathematical components of theories. Narratives represent complex processes unfolding in time as a sequence of stages, and hold the mathematical elements together as pieces in the investigation of a given process. I present two case studies from population genetics: R. A. Fisher's "mas selection" theory, and Sewall Wright's shifting balance theory. I apply my analysis to an early episode of the "R. A. Fisher - Sewall Wright controversy." Copyright © 2017 Elsevier Ltd. All rights reserved.
2012-01-01
Background The aim of this study was to clarify the role of global hypomethylation of repetitive elements in determining the genetic and clinical features of multiple myeloma (MM). Methods We assessed global methylation levels using four repetitive elements (long interspersed nuclear element-1 (LINE-1), Alu Ya5, Alu Yb8, and Satellite-α) in clinical samples comprising 74 MM samples and 11 benign control samples (7 cases of monoclonal gammopathy of undetermined significance (MGUS) and 4 samples of normal plasma cells (NPC)). We also evaluated copy-number alterations using array-based comparative genomic hybridization, and performed methyl-CpG binding domain sequencing (MBD-seq). Results Global levels of the repetitive-element methylation declined with the degree of malignancy of plasma cells (NPC>MGUS>MM), and there was a significant inverse correlation between the degree of genomic loss and the LINE-1 methylation levels. We identified 80 genomic loci as common breakpoints (CBPs) around commonly lost regions, which were significantly associated with increased LINE-1 densities. MBD-seq analysis revealed that average DNA-methylation levels at the CBP loci and relative methylation levels in regions with higher LINE-1 densities also declined during the development of MM. We confirmed that levels of methylation of the 5' untranslated region of respective LINE-1 loci correlated strongly with global LINE-1 methylation levels. Finally, there was a significant association between LINE-1 hypomethylation and poorer overall survival (hazard ratio 2.8, P = 0.015). Conclusion Global hypomethylation of LINE-1 is associated with the progression of and poorer prognosis for MM, possibly due to frequent copy-number loss. PMID:23259664
A Systematic Analysis of 2 Monoisocentric Techniques for the Treatment of Multiple Brain Metastases.
Narayanasamy, Ganesh; Stathakis, Sotirios; Gutierrez, Alonso N; Pappas, Evangelos; Crownover, Richard; Floyd, John R; Papanikolaou, Niko
2017-10-01
In this treatment planning study, we compare the plan quality and delivery parameters for the treatment of multiple brain metastases using 2 monoisocentric techniques: the Multiple Metastases Element from Brainlab and the RapidArc volumetric-modulated arc therapy from Varian Medical Systems. Eight patients who were treated in our institution for multiple metastases (3-7 lesions) were replanned with Multiple Metastases Element using noncoplanar dynamic conformal arcs. The same patients were replanned with the RapidArc technique in Eclipse using 4 noncoplanar arcs. Both techniques were designed using a single isocenter. Plan quality metrics (conformity index, homogeneity index, gradient index, and R 50% ), monitor unit, and the planning time were recorded. Comparison of the Multiple Metastases Element and RapidArc plans was performed using Shapiro-Wilk test, paired Student t test, and Wilcoxon signed rank test. A paired Wilcoxon signed rank test between Multiple Metastases Element and RapidArc showed comparable plan quality metrics and dose to brain. Mean ± standard deviation values of conformity index were 1.8 ± 0.7 and 1.7 ± 0.6, homogeneity index were 1.3 ± 0.1 and 1.3 ± 0.1, gradient index were 5.0 ± 1.8 and 5.1 ± 1.9, and R 50% were 4.9 ± 1.8 and 5.0 ± 1.9 for Multiple Metastases Element and RapidArc plans, respectively. Mean brain dose was 2.3 and 2.7 Gy for Multiple Metastases Element and RapidArc plans, respectively. The mean value of monitor units in Multiple Metastases Element plan was 7286 ± 1065, which is significantly lower than the RapidArc monitor units of 9966 ± 1533 ( P < .05). For the planning of multiple brain lesions to be treated with stereotactic radiosurgery, Multiple Metastases Element planning software produced equivalent conformity, homogeneity, dose falloff, and brain V 12 Gy but required significantly lower monitor units, when compared to RapidArc plans.
A Systematic Analysis of 2 Monoisocentric Techniques for the Treatment of Multiple Brain Metastases
Stathakis, Sotirios; Gutierrez, Alonso N.; Pappas, Evangelos; Crownover, Richard; Floyd, John R.; Papanikolaou, Niko
2016-01-01
Background: In this treatment planning study, we compare the plan quality and delivery parameters for the treatment of multiple brain metastases using 2 monoisocentric techniques: the Multiple Metastases Element from Brainlab and the RapidArc volumetric-modulated arc therapy from Varian Medical Systems. Methods: Eight patients who were treated in our institution for multiple metastases (3-7 lesions) were replanned with Multiple Metastases Element using noncoplanar dynamic conformal arcs. The same patients were replanned with the RapidArc technique in Eclipse using 4 noncoplanar arcs. Both techniques were designed using a single isocenter. Plan quality metrics (conformity index, homogeneity index, gradient index, and R50%), monitor unit, and the planning time were recorded. Comparison of the Multiple Metastases Element and RapidArc plans was performed using Shapiro-Wilk test, paired Student t test, and Wilcoxon signed rank test. Results: A paired Wilcoxon signed rank test between Multiple Metastases Element and RapidArc showed comparable plan quality metrics and dose to brain. Mean ± standard deviation values of conformity index were 1.8 ± 0.7 and 1.7 ± 0.6, homogeneity index were 1.3 ± 0.1 and 1.3 ± 0.1, gradient index were 5.0 ± 1.8 and 5.1 ± 1.9, and R50% were 4.9 ± 1.8 and 5.0 ± 1.9 for Multiple Metastases Element and RapidArc plans, respectively. Mean brain dose was 2.3 and 2.7 Gy for Multiple Metastases Element and RapidArc plans, respectively. The mean value of monitor units in Multiple Metastases Element plan was 7286 ± 1065, which is significantly lower than the RapidArc monitor units of 9966 ± 1533 (P < .05). Conclusion: For the planning of multiple brain lesions to be treated with stereotactic radiosurgery, Multiple Metastases Element planning software produced equivalent conformity, homogeneity, dose falloff, and brain V12 Gy but required significantly lower monitor units, when compared to RapidArc plans. PMID:27612917
High-speed optical phase-shifting apparatus
Zortman, William A.
2016-11-08
An optical phase shifter includes an optical waveguide, a plurality of partial phase shifting elements arranged sequentially, and control circuitry electrically coupled to the partial phase shifting elements. The control circuitry is adapted to provide an activating signal to each of the N partial phase shifting elements such that the signal is delayed by a clock cycle between adjacent partial phase shifting elements in the sequence. The transit time for a guided optical pulse train between the input edges of consecutive partial phase shifting elements in the sequence is arranged to be equal to a clock cycle, thereby enabling pipelined processing of the optical pulses.
Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.
2010-01-01
Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10–20% nucleotide deviation from the canonical ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966
Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B
2010-04-01
Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10-20% nucleotide deviation from the canonical ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.
Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.
2004-01-01
We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169
Using a Sequence of Earcons to Monitor Multiple Simulated Patients.
Hickling, Anna; Brecknell, Birgit; Loeb, Robert G; Sanderson, Penelope
2017-03-01
The aim of this study was to determine whether a sequence of earcons can effectively convey the status of multiple processes, such as the status of multiple patients in a clinical setting. Clinicians often monitor multiple patients. An auditory display that intermittently conveys the status of multiple patients may help. Nonclinician participants listened to sequences of 500-ms earcons that each represented the heart rate (HR) and oxygen saturation (SpO 2 ) levels of a different simulated patient. In each sequence, one, two, or three patients had an abnormal level of HR and/or SpO 2 . In Experiment 1, participants reported which of nine patients in a sequence were abnormal. In Experiment 2, participants identified the vital signs of one, two, or three abnormal patients in sequences of one, five, or nine patients, where the interstimulus interval (ISI) between earcons was 150 ms. Experiment 3 used the five-sequence condition of Experiment 2, but the ISI was either 150 ms or 800 ms. Participants reported which patient(s) were abnormal with median 95% accuracy. Identification accuracy for vital signs decreased as the number of abnormal patients increased from one to three, p < .001, but accuracy was unaffected by number of patients in a sequence. Overall, identification accuracy was significantly higher with an ISI of 800 ms (89%) compared with an ISI of 150 ms (83%), p < .001. A multiple-patient display can be created by cycling through earcons that represent individual patients. The principles underlying the multiple-patient display can be extended to other vital signs, designs, and domains.
An Enterotoxin-Bearing Pathogenicity Island in Staphylococcus epidermidis▿†
Madhusoodanan, Jyoti; Seo, Keun Seok; Remortel, Brian; Park, Joo Youn; Hwang, Sun Young; Fox, Lawrence K.; Park, Yong Ho; Deobald, Claudia F.; Wang, Dan; Liu, Song; Daugherty, Sean C.; Gill, Ann Lindley; Bohach, Gregory A.; Gill, Steven R.
2011-01-01
Cocolonization of human mucosal surfaces causes frequent encounters between various staphylococcal species, creating opportunities for the horizontal acquisition of mobile genetic elements. The majority of Staphylococcus aureus toxins and virulence factors are encoded on S. aureus pathogenicity islands (SaPIs). Horizontal movement of SaPIs between S. aureus strains plays a role in the evolution of virulent clinical isolates. Although there have been reports of the production of toxic shock syndrome toxin 1 (TSST-1), enterotoxin, and other superantigens by coagulase-negative staphylococci, no associated pathogenicity islands have been found in the genome of Staphylococcus epidermidis, a generally less virulent relative of S. aureus. We show here the first evidence of a composite S. epidermidis pathogenicity island (SePI), the product of multiple insertions in the genome of a clinical isolate. The taxonomic placement of S. epidermidis strain FRI909 was confirmed by a number of biochemical tests and multilocus sequence typing. The genome sequence of this strain was analyzed for other unique gene clusters and their locations. This pathogenicity island encodes and expresses staphylococcal enterotoxin C3 (SEC3) and staphylococcal enterotoxin-like toxin L (SElL), as confirmed by quantitative reverse transcription-PCR (qRT-PCR) and immunoblotting. We present here an initial characterization of this novel pathogenicity island, and we establish that it is stable, expresses enterotoxins, and is not obviously transmissible by phage transduction. We also describe the genome sequence, excision, replication, and packaging of a novel bacteriophage in S. epidermidis FRI909, as well as attempts to mobilize the SePI element by this phage. PMID:21317317
Comparative analysis of CRISPR-Cas systems in Klebsiella genomes.
Shen, Juntao; Lv, Li; Wang, Xudong; Xiu, Zhilong; Chen, Guoqiang
2017-04-01
Prokaryotic CRISPR-Cas system provides adaptive immunity against invasive genetic elements. Bacteria of the genus Klebsiella are important nosocomial opportunistic pathogens. However, information of CRISPR-Cas system in Klebsiella remains largely unknown. Here, we analyzed the CRISPR-Cas systems of 68 complete genomes of Klebsiella representing four species. All the elements for CRISPR-Cas system (cas genes, repeats, leader sequences, and PAMs) were characterized. Besides the typical Type I-E and I-F CRISPR-Cas systems, a new Subtype I system located in the ABC transport system-glyoxalase region was found. The conservation of the new subtype CRISPR system between different species showed new evidence for CRISPR horizontal transfer. CRISPR polymorphism was strongly correlated both with species and multilocus sequence types. Some results indicated the function of adaptive immunity: most spacers (112 of 124) matched to prophages and plasmids and no matching housekeeping genes; new spacer acquisition was observed within the same sequence type (ST) and same clonal complex; the identical spacers were observed only in the ancient position (far from the leader) between different STs and clonal complexes. Interestingly, a high ratio of self-targeting spacers (7.5%, 31 of 416) was found in CRISPR-bearing Klebsiella pneumoniae (61%, 11 of 18). In some strains, there even were multiple full matching self-targeting spacers. Some self-targeting spacers were conserved even between different STs. These results indicated that some unknown mechanisms existed to compromise the function of self-targets of CRISPR-Cas systems in K. pneumoniae. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Directions for Space-Based Low-Frequency Radio Astronomy 2. Telescopes
NASA Astrophysics Data System (ADS)
Basart, J. P.; Burns, J. O.; Dennison, B. K.; Weiler, K. W.; Kassim, N. E.; Castillo, S. P.; McCune, B. M.
Astronomical studies of celestial sources at low radio frequencies (0.3 to 30 MHz) lag far behind the investigations of celestial sources at high radio frequencies. In a companion paper [Basart et al., this issue] we discussed the need for low-frequency investigations, and in this paper we discuss the telescopes required to make the observations. Radio telescopes for use in the low-frequency range can be built principally from ``off-the-shelf'' components. For relatively little cost for a space mission, great strides can be made in deploying arrays of antennas and receivers in space that would produce data contributing significantly to our understanding of galaxies and galactic nebulae. In this paper we discuss an evolutionary sequence of telescopes, antenna systems, receivers, and (u,v) plane coverage. The telescopes are space-based because of the disruptive aspects of the Earth's ionosphere on low-frequency celestial signals traveling to the Earth's surface. Orbiting antennas consisting of array elements deposited on a Kevlar balloon have strong advantages of nearly identical multiple beams over 4π steradians and few mechanical aspects in deployment and operation. The relatively narrow beam width of these antennas can significantly help reduce the ``confusion'' problem. The evolutionary sequence of telescopes starts with an Earth-orbiting spectrometer to measure the low-frequency radio environment in space, proceeds to a two-element interferometer, then to an orbiting array, and ends with a telescope on the lunar farside. The sequence is in the order of increasing capability which is also the order of increasing complexity and cost. All the missions can be accomplished with current technology.
Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J
2000-12-01
The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.
[Learning and Repetive Reproduction of Memorized Sequences by the Right and the Left Hand].
Bobrova, E V; Lyakhovetskii, V A; Bogacheva, I N
2015-01-01
An important stage of learning a new skill is repetitive reproduction of one and the same sequence of movements, which plays a significant role in forming of the movement stereotypes. Two groups of right-handers repeatedly memorized (6-10 repetitions) the sequences of their hand transitions by experimenter in 6 positions, firstly by the right hand (RH), and then--by the left hand (LH) or vice versa. Random sequences previously unknown to the volunteers were reproduced in the 11 series. Modified sequences were tested in the 2nd and 3rd series, where the same elements' positions were presented in different order. The processes of repetitive sequence reproduction were similar for RH and LH. However, the learning of the modified sequences differed: Information about elements' position disregarding the reproduction order was used only when LH initiated task performing. This information was not used when LH followed RH and when RH performed the task. Consequently, the type of information coding activated by LH helped learn the positions of sequence elements, while the type of information coding activated by RH prevented learning. It is supposedly connected with the predominant role of right hemisphere in the processes of positional coding and motor learning.
Hobo, T; Asada, M; Kowyama, Y; Hattori, T
1999-09-01
ACGT-containing ABA response elements (ABREs) have been functionally identified in the promoters of various genes. In addition, single copies of ABRE have been found to require a cis-acting, coupling element to achieve ABA induction. A coupling element 3 (CE3) sequence, originally identified as such in the barley HVA1 promoter, is found approximately 30 bp downstream of motif A (ACGT-containing ABRE) in the promoter of the Osem gene. The relationship between these two elements was further defined by linker-scan analyses of a 55 bp fragment of the Osem promoter, which is sufficient for ABA-responsiveness and VP1 activation. The analyses revealed that both motif A and CE3 sequence were required not only for ABA-responsiveness but also for VP1 activation. Since the sequences of motif A and CE3 were found to be similar, motif-exchange experiments were carried out. The experiments demonstrated that motif A and CE3 were interchangeable by each other with respect to both ABA and VP1 regulation. In addition, both sequences were shown to be recognized by a VP1-interacting, ABA-responsive bZIP factor TRAB1. These results indicate that ACGT-containing ABREs and CE3 are functionally equivalent cis-acting elements. Furthermore, TRAB1 was shown to bind two other non-ACGT ABREs. Based on these results, all these ABREs including CE3 are proposed to be categorized into a single class of cis-acting elements.
2009-01-01
Background Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. Results To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Conclusion Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences. PMID:19821996
Förster, Frank; Liang, Chunguang; Shkumatov, Alexander; Beisser, Daniela; Engelmann, Julia C; Schnölzer, Martina; Frohme, Marcus; Müller, Tobias; Schill, Ralph O; Dandekar, Thomas
2009-10-12
Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences.
Gautier, Philippe; Loosli, Felix; Tay, Boon-Hui; Tay, Alice; Murdoch, Emma; Coutinho, Pedro; van Heyningen, Veronica; Brenner, Sydney; Venkatesh, Byrappa; Kleinjan, Dirk A.
2013-01-01
Pax6 is a developmental control gene essential for eye development throughout the animal kingdom. In addition, Pax6 plays key roles in other parts of the CNS, olfactory system, and pancreas. In mammals a single Pax6 gene encoding multiple isoforms delivers these pleiotropic functions. Here we provide evidence that the genomes of many other vertebrate species contain multiple Pax6 loci. We sequenced Pax6-containing BACs from the cartilaginous elephant shark (Callorhinchus milii) and found two distinct Pax6 loci. Pax6.1 is highly similar to mammalian Pax6, while Pax6.2 encodes a paired-less Pax6. Using synteny relationships, we identify homologs of this novel paired-less Pax6.2 gene in lizard and in frog, as well as in zebrafish and in other teleosts. In zebrafish two full-length Pax6 duplicates were known previously, originating from the fish-specific genome duplication (FSGD) and expressed in divergent patterns due to paralog-specific loss of cis-elements. We show that teleosts other than zebrafish also maintain duplicate full-length Pax6 loci, but differences in gene and regulatory domain structure suggest that these Pax6 paralogs originate from a more ancient duplication event and are hence renamed as Pax6.3. Sequence comparisons between mammalian and elephant shark Pax6.1 loci highlight the presence of short- and long-range conserved noncoding elements (CNEs). Functional analysis demonstrates the ancient role of long-range enhancers for Pax6 transcription. We show that the paired-less Pax6.2 ortholog in zebrafish is expressed specifically in the developing retina. Transgenic analysis of elephant shark and zebrafish Pax6.2 CNEs with homology to the mouse NRE/Pα internal promoter revealed highly specific retinal expression. Finally, morpholino depletion of zebrafish Pax6.2 resulted in a “small eye” phenotype, supporting a role in retinal development. In summary, our study reveals that the pleiotropic functions of Pax6 in vertebrates are served by a divergent family of Pax6 genes, forged by ancient duplication events and by independent, lineage-specific gene losses. PMID:23359656
2011-01-01
Background One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated. Results Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes. Conclusions Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues. PMID:21226900
Xiong, Y; Eickbush, T H
1988-01-01
Two types of insertion elements, R1 and R2 (previously called type I and type II), are known to interrupt the 28S ribosomal genes of several insect species. In the silkmoth, Bombyx mori, each element occupies approximately 10% of the estimated 240 ribosomal DNA units, while at most only a few copies are located outside the ribosomal DNA units. We present here the complete nucleotide sequence of an R1 insertion from B. mori (R1Bm). This 5.1-kilobase element contains two overlapping open reading frames (ORFs) which together occupy 88% of its length. ORF1 is 461 amino acids in length and exhibits characteristics of retroviral gag genes. ORF2 is 1,051 amino acids in length and contains homology to reverse transcriptase-like enzymes. The analysis of 3' and 5' ends of independent isolates from the ribosomal locus supports the suggestion that R1 is still functioning as a transposable element. The precise location of the element within the genome implies that its transposition must occur with remarkable insertion sequence specificity. Comparison of the deduced amino acid sequences from six retrotransposons, R1 and R2 of B. mori, I factor and F element of Drosophila melanogaster, L1 of Mus domesticus, and Ingi of Trypanosoma brucei, reveals a relatively high level of sequence homology in the reverse transcriptase region. Like R1, these elements lack long terminal repeats. We have therefore named this class of related elements the non-long-terminal-repeat (non-LTR) retrotransposons. Images PMID:2447482
Oliviero, S; Cortese, R
1989-01-01
Transcription of the human haptoglobin (Hp) gene is induced by interleukin-6 (IL-6) in the human hepatoma cell line Hep3B. Cis-acting elements responsible for this response are localized within the first 186 bp of the 5'-flanking region. Site-specific mutants of the Hp promoter fused to the chloramphenicol acetyl transferase (CAT) gene were analysed by transient transfection into uninduced and IL-6-treated Hep3B cells. We identified three regions, A, B and C, defined by mutation, which are important for the IL-6 response. Band shift experiments using nuclear extracts from untreated or IL-6-treated cells revealed the presence of IL-6-inducible DNA binding activities when DNA fragments containing the A or the C sequences were used. Competition experiments showed that both sequences bind to the same nuclear factors. Polymers of oligonucleotides containing either the A or the C regions confer IL-6 responsiveness to a truncated SV40 promoter. The B region forms several complexes with specific DNA-binding proteins different from those which bind to the A and C region. The B region complexes are identical in nuclear extracts from IL-6-treated and untreated cells. While important for IL-6 induction in the context of the haptoglobin promoter, the B site does not confer IL-6 inducibility to the SV40 promoter. Our results indicate that the IL-6 response of the haptoglobin promoter is dependent on the presence of multiple, partly redundant, cis-acting elements. Images PMID:2787245
cis-Acting elements important for retroviral RNA packaging specificity.
Beasley, Benjamin E; Hu, Wei-Shau
2002-05-01
Spleen necrosis virus (SNV) proteins can package RNA from distantly related murine leukemia virus (MLV), whereas MLV proteins cannot package SNV RNA efficiently. We used this nonreciprocal recognition to investigate regions of packaging signals that influence viral RNA encapsidation specificity. Although the MLV and SNV packaging signals (Psi and E, respectively) do not contain significant sequence homology, they both contain a pair of hairpins. This hairpin pair was previously proposed to be the core element in MLV Psi. In the present study, MLV-based vectors were generated to contain chimeric SNV/MLV packaging signals in which the hairpins were replaced with the heterologous counterpart. The interactions between these chimeras and MLV or SNV proteins were examined by virus replication and RNA analyses. SNV proteins recognized all of the chimeras, indicating that these chimeras were functional. We found that replacing the hairpin pair did not drastically alter the ability of MLV proteins to package these chimeras. These results indicate that, despite the important role of the hairpin pair in RNA packaging, it is not the major motif responsible for the ability of MLV proteins to discriminate between the MLV and SNV packaging signals. To determine the role of sequences flanking the hairpins in RNA packaging specificity, vectors with swapped flanking regions were generated and evaluated. SNV proteins packaged all of these chimeras efficiently. In contrast, MLV proteins strongly favored chimeras with the MLV 5'-flanking regions. These data indicated that MLV Gag recognizes multiple elements in the viral packaging signal, including the hairpin structure and flanking regions.
Richard, D; Ravigné, V; Rieux, A; Facon, B; Boyer, C; Boyer, K; Grygiel, P; Javegny, S; Terville, M; Canteros, B I; Robène, I; Vernière, C; Chabirand, A; Pruvost, O; Lefeuvre, P
2017-04-01
Copper-based antimicrobial compounds are widely used to control plant bacterial pathogens. Pathogens have adapted in response to this selective pressure. Xanthomonas citri pv. citri, a major citrus pathogen causing Asiatic citrus canker, was first reported to carry plasmid-encoded copper resistance in Argentina. This phenotype was conferred by the copLAB gene system. The emergence of resistant strains has since been reported in Réunion and Martinique. Using microsatellite-based genotyping and copLAB PCR, we demonstrated that the genetic structure of the copper-resistant strains from these three regions was made up of two distant clusters and varied for the detection of copLAB amplicons. In order to investigate this pattern more closely, we sequenced six copper-resistant X. citri pv. citri strains from Argentina, Martinique and Réunion, together with reference copper-resistant Xanthomonas and Stenotrophomonas strains using long-read sequencing technology. Genes involved in copper resistance were found to be strain dependent with the novel identification in X. citri pv. citri of copABCD and a cus heavy metal efflux resistance-nodulation-division system. The genes providing the adaptive trait were part of a mobile genetic element similar to Tn3-like transposons and included in a conjugative plasmid. This indicates the system's great versatility. The mining of all available bacterial genomes suggested that, within the bacterial community, the spread of copper resistance associated with mobile elements and their plasmid environments was primarily restricted to the Xanthomonadaceae family. © 2017 John Wiley & Sons Ltd.
Limitations and potentials of current motif discovery algorithms
Hu, Jianjun; Li, Bin; Kihara, Daisuke
2005-01-01
Computational methods for de novo identification of gene regulation elements, such as transcription factor binding sites, have proved to be useful for deciphering genetic regulatory networks. However, despite the availability of a large number of algorithms, their strengths and weaknesses are not sufficiently understood. Here, we designed a comprehensive set of performance measures and benchmarked five modern sequence-based motif discovery algorithms using large datasets generated from Escherichia coli RegulonDB. Factors that affect the prediction accuracy, scalability and reliability are characterized. It is revealed that the nucleotide and the binding site level accuracy are very low, while the motif level accuracy is relatively high, which indicates that the algorithms can usually capture at least one correct motif in an input sequence. To exploit diverse predictions from multiple runs of one or more algorithms, a consensus ensemble algorithm has been developed, which achieved 6–45% improvement over the base algorithms by increasing both the sensitivity and specificity. Our study illustrates limitations and potentials of existing sequence-based motif discovery algorithms. Taking advantage of the revealed potentials, several promising directions for further improvements are discussed. Since the sequence-based algorithms are the baseline of most of the modern motif discovery algorithms, this paper suggests substantial improvements would be possible for them. PMID:16284194
Botelho, Ana; Canto, Ana; Leão, Célia; Cunha, Mónica V
2015-01-01
Typical CRISPR (clustered, regularly interspaced, short palindromic repeat) regions are constituted by short direct repeats (DRs), interspersed with similarly sized non-repetitive spacers, derived from transmissible genetic elements, acquired when the cell is challenged with foreign DNA. The analysis of the structure, in number and nature, of CRISPR spacers is a valuable tool for molecular typing since these loci are polymorphic among strains, originating characteristic signatures. The existence of CRISPR structures in the genome of the members of Mycobacterium tuberculosis complex (MTBC) enabled the development of a genotyping method, based on the analysis of the presence or absence of 43 oligonucleotide spacers separated by conserved DRs. This method, called spoligotyping, consists on PCR amplification of the DR chromosomal region and recognition after hybridization of the spacers that are present. The workflow beneath this methodology implies that the PCR products are brought onto a membrane containing synthetic oligonucleotides that have complementary sequences to the spacer sequences. Lack of hybridization of the PCR products to a specific oligonucleotide sequence indicates absence of the correspondent spacer sequence in the examined strain. Spoligotyping gained great notoriety as a robust identification and typing tool for members of MTBC, enabling multiple epidemiological studies on human and animal tuberculosis.
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-01-01
Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048
Fischer, Susan; Maier, Lisa-Katharina; Stoll, Britta; Brendel, Jutta; Fischer, Eike; Pfeiffer, Friedhelm; Dyall-Smith, Mike; Marchfelder, Anita
2012-01-01
The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated (Cas) system provides adaptive and heritable immunity against foreign genetic elements in most archaea and many bacteria. Although this system is widespread and diverse with many subtypes, only a few species have been investigated to elucidate the precise mechanisms for the defense of viruses or plasmids. Approximately 90% of all sequenced archaea encode CRISPR/Cas systems, but their molecular details have so far only been examined in three archaeal species: Sulfolobus solfataricus, Sulfolobus islandicus, and Pyrococcus furiosus. Here, we analyzed the CRISPR/Cas system of Haloferax volcanii using a plasmid-based invader assay. Haloferax encodes a type I-B CRISPR/Cas system with eight Cas proteins and three CRISPR loci for which the identity of protospacer adjacent motifs (PAMs) was unknown until now. We identified six different PAM sequences that are required upstream of the protospacer to permit target DNA recognition. This is only the second archaeon for which PAM sequences have been determined, and the first CRISPR group with such a high number of PAM sequences. Cells could survive the plasmid challenge if their CRISPR/Cas system was altered or defective, e.g. by deletion of the cas gene cassette. Experimental PAM data were supplemented with bioinformatics data on Haloferax and Haloquadratum. PMID:22767603
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-03-01
Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
Approximation algorithm for the problem of partitioning a sequence into clusters
NASA Astrophysics Data System (ADS)
Kel'manov, A. V.; Mikhailova, L. V.; Khamidullin, S. A.; Khandeev, V. I.
2017-08-01
We consider the problem of partitioning a finite sequence of Euclidean points into a given number of clusters (subsequences) using the criterion of the minimal sum (over all clusters) of intercluster sums of squared distances from the elements of the clusters to their centers. It is assumed that the center of one of the desired clusters is at the origin, while the center of each of the other clusters is unknown and determined as the mean value over all elements in this cluster. Additionally, the partition obeys two structural constraints on the indices of sequence elements contained in the clusters with unknown centers: (1) the concatenation of the indices of elements in these clusters is an increasing sequence, and (2) the difference between an index and the preceding one is bounded above and below by prescribed constants. It is shown that this problem is strongly NP-hard. A 2-approximation algorithm is constructed that is polynomial-time for a fixed number of clusters.
Characterization of an endogenous retrovirus class in elephants and their relatives
Greenwood, Alex D; Englbrecht, Claudia C; MacPhee, Ross DE
2004-01-01
Background Endogenous retrovirus-like elements (ERV-Ls, primed with tRNA leucine) are a diverse group of reiterated sequences related to foamy viruses and widely distributed among mammals. As shown in previous investigations, in many primates and rodents this class of elements has remained transpositionally active, as reflected by increased copy number and high sequence diversity within and among taxa. Results Here we examine whether proviral-like sequences may be suitable molecular probes for investigating the phylogeny of groups known to have high element diversity. As a test we characterized ERV-Ls occurring in a sample of extant members of superorder Uranotheria (Asian and African elephants, manatees, and hyraxes). The ERV-L complement in this group is even more diverse than previously suspected, and there is sequence evidence for active expansion, particularly in elephantids. Many of the elements characterized have protein coding potential suggestive of activity. Conclusions In general, the evidence supports the hypothesis that the complement had a single origin within basal Uranotheria. PMID:15476555
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chris Amemiya
2003-04-01
The goals of this project were to isolate, characterize, and sequence the Dlx3/Dlx7 bigene cluster from twelve different species of mammals. The Dlx3 and Dlx7 genes are known to encode homeobox transcription factors involved in patterning of structures in the vertebrate jaw as well as vertebrate limbs. Genomic sequences from the respective taxa will subsequently be compared in order to identify conserved non-coding sequences that are potential cis-regulatory elements. Based on the comparisons they will fashion transgenic mouse experiments to functionally test the strength of the potential cis-regulatory elements. A goal of the project is to attempt to identify thosemore » elements that may function in coordinately regulating both Dlx3 and Dlx7 functions.« less
Wong, S W; Schaffer, P A
1991-05-01
Like other DNA-containing viruses, the three origins of herpes simplex virus type 1 (HSV-1) DNA replication are flanked by sequences containing transcriptional regulatory elements. In a transient plasmid replication assay, deletion of sequences comprising the transcriptional regulatory elements of ICP4 and ICP22/47, which flank oriS, resulted in a greater than 80-fold decrease in origin function compared with a plasmid, pOS-822, which retains these sequences. In an effort to identify specific cis-acting elements responsible for this effect, we conducted systematic deletion analysis of the flanking region with plasmid pOS-822 and tested the resulting mutant plasmids for origin function. Stimulation by cis-acting elements was shown to be both distance and orientation dependent, as changes in either parameter resulted in a decrease in oriS function. Additional evidence for the stimulatory effect of flanking sequences on origin function was demonstrated by replacement of these sequences with the cytomegalovirus immediate-early promoter, resulting in nearly wild-type levels of oriS function. In competition experiments, cotransfection of cells with the test plasmid, pOS-822, and increasing molar concentrations of a competitor plasmid which contained the ICP4 and ICP22/47 transcriptional regulatory regions but lacked core origin sequences resulted in a significant reduction in the replication efficiency of pOS-822, demonstrating that factors which bind specifically to the oriS-flanking sequences are likely involved as auxiliary proteins in oriS function. Together, these studies demonstrate that trans-acting factors and the sites to which they bind play a critical role in the efficiency of HSV-1 DNA replication from oriS in transient-replication assays.
Combinatorial events of insertion sequences and ICE in Gram-negative bacteria.
Toleman, Mark A; Walsh, Timothy R
2011-09-01
The emergence of antibiotic and antimicrobial resistance in Gram-negative bacteria is incremental and linked to genetic elements that function in a so-called 'one-ended transposition' manner, including ISEcp1, ISCR elements and Tn3-like transposons. The power of these elements lies in their inability to consistently recognize one of their own terminal sequences, while recognizing more genetically distant surrogate sequences. This has the effect of mobilizing the DNA sequence found adjacent to their initial location. In general, resistance in Gram-negatives is closely linked to a few one-off events. These include the capture of the class 1 integron by a Tn5090-like transposon; the formation of the 3' conserved segment (3'-CS); and the fusion of the ISCR1 element to the 3'-CS. The structures formed by these rare events have been massively amplified and disseminated in Gram-negative bacteria, but hitherto, are rarely found in Gram-positives. Such events dominate current resistance gene acquisition and are instrumental in the construction of large resistance gene islands on chromosomes and plasmids. Similar combinatorial events appear to have occurred between conjugative plasmids and phages constructing hybrid elements called integrative and conjugative elements or conjugative transposons. These elements are beginning to be closely linked to some of the more powerful resistance mechanisms such as the extended spectrum β-lactamases, metallo- and AmpC type β-lactamases. Antibiotic resistance in Gram-negative bacteria is dominated by unusual combinatorial mistakes of Insertion sequences and gene fusions which have been selected and amplified by antibiotic pressure enabling the formation of extended resistance islands. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Kinnevey, Peter M.; Shore, Anna C.; Brennan, Grainne I.; Sullivan, Derek J.; Ehricht, Ralf; Monecke, Stefan; Slickers, Peter
2013-01-01
Methicillin-resistant Staphylococcus aureus (MRSA) has been a major cause of nosocomial infection in Irish hospitals for 4 decades, and replacement of predominant MRSA clones has occurred several times. An MRSA isolate recovered in 2006 as part of a larger study of sporadic MRSA exhibited a rare spa (t878) and multilocus sequence (ST779) type and was nontypeable by PCR- and DNA microarray-based staphylococcal cassette chromosome mec (SCCmec) element typing. Whole-genome sequencing revealed the presence of a novel 51-kb composite island (CI) element with three distinct domains, each flanked by direct repeat and inverted repeat sequences, including (i) a pseudo SCCmec element (16.3 kb) carrying mecA with a novel mec class region, a fusidic acid resistance gene (fusC), and two copper resistance genes (copB and copC) but lacking ccr genes; (ii) an SCC element (17.5 kb) carrying a novel ccrAB4 allele; and (iii) an SCC element (17.4 kb) carrying a novel ccrC allele and a clustered regularly interspaced short palindromic repeat (CRISPR) region. The novel CI was subsequently identified by PCR in an additional 13 t878/ST779 MRSA isolates, six from bloodstream infections, recovered between 2006 and 2011 in 11 hospitals. Analysis of open reading frames (ORFs) carried by the CI showed amino acid sequence similarity of 44 to 100% to ORFs from S. aureus and coagulase-negative staphylococci (CoNS). These findings provide further evidence of genetic transfer between S. aureus and CoNS and show how this contributes to the emergence of novel SCCmec elements and MRSA strains. Ongoing surveillance of this MRSA strain is warranted and will require updating of currently used SCCmec typing methods. PMID:23147725
He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei
2015-04-18
Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions of repetitive elements in radish were estimated and satellite repeats were the most dominating elements. Fine karyotyping analysis was established which allow us to easily identify each individual somatic metaphase chromosome. Immunofluorescence- and ChIP-based assays demonstrated the functional significance of satellite and centromere-specific retrotransposon at centromeres. Our study provides a valuable basis for future genomic studies in radish.
Evolutionary trajectory of Pack-MULEs is determined by their epigenetic status
USDA-ARS?s Scientific Manuscript database
Acquisition and rearrangement of host genes by transposable elements is one mechanism to increase gene diversity. The rice genome is replete in such sequences and while ~3,000 Pack- Mutator-like transposable elements containing gene sequences (Pack-MULEs) have been identified, their function remains...
Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2004-09-22
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl
Disruption of Boundary Encoding During Sensorimotor Sequence Learning: An MEG Study.
Michail, Georgios; Nikulin, Vadim V; Curio, Gabriel; Maess, Burkhard; Herrojo Ruiz, María
2018-01-01
Music performance relies on the ability to learn and execute actions and their associated sounds. The process of learning these auditory-motor contingencies depends on the proper encoding of the serial order of the actions and sounds. Among the different serial positions of a behavioral sequence, the first and last (boundary) elements are particularly relevant. Animal and patient studies have demonstrated a specific neural representation for boundary elements in prefrontal cortical regions and in the basal ganglia, highlighting the relevance of their proper encoding. The neural mechanisms underlying the encoding of sequence boundaries in the general human population remain, however, largely unknown. In this study, we examined how alterations of auditory feedback, introduced at different ordinal positions (boundary or within-sequence element), affect the neural and behavioral responses during sensorimotor sequence learning. Analysing the neuromagnetic signals from 20 participants while they performed short piano sequences under the occasional effect of altered feedback (AF), we found that at around 150-200 ms post-keystroke, the neural activities in the dorsolateral prefrontal cortex (DLPFC) and supplementary motor area (SMA) were dissociated for boundary and within-sequence elements. Furthermore, the behavioral data demonstrated that feedback alterations on boundaries led to greater performance costs, such as more errors in the subsequent keystrokes. These findings jointly support the idea that the proper encoding of boundaries is critical in acquiring sensorimotor sequences. They also provide evidence for the involvement of a distinct neural circuitry in humans including prefrontal and higher-order motor areas during the encoding of the different classes of serial order.
Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross
1999-01-01
Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978
Link, Gerhard
1984-01-01
A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540
Senescence responsive transcriptional element
Campisi, Judith; Testori, Alessandro
1999-01-01
Recombinant polynucleotides have expression control sequences that have a senescence responsive element and a minimal promoter, and which are operatively linked to a heterologous nucleotide sequence. The molecules are useful for achieving high levels of expression of genes in senescent cells. Methods of inhibiting expression of genes in senescent cells also are provided.
DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.
Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano
2018-01-01
Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .
Molecular and bioinformatic analysis of the FB-NOF transposable element.
Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol
2006-04-12
The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.
Statistical learning of movement.
Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M
2016-12-01
The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.
Repetitive Elements May Comprise Over Two-Thirds of the Human Genome
de Koning, A. P. Jason; Gu, Wanjun; Castoe, Todd A.; Batzer, Mark A.; Pollock, David D.
2011-01-01
Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed. PMID:22144907
Kumar, Vinod; Sun, Peng; Vamathevan, Jessica; Li, Yong; Ingraham, Karen; Palmer, Leslie; Huang, Jianzhong; Brown, James R.
2011-01-01
There is a global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. While the epidemiology of K. pneumoniae strains and occurrences of specific antibiotic resistance genes, such as plasmid-borne extended-spectrum β-lactamases (ESBLs), have been extensively studied, only four complete genomes of K. pneumoniae are available. To better understand the multidrug resistance factors in K. pneumoniae, we determined by pyrosequencing the nearly complete genome DNA sequences of two strains with disparate antibiotic resistance profiles, broadly drug-susceptible strain JH1 and strain 1162281, which is resistant to multiple clinically used antibiotics, including extended-spectrum β-lactams, fluoroquinolones, aminoglycosides, trimethoprim, and sulfamethoxazoles. Comparative genomic analysis of JH1, 1162281, and other published K. pneumoniae genomes revealed a core set of 3,631 conserved orthologous proteins, which were used for reconstruction of whole-genome phylogenetic trees. The close evolutionary relationship between JH1 and 1162281 relative to other K. pneumoniae strains suggests that a large component of the genetic and phenotypic diversity of clinical isolates is due to horizontal gene transfer. Using curated lists of over 400 antibiotic resistance genes, we identified all of the elements that differentiated the antibiotic profile of MDR strain 1162281 from that of susceptible strain JH1, such as the presence of additional efflux pumps, ESBLs, and multiple mechanisms of fluoroquinolone resistance. Our study adds new and significant DNA sequence data on K. pneumoniae strains and demonstrates the value of whole-genome sequencing in characterizing multidrug resistance in clinical isolates. PMID:21746949
copia-like retrotransposons are ubiquitous among plants.
Voytas, D F; Cummings, M P; Koniczny, A; Ausubel, F M; Rodermel, S R
1992-01-01
Transposable genetic elements are assumed to be a feature of all eukaryotic genomes. Their identification, however, has largely been haphazard, limited principally to organisms subjected to molecular or genetic scrutiny. We assessed the phylogenetic distribution of copia-like retrotransposons, a class of transposable element that proliferates by reverse transcription, using a polymerase chain reaction assay designed to detect copia-like element reverse transcriptase sequences. copia-like retrotransposons were identified in 64 plant species as well as the photosynthetic protist Volvox carteri. The plant species included representatives from 9 of 10 plant divisions, including bryophytes, lycopods, ferns, gymnosperms, and angiosperms. DNA sequence analysis of 29 cloned PCR products and of a maize retrotransposon cDNA confirmed the identity of these sequences as copia-like reverse transcriptase sequences, thereby demonstrating that this class of retrotransposons is a ubiquitous component of plant genomes. Images PMID:1379734
A Mobile Element in mutS Drives Hypermutation in a Marine Vibrio
Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia; Polz, Martin F.; Grossman, Alan D.
2017-01-01
ABSTRACT Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome, the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria. PMID:28174306
Williams, Kelly P.
2003-01-01
A partial screen for genetic elements integrated into completely sequenced bacterial genomes shows more significant bias in specificity for the tmRNA gene (ssrA) than for any type of tRNA gene. Horizontal gene transfer, a major avenue of bacterial evolution, was assessed by focusing on elements using this single attachment locus. Diverse elements use ssrA; among enterobacteria alone, at least four different integrase subfamilies have independently evolved specificity for ssrA, and almost every strain analyzed presents a unique set of integrated elements. Even elements using essentially the same integrase can be very diverse, as is a group with an ssrA-specific integrase of the P4 subfamily. This same integrase appears to promote damage routinely at attachment sites, which may be adaptive. Elements in arrays can recombine; one such event mediated by invertible DNA segments within neighboring elements likely explains the monophasic nature of Salmonella enterica serovar Typhi. One of a limited set of conserved sequences occurs at the attachment site of each enterobacterial element, apparently serving as a transcriptional terminator for ssrA. Elements were usually found integrated into tRNA-like sequence at the 3′ end of ssrA, at subsites corresponding to those used in tRNA genes; an exception was found at the non-tRNA-like 3′ end produced by ssrA gene permutation in cyanobacteria, suggesting that, during the evolution of new site specificity by integrases, tropism toward a conserved 3′ end of an RNA gene may be as strong as toward a tRNA-like sequence. The proximity of ssrA and smpB, which act in concert, was also surveyed. PMID:12533482
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment
2013-01-01
Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.
Nagar, Anurag; Hahsler, Michael
2013-01-01
Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
Responses of stream microbes to multiple anthropogenic stressors in a mesocosm study.
Nuy, Julia K; Lange, Anja; Beermann, Arne J; Jensen, Manfred; Elbrecht, Vasco; Röhl, Oliver; Peršoh, Derek; Begerow, Dominik; Leese, Florian; Boenigk, Jens
2018-08-15
Stream ecosystems are affected by multiple anthropogenic stressors worldwide. Even though effects of many single stressors are comparatively well studied, the effects of multiple stressors are difficult to predict. In particular bacteria and protists, which are responsible for the majority of ecosystem respiration and element flows, are infrequently studied with respect to multiple stressors responses. We conducted a stream mesocosm experiment to characterize the responses of single and multiple stressors on microbiota. Two functionally important stream habitats, leaf litter and benthic phototrophic rock biofilms, were exposed to three stressors in a full factorial design: fine sediment deposition, increased chloride concentration (salinization) and reduced flow velocity. We analyzed the microbial composition in the two habitat types of the mesocosms using an amplicon sequencing approach. Community analysis on different taxonomic levels as well as principle component analyses (PCoAs) based on realtive abundances of operational taxonomic units (OTUs) showed treatment specific shifts in the eukaryotic biofilm community. Analysis of variance (ANOVA) revealed that Bacillariophyta responded positively salinity and sediment increase, while the relative read abundance of chlorophyte taxa decreased. The combined effects of multiple stressors were mainly antagonistic. Therefore, the community composition in multiply stressed environments resembled the composition of the unstressed control community in terms of OTU occurrence and relative abundances. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Deletion of ultraconserved elements yields viable mice
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahituv, Nadav; Zhu, Yiwen; Visel, Axel
2007-07-15
Ultraconserved elements have been suggested to retainextended perfect sequence identity between the human, mouse, and ratgenomes due to essential functional properties. To investigate thenecessities of these elements in vivo, we removed four non-codingultraconserved elements (ranging in length from 222 to 731 base pairs)from the mouse genome. To maximize the likelihood of observing aphenotype, we chose to delete elements that function as enhancers in amouse transgenic assay and that are near genes that exhibit markedphenotypes both when completely inactivated in the mouse as well as whentheir expression is altered due to other genomic modifications.Remarkably, all four resulting lines of mice lackingmore » these ultraconservedelements were viable and fertile, and failed to reveal any criticalabnormalities when assayed for a variety of phenotypes including growth,longevity, pathology and metabolism. In addition more targeted screens,informed by the abnormalities observed in mice where genes in proximityto the investigated elements had been altered, also failed to revealnotable abnormalities. These results, while not inclusive of all thepossible phenotypic impact of the deleted sequences, indicate thatextreme sequence constraint does not necessarily reflect crucialfunctions required for viability.« less
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.
Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard
2004-09-09
Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Gallus, Susanne; Lammers, Fritjof
2016-01-01
The autonomous transposable element LINE-1 is a highly abundant element that makes up between 15% and 20% of therian mammal genomes. Since their origin before the divergence of marsupials and placental mammals, LINE-1 elements have contributed actively to the genome landscape. A previous in silico screen of the Tasmanian devil genome revealed a lack of functional coding LINE-1 sequences. In this study we present the results of an in vitro analysis from a partial LINE-1 reverse transcriptase coding sequence in five marsupial species. Our experimental screen supports the in silico findings of the genome-wide degradation of LINE-1 sequences in the Tasmanian devil, and identifies a high frequency of degraded LINE-1 sequences in other Australian marsupials. The comparison between the experimentally obtained LINE-1 sequences and reference genome assemblies suggests that conclusions from in silico analyses of retrotransposition activity can be influenced by incomplete genome assemblies from short reads. PMID:27389686
Computer-aided visualization and analysis system for sequence evaluation
Chee, M.S.
1998-08-18
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.
2004-05-11
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
1998-08-18
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
2003-08-19
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Phase discriminating capacitive array sensor system
NASA Technical Reports Server (NTRS)
Vranish, John M. (Inventor); Rahim, Wadi (Inventor)
1993-01-01
A phase discriminating capacitive sensor array system which provides multiple sensor elements which are maintained at a phase and amplitude based on a frequency reference provided by a single frequency stabilized oscillator. Sensor signals provided by the multiple sensor elements are controlled by multiple phase control units, which correspond to the multiple sensor elements, to adjust the sensor signals from the multiple sensor elements based on the frequency reference. The adjustment made to the sensor signals is indicated by output signals which indicate the proximity of the object. The output signals may also indicate the closing speed of the object based on the rate of change of the adjustment made, and the edges of the object based on a sudden decrease in the adjustment made.
Insights into Structural and Mechanistic Features of Viral IRES Elements
Martinez-Salas, Encarnacion; Francisco-Velilla, Rosario; Fernandez-Chamorro, Javier; Embarek, Azman M.
2018-01-01
Internal ribosome entry site (IRES) elements are cis-acting RNA regions that promote internal initiation of protein synthesis using cap-independent mechanisms. However, distinct types of IRES elements present in the genome of various RNA viruses perform the same function despite lacking conservation of sequence and secondary RNA structure. Likewise, IRES elements differ in host factor requirement to recruit the ribosomal subunits. In spite of this diversity, evolutionarily conserved motifs in each family of RNA viruses preserve sequences impacting on RNA structure and RNA–protein interactions important for IRES activity. Indeed, IRES elements adopting remarkable different structural organizations contain RNA structural motifs that play an essential role in recruiting ribosomes, initiation factors and/or RNA-binding proteins using different mechanisms. Therefore, given that a universal IRES motif remains elusive, it is critical to understand how diverse structural motifs deliver functions relevant for IRES activity. This will be useful for understanding the molecular mechanisms beyond cap-independent translation, as well as the evolutionary history of these regulatory elements. Moreover, it could improve the accuracy to predict IRES-like motifs hidden in genome sequences. This review summarizes recent advances on the diversity and biological relevance of RNA structural motifs for viral IRES elements. PMID:29354113
Multiple Access Interference Reduction Using Received Response Code Sequence for DS-CDMA UWB System
NASA Astrophysics Data System (ADS)
Toh, Keat Beng; Tachikawa, Shin'ichi
This paper proposes a combination of novel Received Response (RR) sequence at the transmitter and a Matched Filter-RAKE (MF-RAKE) combining scheme receiver system for the Direct Sequence-Code Division Multiple Access Ultra Wideband (DS-CDMA UWB) multipath channel model. This paper also demonstrates the effectiveness of the RR sequence in Multiple Access Interference (MAI) reduction for the DS-CDMA UWB system. It suggests that by using conventional binary code sequence such as the M sequence or the Gold sequence, there is a possibility of generating extra MAI in the UWB system. Therefore, it is quite difficult to collect the energy efficiently although the RAKE reception method is applied at the receiver. The main purpose of the proposed system is to overcome the performance degradation for UWB transmission due to the occurrence of MAI during multiple accessing in the DS-CDMA UWB system. The proposed system improves the system performance by improving the RAKE reception performance using the RR sequence which can reduce the MAI effect significantly. Simulation results verify that significant improvement can be obtained by the proposed system in the UWB multipath channel models.
DNA capture elements for rapid detection and identification of biological agents
NASA Astrophysics Data System (ADS)
Kiel, Johnathan L.; Parker, Jill E.; Holwitt, Eric A.; Vivekananda, Jeeva
2004-08-01
DNA capture elements (DCEs; aptamers) are artificial DNA sequences, from a random pool of sequences, selected for their specific binding to potential biological warfare agents. These sequences were selected by an affinity method using filters to which the target agent was attached and the DNA isolated and amplified by polymerase chain reaction (PCR) in an iterative, increasingly stringent, process. Reporter molecules were attached to the finished sequences. To date, we have made DCEs to Bacillus anthracis spores, Shiga toxin, Venezuelan Equine Encephalitis (VEE) virus, and Francisella tularensis. These DCEs have demonstrated specificity and sensitivity equal to or better than antibody.
EXTENDED STAR FORMATION IN THE INTERMEDIATE-AGE LARGE MAGELLANIC CLOUD STAR CLUSTER NGC 2209
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keller, Stefan C.; Mackey, A. Dougal; Da Costa, Gary S.
2012-12-10
We present observations of the 1 Gyr old star cluster NGC 2209 in the Large Magellanic Cloud made with the GMOS imager on the Gemini South Telescope. These observations show that the cluster exhibits a main-sequence turnoff that spans a broader range in luminosity than can be explained by a single-aged stellar population. This places NGC 2209 amongst a growing list of intermediate-age (1-3 Gyr) clusters that show evidence for extended or multiple epochs of star formation of between 50 and 460 Myr in extent. The extended main-sequence turnoff observed in NGC 2209 is a confirmation of the prediction inmore » Keller et al. made on the basis of the cluster's large core radius. We propose that secondary star formation is a defining feature of the evolution of massive star clusters. Dissolution of lower mass clusters through evaporation results in only clusters that have experienced secondary star formation surviving for a Hubble time, thus providing a natural connection between the extended main-sequence turnoff phenomenon and the ubiquitous light-element abundance ranges seen in the ancient Galactic globular clusters.« less
Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arakaki, Tracy; Le Trong, Isolde; Structural Genomics of Pathogenic Protozoa
2006-03-01
The crystal structure of a conserved hypothetical protein from L. major, Pfam sequence family PF04543, structural genomics target ID Lmaj006129AAA, has been determined at a resolution of 1.6 Å. The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure elements and disorder. The structure of one of these variants, Lmaj006129AAH, was solved by multiple-wavelength anomalous diffraction (MAD)more » using ELVES, an automatic protein crystal structure-determination system. This model was then successfully used as a molecular-replacement probe for the parent full-length target, Lmaj006129AAA. The final structure of Lmaj006129AAA was refined to an R value of 0.185 (R{sub free} = 0.229) at 1.60 Å resolution. Structure and sequence comparisons based on Lmaj006129AAA suggest that proteins belonging to Pfam sequence families PF04543 and PF01878 may share a common ligand-binding motif.« less
The timing of sequences of saccades in visual search.
Van Loon, E M; Hooge, I Th C; Van den Berg, A V
2002-01-01
According to the LATER model (linear approach to thresholds with ergodic rate), the latency of a single saccade in response to target appearance can be understood as a decision process, which is subject to (i) variations in the rate of (visual) information processing; and (ii) the threshold for the decision. We tested whether the LATER model can also be applied to the sequences of saccades in a multiple fixation search, during which latencies of second and subsequent saccades are typically shorter than that of the initial saccade. We found that the distributions of the reciprocal latencies for later saccades, unlike those of the first saccade, are highly asymmetrical, much like a gamma distribution. This suggests that the normal distribution of the rate r, which the LATER model assumes, is not appropriate to describe the rate distributions of subsequent saccades in a scanning sequence. By contrast, the gamma distribution is also appropriate to describe the distribution of reciprocal latencies for the first saccade. The change of the gamma distribution parameters as a function of the ordinal number of the saccade suggests a lowering of the threshold for second and later saccades, as well as a reduction in the number of target elements analysed. PMID:12184827
Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)
Singh, Ranjan K.; Tanner, John J.
2013-01-01
Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760
Nony, P; Tessier, J; Chadeuf, G; Ward, P; Giraud, A; Dugast, M; Linden, R M; Moullier, P; Salvetti, A
2001-10-01
This study identifies a region of the adeno-associated virus type 2 (AAV-2) rep gene (nucleotides 190 to 540 of wild-type AAV-2) as a cis-acting Rep-dependent element able to promote the replication of transiently transfected plasmids. This viral element is also shown to be involved in the amplification of integrated sequences in the presence of adenovirus and Rep proteins.
GrTEdb: the first web-based database of transposable elements in cotton (Gossypium raimondii).
Xu, Zhenzhen; Liu, Jing; Ni, Wanchao; Peng, Zhen; Guo, Yue; Ye, Wuwei; Huang, Fang; Zhang, Xianggui; Xu, Peng; Guo, Qi; Shen, Xinlian; Du, Jianchang
2017-01-01
Although several diploid and tetroploid Gossypium species genomes have been sequenced, the well annotated web-based transposable elements (TEs) database is lacking. To better understand the roles of TEs in structural, functional and evolutionary dynamics of the cotton genome, a comprehensive, specific, and user-friendly web-based database, Gossypium raimondii transposable elements database (GrTEdb), was constructed. A total of 14 332 TEs were structurally annotated and clearly categorized in G. raimondii genome, and these elements have been classified into seven distinct superfamilies based on the order of protein-coding domains, structures and/or sequence similarity, including 2929 Copia-like elements, 10 368 Gypsy-like elements, 299 L1 , 12 Mutators , 435 PIF-Harbingers , 275 CACTAs and 14 Helitrons . Meanwhile, the web-based sequence browsing, searching, downloading and blast tool were implemented to help users easily and effectively to annotate the TEs or TE fragments in genomic sequences from G. raimondii and other closely related Gossypium species. GrTEdb provides resources and information related with TEs in G. raimondii , and will facilitate gene and genome analyses within or across Gossypium species, evaluating the impact of TEs on their host genomes, and investigating the potential interaction between TEs and protein-coding genes in Gossypium species. http://www.grtedb.org/. © The Author(s) 2017. Published by Oxford University Press.
DCODE.ORG Anthology of Comparative Genomic Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loots, G G; Ovcharenko, I
2005-01-11
Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a toolmore » for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.« less
Meteyer, C.U.; Loeffler, I.K.; Fallon, J.F.; Converse, K.A.; Green, E.; Helgen, J.C.; Kersten, S.; Levey, R.; Eaton-Poole, L.; Burkhart, J.G.
2000-01-01
Background Reports of malformed frogs have increased throughout the North American continent in recent years. Most of the observed malformations have involved the hind limbs. The goal of this study was to accurately characterize the hind limb malformations in wild frogs as an important step toward understanding the possible etiologies. Methods During 1997 and 1998, 182 recently metamorphosed northern leopard frogs (Rana pipiens) were collected from Minnesota, Vermont, and Maine. Malformed hind limbs were present in 157 (86%) of these frogs, which underwent necropsy and radiographic evaluation at the National Wildlife Health Center. These malformations are described in detail and classified into four major categories: (1) no limb (amelia); (2) multiple limbs or limb elements (polymelia, polydactyly, polyphalangy); (3) reduced limb segments or elements (phocomelia, ectromelia, ectrodactyly, and brachydactyly; and (4) distally complete but malformed limb (bone rotations, bridging, skin webbing, and micromelia). Results Amelia and reduced segments and/or elements were the most common finding. Frogs with bilateral hind limb malformations were not common, and in only eight of these 22 frogs were the malformations symmetrical. Malformations of a given type tended to occur in frogs collected from the same site, but the types of malformations varied widely among all three states, and between study sites within Minnesota. Conclusions Clustering of malformation type suggests that developmental events may produce a variety of phenotypes depending on the timing, sequence, and severity of the environmental insult. Hind limb malformations in free-living frogs transcend current mechanistic explanations of tetrapod limb development.
Genomic comparison of virulent and non-virulent Streptococcus agalactiae in fish.
Delannoy, C M J; Zadoks, R N; Crumlish, M; Rodgers, D; Lainson, F A; Ferguson, H W; Turnbull, J; Fontaine, M C
2016-01-01
Streptococcus agalactiae infections in fish are predominantly caused by beta-haemolytic strains of clonal complex (CC) 7, notably its namesake sequence type (ST) 7, or by non-haemolytic strains of CC552, including the globally distributed ST260. In contrast, CC23, including its namesake ST23, has been associated with a wide homeothermic and poikilothermic host range, but never with fish. The aim of this study was to determine whether ST23 is virulent in fish and to identify genomic markers of fish adaptation of S. agalactiae. Intraperitoneal challenge of Nile tilapia, Oreochromis niloticus (Linnaeus), showed that ST260 is lethal at doses down to 10(2) cfu per fish, whereas ST23 does not cause disease at 10(7) cfu per fish. Comparison of the genome sequence of ST260 and ST23 with those of strains derived from fish, cattle and humans revealed the presence of genomic elements that are unique to subpopulations of S. agalactiae that have the ability to infect fish (CC7 and CC552). These loci occurred in clusters exhibiting typical signatures of mobile genetic elements. PCR-based screening of a collection of isolates from multiple host species confirmed the association of selected genes with fish-derived strains. Several fish-associated genes encode proteins that potentially provide fitness in the aquatic environment. © 2014 John Wiley & Sons Ltd.
Tanaka, Kanji; Watanabe, Katsumi
2016-02-01
The present study examined whether sequence learning led to more accurate and shorter performance time if people who are learning a sequence start over from the beginning when they make an error (i.e., practice the whole sequence) or only from the point of error (i.e., practice a part of the sequence). We used a visuomotor sequence learning paradigm with a trial-and-error procedure. In Experiment 1, we found fewer errors, and shorter performance time for those who restarted their performance from the beginning of the sequence as compared to those who restarted from the point at which an error occurred, indicating better learning of spatial and motor representations of the sequence. This might be because the learned elements were repeated when the next performance started over from the beginning. In subsequent experiments, we increased the occasions for the repetitions of learned elements by modulating the number of fresh start points in the sequence after errors. The results showed that fewer fresh start points were likely to lead to fewer errors and shorter performance time, indicating that the repetitions of learned elements enabled participants to develop stronger spatial and motor representations of the sequence. Thus, a single or two fresh start points in the sequence (i.e., starting over only from the beginning or from the beginning or midpoint of the sequence after errors) is likely to lead to more accurate and faster performance. Copyright © 2016 Elsevier B.V. All rights reserved.
Elements of Mathematics, Book O: Intuitive Background. Chapter 1, Operational Systems.
ERIC Educational Resources Information Center
Exner, Robert; And Others
The sixteen chapters of this book provide the core material for the Elements of Mathematics Program, a secondary sequence developed for highly motivated students with strong verbal abilities. The sequence is based on a functional-relational approach to mathematics teaching, and emphasizes teaching by analysis of real-life situations. This text is…
Elements of Mathematics, Book O: Intuitive Background. Chapter 5, Mappings.
ERIC Educational Resources Information Center
Exner, Robert; And Others
The sixteen chapters of this book provide the core material for the Elements of Mathematics Program, a secondary sequence developed for highly motivated students with strong verbal abilities. The sequence is based on a functional-relational approach to mathematics teaching, and emphasizes teaching by analysis of real-life situations. This text is…
Elements of Mathematics, Book O: Intuitive Background. Chapter 2, The Integers.
ERIC Educational Resources Information Center
Exner, Robert; And Others
The sixteen chapters of this book provide the core materials for the Elements of Mathematics Program, a secondary sequence developed for highly motivated students with strong verbal abilities. The sequence is based on a functional-relational approach to mathematics teaching, and emphasizes teaching by analysis of real-life situations. This text is…
A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band
NASA Astrophysics Data System (ADS)
Zou, Quan; Shan, Xiao; Jiang, Yi
Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.
Multiple intensity distributions from a single optical element
NASA Astrophysics Data System (ADS)
Berens, Michael; Bruneton, Adrien; Bäuerle, Axel; Traub, Martin; Wester, Rolf; Stollenwerk, Jochen; Loosen, Peter
2013-09-01
We report on an extension of the previously published two-step freeform optics tailoring algorithm using a Monge-Kantorovich mass transportation framework. The algorithm's ability to design multiple freeform surfaces allows for the inclusion of multiple distinct light paths and hence the implementation of multiple lighting functions in a single optical element. We demonstrate the procedure in the context of automotive lighting, in which a fog lamp and a daytime running lamp are integrated in a single optical element illuminated by two distinct groups of LEDs.
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
1999-10-26
A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Computer-aided visualization and analysis system for sequence evaluation
Chee, Mark S.
2001-06-05
A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Method and apparatus for signal processing in a sensor system for use in spectroscopy
O'Connor, Paul [Bellport, NY; DeGeronimo, Gianluigi [Nesconset, NY; Grosholz, Joseph [Natrona Heights, PA
2008-05-27
A method for processing pulses arriving randomly in time on at least one channel using multiple peak detectors includes asynchronously selecting a non-busy peak detector (PD) in response to a pulse-generated trigger signal, connecting the channel to the selected PD in response to the trigger signal, and detecting a pulse peak amplitude. Amplitude and time of arrival data are output in first-in first-out (FIFO) sequence. An apparatus includes trigger comparators to generate the trigger signal for the pulse-receiving channel, PDs, a switch for connecting the channel to the selected PD, and logic circuitry which maintains the write pointer. Also included, time-to-amplitude converters (TACs) convert time of arrival to analog voltage and an analog multiplexer provides FIFO output. A multi-element sensor system for spectroscopy includes detector elements, channels, trigger comparators, PDs, a switch, and a logic circuit with asynchronous write pointer. The system includes TACs, a multiplexer and analog-to-digital converter.
Lee, Younghee; Han, Seonggyun; Kim, Dongwook; Kim, Dokyoon; Horgousluoglu, Emrin; Risacher, Shannon L; Saykin, Andrew J; Nho, Kwangsik
2018-01-01
Genetic variation in cis-regulatory elements related to splicing machinery and splicing regulatory elements (SREs) results in exon skipping and undesired protein products. We developed a splicing decision model to identify actionable loci among common SNPs for gene regulation. The splicing decision model identified SNPs affecting exon skipping by analyzing sequence-driven alternative splicing (AS) models and by scanning the genome for the regions with putative SRE motifs. We used non-Hispanic Caucasians with neuroimaging, and fluid biomarkers for Alzheimer's disease (AD) and identified 17,088 common exonic SNPs affecting exon skipping. GWAS identified one SNP (rs1140317) in HLA-DQB1 as significantly associated with entorhinal cortical thickness, AD neuroimaging biomarker, after controlling for multiple testing. Further analysis revealed that rs1140317 was significantly associated with brain amyloid-f deposition (PET and CSF). HLA-DQB1 is an essential immune gene and may regulate AS, thereby contributing to AD pathology. SRE may hold potential as novel therapeutic targets for AD.
2012-01-01
Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection). A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs) [plasmid, phage, integrative conjugative element (ICE)] and comparison to other species provided convincing evidence for lateral gene transfer (LGT) between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae), with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST) of a subset of the isolates (n = 45) detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types]), suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates) occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human bacteria (Streptococcus urinalis) is cause for concern, as it highlights the possibility for continued acquisition of human virulence factors for this emerging zoonotic pathogen. PMID:23244770
Verwey, Willem B; Lammens, Robin; van Honk, Jack
2002-01-01
Participants practiced two discrete six-key sequences for a total of 420 trials. The 1 x 6 sequence had a unique order of key presses while the 2 x 3 sequence involved repetition of a three-key segment. Both sequences showed a long interkey interval halfway the sequence indicating hierarchical sequence control in that not only the 2 x 3 but also the 1 x 6 sequence was executed as two successive motor chunks. Besides, the second part of both sequences was executed faster than the first part. This supports the earlier notion of a motor processor executing the elements of familiar motor chunks and a cognitive processor triggering either these motor chunks or individual sequence elements. Low-frequency, off-line transcranial magnetic stimulation (TMS) of the supplementary motor area (SMA) counteracted normal improvement with practice of key presses at all sequence positions. Together, these results are in line with the notion that with moderate practice, the SMA executes short sequence fragments that are concatenated by other brain structures.
Solar Pumped Solid State Lasers for Space Solar Power: Experimental Path
NASA Technical Reports Server (NTRS)
Fork, Richard L.; Carrington, Connie K.; Walker, Wesley W.; Cole, Spencer T.; Green, Jason J. A.; Laycock, Rustin L.
2003-01-01
We outline an experimentally based strategy designed to lead to solar pumped solid state laser oscillators useful for space solar power. Our method involves solar pumping a novel solid state gain element specifically designed to provide efficient conversion of sunlight in space to coherent laser light. Kilowatt and higher average power is sought from each gain element. Multiple such modular gain elements can be used to accumulate total average power of interest for power beaming in space, e.g., 100 kilowatts and more. Where desirable the high average power can also be produced as a train of pulses having high peak power (e.g., greater than 10(exp 10 watts). The modular nature of the basic gain element supports an experimental strategy in which the core technology can be validated by experiments on a single gain element. We propose to do this experimental validation both in terrestrial locations and also on a smaller scale in space. We describe a terrestrial experiment that includes diagnostics and the option of locating the laser beam path in vacuum environment. We describe a space based experiment designed to be compatible with the Japanese Experimental Module (JEM) on the International Space Station (ISS). We anticipate the gain elements will be based on low temperature (approx. 100 degrees Kelvin) operation of high thermal conductivity (k approx. 100 W/cm-K) diamond and sapphire (k approx. 4 W/cm-K). The basic gain element will be formed by sequences of thin alternating layers of diamond and Ti:sapphire with special attention given to the material interfaces. We anticipate this strategy will lead to a particularly simple, robust, and easily maintained low mass modelocked multi-element laser oscillator useful for space solar power.
Genetic Basis of Melanin Pigmentation in Butterfly Wings
Zhang, Linlin; Martin, Arnaud; Perry, Michael W.; van der Burg, Karin R. L.; Matsuoka, Yuji; Monteiro, Antónia; Reed, Robert D.
2017-01-01
Despite the variety, prominence, and adaptive significance of butterfly wing patterns, surprisingly little is known about the genetic basis of wing color diversity. Even though there is intense interest in wing pattern evolution and development, the technical challenge of genetically manipulating butterflies has slowed efforts to functionally characterize color pattern development genes. To identify candidate wing pigmentation genes, we used RNA sequencing to characterize transcription across multiple stages of butterfly wing development, and between different color pattern elements, in the painted lady butterfly Vanessa cardui. This allowed us to pinpoint genes specifically associated with red and black pigment patterns. To test the functions of a subset of genes associated with presumptive melanin pigmentation, we used clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 genome editing in four different butterfly genera. pale, Ddc, and yellow knockouts displayed reduction of melanin pigmentation, consistent with previous findings in other insects. Interestingly, however, yellow-d, ebony, and black knockouts revealed that these genes have localized effects on tuning the color of red, brown, and ochre pattern elements. These results point to previously undescribed mechanisms for modulating the color of specific wing pattern elements in butterflies, and provide an expanded portrait of the insect melanin pathway. PMID:28193726
Successful Gene Tagging in Lettuce Using the Tnt1 Retrotransposon from Tobacco
Mazier, Marianne; Botton, Emmanuel; Flamain, Fabrice; Bouchet, Jean-Paul; Courtial, Béatrice; Chupeau, Marie-Christine; Chupeau, Yves; Maisonneuve, Brigitte; Lucas, Hélène
2007-01-01
The tobacco (Nicotiana tabacum) element Tnt1 is one of the few identified active retrotransposons in plants. These elements possess unique properties that make them ideal genetic tools for gene tagging. Here, we demonstrate the feasibility of gene tagging using the retrotransposon Tnt1 in lettuce (Lactuca sativa), which is the largest genome tested for retrotransposon mutagenesis so far. Of 10 different transgenic bushes carrying a complete Tnt1 containing T-DNA, eight contained multiple transposed copies of Tnt1. The number of transposed copies of the element per plant was particularly high, the smallest number being 28. Tnt1 transposition in lettuce can be induced by a very simple in vitro culture protocol. Tnt1 insertions were stable in the progeny of the primary transformants and could be segregated genetically. Characterization of the sequences flanking some insertion sites revealed that Tnt1 often inserted into genes. The progeny of some primary transformants showed phenotypic alterations due to recessive mutations. One of these mutations was due to Tnt1 insertion in the gibberellin 3β-hydroxylase gene. Taken together, these results indicate that Tnt1 is a powerful tool for insertion mutagenesis especially in plants with a large genome. PMID:17351058
The sequence, structure and evolutionary features of HOTAIR in mammals
2011-01-01
Background An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR. PMID:21496275
Colonization of heterochromatic genes by transposable elements in Drosophila.
Dimitri, Patrizio; Junakovic, Nikolaj; Arcà, Bruno
2003-04-01
As a further step toward understanding transposable element-host genome interactions, we investigated the molecular anatomy of introns from five heterochromatic and 22 euchromatic protein-coding genes of Drosophila melanogaster. A total of 79 kb of intronic sequences from heterochromatic genes and 355 kb of intronic sequences from euchromatic genes have been used in Blast searches against Drosophila transposable elements (TEs). The results show that TE-homologous sequences belonging to 19 different families represent about 50% of intronic DNA from heterochromatic genes. In contrast, only 0.1% of the euchromatic intron DNA exhibits homology to known TEs. Intraspecific and interspecific size polymorphisms of introns were found, which are likely to be associated with changes in TE-related sequences. Together, the enrichment in TEs and the apparent dynamic state of heterochromatic introns suggest that TEs contribute significantly to the evolution of genes located in heterochromatin.
Episodic sequence memory is supported by a theta-gamma phase code.
Heusser, Andrew C; Poeppel, David; Ezzyat, Youssef; Davachi, Lila
2016-10-01
The meaning we derive from our experiences is not a simple static extraction of the elements but is largely based on the order in which those elements occur. Models propose that sequence encoding is supported by interactions between high- and low-frequency oscillations, such that elements within an experience are represented by neural cell assemblies firing at higher frequencies (gamma) and sequential order is encoded by the specific timing of firing with respect to a lower frequency oscillation (theta). During episodic sequence memory formation in humans, we provide evidence that items in different sequence positions exhibit greater gamma power along distinct phases of a theta oscillation. Furthermore, this segregation is related to successful temporal order memory. Our results provide compelling evidence that memory for order, a core component of an episodic memory, capitalizes on the ubiquitous physiological mechanism of theta-gamma phase-amplitude coupling.
Dungan, M.A.; Wulff, A.; Thompson, R.
2001-01-01
The Quaternary Tatara-San Pedro volcanic complex (36°S, Chilean Andes) comprises eight or more unconformity-bound volcanic sequences, representing variably preserved erosional remnants of volcanic centers generated during 930 ky of activity. The internal eruptive histories of several dominantly mafic to intermediate sequences have been reconstructed, on the basis of correlations of whole-rock major and trace element chemistry of flows between multiple sampled sections, but with critical contributions from photogrammetric, geochronologic, and paleomagnetic data. Many groups of flows representing discrete eruptive events define internal variation trends that reflect extrusion of heterogeneous or rapidly evolving magna batches from conduit-reservoir systems in which open-system processes typically played a large role. Long-term progressive evolution trends are extremely rare and the magma compositions of successive eruptive events rarely lie on precisely the same differentiation trend, even where they have evolved from similar parent magmas by similar processes. These observations are not consistent with magma differentiation in large long-lived reservoirs, but they may be accommodated by diverse interactions between newly arrived magma inputs and multiple resident pockets of evolved magma and / or crystal mush residing in conduit-dominated subvolcanic reservoirs. Without constraints provided by the reconstructed stratigraphic relations, the framework for petrologic modeling would be far different. A well-established eruptive stratigraphy may provide independent constraints on the petrologic processes involved in magma evolution-simply on the basis of the specific order in which diverse, broadly cogenetic magmas have been erupted. The Tatara-San Pedro complex includes lavas ranging from primitive basalt to high-SiO2 rhyolite, and although the dominant erupted magma type was basaltic andesite ( 52-55 wt % SiO2) each sequence is characterized by unique proportions of mafic, intermediate, and silicic eruptive products. Intermediate lava compositions also record different evolution paths, both within and between sequences. No systematic long-term pattern is evident from comparisons at the level of sequences. The considerable diversity of mafic and evolved magmas of the Tatara-San Pedro complex bears on interpretations of regional geochemical trends. The variable role of open-system processes in shaping the compositions of evolved Tatara-San Pedro complex magmas, and even some basaltic magmas, leads to the conclusion that addressing problems such as are magma genesis and elemental fluxes through subduction zones on the basis of averaged or regressed reconnaissance geochemical datasets is a tenuous exercise. Such compositional indices are highly instructive for identifying broad regional trends and first-order problems, but they should be used with extreme caution in attempts to quantify processes and magma sources, including crustal components, implicated in these trends.
A proposal to rename the hyperthermophile Pyrococcus woesei as Pyrococcus furiosus subsp. woesei.
Kanoksilapatham, Wirojne; González, Juan M; Maeder, Dennis L; DiRuggiero, Jocelyne; Robb, Frank T
2004-10-01
Pyrococcus species are hyperthermophilic members of the order Thermococcales, with optimal growth temperatures approaching 100 degrees C. All species grow heterotrophically and produce H2 or, in the presence of elemental sulfur (S(o)), H2S. Pyrococcus woesei and P. furiosus were isolated from marine sediments at the same Vulcano Island beach site and share many morphological and physiological characteristics. We report here that the rDNA operons of these strains have identical sequences, including their intergenic spacer regions and part of the 23S rRNA. Both species grow rapidly and produce H2 in the presence of 0.1% maltose and 10-100 microM sodium tungstate in S(o)-free medium. However, P. woesei shows more extensive autolysis than P. furiosus in the stationary phase. Pyrococcus furiosus and P. woesei share three closely related families of insertion sequences (ISs). A Southern blot performed with IS probes showed extensive colinearity between the genomes of P. woesei and P. furiosus. Cloning and sequencing of ISs that were in different contexts in P. woesei and P. furiosus revealed that the napA gene in P. woesei is disrupted by a type III IS element, whereas in P. furiosus, this gene is intact. A type I IS element, closely linked to the napA gene, was observed in the same context in both P. furiosus and P. woesei genomes. Our results suggest that the IS elements are implicated in genomic rearrangements and reshuffling in these closely related strains. We propose to rename P. woesei a subspecies of P. furiosus based on their identical rDNA operon sequences, many common IS elements that are shared genomic markers, and the observation that all P. woesei nucleotide sequences deposited in GenBank to date are > 99% identical to P. furiosus sequences.
Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites
Prouse, Michael B.; Campbell, Malcolm M.
2013-01-01
Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators. PMID:23741471
Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus
Brinton, Margo A.; Basu, Mausumi
2015-01-01
The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510
Fort, Philippe; Albertini, Aurélie; Van-Hua, Aurélie; Berthomieu, Arnaud; Roche, Stéphane; Delsuc, Frédéric; Pasteur, Nicole; Capy, Pierre; Gaudin, Yves; Weill, Mylène
2012-01-01
Retroelements represent a considerable fraction of many eukaryotic genomes and are considered major drives for adaptive genetic innovations. Recent discoveries showed that despite not normally using DNA intermediates like retroviruses do, Mononegaviruses (i.e., viruses with nonsegmented, negative-sense RNA genomes) can integrate gene fragments into the genomes of their hosts. This was shown for Bornaviridae and Filoviridae, the sequences of which have been found integrated into the germ line cells of many vertebrate hosts. Here, we show that Rhabdoviridae sequences, the major Mononegavirales family, have integrated only into the genomes of arthropod species. We identified 185 integrated rhabdoviral elements (IREs) coding for nucleoproteins, glycoproteins, or RNA-dependent RNA polymerases; they were mostly found in the genomes of the mosquito Aedes aegypti and the blacklegged tick Ixodes scapularis. Phylogenetic analyses showed that most IREs in A. aegypti derived from multiple independent integration events. Since RNA viruses are submitted to much higher substitution rates as compared with their hosts, IREs thus represent fossil traces of the diversity of extinct Rhabdoviruses. Furthermore, analyses of orthologous IREs in A. aegypti field mosquitoes sampled worldwide identified an integrated polymerase IRE fragment that appeared under purifying selection within several million years, which supports a functional role in the host's biology. These results show that A. aegypti was subjected to repeated Rhabdovirus infectious episodes during its evolution history, which led to the accumulation of many integrated sequences. They also suggest that like retroviruses, integrated rhabdoviral sequences may participate actively in the evolution of their hosts.
Mechanism for DNA transposons to generate introns on genomic scales
Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.
2017-01-01
Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
Regulatory elements in vivo in the promoter of the abscisic acid responsive gene rab17 from maize.
Busk, P K; Jensen, A B; Pagès, M
1997-06-01
The rab17 gene from maize is transcribed in late embryonic development and is responsive to abscisic acid and water stress in embryo and vegetative tissues. In vivo footprinting and transient transformation of rab17 were performed in embryos and vegetative tissues to characterize the cis-elements involved in regulation of the gene. By in vivo footprinting, protein binding was observed to nine elements in the promoter, which correspond to five putative ABREs (abscisic acid responsive elements) and four other sequences. The footprints indicated that distinct proteins interact with these elements in the two developmental stages. In transient transformation, six of the elements were important for high level expression of the rab17 promoter in embryos, whereas only three elements were important in leaves. The cis-acting sequences can be divided in embryo-specific, ABA-specific and leaf-specific elements on the basis of protein binding and the ability to confer expression of rab17. We found one positive, new element, called GRA, with the sequence CACTGGCCGCCC. This element was important for transcription in leaves but not in embryos. Two other non-ABRE elements that stimulated transcription from the rab17 promoter resemble previously described abscisic acid and drought-inducible elements. There were differences in protein binding and function of the five ABREs in the rab17 promoter. The possible reasons for these differences are discussed. The in vivo data obtained suggest that an embryo-specific pathway regulates transcription of the rab genes during development, whereas another pathway is responsible for induction in response to ABA and drought in vegetative tissues.
2011-01-01
Background Insertion sequence (IS) elements are important mediators of genome plasticity and are widespread among bacterial and archaeal genomes. The 1.88 Mbp genome of the obligate intracellular amoeba symbiont Amoebophilus asiaticus contains an unusually large number of transposase genes (n = 354; 23% of all genes). Results The transposase genes in the A. asiaticus genome can be assigned to 16 different IS elements termed ISCaa1 to ISCaa16, which are represented by 2 to 24 full-length copies, respectively. Despite this high IS element load, the A. asiaticus genome displays a GC skew pattern typical for most bacterial genomes, indicating that no major rearrangements have occurred recently. Additionally, the high sequence divergence of some IS elements, the high number of truncated IS element copies (n = 143), as well as the absence of direct repeats in most IS elements suggest that the IS elements of A. asiaticus are transpositionally inactive. Although we could show transcription of 13 IS elements, we did not find experimental evidence for transpositional activity, corroborating our results from sequence analyses. However, we detected contiguous transcripts between IS elements and their downstream genes at nine loci in the A. asiaticus genome, indicating that some IS elements influence the transcription of downstream genes, some of which might be important for host cell interaction. Conclusions Taken together, the IS elements in the A. asiaticus genome are currently in the process of degradation and largely represent reflections of the evolutionary past of A. asiaticus in which its genome was shaped by their activity. PMID:21943072
Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment
2011-01-01
Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510
Characterization of the UGA-recoding and SECIS-binding activities of SECIS-binding protein 2.
Bubenik, Jodi L; Miniard, Angela C; Driscoll, Donna M
2014-01-01
Selenium, a micronutrient, is primarily incorporated into human physiology as selenocysteine (Sec). The 25 Sec-containing proteins in humans are known as selenoproteins. Their synthesis depends on the translational recoding of the UGA stop codon to allow Sec insertion. This requires a stem-loop structure in the 3' untranslated region of eukaryotic mRNAs known as the Selenocysteine Insertion Sequence (SECIS). The SECIS is recognized by SECIS-binding protein 2 (SBP2) and this RNA:protein interaction is essential for UGA recoding to occur. Genetic mutations cause SBP2 deficiency in humans, resulting in a broad set of symptoms due to differential effects on individual selenoproteins. Progress on understanding the different phenotypes requires developing robust tools to investigate SBP2 structure and function. In this study we demonstrate that SBP2 protein produced by in vitro translation discriminates among SECIS elements in a competitive UGA recoding assay and has a much higher specific activity than bacterially expressed protein. We also show that a purified recombinant protein encompassing amino acids 517-777 of SBP2 binds to SECIS elements with high affinity and selectivity. The affinity of the SBP2:SECIS interaction correlated with the ability of a SECIS to compete for UGA recoding activity in vitro. The identification of a 250 amino acid sequence that mediates specific, selective SECIS-binding will facilitate future structural studies of the SBP2:SECIS complex. Finally, we identify an evolutionarily conserved core cysteine signature in SBP2 sequences from the vertebrate lineage. Mutation of multiple, but not single, cysteines impaired SECIS-binding but did not affect protein localization in cells.
Spuesens, Emiel B M; van de Kreeke, Nick; Estevão, Silvia; Hoogenboezem, Theo; Sluijter, Marcel; Hartwig, Nico G; van Rossum, Annemarie M C; Vink, Cornelis
2011-02-01
Mycoplasma pneumoniae is a human pathogen that causes a range of respiratory tract infections. The first step in infection is adherence of the bacteria to the respiratory epithelium. This step is mediated by a specialized organelle, which contains several proteins (cytadhesins) that have an important function in adherence. Two of these cytadhesins, P40 and P90, represent the proteolytic products from a single 130 kDa protein precursor, which is encoded by the MPN142 gene. Interestingly, MPN142 contains a repetitive DNA element, termed RepMP5, of which homologues are found at seven other loci within the M. pneumoniae genome. It has been hypothesized that these RepMP5 elements, which are similar but not identical in sequence, recombine with their counterpart within MPN142 and thereby provide a source of sequence variation for this gene. As this variation may give rise to amino acid changes within P40 and P90, the recombination between RepMP5 elements may constitute the basis of antigenic variation and, possibly, immune evasion by M. pneumoniae. To investigate the sequence variation of MPN142 in relation to inter-RepMP5 recombination, we determined the sequences of all RepMP5 elements in a collection of 25 strains. The results indicate that: (i) inter-RepMP5 recombination events have occurred in seven of the strains, and (ii) putative RepMP5 recombination events involving MPN142 have induced amino acid changes in a surface-exposed part of the P40 protein in two of the strains. We conclude that recombination between RepMP5 elements is a common phenomenon that may lead to sequence variation of MPN142-encoded proteins.
Zhou, Kaixin; Xie, Lianyan; Han, Lizhong; Guo, Xiaokui; Wang, Yong; Sun, Jingyong
2017-01-01
ICE Sag37 , a novel integrative and conjugative element carrying multidrug resistance and potential virulence factors, was characterized in a clinical isolate of Streptococcus agalactiae . Two clinical strains of S. agalactiae , Sag37 and Sag158, were isolated from blood samples of new-borns with bacteremia. Sag37 was highly resistant to erythromycin and tetracycline, and susceptible to levofloxacin and penicillin, while Sag158 was resistant to tetracycline and levofloxacin, and susceptible to erythromycin. Transfer experiments were performed and selection was carried out with suitable antibiotic concentrations. Through mating experiments, the erythromycin resistance gene was found to be transferable from Sag37 to Sag158. Sma I-PFGE revealed a new Sma I fragment, confirming the transfer of the fragment containing the erythromycin resistance gene. Whole genome sequencing and sequence analysis revealed a mobile element, ICE Sag37 , which was characterized using several molecular methods and in silico analyses. ICE Sag37 was excised to generate a covalent circular intermediate, which was transferable to S. agalactiae . Inverse PCR was performed to detect the circular form. A serine family integrase mediated its chromosomal integration into rumA , which is a known hotspot for the integration of streptococcal ICEs. The integration site was confirmed using PCR. ICE Sag37 carried genes for resistance to multiple antibiotics, including erythromycin [ erm(B) ], tetracycline [ tet(O) ], and aminoglycosides [ aadE, aphA , and ant(6) ]. Potential virulence factors, including a two-component signal transduction system ( nisK/nisR ), were also observed in ICE Sag37 . S1-PFGE analysis ruled out the existence of plasmids. ICE Sag37 is the first ICE Sa2603 family-like element identified in S. agalactiae carrying both resistance and potential virulence determinants. It might act as a vehicle for the dissemination of multidrug resistance and pathogenicity among S. agalactiae .
Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing.
Ryland, Georgina L; Jones, Kate; Chin, Melody; Markham, John; Aydogan, Elle; Kankanige, Yamuna; Caruso, Marisa; Guinto, Jerick; Dickinson, Michael; Prince, H Miles; Yong, Kwee; Blombery, Piers
2018-05-14
Multiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre. A cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline. At least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed. Our results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Itzhaki, H; Maxson, J M; Woodson, W R
1994-09-13
The increased production of ethylene during carnation petal senescence regulates the transcription of the GST1 gene encoding a subunit of glutathione-S-transferase. We have investigated the molecular basis for this ethylene-responsive transcription by examining the cis elements and trans-acting factors involved in the expression of the GST1 gene. Transient expression assays following delivery of GST1 5' flanking DNA fused to a beta-glucuronidase receptor gene were used to functionally define sequences responsible for ethylene-responsive expression. Deletion analysis of the 5' flanking sequences of GST1 identified a single positive regulatory element of 197 bp between -667 and -470 necessary for ethylene-responsive expression. The sequences within this ethylene-responsive region were further localized to 126 bp between -596 and -470. The ethylene-responsive element (ERE) within this region conferred ethylene-regulated expression upon a minimal cauliflower mosaic virus-35S TATA-box promoter in an orientation-independent manner. Gel electrophoresis mobility-shift assays and DNase I footprinting were used to identify proteins that bind to sequences within the ERE. Nuclear proteins from carnation petals were shown to specifically interact with the 126-bp ERE and the presence and binding of these proteins were independent of ethylene or petal senescence. DNase I footprinting defined DNA sequences between -510 and -488 within the ERE specifically protected by bound protein. An 8-bp sequence (ATTTCAAA) within the protected region shares significant homology with promoter sequences required for ethylene responsiveness from the tomato fruit-ripening E4 gene.
Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav
2013-07-18
Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.
A mobile element in mutS drives hypermutation in a marine Vibrio
Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia; ...
2017-02-07
Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome,more » the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Finally, our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria.« less
A mobile element in mutS drives hypermutation in a marine Vibrio
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chu, Nathaniel D.; Clarke, Sean A.; Timberlake, Sonia
Bacteria face a trade-off between genetic fidelity, which reduces deleterious mistakes in the genome, and genetic innovation, which allows organisms to adapt. Evidence suggests that many bacteria balance this trade-off by modulating their mutation rates, but few mechanisms have been described for such modulation. Following experimental evolution and whole-genome resequencing of the marine bacterium Vibrio splendidus 12B01, we discovered one such mechanism, which allows this bacterium to switch to an elevated mutation rate. This switch is driven by the excision of a mobile element residing in mutS, which encodes a DNA mismatch repair protein. When integrated within the bacterial genome,more » the mobile element provides independent promoter and translation start sequences for mutS—different from the bacterium’s original mutS promoter region—which allow the bacterium to make a functional mutS gene product. Excision of this mobile element rejoins the mutS gene with host promoter and translation start sequences but leaves a 2-bp deletion in the mutS sequence, resulting in a frameshift and a hypermutator phenotype. We further identified hundreds of clinical and environmental bacteria across Betaproteobacteria and Gammaproteobacteria that possess putative mobile elements within the same amino acid motif in mutS. In a subset of these bacteria, we detected excision of the element but not a frameshift mutation; the mobile elements leave an intact mutS coding sequence after excision. Finally, our findings reveal a novel mechanism by which one bacterium alters its mutation rate and hint at a possible evolutionary role for mobile elements within mutS in other bacteria.« less
LINE-1 Elements in Structural Variation and Disease
Beck, Christine R.; Garcia-Perez, José Luis; Badge, Richard M.; Moran, John V.
2014-01-01
The completion of the human genome reference sequence ushered in a new era for the study and discovery of human transposable elements. It now is undeniable that transposable elements, historically dismissed as junk DNA, have had an instrumental role in sculpting the structure and function of our genomes. In particular, long interspersed element-1 (LINE-1 or L1) and short interspersed elements (SINEs) continue to affect our genome, and their movement can lead to sporadic cases of disease. Here, we briefly review the types of transposable elements present in the human genome and their mechanisms of mobility. We next highlight how advances in DNA sequencing and genomic technologies have enabled the discovery of novel retrotransposons in individual genomes. Finally, we discuss how L1-mediated retrotransposition events impact human genomes. PMID:21801021
NASA Astrophysics Data System (ADS)
Archer, C.; Noble, P. J.; Mensing, S. A.; Tunno, I.; Sagnotti, L.; Florindo, F.; Cifnani, G.; Zimmerman, S. R. H.; Piovesan, G.
2014-12-01
A 14.4 m thick sedimentary sequence was recovered in multiple cores from Lago Lungo in the Rieti Basin, an intrapenninic extensional basin ~80 km north of Rome, Italy. This sequence provides a high-resolution record of environmental change related to climatic influence and anthropogenic landscape alteration. Pollen analyses, corroborated with historical records of land-use change, define the major shifts in forest composition and their historical context. An age model of the sequence was built using ties to regional cultigen datums and archaeomagnetic reference curves. Here we focus on sedimentologic and geochemical data (scanning XRF) from the Roman Period through the Little Ice Age (LIA). The base of the sequence (ca. 680 BCE- 1 CE) is marked by a steady increase in fine-grained detrital elements Ti, Rb, and K, and corresponding decrease in Ca, representing a transition from the unaltered system after the Romans constructed a channel that the basin. The Medieval Period (MP; 900-1350 CE) is lithologically distinct, composed of varicolored bands of alternating silt, clay, and calcareous concretions. Low counts of Ca, high detrital elements and frequent abrupt peaks in levels of the redox elements Fe and Mn indicate episodic clastic influx. Pollen data indicate that the greatest degree of deforestation and erosion occurred during the MP, supported by mean sedimentation rates of ca. 1cm/year, over twice the rate of the underlying interval. The Medieval climate was warmer and more stable, population increased, and elevations >1000 m were exploited for agriculture. The influence of the Velino River on the lake appears to increase during the MP through channel migration, increased flooding, or increased overland flow. The next transition (1350 CE) marks the start of the LIA and is coincident with the Black Plague. Historical records document a large earthquake in 1349 that severely struck Central Italy, with possible effects on the lake's depositional and hydrochemical regime. Clastic input abruptly ceases at the start of the LIA, and peaks in Sr, Ca, and S may be attributed to changes in lake inflow. Core analyses results, corroborated with historical documentation, provide new insights into the basin history and the underlying causes of environmental change.
Cell type-specific termination of transcription by transposable element sequences.
Conley, Andrew B; Jordan, I King
2012-09-30
Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3' UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
NASA Technical Reports Server (NTRS)
Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris
2008-01-01
The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.
Differential evolution-simulated annealing for multiple sequence alignment
NASA Astrophysics Data System (ADS)
Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.
2017-10-01
Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
Fine-tuning the onset of myogenesis by homeobox proteins that interact with the Myf5 limb enhancer
Daubas, Philippe; Duval, Nathalie; Bajard, Lola; Langa Vives, Francina; Robert, Benoît; Mankoo, Baljinder S.; Buckingham, Margaret
2015-01-01
ABSTRACT Skeletal myogenesis in vertebrates is initiated at different sites of skeletal muscle formation during development, by activation of specific control elements of the myogenic regulatory genes. In the mouse embryo, Myf5 is the first myogenic determination gene to be expressed and its spatiotemporal regulation requires multiple enhancer sequences, extending over 120 kb upstream of the Mrf4-Myf5 locus. An enhancer, located at −57/−58 kb from Myf5, is responsible for its activation in myogenic cells derived from the hypaxial domain of the somite, that will form limb muscles. Pax3 and Six1/4 transcription factors are essential activators of this enhancer, acting on a 145-bp core element. Myogenic progenitor cells that will form the future muscle masses of the limbs express the factors necessary for Myf5 activation when they delaminate from the hypaxial dermomyotome and migrate into the forelimb bud, however they do not activate Myf5 and the myogenic programme until they have populated the prospective muscle masses. We show that Msx1 and Meox2 homeodomain-containing transcription factors bind in vitro and in vivo to specific sites in the 145-bp element, and are implicated in fine-tuning activation of Myf5 in the forelimb. Msx1, when bound between Pax and Six sites, prevents the binding of these key activators, thus inhibiting transcription of Myf5 and consequent premature myogenic differentiation. Meox2 is required for Myf5 activation at the onset of myogenesis via direct binding to other homeodomain sites in this sequence. Thus, these homeodomain factors, acting in addition to Pax3 and Six1/4, fine-tune the entry of progenitor cells into myogenesis at early stages of forelimb development. PMID:26538636
Smith, L. Courtney; Lun, Cheng Man
2017-01-01
The complex innate immune system of sea urchins is underpinned by several multigene families including the SpTransformer family (SpTrf; formerly Sp185/333) with estimates of ~50 members, although the family size is likely variable among individuals of Strongylocentrotus purpuratus. The genes are small with similar structure, are tightly clustered, and have several types of repeats in the second of two exons and that surround each gene. The density of repeats suggests that the genes are positioned within regions of genomic instability, which may be required to drive sequence diversification. The second exon encodes the mature protein and is composed of blocks of sequence called elements that are present in mosaics of defined element patterns and are the major source of sequence diversity. The SpTrf genes respond swiftly to immune challenge, but only a single gene is expressed per phagocyte. Many of the mRNAs appear to be edited and encode proteins with altered and/or missense sequence that are often truncated, of which some may be functional. The standard SpTrf protein structure is an N-terminal glycine-rich region, a central RGD motif, a histidine-rich region, and a C-terminal region. Function is predicted from a recombinant protein, rSpTransformer-E1 (rSpTrf-E1), which binds to Vibrio and Saccharomyces, but not to Bacillus, and binds tightly to lipopolysaccharide, β-1,3-glucan, and flagellin, but not to peptidoglycan. rSpTrf-E1 is intrinsically disordered but transforms to α helical structure in the presence of binding targets including lipopolysaccharide, which may underpin the characteristics of binding to multiple targets. SpTrf proteins associate with coelomocyte membranes, and rSpTrf-E1 binds specifically to phosphatidic acid (PA). When rSpTrf-E1 is bound to PA in liposome membranes, it induces morphological changes in liposomes that correlate with PA clustering and leakage of luminal contents, and it extracts or removes PA from the bilayer. The multitasking activities of rSpTrf-E1 infer multiple and perhaps overlapping activities for the hundreds of native SpTrf proteins that are produced by individual sea urchins. This likely generates a flexible and highly protective immune system for the sea urchin in its marine habitat that it shares with broad arrays of microbes that may be pathogens and opportunists. PMID:28713368
Badal, Martí; Xamena, Noel; Cabré, Oriol
2013-09-10
Most foldback elements are defective due to the lack of coding sequences but some are associated with coding sequences and may represent the entire element. This is the case of the NOF sequences found in the FB of Drosophila melanogaster, formerly considered as an autonomous TE and currently proposed as part of the so-called FB-NOF element, the transposon that would be complete and fully functional. NOF is always associated with FB and never seen apart from the FB inverted repeats (IR). This is the reason why the FB-NOF composite element can be considered the complete element. At least one of its ORFs encodes a protein that has always been considered its transposase, but no detailed studies have been carried out to verify this. In this work we test the hypothesis that FB-NOF is an active transposon nowadays. We search for its expression product, obtaining its cDNA, and propose the ORF and the sequence of its potential protein. We found that the NOF protein is not a transposase as it lacks any of the motifs of known transposases and also shows structural homology with hydrolases, therefore FB-NOF cannot belong to the superfamily MuDR/foldback, as up to now it has been classified, and can be considered as a non-autonomous transposable element. The alignment with the published genomes of 12 Drosophila species shows that NOF presence is restricted only to the 6 Drosophila species belonging to the melanogaster group. Copyright © 2013 Elsevier B.V. All rights reserved.
Wang, Xiaoli; Xie, Yingzhou; Li, Gang; Liu, Jialin; Li, Xiaobin; Tian, Lijun; Sun, Jingyong; Ou, Hong-Yu; Qu, Hongping
2018-01-01
Hypervirulent K. pneumoniae variants (hvKP) have been increasingly reported worldwide, causing metastasis of severe infections such as liver abscesses and bacteremia. The capsular serotype K2 hvKP strains show diverse multi-locus sequence types (MLSTs), but with limited genetics and virulence information. In this study, we report a hypermucoviscous K. pneumoniae strain, RJF293, isolated from a human bloodstream sample in a Chinese hospital. It caused a metastatic infection and fatal septic shock in a critical patient. The microbiological features and genetic background were investigated with multiple approaches. The Strain RJF293 was determined to be multilocis sequence type (ST) 374 and serotype K2, displayed a median lethal dose (LD50) of 1.5 × 10 2 CFU in BALB/c mice and was as virulent as the ST23 K1 serotype hvKP strain NTUH-K2044 in a mouse lethality assay. Whole genome sequencing revealed that the RJF293 genome codes for 32 putative virulence factors and exhibits a unique presence/absence pattern in comparison to the other 105 completely sequenced K. pneumoniae genomes. Whole genome SNP-based phylogenetic analysis revealed that strain RJF293 formed a single clade, distant from those containing either ST66 or ST86 hvKP. Compared to the other sequenced hvKP chromosomes, RJF293 contains several strain-variable regions, including one prophage, one ICEKp1 family integrative and conjugative element and six large genomic islands. The sequencing of the first complete genome of an ST374 K2 hvKP clinical strain should reinforce our understanding of the epidemiology and virulence mechanisms of this bloodstream infection-causing hvKP with clinical significance.
Wang, Xiaoli; Xie, Yingzhou; Li, Gang; Liu, Jialin; Li, Xiaobin; Tian, Lijun; Sun, Jingyong; Qu, Hongping
2018-01-01
ABSTRACT Hypervirulent K. pneumoniae variants (hvKP) have been increasingly reported worldwide, causing metastasis of severe infections such as liver abscesses and bacteremia. The capsular serotype K2 hvKP strains show diverse multi-locus sequence types (MLSTs), but with limited genetics and virulence information. In this study, we report a hypermucoviscous K. pneumoniae strain, RJF293, isolated from a human bloodstream sample in a Chinese hospital. It caused a metastatic infection and fatal septic shock in a critical patient. The microbiological features and genetic background were investigated with multiple approaches. The Strain RJF293 was determined to be multilocis sequence type (ST) 374 and serotype K2, displayed a median lethal dose (LD50) of 1.5 × 102 CFU in BALB/c mice and was as virulent as the ST23 K1 serotype hvKP strain NTUH-K2044 in a mouse lethality assay. Whole genome sequencing revealed that the RJF293 genome codes for 32 putative virulence factors and exhibits a unique presence/absence pattern in comparison to the other 105 completely sequenced K. pneumoniae genomes. Whole genome SNP-based phylogenetic analysis revealed that strain RJF293 formed a single clade, distant from those containing either ST66 or ST86 hvKP. Compared to the other sequenced hvKP chromosomes, RJF293 contains several strain-variable regions, including one prophage, one ICEKp1 family integrative and conjugative element and six large genomic islands. The sequencing of the first complete genome of an ST374 K2 hvKP clinical strain should reinforce our understanding of the epidemiology and virulence mechanisms of this bloodstream infection-causing hvKP with clinical significance. PMID:29338592
Legault, Boris A; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R Thane
2006-01-01
Background Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities. PMID:16820057
Glinsky, Gennadi V.
2015-01-01
Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements. PMID:25956794
Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman
1985-01-01
A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652
Animal vocal sequences: not the Markov chains we thought they were
Kershenbaum, Arik; Bowles, Ann E.; Freeberg, Todd M.; Jin, Dezhe Z.; Lameira, Adriano R.; Bohn, Kirsten
2014-01-01
Many animals produce vocal sequences that appear complex. Most researchers assume that these sequences are well characterized as Markov chains (i.e. that the probability of a particular vocal element can be calculated from the history of only a finite number of preceding elements). However, this assumption has never been explicitly tested. Furthermore, it is unclear how language could evolve in a single step from a Markovian origin, as is frequently assumed, as no intermediate forms have been found between animal communication and human language. Here, we assess whether animal taxa produce vocal sequences that are better described by Markov chains, or by non-Markovian dynamics such as the ‘renewal process’ (RP), characterized by a strong tendency to repeat elements. We examined vocal sequences of seven taxa: Bengalese finches Lonchura striata domestica, Carolina chickadees Poecile carolinensis, free-tailed bats Tadarida brasiliensis, rock hyraxes Procavia capensis, pilot whales Globicephala macrorhynchus, killer whales Orcinus orca and orangutans Pongo spp. The vocal systems of most of these species are more consistent with a non-Markovian RP than with the Markovian models traditionally assumed. Our data suggest that non-Markovian vocal sequences may be more common than Markov sequences, which must be taken into account when evaluating alternative hypotheses for the evolution of signalling complexity, and perhaps human language origins. PMID:25143037
Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach
NASA Astrophysics Data System (ADS)
Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.
2012-10-01
In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.
Primate-Specific Evolution of an LDLR Enhancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qian-fei; Prabhakar, Shyam; Wang, Qianben
2006-06-28
Sequence changes in regulatory regions have often beeninvoked to explain phenotypic divergence among species, but molecularexamples of this have been difficult to obtain. In this study, weidentified an anthropoid primate specific sequence element thatcontributed to the regulatory evolution of the LDL receptor. Using acombination of close and distant species genomic sequence comparisonscoupled with in vivo and in vitro studies, we show that a functionalcholesterol-sensing sequence motif arose and was fixed within apre-existing enhancer in the common ancestor of anthropoid primates. Ourstudy demonstrates one molecular mechanism by which ancestral mammalianregulatory elements can evolve to perform new functions in the primatelineage leadingmore » to human.« less
ERIC Educational Resources Information Center
Exner, Robert; And Others
The sixteen chapters of this book provide the core material for the Elements of Mathematics Program, a secondary sequence developed for highly motivated students with strong verbal abilities. The sequence is based on a functional-relational approach to mathematics teaching, and emphasizes teaching by analysis of real-life situations. This text is…
ERIC Educational Resources Information Center
Stott, Jon C.
1987-01-01
Suggests that children, even in early elementary grades, can grasp basic elements of children's literature using a spiralled sequence story curriculum, which helps them examine types of character, such as the trickster; elements of plot, such as the journey; and generally see patterns in the stories they read. (JC)
FAST - FREEDOM ASSEMBLY SEQUENCING TOOL PROTOTYPE
NASA Technical Reports Server (NTRS)
Borden, C. S.
1994-01-01
FAST is a project management tool designed to optimize the assembly sequence of Space Station Freedom. An appropriate assembly sequence coordinates engineering, design, utilization, transportation availability, and operations requirements. Since complex designs tend to change frequently, FAST assesses the system level effects of detailed changes and produces output metrics that identify preferred assembly sequences. FAST incorporates Space Shuttle integration, Space Station hardware, on-orbit operations, and programmatic drivers as either precedence relations or numerical data. Hardware sequencing information can either be input directly and evaluated via the "specified" mode of operation or evaluated from the input precedence relations in the "flexible" mode. In the specified mode, FAST takes as its input a list of the cargo elements assigned to each flight. The program determines positions for the cargo elements that maximize the center of gravity (c.g.) margin. These positions are restricted by the geometry of the cargo elements and the location of attachment fittings both in the orbiter and on the cargo elements. FAST calculates every permutation of cargo element location according to its height, trunnion fitting locations, and required intercargo element spacing. Each cargo element is tested in both its normal and reversed orientation (rotated 180 degrees). The best solution is that which maximizes the c.g. margin for each flight. In the flexible mode, FAST begins with the first flight and determines all feasible combinations of cargo elements according to mass, volume, EVA, and precedence relation constraints. The program generates an assembly sequence that meets mass, volume, position, EVA, and precedence constraints while minimizing the total number of Shuttle flights required. Issues associated with ground operations, spacecraft performance, logistics requirements and user requirements will be addressed in future versions of the model. FAST is written in C-Language and has been implemented on DEC VAX series computers running VMS. The program is distributed in executable form. The source code is also provided, but it cannot be compiled without the Tree Manipulation Based Routines (TMBR) package from the Jet Propulsion Laboratory, which is not currently available from COSMIC. The main memory requirement is based on the data used to drive the FAST program. All applications should easily run on an installation with 10Mb of main memory. FAST was developed in 1990 and is a copyrighted work with all copyright vested in NASA. DEC, VAX and VMS are trademarks of Digital Equipment Corporation.
Sequence information signal processor for local and global string comparisons
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1997-01-01
A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
AlignMe—a membrane protein sequence alignment web server
Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.
2014-01-01
We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Sun, Cheng; Wyngaard, Grace; Walton, D Brian; Wichman, Holly A; Mueller, Rachel Lockridge
2014-03-11
Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution--some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 - 75 Gb, 12-74 Gb of which are lost from pre-somatic cell lineages at germline--soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms.
2014-01-01
Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms. PMID:24618421
Enhanced sequencing coverage with digital droplet multiple displacement amplification
Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.
2016-01-01
Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978
Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carte, Jason; Wang, Ruiying; Li, Hong
An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here we identified Pyrococcus furiosus Cas6 as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targetingmore » RNAs. Cas6 interacts with a specific sequence motif in the 5{prime} region of the CRISPR repeat element and cleaves at a defined site within the 3{prime} region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal-independent. cas6 is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR-derived guide RNAs in numerous bacteria and archaea.« less
Andrews, T Daniel; Gojobori, Takashi
2004-01-01
The PilE protein is the major component of the Neisseria meningitidis pilus, which is encoded by the pilE/pilS locus that includes an expressed gene and eight homologous silent fragments. The silent gene fragments have been shown to recombine through gene conversion with the expressed gene and thereby provide a means by which novel antigenic variants of the PilE protein can be generated. We have analyzed the evolutionary rate of the pilE gene using the nucleotide sequence of two complete pilE/pilS loci. The very high rate of evolution displayed by the PilE protein appears driven by both recombination and positive selection. Within the semivariable region of the pilE and pilS genes, recombination appears to occur within multiple small sequence blocks that lie between conserved sequence elements. Within the hypervariable region, positive selection was identified from comparison of the silent and expressed genes. The unusual gene conversion mechanism that operates at the pilE/pilS locus is a strategy employed by N. meningitidis to enhance mutation of certain regions of the PilE protein. The silent copies of the gene effectively allow "parallelized" evolution of pilE, thus enabling the encoded protein to rapidly explore a large area of sequence space in an effort to find novel antigenic variants.
Characterization of short interspersed elements (SINEs) in a red alga, Porphyra yezoensis.
Zhang, Wenbo; Lin, Xiaofei; Peddigari, Suresh; Takechi, Katsuaki; Takano, Hiroyoshi; Takio, Susumu
2007-02-01
Short interspersed element (SINE)-like sequences referred to as PySN1 and PySN2 were identified in a red alga, Porphyra yezoensis. Both elements contained an internal promoter with motifs (A box and B box) recognized by RNA polymerase III, and target site duplications at both ends. Genomic Southern blot analysis revealed that both elements were widely and abundantly distributed on the genome. 3' and 5' RACE suggested that PySN1 was expressed as a chimera transcript with flanking SINE-unrelated sequences and possessed the poly-A tail at the same position near the 3' end of PySN1.
ISEScan: automated identification of insertion sequence elements in prokaryotic genomes.
Xie, Zhiqun; Tang, Haixu
2017-11-01
The insertion sequence (IS) elements are the smallest but most abundant autonomous transposable elements in prokaryotic genomes, which play a key role in prokaryotic genome organization and evolution. With the fast growing genomic data, it is becoming increasingly critical for biology researchers to be able to accurately and automatically annotate ISs in prokaryotic genome sequences. The available automatic IS annotation systems are either providing only incomplete IS annotation or relying on the availability of existing genome annotations. Here, we present a new IS elements annotation pipeline to address these issues. ISEScan is a highly sensitive software pipeline based on profile hidden Markov models constructed from manually curated IS elements. ISEScan performs better than existing IS annotation systems when tested on prokaryotic genomes with curated annotations of IS elements. Applying it to 2784 prokaryotic genomes, we report the global distribution of IS families across taxonomic clades in Archaea and Bacteria. ISEScan is implemented in Python and released as an open source software at https://github.com/xiezhq/ISEScan. hatang@indiana.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Searching for nuclear export elements in hepatitis D virus RNA.
Freitas, Natália; Cunha, Celso
2013-08-12
To search for the presence of cis elements in hepatitis D virus (HDV) genomic and antigenomic RNA capable of promoting nuclear export. We made use of a well characterized chloramphenicol acetyl-transferase reporter system based on plasmid pDM138. Twenty cDNA fragments corresponding to different HDV genomic and antigenomic RNA sequences were inserted in plasmid pDM138, and used in transfection experiments in Huh7 cells. The relative amounts of HDV RNA in nuclear and cytoplasmic fractions were then determined by real-time polymerase chain reaction and Northern blotting. The secondary structure of the RNA sequences that displayed nuclear export ability was further predicted using a web interface. Finally, the sensitivity to leptomycin B was assessed in order to investigate possible cellular pathways involved in HDV RNA nuclear export. Analysis of genomic RNA sequences did not allow identifying an unequivocal nuclear export element. However, two regions were found to promote the export of reporter mRNAs with efficiency higher than the negative controls albeit lower than the positive control. These regions correspond to nucleotides 266-489 and 584-920, respectively. In addition, when analyzing antigenomic RNA sequences a nuclear export element was found in positions 214-417. Export mediated by the nuclear export element of HDV antigenomic RNA is sensitive to leptomycin B suggesting a possible role of CRM1 in this transport pathway. A cis-acting nuclear export element is present in nucleotides 214-417 of HDV antigenomic RNA.
Taylor, James; Tyekucheva, Svitlana; King, David C; Hardison, Ross C; Miller, Webb; Chiaromonte, Francesca
2006-12-01
Genomic sequence signals - such as base composition, presence of particular motifs, or evolutionary constraint - have been used effectively to identify functional elements. However, approaches based only on specific signals known to correlate with function can be quite limiting. When training data are available, application of computational learning algorithms to multispecies alignments has the potential to capture broader and more informative sequence and evolutionary patterns that better characterize a class of elements. However, effective exploitation of patterns in multispecies alignments is impeded by the vast number of possible alignment columns and by a limited understanding of which particular strings of columns may characterize a given class. We have developed a computational method, called ESPERR (evolutionary and sequence pattern extraction through reduced representations), which uses training examples to learn encodings of multispecies alignments into reduced forms tailored for the prediction of chosen classes of functional elements. ESPERR produces a greatly improved Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy ( approximately 94%). This score captures strong signals (GC content and conservation), as well as subtler signals (with small contributions from many different alignment patterns) that characterize the regulatory elements in our training set. ESPERR is also effective for predicting other classes of functional elements, as we show for DNaseI hypersensitive sites and highly conserved regions with developmental enhancer activity. Our software, training data, and genome-wide predictions are available from our Web site (http://www.bx.psu.edu/projects/esperr).
System and method for image registration of multiple video streams
Dillavou, Marcus W.; Shum, Phillip Corey; Guthrie, Baron L.; Shenai, Mahesh B.; Deaton, Drew Steven; May, Matthew Benton
2018-02-06
Provided herein are methods and systems for image registration from multiple sources. A method for image registration includes rendering a common field of interest that reflects a presence of a plurality of elements, wherein at least one of the elements is a remote element located remotely from another of the elements and updating the common field of interest such that the presence of the at least one of the elements is registered relative to another of the elements.
Conserved interdomain linker promotes phase separation of the multivalent adaptor protein Nck
Banjade, Sudeep; Wu, Qiong; Mittal, Anuradha; Peeples, William B.; Pappu, Rohit V.; Rosen, Michael K.
2015-01-01
The organization of membranes, the cytosol, and the nucleus of eukaryotic cells can be controlled through phase separation of lipids, proteins, and nucleic acids. Collective interactions of multivalent molecules mediated by modular binding domains can induce gelation and phase separation in several cytosolic and membrane-associated systems. The adaptor protein Nck has three SRC-homology 3 (SH3) domains that bind multiple proline-rich segments in the actin regulatory protein neuronal Wiskott-Aldrich syndrome protein (N-WASP) and an SH2 domain that binds to multiple phosphotyrosine sites in the adhesion protein nephrin, leading to phase separation. Here, we show that the 50-residue linker between the first two SH3 domains of Nck enhances phase separation of Nck/N-WASP/nephrin assemblies. Two linear motifs within this element, as well as its overall positively charged character, are important for this effect. The linker increases the driving force for self-assembly of Nck, likely through weak interactions with the second SH3 domain, and this effect appears to promote phase separation. The linker sequence is highly conserved, suggesting that the sequence determinants of the driving forces for phase separation may be generally important to Nck functions. Our studies demonstrate that linker regions between modular domains can contribute to the driving forces for self-assembly and phase separation of multivalent proteins. PMID:26553976
Kijima, T E; Innan, Hideki
2013-11-01
A population genetic simulation framework is developed to understand the behavior and molecular evolution of DNA sequences of transposable elements. Our model incorporates random transposition and excision of transposable element (TE) copies, two modes of selection against TEs, and degeneration of transpositional activity by point mutations. We first investigated the relationships between the behavior of the copy number of TEs and these parameters. Our results show that when selection is weak, the genome can maintain a relatively large number of TEs, but most of them are less active. In contrast, with strong selection, the genome can maintain only a limited number of TEs but the proportion of active copies is large. In such a case, there could be substantial fluctuations of the copy number over generations. We also explored how DNA sequences of TEs evolve through the simulations. In general, active copies form clusters around the original sequence, while less active copies have long branches specific to themselves, exhibiting a star-shaped phylogeny. It is demonstrated that the phylogeny of TE sequences could be informative to understand the dynamics of TE evolution.